Introduction
If you ever tried to return an Iterator
from a function and had difficulties
with it, this article might help.
Let’s introduce a context first. Imagine for a minute you’re trying to define a trait to list the colors of anything. For example, you could implement this for vegetables, logos or even flags.
You’ll probably define the trait like this.
#[derive(Clone, Copy, Eq, Hash)]
struct Color(u8, u8, u8);
trait Colors {
fn colors(&self) -> Vec<Color>;
}
Very simple right? But what about the zero-cost abstraction that Rust
embraces. Why would we want to return a Vec<Color>
if when we use it, we only
need the first one for example? No need to allocate memory space for an entire
Vec
.
First try: return an Iterator
type
Fine, let’s use an
Iterator
instead:
the one returned by
Vec::iter()
.
trait Colors<'a> {
fn colors(&'a self) -> std::slice::Iter<'a, Self>;
}
At this point, we are enforcing which type of Iterator
any implementor of
Colors
would need to use. If you don’t see the problem, let’s understand what
is wrong with it before going forward.
Imagine the following implementation for struct Flag
.
struct Flag {
country: String,
colors: Vec<Color>,
}
impl Colors<'a> for Flag {
fn colors(&'a self) -> std::slice::Iter<'a, Color> {
self.colors.iter()
}
}
Everything looks fine so far. But why do we use a Vec<Color>
for the list of
colors. We shouldn’t have twice the same color in this list so let’s replace
Vec<Color>
with HashSet<Color>
(Eq
and Hash
are implemented on Color
).
You will get the following compilation error because self.colors.iter()
doesn’t produce a std::slice::Iter
but a std::collections::hash_set::Iter
.
error[E0308]: mismatched types
--> src/lib.rs:15:9
|
15 | self.colors.iter()
| ^^^^^^^^^^^^^^^^^^ expected struct `std::slice::Iter`, found struct `std::collections::hash_set::Iter`
|
= note: expected struct `std::slice::Iter<'a, Color>`
found struct `std::collections::hash_set::Iter<'_, Color>`
Second try: associated type
Let’s change a bit our trait definition so the implementor can decide the
Iterator
type.
trait Colors<'a> {
type ColorsIter: Iterator<Item = &'a Color>;
fn colors(&'a self) -> Self::ColorsIter;
}
And then the implementation would become the following.
impl<'a> Colors<'a> for Flag {
type ColorsIter = std::collections::hash_set::Iter<'a, Color>;
fn colors(&'a self) -> Self::ColorsIter {
self.colors.iter()
}
}
Perfect, we’re done. Are we, really? Something itches me. Indeed, we basically
push the complexity to the implementor. If I could, I’d like to have an API as
simple as possible. On top of that, the complexity here might actually become
close to a nightmare. Let’s say for example that Colors
trait is documented as
follow “list all the colors of an object
(black
is not considered a color)”.
But you’re the implementor of struct Flag
and you do not really care of the
consideration of some scientists to consider that black is not a color: some
flags contains black and you have to store it. But that’s fine, you can still
implement it by filtering out the black.
impl<'a> Colors<'a> for Flag {
type ColorsIter = std::collections::hash_set::Iter<'a, Color>;
fn colors(&'a self) -> Self::ColorsIter {
self.colors
.iter()
.filter(|color| color.0 != 0 && color.1 != 0 && color.2 != 0)
}
}
Here is the compilation error you’ll get.
error[E0308]: mismatched types
--> src/lib.rs:17:9
|
16 | fn colors(&'a self) -> Self::ColorsIter {
| ---------------- expected `std::collections::hash_set::Iter<'_, Color>` because of return type
17 | / self.colors
18 | | .iter()
19 | | .filter(|color| color.0 != 0 && color.1 != 0 && color.2 != 0)
| |_________________________________________________________________________^ expected struct `std::collections::hash_set::Iter`, found struct `std::iter::Filter`
|
= note: expected struct `std::collections::hash_set::Iter<'_, _>`
found struct `std::iter::Filter<std::collections::hash_set::Iter<'_, _>, [closure@src/lib.rs:19:21: 19:73]>`
Of course, using .filter()
doesn’t get you
a std::collections::hash_set::Iter
anymore but a std::iter::Filter
. Lucky
for you, the compiler is nice and tell you the type you need to use… or is it?
Did you notice the part that looks like [closure@src/lib.rs:19:21: 19:73]
?
How do you define that?
Either you’re experienced enough and know how to get out of this situation, or
you’re not and feel like maybe, you’re not going to use this Colors
trait
afterall (read fourth try below, it might
help!). In any case, I hope you’re convinced our API leaked some bad complexity
to the implementor. Can we do better?
Third try: impl Iterator
Since Rust 1.26, we
can use the
impl
keyword in function parameters and return types. Let’s try it.
trait Colors<'a> {
fn colors(&'a self) -> impl Iterator<Item = &'a Color>;
}
impl<'a> Colors<'a> for Flag {
fn colors(&'a self) -> impl Iterator<Item = &'a Color> {
self.colors
.iter()
.filter(|color| color.0 != 0 && color.1 != 0 && color.2 != 0)
}
}
Well, here is what the compiler will tell you.
error[E0562]: `impl Trait` not allowed outside of function and inherent method return types
--> src/lib.rs:6:28
|
6 | fn colors(&'a self) -> impl Iterator<Item = &'a Color>;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
And indeed, if we jump to the RFC1522 that introduced this nice behavior, there is a limitation (might be lifted in the future but not now).
`impl Trait` may only be written within the return type of a freestanding or inherent-impl function, not in trait definitions or any non-return type position. They may also not appear in the return type of closure traits or function pointers, unless these are themselves part of a legal return type.
Fork! It seems like the ideal solution but is not yet supported in the language.
Fourth try: Box<dyn Iterator>
Associated types were not a nice solution because we let the implementor deal
with the complexity. But maybe there is a solution not so complex for the
implementor. We know we want to return something that implements
Iterator
. Trait-objects
might be the solution here.
Let’s get back to our previous definition using an associated type with a little tweak.
trait Colors<'a> {
type ColorsIter: Iterator<Item = &'a Color> + 'a;
fn colors(&'a self) -> Self::ColorsIter;
}
Note that since we iterate over references of colors, the Iterator
itself must
live at least as long as these references, which explains why we need to add +
'a
.
Then we now can use some boxed trait-object to solve our problems.
impl<'a> Colors<'a> for Flag {
type ColorsIter = Box<dyn Iterator<Item = &'a Color> + 'a>;
fn colors(&'a self) -> Self::ColorsIter {
Box::new(
self.colors
.iter()
.filter(|color| color.0 != 0 && color.1 != 0 && color.2 != 0),
)
}
}
This works just fine. Using trait-object, the implementor don’t need to know the
exact Iterator
type, just that it implements Iterator
.
Fifth try: wrapping Box<dyn Iterator>
Are we not done here? Well, here is me splitting hairs. Sure we’ve found
something manageable but can we simplify a bit the life of a potential
implementor by hiding some of these Box
and dyn
? How about we create a type
to hide this complexity.
pub struct DynIter<'iter, V> {
iter: Box<dyn Iterator<Item = V> + 'iter>,
}
And then we need to implement Iterator
on it.
impl<'iter, V> Iterator for DynIter<'iter, V> {
type Item = V;
fn next(&mut self) -> Option<Self::Item> {
self.iter.next()
}
}
Let’s also create a new()
method to remove some of the boilerplate of
creating a DynIter
.
impl<'iter, V> DynIter<'iter, V> {
pub fn new<I>(iter: I) -> Self
where
I: Iterator<Item = V> + 'iter,
{
Self {
iter: Box::new(iter),
}
}
}
And here we are, no more dyn
or Box
.
impl<'a> Colors<'a> for Flag {
type ColorsIter = DynIter<'a, &'a Color>;
fn colors(&'a self) -> Self::ColorsIter {
DynIter::new(
self.colors
.iter()
.filter(|color| color.0 != 0 && color.1 != 0 && color.2 != 0),
)
}
}
Conclusion
Since I’ve found out that the above solution could be reusable, I actually made
a crate out of it: dyn-iter
. Please
report issue or improvement in the
project repository.
I hope you learned something. At least, I’ve learned a lot by trying to make all of this work. I’m sure there is still some hidden corners that I don’t master yet. If you found any mistake in what I’m saying, or some ideas to improve this blog post, please let me know on the blog repository.
Take second try and third try, shake, and here we go
After the first publication of this article,
chrysn provided some
more information about
RFC2515
that actually brings the impl
keyword into associated types. At the time of
writing [1], this is already available in
nightly compiler.
#![feature(type_alias_impl_trait)]
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
struct Color(u8, u8, u8);
struct Flag {
country: String,
colors: Vec<Color>,
}
trait Colors<'a> {
type ColorsIter: Iterator<Item = &'a Color>;
fn colors(&'a self) -> Self::ColorsIter;
}
impl<'a> Colors<'a> for Flag {
type ColorsIter = impl Iterator<Item = &'a Color>;
fn colors(&'a self) -> Self::ColorsIter {
self.colors
.iter()
.filter(|color| color.0 != 0 && color.1 != 0 && color.2 != 0)
}
}
Once this is stable, I’ll pretty much forget about
dyn-iter
crate and use this new
solution instead.