Chris Morgan

I’m a software developer, dealing mostly in web things.

I’m also (more importantly) a committed Christian—please talk to me at any time about that.

Elsewhere

Rust proposal: Result‐based iteration and for..else

This idea has been floating loosely about in the void between my ears for some time; I fleshed it out further in my mind in response to the “reader.lines() swallows io errors” thread but is certainly not tied down to just that scope.

Prelude: the Python iteration protocol

It will be of value for me to explain Python’s iteration protocol for those that do not know it, because I’ve borrowed certain parts of it later on.

Here are the syntax and relevant semantics of Python’s for loop:

for_stmt ::=  "for" target_list "in" expression_list ":" suite
["else" ":" suite]

The expression list is evaluated once; it should yield an iterable object. An iterator is created for the result of the expression_list. The suite is then executed once for each item provided by the iterator, in the order of ascending indices. Each item in turn is assigned to the target list using the standard rules for assignments (see Assignment statements), and then the suite is executed. When the items are exhausted (which is immediately when the sequence is empty or an iterator raises a StopIteration exception), the suite in the else clause, if present, is executed, and the loop terminates.

A break statement executed in the first suite terminates the loop without executing the else clause’s suite. A continue statement executed in the first suite skips the rest of the suite and continues with the next item, or with the else clause if there was no next item.

The part I wish to draw out is the else clause. Its semantics are that it is executed iff the iterator is exhausted without the loop being broken out of.

for..else has been in Python since the first public release. The nameelse” may not be the nicest thing around, but the semantics are definitely useful.

My proposal involves taking this behaviour and improving upon it in a way that Python could not have due to its statement doctrine (statements and blocks do not have values) while Rust with its expression doctrine (blocks are statements, &c.) can. For simplicity, I will retain the use of the keyword else; please, no bikeshedding on whether it is the best keyword to use or not.

The first step: for..else in Rust, with Option semantics

This is the syntax and these are the relevant current semantics of the for loop:

for_expr : "for" pat "in" expr '{' block '}';

Here is the new syntax as it would be with the theft of Python’s for..else:

for_expr : "for" pat "in" expr '{' block '}'
[ "else" '{' block '}' ];

Now for the semantics. There are two ways this could be done; I will of course go for the second.

Firstly, we might have the simple approach, modelling Python’s technique:

use std::rand::random;
 
for i in range(4, 8) {
if random() {
println!("Breaking early");
break;
}
} else {
println!("Iteration finished");
}

But here in Rust we can start introducing an output value for the expression. First of all we must redefine the syntax of break; currently

break_expr : "break" [ lifetime ];

we will redefine it to

break_expr : "break" ( lifetime
| lifetime "," expr
| expr )? ;

with the restriction that the expr is only permitted where the break applies to a for loop, and it defaults to () if not specified (as with all blocks).

Now we can define the semantics of the for loop with regards to a to‐be‐resolved type U:

This is completely backwards‐compatible by virtue of U effectively defaulting to () (as E defaults to ()). However, for cases where E is not (), you will no longer be able to use just break; without there being an appropriate else clause which also outputs ().

The example of the simple technique will work just fine, but we can now also introduce a value to the else clause and the break expression:

use std::rand::random;
 
let n = for i in range(4, 8) {
if random() {
println!("Breaking early");
break i;
}
} else {
println!("Iteration finished");
17
}

This will break early leaving n the value 4, 5, 6 or 7, or finish iteration and leave n with the value 17.

Yes, this is a toy example, but it demonstrates how we can effectively use the expression orientation of Rust. Such a thing is not possible in Python.

As stated, this part of the proposal is completely backwards compatible. But while I believe this would be useful on its own, it is but the first step in my proposal. Let’s get on to the backwards‐incompatible parts.

The second step: the concept of Result‐based iteration

The other thing I’ve been wondering about is changing the semantics of our iteration and for loop from using Option<T> to using Result<T, E>.

The simple translation for existing code would be from Iterator<T> with next() -> Option<T> to Iterator<T, ()> with next() -> Result<T, ()>; after that, () can be replaced with error types as appropriate. For example, the Reader string split iterators would implement Iterator<~str, IoError>, thus immediately resolving the issue of reader.lines() swallowing I/O errors. (Since I started writing this article, Lines was changed to implement Iterator<IoResult<~str>>, which works, but is rather verbose; this would allow much more natural code to be written.)

This allows iterators to return a special value at the end, which may be useful. IoError is of course the first example that springs to mind, but it's not the only case where it would be valuable. Other iterators may be able to use this to advantage also.

Changes to the for loop

Result-based iteration depends upon for..else, as shown earlier, but with some further changes. Here are the changes, with regards to Iterator<T, E> and a to‐be‐resolved type U:

Unchanged things:

As mentioned, the else clause now needs to take a value; here is the for_expr grammar with a suitable amendment:

for_expr : "for" pat "in" expr '{' block '}'
[ "else" ( '|' pat '|' )? '{' block '}' ];

(The else pattern is surrounded by pipes to avoid grammatical ambiguity from the { caused by the possibility of struct matching. I’m not absolutely set on |; it’s open to bikeshedding just as the else keyword is.)

One could allow multiple else clauses and thus permit a refutable pattern, but for simplicity I have not done that; my proposal requires an irrefutable pattern. That’s probably stuffing too much into what should be a match. (It also makes it look more like exception handling in other languages, which might not be a good thing.)

Some examples

The semantics of my proposal can be seen at https://gist.github.com/chris-morgan/9105842.

Here are some examples of this at work:

for line in file.lines() {
// Process the line
} else |err| {
// Do something with that IoError
}

for msg in receiver.iter() {
// Process the message
} else {
// No need of a pattern here, Messages implements Iterator<T, ()>.
// The fact that we get here means the channel was closed.
}

See also the examples that follow in the proof of concept below.

A proof‐of‐concept, macro implementation with runnable examples

This implementation is not complete; certain parts of it can’t be implemented inside structural macros. But it should be enough to get the idea across.

#[feature(macro_rules)];
 
// There must be two variants of the macro for this simple prototype, each of
// them lacking one important feature and with one remaining feature completely
// infeasible in a macro_rules! macro. Done properly in the compiler these
// would be entirely reasonable.
 
// A variant with support for an output value from the else clause, but with no
// support for breaking out of the `for` block.
macro_rules! forr1 (
(for $p1:pat in $e1:expr $b1:expr else |$p2:pat| $b2:expr) => ({
// b1 and b2 should be block, but if they were then I would need to use
// `if true $b1` in the macro body and that would prevent non-() value.
// And sorry, but something must go between $p2 and $b2 or the latter's
// ``{`` may be perceived as part of the pattern
let mut e1 = $e1;
let mut out;
loop {
match e1.next2() {
Ok($p1) => $b1,
Err($p2) => {
out = $b2;
break;
},
}
}
out
})
)
 
// A variant with no support for an output value from the else clause, so that
// the `for` part *can* break.
macro_rules! forr2 (
(for $p1:pat in $e1:expr $b1:expr else |$p2:pat| $b2:expr) => ({
// b1 and b2 should be block, but if they were then I would need to use
// `if true $b1` in the macro body and that would prevent non-() value.
// And sorry, but something must go between $p2 and $b2 or the latter's
// ``{`` may be perceived as part of the pattern
let mut e1 = $e1;
loop {
match e1.next2() {
Ok($p1) => $b1,
Err($p2) => {
$b2;
break;
},
}
}
})
)
 
pub trait Iterator<T, E> {
fn next2(&mut self) -> Result<T, E>;
// Yeah, I've skipped size_hint and all the helper methods, and I haven't
// really considered the impact this would have on the helper methods and
// how they handle error types. I guess they'd need to have the same E
// until we get something like https://github.com/mozilla/rust/issues/8277.
}
 
/* I'd do this, but it would prevent other implementations of it
impl<T, I: ::std::iter::Iterator<T>> Iterator<T, ()> for I {
fn next2(&mut self) -> Result<T, ()> {
match self.next() {
Some(t) => Ok(t),
None => Err(()),
}
}
}*/
 
impl<A: Add<A, A> + Ord + Clone + ToPrimitive> Iterator<A, ()>
for ::std::iter::Range<A> {
fn next2(&mut self) -> Result<A, ()> {
match self.next() {
Some(a) => Ok(a),
None => Err(()),
}
}
}
 
fn main() {
println!("===== DEMO 1 =====");
demo1();
println!("===== DEMO 2 =====");
demo2();
println!("===== DEMO 3 =====");
demo3();
println!("====== DONE ======");
}
 
fn demo1() {
println!("That there for loop's else clause returned {}", forr1!(
for x in range(1, 3) {
println!("x = {}", x);
// Theoretically, `break 9;` would work here, provided the value is
// the same type as the else block's type. (A simple `break;` will
// of course *not* work when the else branch evaluates to non-().)
// In practice, tweaking the macro to allow it is… rather complex.
} else |()| {
println!("You didn't break!");
42
}
));
}
 
struct X {
steps: int,
value: int,
}
struct Done(int);
impl Iterator<int, Done> for X {
fn next2(&mut self) -> Result<int, Done> {
if self.steps == self.value {
Err(Done(self.steps))
} else {
self.value += 1;
Ok(self.value)
}
}
}
 
fn demo2() {forr2!(
for x in X { steps: 4, value: 1 } {
println!("x = {}", x);
if x == 3 {
println!("Breaking early");
break;
}
} else |Done(n)| {
println!("Iteration finished at {}", n);
}
)}
 
fn demo3() {forr2!(
for x in X { steps: 4, value: 1 } {
println!("x = {}", x);
} else |Done(n)| {
println!("Iteration finished at {}", n);
}
)}

I propose new semantics for the `for` loop with a Result-based iterator: iteration stops once an Err is returned by the iterator. The `for` statement evaluates to Result<O, E>; O could be defined as being () or we could even allow `break` to include a value and infer from that. I prefer the second of those options.

Another path that could be followed is something more along the lines of Python's for..else construct¹. (For simplicity I'll keep the "else" keyword—it's by no means set in concrete, but *please* do not bikeshed it yet.) The semantics would be much the same, but we would need to add a pattern in after the "else" keyword to bind the Err to (with Python it's just a simple StopIteration, no value—Guido went with the Option<T> path rather than the Result<T, E> path).

Now for some sample code for the several schemes:

`for` with `Result<T, E>` semantics and a `Result<(), E>` return value: if_ok!(for line in file.lines() { print!("{}", line); })

`for` with `Result<T, E>` semantics and an `else` block—return value uncertain, probably simplest as `()`: for line in file.lines() { print!("{}", line); } else err { return Err(err); }

(In that example, the `err` after `else` is a pattern, as though the else block were `match Err(x) { Ok(_) => unreachable!(), Err($pat) => $block }`.)

(the Ok . the block The semantics of the `for` loop would be that iteration stops as soon as an Err is returned, and then one of two things:

The advantage of this scheme is that it allows you to actually handle errors. Python has its for..else construct and has from the first public release.

¹ The for..else construct in Python has been there from the first public release. I don't much like the name `else` and have heard complaints from some others about it, but its semantics are very useful: the `else` suite is executed if the for loop is not broken out of, i.e. if the iterator raises StopIteration.

Comments? Questions? Corrections? If you want to contact me about anything in this post, write to me at @__chrismorgan or email me.