[Pitch] Clarify end-of-iteration behavior for AsyncSequence

nnnnnnnn · March 10, 2021, 6:53pm

Hi all —

Here’s a pitch to codify end-of-iteration behavior for AsyncSequence and its iterators to match the existing Sequence protocol.

Introduction

The recent proposal for asynchronous sequences left open a question about how asynchronous iterators must behave when reaching the end of their elements. We should amend the proposal to clarify that after returning nil or throwing an error from an iterator’s next() method, all subsequent calls to next() must return nil .

Motivation

As currently written, once an asynchronous iterator terminates iteration, the behavior of any future calls to next() is unspecified and up to the implementation of any particular iterator. This isn’t ideal, as it makes it harder to use asynchronous iterators in a generic context and can make it impossible to know whether an iterator has exhausted its elements.

The existing (synchronous) IteratorProtocol includes a requirement that once an iterator returns nil from its next() method, all subsequent calls to next() must also return nil . This requirement was added in SE-0052, eliminating a precondition on the next() method that it must not be called after returning nil .

This requirement is quite important, as there’s no way to tell if an iterator has finished iterating other than calling its next() method. For an example of this in practice, let’s look at the UnsafeMutableBufferPointer.initialize(from:) method, which initializes a buffer from a sequence and returns an iterator to any elements of the sequence that didn’t fit in the buffer. As the caller of this method, your only way to know whether all the sequence’s elements were written into memory is to call next() on the returned iterator. Without the “forever nil ” guarantee on the iterator, the initialize(from:) method would need to try to consume an additional unneeded element, and return more information to the caller.

These same kinds of issues will affect Async sequences and iterators, with the added complexity of throwing as a way of ending iteration.

Proposed Solution

To clarify this, we should require that AsyncIteratorProtocol carry the same requirement as IteratorProtocol , extended to treat throwing an error the same as returning nil . That is, once an async iterator has terminated iteration by throwing an error or returning nil , all subsequent calls to next() must return nil .

Detailed design

This change adds minimal complexity to iterator implementations. Iterators that wrap an upstream iterator can generally rely on their upstream iterator’s adherence to this requirement to get the correct behavior for free. If an iterator terminates iteration before its upstream source, it will only need to track that termination as state and prevent returning more elements from future calls to next() .

When an iterator wraps a closure, that additional state can frequently be handled by marking the closure as optional. This doesn’t require any additional storage, and allows the closure to be set to nil upon termination, freeing up any resources that may have been captured.

Source compatibility

None — no async iterators have shipped yet.

Effect on ABI stability

None.

Effect on API resilience

None.

Alternatives considered

None.

jawbroken · March 19, 2021, 2:38pm

Seems uncontroversial to me, and from the lack of other replies I would guess that is generally true. I don't see a good reason why the guarantee should be different for AsyncIteratorProtocol and IteratorProtocol.

ktoso · March 19, 2021, 2:42pm

Oh, I somehow totally missed this pitch, thanks for digging up @jawbroken.

+100, absolutely agree on the proposed amendment! I’m shocked to realize it isn’t so already to be honest.

It is also an assumption that the TaskGroup conformance to AsyncSequence is running under.

It also is a strong requirement for any reactive-streams style (Combine is one of them) stream implementations to to be able to tear down things as they “finish” (with error or completion).

Michael_Ilseman · March 19, 2021, 4:50pm

Strong +1 from me.

Srdan_Rasic · March 20, 2021, 10:25pm

Definitely +1.

asdf · March 21, 2021, 5:43am

what is the reason to throw error instead of returning Result?
why not to return Result?

benrimmington · March 27, 2021, 7:53pm

The first sentence (quoted above) seems to imply that all subsequent calls can also throw an error.
Should the proposed solution be amended as follows?

… once an async iterator has terminated iteration by throwing an error or returning nil, all subsequent calls to next() must throw an error or return nil.

nnnnnnnn · March 29, 2021, 6:09am

Ah, no — the clarification should be in the other direction: