[Pitch] Generalize `AsyncSequence` and `AsyncIteratorProtocol`

Hello, Swift evolution!

I wrote up a dedicated pitch detailing solutions to two fundamental issues with AsyncSequence and AsyncIteratorProtocol: the bespoke-and-limited @rethrows attribute, and issues with Sendable checking when iterating over an AsyncSequence. This includes the typed throws adoption in AsyncSequence and AsyncIteratorProtocol described in [Pitch] Typed throws in the Concurrency module, and adoption of isolated parameters so that AsyncSequence and AsyncIteratorProtocol are polymorphic over both the thrown error type and actor isolation.

You can view the proposal draft on GitHub at swift-evolution/proposals/NNNN-generalize-async-sequence.md at generalize-async-sequence · hborla/swift-evolution · GitHub.

Please leave editorial feedback on the swift-evolution PR at Add a proposal to generalize `AsyncSequence` and `AsyncIteratorProtocol`. by hborla · Pull Request #2264 · apple/swift-evolution · GitHub.

I welcome your questions, thoughts, and other constructive feedback!

-Holly

24 Likes

First off: I’m very happy about this proposal, and feel a bit apologetic about the need for it, since the original design it’s correcting was partly my doing :sweat_smile: so thank you for that!

My main practical concern is making sure that the “always-inline next() to avoid executor hops” trick that AsyncBytes relies on still works. I don’t see any reason to believe it wouldn’t, but it’s been fragile to compiler changes in the past, so we should verify that it’s still ok.

2 Likes

The motivation is sound and requires little elaboration; this is in a sense 'just' fixing bugs. Important ones. It's great to see them addressed.

Next next

As with quite a number of recent pitches, there's this next2 problem. I'm not thrilled with the proposed proliferation of _newNext, betterNext, nextNewTechnology, etc. There's the inconsistency for a start, but more importantly it's messy to have these 'duplicate' methods, and very confusing to anyone coming in new to this code. On this point:

  1. Is there really no way to update the existing method in-place? I vaguely recall that there's a mechanism to keep emitting a compatibility version of a method while adopting a new one under a custom mangled name, or something like that? Especially since you can't fully benefit from the new & improved implementation unless your minimum deployment target is the cut-over version (5.11) anyway.
  2. If there must be a detached new method, can the name at least be somewhat suggestive of purpose if not also superiority? e.g. concurrencySafeNext or fullyTypedNext or whatever.

Relating to the first point, I see the pragmatic appeal of bridging the existing next to the new implementation by default, even though it violates Swift's safety goals in a [continued] buried way… and I don't see a way around that. It's worth noting that it's unfortunate, though.

Is there any opportunity in Swift 6 to 'flag day' a lot of this retroactively, removing all the deprecated stuff and returning the API to a clean state?

(any Actor)?

I don't mind it, other than the slightly awkward syntax (it seems like any Actor? conveys the same intent - maybe that syntax could be refined in a separate proposal?).

Conceptually it seems fine & appropriate to use nil to represent non-isolated use. Having a magic value (in essence) to signify that seems strictly less elegant and against the grain of optionals.

Tangentially, I'm still working on coming to terms with this method of dynamic actor affinity. It feels like it should be something more intrinsic to method calling (like generic parameters), than a parameter that can be arbitrarily specified in code (even if the intended common case is that the compiler populates it automagically).

1 Like

Thanks for taking this part of the Typed throws in the Concurrency module pitch. This is definitely one of the more complicated pieces of the pitch and I agree that having a dedicated pitch for it is very sensible.

Now to some feedback on the pitch.

mutating func nextElement(_ actor: isolated (any Actor)?) async throws(Failure) -> Element?

This is the first method where we are adopting the isolated parameter in the standard library if I am not mistaken. I don't think we have written any API design guidelines around the naming of isolated parameters and I find the proposed call site spelling a bit confusing:

var iterator: some AsyncIteratorProtocol<Int, Never> = ...
let element = await iterator.nextElement(nil)

Just from looking at the code it isn't obvious what nil means here. Similarly if I would be passing self if I am inside an actor. What if we add an external parameter label here e.g. nextElement(isolatedTo actor: ...)?

To avoid silently allowing conformances that implement neither requirement, and to facilitate the transition of conformances from next() to nextElement() , we add a new availability rule where the witness checker diagnoses a protocol conformance that uses an deprecated, obsoleted, or unavailable default witness implementation. Deprecated implementations will produce a warning, while obsoleted and unavailable implementations will produce an error.

This is really cool and something that makes evolving protocols forward with minimal adopter friction amazing! @ktoso maybe we can use this trick for the SerialExecutor protocols as well once it landed.

Once the concrete AsyncIteratorProtocol types in the standard library, such as Async{Throwing}Stream.Iterator , implement nextElement() directly, code that iterates over those concrete AsyncSequence types in an actor-isolated context may exhibit fewer hops to the generic executor at runtime.

This is quite an important part of the pitch that has a larger impacted then just fewer hops and IMO deserves a bit more discussion. With SE-0306 we have changed the execution semantics of non isolated methods to eagerly hop to the global concurrent executor since we noticed that users of Swift Concurrency were often hogging the MainActor with non-isolated methods. With this pitch we introduce the first method in the standard library where we aggressively try to stick to the current context's isolation; hence, we introduce a new area where developers might hog the MainActor.
Personally, I don't think this is a bad thing but it might result in performance changes for applications once they recompile with a newer Swift version and the nextElement(_:) implementation gets picked by the compiler.

Note that the use of an existential type (any Actor)? means that embedded Swift would need to support class existentials in order to use nextElement() .

Do we think this is a reasonable thing for Embedded Swift to support in the future? It would be sad if we would limit the possible usage of AsyncSequence for Embedded because of the usage of any Actor here.

Lastly, the original pitch of @ktoso and me included changes to the various async sequence algorithms in the standard library. Do we foresee any problems if we are only landing the protocol/compiler changes for typed throws but not adopt them in the algorithms?

4 Likes

I think @scanon may remember the details better than I can, but there are existing places in the standard library with default implementations that call each other where we've figured out how to use availability attributes to throw up the proper diagnostics, so it'd be good to check that we actually need to add a "new availability rule" here versus applying existing techniques.

3 Likes

Can't wait for this! So glad you're tackling this issue!

This detail sounds really useful to me, even outside the context of this pitch:

… to facilitate the transition of conformances from next() to nextElement() , we add a new availability rule where the witness checker diagnoses a protocol conformance that uses an deprecated, obsoleted, or unavailable default witness implementation.

I have my own use cases for this very feature. I'll be really glad that inheriting deprecated defaults will be diagnosed as a warning.

On the isolated actor parameter:
I see there are some overlapping ideas with @John_McCall's pitch "Inheriting the caller's actor isolation".
Do you plan to integrate more parts of that pitch? Like, to use = #isolation as a default argument for nextElement or even the @inheritsIsolation attribute? If not, could you enlighten us?

Could this pitch also touch on the problem of cooperative cancellation?
More on the topic - AsyncSequences and cooperative task cancellation

There is no way to avoid adding a new entry point because the method needs to accept a new parameter, and the typed throws ABI (not just the mangling) is different from the throws/rethrows ABI.

I'm absolutely open to alternative names for nextElement(). We could even just call it next() since it accepts a parameter and the original doesn't. If we do that, we'd need to make sure that if we ever do add a default argument for the isolated parameter, calls to next() in a context with Swift 5.11 availability resolve to the new one and not the old one.

Unfortunate indeed!

After writing this proposal, I'm more convinced than ever that requiring parenthesis around optional opaque and existential types was a mistake. I think it should be a separate proposal, though. (And if anyone is interesting in writing that proposal, I'll provide an implementation!)

Yeah, abstracting over actor isolation is inherently value dependent. @John_McCall describes this well in his proposal for inheriting actor isolation. Here's an excerpt from the detailed design section:

Since isolation is unavoidably value-dependent (an actor method is isolated to a specific actor reference, not just any actor of that type), polymorphism over it can't be expressed with just generics. The natural next choice is to just use a parameter of polymorphic type, such as (any Actor)?. This matches SE-0313's isolated parameter feature, except that isolated parameters are currently required to be non-optional actor types: either a concrete actor type or a protocol type which implies Actor. Generalizing this is straightforward and gives us the ability to make functions explicitly polymorphic over an arbitrary isolation.

While we're talking about the isolation inheritance pitch...

Yes, the generalization of isolated parameters came directly from John's proposal. I did that because it's separable from isolation inheritance, and because I think the implementation for this AsyncSequence and AsyncIteratorProtocol pitch is very close to being ready for review, while the full isolation inheritance implementation is not.

I do not plan to incorporate the rest of the isolation inheritance pitch into this one. If this pitch here moves into review, the isolated parameter generalization will just get subsetted out of the isolation inheritance pitch because it will have already been reviewed.

4 Likes

This is great to see. The non-sendability of (most) AsyncIteratorProtocol conformers remains one of the few issues with adopting Swift concurrency safety that does not have at least a somewhat reasonable answer.

I have a vague memory that when this issue came up in the context of region-based isolation it was suggested that a solution might live in allowing next() to be marked as not merging its parameter/results with the generic context, could you discuss a bit why you took this direction instead? Is it just that this additionally helps us avoid unnecessary hops or are there other benefits?

1 Like

I think that's the best possible solution here, in lieu of the dream I have of upgrading the existing method in place. Long-term the old overload will eventually fade into obscurity (if not literally be removed). In the interim there is a little potential for confusion, as to which one should be preferred, but that's easily addressed with documentation (anyone implementing, or thinking of implementing, these methods will need to be reading the documentation anyway in order to understand the required semantics).

It's about time I tried writing a proposal rather than just critiquing others'. :laughing:

Send me a DM, if you like, and we'll figure out a plan (which is to say, you can outline for me what I need / you want me to do :smile:).

3 Likes

I had a bunch of questions that have already been asked ^_^ but I just wanted to say how excited I am for this. For me, this is the real problem solved by typed throws, and it will finally provide us with a path from Combine to AsyncSequence.

My biggest concern is the back-deployment option. I understand that it's technically very difficult, but if this ends up requiring iOS 18, it'll be 3+ years before I can actually use it. That's too long to be stuck on Combine for!

Are there other options which might allow back-deployment of more of the proposal? In particular, the ability to use any AsyncSequence<T, any Error> and any AsyncSequence<T, Never> back as far as possible would be invaluable.

1 Like

One thing I'd like to draw to attention is that I think we're going to need to perform some fairly major surgery on these types soon in order for them to support non-copyable (and non-escaping) element types.

Fundamentally, next() cannot return an element; it needs to allow its caller to borrow an element.

In the language we have, you'd express this using a closure (rough example for illustration purposes):

protocol AsyncIteratorProtocol {
  associatedtype Element: Sendable
  associatedtype Failure: Error

  mutating func next<T>(
    _ yield: (Element) async throws(T) -> IterationResult
  ) async throws(errorUnion(Failure, T))

  /* + isolation parameter in above */
}

enum IterationResult {
  case continue
  case break
}

And then you'd write your iterator like so:

mutating func next<T>(
  _ yield: ([UInt8]) async throws(T) -> IterationResult
) async throws(errorUnion(Failure, T)) {

  var buffer = [UInt8]()
  while await populateBuffer(&buffer) {
    guard try await yield(buffer) == .continue else { break }
  }
}

And now that we're yielding borrows of the element rather than returning copies of it, the element can instead be a non-copyable type -- and even a non-escaping type like some kind of stack buffer.

Ultimately I'm sure we're going to want a real coroutine/generator interface in the language rather than literally adding an enum and asking people to write closure parameters everywhere, but this demonstrates the concept.

Noncopyable types have already shipped, there is a pitch and early implementation for using them in generics, and I've seen early implementation work for non-escaping types start to land in the compiler. So this doesn't seem to me like a distant fantasy any more; it might actually be a realistic thing that happens in the near future.

And so if we're going to have to do this sort of major surgery soon anyway (e.g. possibly outright deprecating AsyncIteratorProtocol in favour of a generator interface), I wonder if it's really worth going through all of this now.

Can anybody illuminate the plans for supporting non-copyable and non-escaping types in async streams?

7 Likes

I appreciate the will to tackle this, but I feel like this proposal isn't the right place for this task. The base protocol is just the requirement on how to create custom async sequences, not how their cancellation behavior should behave. If anything I think we should generally revisit all existing async sequences in terms of their cancellation in a standalone proposal.

3 Likes

I like the isolatedTo label. I agree that it's worth having some API design guidance around adopting isolated parameters, especially because I anticipate there will be other functions that will accept a parameter and do nothing with it other than isolated the function to it.

It's worth noting that freeing up actors was not the only motivation for SE-0338. The other major motivation for SE-0338 was Sendable checking, because without SE-0338, non-Sendable state could end up being inadvertently shared because a non-Sendable parameter to a nonisolated function could cross an isolation boundary over a suspension point within that function because it resumes on some other executor.

I agree that it's worth elaborating on the possible performance changes in the proposal. In some cases, eliminating unnecessary hops to the generic executor could end up improving performance despite having fewer suspension points when iterating over an AsyncSequence on the MainActor.

Also, somebody does notice a performance problem in their AsyncSequence iteration, and they can force the source sequence to be in a disconnected region, they can explicitly call nextElement() and pass in nil so that nextElement() runs on the generic executor. I think that's an important aspect of this proposal, and the reason I chose to use an explicit isolated parameter instead of something like isolated(caller) from the actor inheritance proposal.

I think so, because any Actor is a class-constrained existential, and embedded Swift code that uses actors has to be in the allocating subset anyway.

I can't immediately think of any problems that wouldn't already exist if we adopted typed throws in the AsyncSequence algorithms as part of this proposal. We need to consider the impact on overload resolution in contexts where both the old and new overload are available. The overloads will also need to be tied to the availability of the Failure associated type regardless of whether they're added now or later.

We discussed this in the Language Steering Group meeting. The witness checker already diagnoses using default implementations that are unavailable or obsoleted, so the fact that deprecation isn't considered is just a compiler bug. I also learned that you can tie availability to language mode, so I think we want the default implementation of nextElement() to be unconditionally deprecated, and obsoleted in Swift 6.

There was originally a question in one of the isolation region discussions about whether that proposal would just allow the non-Sendable iterator to be passed across isolation boundaries for the next() call, but that only works if the iterator is in a disconnected region. That generally will not be true for non-Sendable async sequences unless they're created and used entirely locally, which is not how most code that uses AsyncSequence in an actor-isolated context works.

I do think that the future directions of region isolation may be useful for AsyncSequence and other concurrency APIs independent of this proposal. For example, today it's possible to use CheckedContinuation, AsyncStream.Continuation, etc to smuggle non-Sendable values across isolation boundaries using yield. We could require yield to accept a Sendable parameter, but that'd be extremely prohibitive. Instead, yield should probably accept a transferring parameter, which can be satisfied either by a Sendable value or a non-Sendable value in a disconnected region. If we do that, it may be possible to mark the result type of nextElement() as transferring / "returns isolated", which would make it valid to iterate over an AsyncSequence of non-Sendable values in an actor isolated context while passing nil to nextElement() because you explicitly want it to run on the generic executor.

But even if we could make all the concurrency warnings go away with just region isolation, that wouldn't solve the issue of excessive hops to the generic executor. For example, in cases where the computation for the next element is isolated to some other actor, next() would hop to the generic executor only to turn around and immediately hop to some other executor. I think isolating next() to the calling actor is the right default, while still providing tools for either isolating the element-producing closure, or explicitly isolating nextElement with the isolated argument.

None of us are happy with the back deployment restrictions in the proposal, but we have not come up with a solution that's actually feasible. Even if we did come up with a way to let you use primary associated types, they would have to be limited in what you can do with them. For example, you would not be able to dynamic cast to type any AsyncSequence<T, Never>.

5 Likes

Personally in many cases this would be a fair tradeoff to pay. It's just a gut feeling, but I don't think that casting into or from the existential is such a big need compared to just wrap all sorts of async sequence algorithms into some complex chains. Therefore I would second the exploration of potential back deployment as well.

By the way, as we now introducing a concrete failure type, shouldn't we also provide a set of algorithms to transform the failure type (mapFailure etc.)?

Right but there might be an embedded system that wants to use AsyncSequences but not actors. With this change we are tying both together.

1 Like

For embedded systems that use concurrency (e.g. likely ones that have multiple cores) are much more likely to have enough ram to have the potentials of heap allocations. There is a potential that if we saw fit to add existentials for reference types (including Actor) that could be possible w/o involving requiring all type metadata.

However those particular cases are rather limited in utility; where I would expect that many of those cases would be fine w/ that trade-off. This type of decision obviously should not be made trivially and can be deferred off the proposal such that we can make a better choice later on when that use case is better defined.

The only alternative I see is that we add two methods one that takes some Actor and another that takes some sort of nil/default argument to specify that the isolation is not required. The hacky (but perhaps smart... I haven't decided which one it is yet) is to have the overload to the _ isolation: some Actor be _ : Never? where the only valid parameter is nil.

I definitely don't need to use as? or as! for this to be useful. Basically, I want to be able to

  • write protocols where requirements return any AsyncSequence<T, E>,
  • use any AsyncSequence<T, E> as an async sequence, including iterating it, or transforming it (eg. with map)
  • write new async sequences (like AsyncMapSequence) which conditionally throw depending on whether their input sequences do

Though thinking about that, I don't know how dynamic the "opening the existential" operation actually is... maybe that's equivalent to a dynamic cast and this is all in the "impossible" basket?

Small catch on the (any Actor)? of nextElement(isolatedTo:) -- this should be (any AnyActor)?, because iteration inside a distributed actor should work the same way as in an actor. Or alternatively make it a generic parameter?

It's also tempting to go ahead and perhaps implement the being-pitched in [Pitch] Inheriting the caller's actor isolation #isolated or the moral equivalent of it, so we can spell the parameter as isolatedTo actor: A = #isolated (or alternative name it might get) so we don't have to invent a special way to summon the actor value but rely on the # default parameter :thinking:

4 Likes