[Pitch] Typed throws in the Concurrency module

ktoso · November 9, 2023, 7:08am

Sadly it is a pretty annoying special case so we'd like to keep those to a minimum, but doing so for a core type like AsyncSequence would justify the pain if we have to... Especially if we could at the same time fix some isolation issues that currently prevent AsyncSequence to be used properly inside actors (due to the iterator not being Sendable, however perhaps this may be solved by send-not-sendable semantics).

The difficult thing here is about getting the level of deprecations "just right", and we struggled with this for quite a while with Executor -- we'd want to issue warnings, but not too much warnings to cause people with otherwise good implementations to get warnings they cannot resolve etc.

We're getting to the third instance of such thing now with "next2" so it might be time to generalize but I don't think this would be a public feature, but an internal attribute really -- at least for now.

KeithBauerANZ · December 21, 2023, 3:37am

The async sequence combinators currently always throw the Failure type of the receiving sequence. I think there should be a second variant which allows introducing a failure type where there was not already one (this mirrors similar cases in Combine, for example):

// proposed above
  public func map<Transformed>(
    _ transform: @Sendable @escaping (Element) async throws(Failure) -> Transformed
  ) -> some AsyncSequence<ElementOfResult, Failure>

// we should also add
  public func map<Transformed, NewFailure>(
    _ transform: @Sendable @escaping (Element) async throws(NewFailure) -> Transformed
  ) -> some AsyncSequence<ElementOfResult, NewFailure> where Failure == Never

(unfortunately this explodes things with the current idea to offer multiple overloads to allow mixing Sendable and opaque result types, leaving 4 overloads for most of these functions. Perhaps it would be better to return to explicitly named result types to allow collapsing the Sendable variants?)

pyrtsa · December 21, 2023, 4:55am

I think it'd be best to achieve this not by introducing overloads but the equivalent of setFailureType(to:) for Never-throwing async sequences. (Or a mapFailure for transforming error types from one, e.g. Never, to another.)

KeithBauerANZ · December 21, 2023, 9:26pm

I think both those are important things to add too, but I think it'd be quite unexpected that eg.

AsyncStream(sequence: [1, 3]) // a trivial enough extension to define...
    .map { i in
        if i % 2 == 0 {
            throw IHateEvenNumbers()
        }
        return i
    }

should be rejected by the compiler. And I don't think that many people would think to look to

AsyncStream(sequence: [1, 3])
    .setFailureType(to: IHateEvenNumbers.self)
    .map { i in
        if i % 2 == 0 {
            throw IHateEvenNumbers()
        }
        return i
    }

as the solution, when they do hit that rejection!

1-877-547-7272 · December 22, 2023, 6:28am

Furthermore, a problem that has been brought up is the @discardableResult annotation on the initializers. This annotation silences any warning if the resulting Task is not stored; however, the only cases where this really makes sense are Task<Void, Never> or Task<Never, Never> . We propose to add the following new initializers

While it would be nice to have @discardableResult for both Task<Void, Never> and Task<Never, Never>, it should be noted that overloading a function/initializer with a function type returning Void or Never can cause errors due to ambiguity.

func f(_: () -> Void) {}
func f(_: () -> Never) {}
f { while true {} }
// error: ambiguous use of 'f'

In order to avoid breaking currently-working code, the Task<Never, Never> initializer will have to be marked as @_disfavoredOverload. In addition, to prevent the type checker choosing them over the Task<Never, Never> initializer, all of the non–discardable result initializers will also have to be marked @_disfavoredOverload as well. Same goes for the detached methods.

Or, alternatively, the type checker could be changed and given a new rule to choose a favored overload in cases like this. It should be noted that for cases without overloads like this:

let x = { while true {} }

the type checker already defaults to giving x the type () -> Void.

pyrtsa · December 22, 2023, 7:40am

KeithBauerANZ:

I think it'd be quite unexpected that eg.

AsyncStream(sequence: [1, 3]) // a trivial enough extension to define...
    .map { i in
        if i % 2 == 0 {
            throw IHateEvenNumbers()
        }
        return i
    }

should be rejected by the compiler.

Oh yes, that's problematic indeed. That made me wonder if instead of limiting the API to the Failure type of self, it would be possible to do a type conversion on failure types as well; so instead of the proposed new map(_:) API

extension AsyncSequence {
  public func map<Transformed>(
    _ transform: @Sendable @escaping (Element) async throws(Failure) -> Transformed
  ) -> some AsyncSequence<ElementOfResult, Failure>
}

we could get back

  ) -> some AsyncSequence<Transformed, errorUnion(Failure, NewFailure)>

I'm not sure if the errorUnion(e1, e2, ..., EN) type function introduced by SE-0413 was actually meant to be usable in type expressions, but here it would be useful IMO^[1].

And I don't think that many people would think to look to
AsyncStream(sequence: [1, 3])
    .setFailureType(to: IHateEvenNumbers.self)
    .map { i in ... }
as the solution, when they do hit that rejection!

Can't disagree on that either!

As I understood it, the semantics of errorUnion were meant to be:
(1) errorUnion(E, E) = E
(2) errorUnion(E, Never) = E
(3) errorUnion(E1, E2) = any Error where E1 != E2 ↩︎

Douglas_Gregor · January 2, 2024, 10:02pm

Hey all,

I took a stab at implementing the AsyncSequence part of this protocol, to assess whether it's possible to introduce typed throws in a manner that is both backward compatible and achieves the composability we want, with support for any AsyncSequence<Element, Failure> and such. The implementation in the compiler and library is in this pull request, along with toolchains to play with, but the salient details are below.

tl;dr we can stage in the new Failure associated type without breaking existing code, and the unfinished/unofficial @rethrows can be removed over time.

With my implementation, AsyncSequence and AsyncIteratorProtocol both get Failure associated types and adopt primary associated types, as in the proposal. AsyncIteratorProtocol gets a new function requirement _nextElement() that is a typed-throws version of next():

protocol AsyncIteratorProtocol<Element, Failure> {
  associatedtype Element
  associatedtype Failure: Error = any Error
  mutating func next() async throws -> Element?
  mutating func _nextElement() async throws(Failure) -> Element?
}

public protocol AsyncSequence<Element, Failure> {
  associatedtype AsyncIterator: AsyncIteratorProtocol
  associatedtype Element where AsyncIterator.Element == Element
  associatedtype Failure = AsyncIterator.Failure where AsyncIterator.Failure == Failure
  func makeAsyncIterator() -> AsyncIterator
}

Because existing AsyncIteratorProtocol-conforming types only implement next(), we need to provide a default implementation of _nextElement:

extension AsyncIteratorProtocol {
  /// Default implementation of `_nextElement()` in terms of `next()`, which is
  /// required to maintain backward compatibility with existing async iterators.
  public mutating func _nextElement() async throws(Failure) -> Element? {
    do {
      return try await next()
    } catch {
      throw error as! Failure
    }
  }
}

I didn't also implement next() in terms of _nextElement(), but we'd want to do that so that new async sequences could implement just _nextElement().

Now, one of the harder problems is how to get the right Failure type for existing async sequences. If the async sequence doesn't get recompiled, it'll get the default Failure type of any Error at runtime. This is fine---either it doesn't throw anything in practice, or its clients will see the any Error instance.

When the async sequence does get recompiled, we want to pick an appropriate Failure type even when there is no explicitly-specified one. I ended up using the following inference logic based on the next() implementation:

If next() throws nothing, Failure is inferred to Never.
If next() throws, Failure is inferred to any Error.
If next() rethrows, Failure is inferred to T.Failure, where T is the first type parameter with a conformance to either AsyncSequence or AsyncIteratorProtocol. If there are multiple such requirements, take the errorUnion of them all.

The async for..in loop switches from using next() to using _nextElement(), so iteration over an async sequence throws its Failure type. This subsumes the specialized behavior for @rethrows (if Failure is Never, you don't need the try because nothing is thrown), and gives us typed-throws behavior for iteration.

@rethrows protocols had another bit of special behavior, which is that conformance requirements to @rethrows protocols can be considered as sources of errors for rethrowing. So, you can currently write a rethrows function like this:

extension AsyncSequence {
  func contains(_ predicate: (Element) async throws -> Bool) rethrows -> Bool { ... }
}

and this function can throw if either the AsyncSequence throws (i.e., it's Failure type is not Never) or if the predicate throws. @pyrtsa noted this issue. I've partially addressed the problem by introducing a specific rule that allows requirements on AsyncSequence and AsyncIteratorProtocol to be involved in rethrows checking, so existing code that uses rethrows in this manner with async sequences will continue to work.

However, that doesn't address the fact that we can't write a proper typed throws signature for contains. As @pyrtsa noted, we could elevate errorUnion to an actual type in the type system, so we could write, e.g.,:

extension AsyncSequence {
  func contains<E: Error>(_ predicate: (Element) async throws(E) -> Bool) throws(errorUnion(Failure,E))  -> Bool { ... }
}

that's effectively what I've turned rethrows into, implicitly. We'd need to do something like this to fully replace the current rethrows behavior.

Doug

FranzBusch · January 3, 2024, 1:18pm

That's great progress. Thanks for putting in the work @Douglas_Gregor. Just a small naming bikeshed: I don't think we should underscore prefix the new method _nextElement() since it is quite common for developers to manually create an iterator and call next() on it. Having the new method underscored would hide it from code completion.

greggwon · January 9, 2024, 2:58pm

I really like chained exceptions for this kind of thing. Every exception should have a constructor that includes a message and an inner exception. This allows the type system to remain constant because only the recognized exception type is ever visible to the called where a throw occurred.

ole · January 15, 2024, 2:06pm

Task cancellation

A few of the Task APIs are documented to only throw CancellationError and can adopt typed throws. For example, checkCancellation:
public static func checkCancellation() throws(CancellationError)

Small editorial note: I find this phrasing problematic for a proposal. Are there more Task APIs that will be changed in this way? "A few" and "For example" seems to imply so. In that case, we should list all of them explicitly.

ktoso · January 15, 2024, 3:07pm

Yeah wording should be precise here.

Strictly speaking it is just:

checkCancellation that can ONLY throw CancellationError

And we could argue about Task.sleep, since in practice this is the only error it will throw nowadays and we document that like this:

  /// If the task is canceled before the time ends,
  /// this function throws `CancellationError`.

The only other place are the sleep methods. The static Task.sleep methods...

Those are debatable if we should strictly guarantee only this specific error type -- I'd argue that no.

A specific Clock may want to throw for various reasons though. And unless I'm misreading the sources, the Clock.sleep method has no documentation at all, so... we didn't promise we'll just throw a specific error there it seems.

ole · January 15, 2024, 3:11pm

Yes, the Clock.sleep issue is mentioned under Alternatives considered:

Most of the clock implementation only throw a CancellationError from their sleep method; however, nothing enforces this right now and there might be implementations out there that throw a different error. Restricting the protocol to only throw CancellationErrors would be a breaking change.

wadetregaskis · January 15, 2024, 4:13pm

ktoso:

And we could argue about Task.sleep, since in practice this is the only error it will throw nowadays and we document that like this:
  /// If the task is canceled before the time ends,
  /// this function throws `CancellationError`.

I think those horses have left the barn. I rely on that documented behaviour, as does quite a lot of 3rd party code I see. Changing it now will break a lot of existing code, in ways that are somewhat subtle and hard to anticipate and locate.

As such, I think the sleep methods should adopt throws(CancellationError) - might as well have the compiler ensure the documented invariant lots of people are relying on already.

wadetregaskis · January 16, 2024, 5:52am

And I meant to add: it'd be easy enough to add another variant of sleep which permits other types of errors, that could be opted-into by interested parties.

e.g. the classic reason why usleep bails prematurely is that a signal was received, unless you manually remember to adjust the current thread's signal mask first, which (a) almost nobody ever does and (b) presumably doesn't work for Swift Concurrency since the underlying thread is ill-defined and could change at any suspension point).

I'm led to believe Task.sleep ignores signals, but - even with typed throws aside - it would be handy to have a variant which doesn't, e.g.:

enum Signal {
    case abort
    case alarm
    …
    case hangup
    …
    case userDefined1
    case userDefined2
    …
}

enum SleepError: Error {
    case cancelled
    case interruptedBySignal(Signal)
}

static func sleep<C>(
    until deadline: C.Instant,
    orSignal signals: Set<Signal>,
    tolerance: C.Instant.Duration? = nil,
    clock: C = ContinuousClock()
) async throws(SleepError) where C : Clock

(unfortunately Swift currently requires that manual SleepError sum type, but ideally one day we'll get built-in support such that it could just say throws(CancellationError | InterruptedBySignalError))