Clarification needed on UnsafeContinuation documentation

To further illustrate what I mean, this is the code I'm actually using, because the documentation makes me afraid the code will suspend at the wrong point:

func join() async {
    return await withCheckedContinuation { newCompletion in
        if let oldCompletion = completion {
            self.completion = nil
            oldCompletion.resume()
            newCompletion.resume()
        } else {
            self.completion = newCompletion
        }
    }
}
1 Like

The closure is executed synchronously, without allowing any interleaving on the actor; your first code is correct.

This scheduling behavior is actually a special power of the with*Continuation functions ever since SE-0338. We intend to generalize that so that other functions can opt in to that behavior, but we haven't done so yet.

8 Likes

I think this is where my misunderstanding comes from. My mental model is that the task continues running, executing the continuation closure, and that the task suspends when the continuation closure returns. (This begs the question what happens when the continuation closure resumes the continuation synchronously, which would kind of resume the task before it suspended, so maybe this mental model isn't ideal.)

This interpretation better fits my mental model. Another way to detect this seems to be withUnsafeCurrentTask:

func doSomething() async {
  withUnsafeCurrentTask { unsafeTask in
    print("Task before continuation: \(unsafeTask!.hashValue)")
  }
  let _: Void = await withCheckedContinuation { continuation in
    withUnsafeCurrentTask { unsafeTask in
      print("Task inside continuation closure: \(unsafeTask!.hashValue)")
    }
    continuation.resume()
  }
}

This will print the same hash value for both unsafe tasks. (hashValue isn't ideal to verify these are identical, but close enough; another way could be to use withUnsafeCurrentTask to cancel the current task before creating the continuation, then check Task.isCancelled inside the continuation closure.)

2 Likes

Interesting, thanks for this information. I didn't consider how SE-0338 would change things, but knowing that with…Continuation needs special handling to preserve its semantics makes things clearer.

Let me expand on your post to verify my understanding:

Among other things, SE-0338 prescribes:

non-actor-isolated async functions never formally run on any actor's executor

I.e. if an actor calls a non-actor-isolated async func, the runtime must switch executors immediately. The executor hop may (not must) suspend the current task, e.g. if the target executor is busy.

There is a special exception for await with*Continuation (implemented via @_unsafeInheritExecutor, I think) that opts out of the new SE-0338 semantics and continues to execute these functions on the calling executor.

Correct?

2 Likes

That’s correct, yes.

The semantic rule has always been that the task is not resumed until both resume is called and the closure has returned.

If resume is called synchronously, the task is not suspended at all.

6 Likes

Thanks John!

I think that in your sentence, "the task" is the task that calls withUnsafeContinuation. There is a second task that enters the equation, and it is the eventual task that is resumed by withUnsafeContinuation (as a suspension point).

This second task can start running before the closure has returned. For all we know, it may even start running on the same thread as the first task, before the closure starts. This uncertainty raises questions: should we assume that methods that call withUnsafeContinuation must support reentrancy ? To be explicit: if the second resumed task immediately calls the same method that is still inside withUnsafeContinuation, waiting for its continuation, then this method must support reentrancy. And what if we don't want to (support reentrancy)? And which expectations will break when users will be able to write their own executors?

Unsafe continuations are still not sufficiently documented and described. This is all very confusing, and a lot of people are writing buggy code thinking they are safe. One bug around unsafe continuations I found today in the Swift runtime: `withUnsafeContinuation` can break actor isolation · Issue #61485 · apple/swift · GitHub

3 Likes

Personally, I find the documentation for concurrency stuff quite inadequate and poorly written. Sometimes I feel that it is written for those who are familiar with the internals of the system, not for those who actually write code to solve real word problems.

3 Likes

Hmm. It is important to distinguish the language and its standard library from the ecosystem that can flourish, based on them.

Sure, the concurrency aspects of the language and the stdlib are in sufficiently documented, leaving too much room for interpretation and nagging doubts. This will improve, I suppose, with time, and also with the discovered flaws that will help ideas to settle. One does not create a robust concurrency system in a few months. It takes more time.

What you call "real world problems" are supposed to be solved in the ecosystem, not in the language+stdlib. This is not a fact or an opinion, this is my interpretation of what I see.

The ecosystem is slow to provide the tools we need. And recent progress are limited to the latest Xcode beta (looking at you, GitHub - apple/swift-async-algorithms: Async Algorithms for Swift), with dependencies on the future Clock), which means that we may never get back-deployable apis when the tools we need ship. To me, this is the most frustrating part.

The ecosystem is slow to adapt to the new concurrency apis, and we're stuck with a marketing motto "fearless concurrency", which lacks building blocks. :man_shrugging:

2 Likes

The bug you found may be exclusive to macOS + Xcode 14.0. Check out my post from a few weeks ago.

1 Like

Maybe!

The closure argument of withUnsafeContinuation is documented to run "immediately", and it is not @escaping, so in all reasonable trends of thoughts it has to run on the same thread as the caller.

Now, I don't know if I would assume that it runs on the same dispatch queue. We know that DispatchQueue.sync can reuse the same thread, but changes the outcome of dispatchPrecondition(condition: .onQueue(*)).

I think that the bug I found is more related to the task that is resumed from the withUnsafeContinuation suspension point (not the current task, but the task that has an opportunity to resume, which I call "the second task" in this post) - but this is just my interpretation.

All right, instead of spending more time trying to make sense of all of this, compiler bugs included, let's have a nice weekend :sunglasses:

I'm not totally sure how to respond to this. You're arguing that there's a lot of confusion about Swift concurrency, and that's very convincing, because your post also asserts a lot of stuff that's wrong. I think you've misunderstood some of the basic terms in use in Swift concurrency, so let me try to clear things up.

It sounds like you're using "task" as if it's basically a scheduling unit — the amount of code that would be indivisibly scheduled by a single call to, say, dispatch_async. In Swift concurrency, we use the term "job" or "partial task" for that. A "task" is an asynchronous thread, which is ultimately executed as a sequence of scheduling units; those units never execute concurrently with one another, and are in fact totally sequential, and their execution is formally well-ordered with respect to concurrency so that the events in one unit must all happen-before the events in the next.

Continuations are not an exception to this. withUnsafeContinuation does not return until both something has called resume on the continuation and the function passed to withUnsafeContinuation has returned, and that is also formally well-ordered with respect to concurrency. So it is absolutely not the case there are somehow two tasks involved with continuations or that the "second task" can start running before the closure has returned.

Now, there is a bug in Xcode 14 when compiling for macOS because it ships with an old macOS SDK. That bug doesn't actually break any of the ordering properties above. It does, however, break Swift's data isolation guarantees because it causes withUnsafeContinuation, when called from an actor-isolated context, to send a non-Sendable function to a non-isolated executor and then call it, which is completely against the rules. And in fact, if you turn strict sendability checking on when compiling against that SDK, you will get a diagnostic about calling withUnsafeContinuation because it thinks that you're violating the rules (because withUnsafeContinuation doesn't properly inherit the execution context of its caller).

But that has nothing to do with the basic correctness of the order of execution on a task, and its only relation to the scheduling of partial tasks is that it incorrectly creates suspension points at the call to and return from withUnsafeContinuation, forcing more partial tasks to be scheduled. (There is not otherwise necessarily a suspension point on the return from withUnsafeContinuation — if the function passed in manages to call resume on the continuation before it returns, then withUnsafeContinuation will return without the task ever having been suspended.)

16 Likes

You sound like you're trying to make me look stupid.

If you want to be useful, please chime in this discussion. I'm trying to build a counting semaphore on top of Swift concurrency (after all, even Microsoft thinks that awaiting a semaphore is not a stupid idea), and we have a few questions that need a practical answer - the main one being: is the current implementation correct?

I am not trying to make you look stupid. I responded to the reply you made to me to try to clarify things what I think you have misunderstood, which is important not just for your benefit, but for the benefit of other people who might find this thread.

I'll go look at that thread. If you thought that that was the right way for me to be useful, though, you might have linked it at some point above instead of just popping off.

8 Likes

Thank you John for shedding light on the dangers of the Xcode 14+macOS, and helping clarifying the behavior of unsafe continuation in the context of AsyncSemaphore.

Look, we have a reliable semaphore that we can await, now. And it back-deploys as far as it can. And it even works on the unstable Xcode 14+macOS combo. Isn't it good news? Yes is it good news. The actor reentrancy problem reported by some people is solved, for example. I wish more people would know that. Thank you!

3 Likes

@John_McCall If you don’t mind my asking, was this bug known inside Apple before the final release of Xcode 14.0? Or were Apple folks surprised because nobody thought through the implications of matching the Swift 5.7 compiler with the Swift 5.6's standard library module interface? Or was it known but not considered as a serious issue? I don't mean to blame anyone – as evidenced by this thread, I was aware of the special behavior of withUnsafeContinuation and didn't think it through, either.

I'm asking because the compiler generating code that breaks concurrency invariants is a serious problem, and I'm surprised by the lack of communication from Apple about it:

  • I can't find any mention of it in the Xcode 14.0 release notes or Xcode 14.0.1 release notes. In my opinion, something like this warrants a big red warning at the top of the release notes: "Don't use Xcode 14.0 to build macOS targets that use concurrency!"

  • The bug is hard/impossible for third-party developers to catch during the SDK beta phase because Xcode betas ship with the beta macOS SDK, only to revert back to last year's SDK for the final release. This is another argument for Apple to communicate it offensively.

  • The thread that first (AFAIK) mentioned this problem on this forum, Concurrency is broken in Xcode 14 for macOS (2022-09-14), received little engagement or acknowledgement of the issue. (I know that no-one can read everything, so again no blame!)

14 Likes

That seems to be better suited to be communicated with Apple's feedback mechanisms - doesn't have that much to do with Swift itself, as it has to do with feedback on how release notes for Apples proprietary tools (Xcode and SDK:s) are decided and generated. (even though I agree it is important feedback, I don't think it is fair to call out an individual engineer to answer for what is likely a process problem and defend that in public - at least what it looks from the outside). I'd file a Feedback (and possibly post the reference to it to allow for follow up by individual engineers who might want to give more feedback internally at Apple).

5 Likes

I asked when this bug will be fixed. According to this answer the bug is specific to Xcode 14.0.x, the bug is fixed in Xcode 14.1 beta, and the bug will remain fixed in the future Xcode releases, 14.1 and forward.

Yes, because the bug is caused by the interplay of the Swift 5.7 compiler with the Swift 5.6 standard library's module interface. It will "fix itself" with Xcode 14.1 because 14.1 will ship with the macOS 13 SDK, which will contain the Swift 5.7 standard library's module interface.

I understand your position, but I disagree.

This issue very much pertains to the Swift open source project because Xcode is the way Swift is distributed on Apple platforms. You pretty much have to use Xcode to use Swift on macOS. I find it very disturbing that the current official Swift release generates wrong code on one of Swift's most important platforms and that neither Apple nor the Swift team are communicating this prominently.

A big red warning in the release notes is the minimum I’d have expected. Even better, the warning should be directly in Xcode, or Xcode 14.0 should outright refuse to build macOS targets that use concurrency.

Xcode 14.0 was released a month ago, and if the rumors are true, it will be another ~2 weeks until we see Xcode 14.1. That means people will have been building and releasing Mac apps with a broken compiler for ~6 weeks.

I want this discussion to be in the open. In my view, not much is gained by my filing private feedback. (Plus, I admit that I'm not willing to participate in Apple's dysfunctional — to outsiders — feedback process if I can help it.)

I apologize to @John_McCall for calling you out individually, which wasn't my intention. I understand if Apple folks don't want to or can't have this discussion in public, but I'd love to be proven wrong.

15 Likes

I wholeheartedly agree with Ole. This needs to be discussed in public.

4 Likes