[Returned for revision] SE-0472: Starting tasks synchronously from caller context

allevato · April 16, 2025, 6:43pm

Hello Swift community,

The initial review of SE-0472: Starting tasks synchronously from caller context concluded on April 10, 2025. Feedback was positive on the direction that the proposal is taking to allow the creation of tasks that start running synchronously in the same context before their first suspension, and commenters pointed to use cases where this solves real-world problems, especially in async code called from UIs that has a fast path that doesn't suspend (e.g., retrieving cached data instead of loading it from an external resource).

The Language Steering Group agrees that this direction is worth pursuing. We also agree with a design change that was proposed during review, which would lift the restriction that the task's closure cannot have a different isolation than the calling context. The behavior of the proposed API depends on where the first suspension in the closure occurs dynamically. This can sometimes be tricky to predict—it may not be the first await expression because that callee may not suspend. Introducing a possible hop to a different actor at the beginning of the closure does not make this any harder to predict, and that hop would still be avoided (i.e., the task would start synchronously) when the caller is already isolated to the same actor as the closure.

For this reason, we are returning the proposal for revision in order to formalize that design change. We also agree that the name startSynchronously no longer captures this nuance, and we expect that this API will be fairly widely used (especially in UI code as mentioned above), so we have asked the author to update the name of that API (and the related variants for detached tasks and task groups) to better reflect the new semantics and to be more lightweight at the usage site.

A second review of the proposal with these changes will be run shortly.

Thanks to everyone who participated in the review!

—Tony Allevato
Review manager

John_McCall · April 16, 2025, 10:21pm

I'll respond to @ktoso's comment from the review here, just to maintain a linear conversation.

SE-0472: Starting tasks synchronously from caller context

So the signature in the proposal is specifically about the original pitched idea here: the "don't ever allow not synchronous execution", which is now being considered to change into "try to run synchronously". I was working out yesterday what the signature will have to become, and it seems it would be as follows:
@_implicitSelfCapture _ operation: __owned sending @isolated(any) @escaping () async throws -> Success
(which uncovered a but in interface printing, so we're working to fix that, while at it). We're not relying on the @_inheritActorContext here, but on comparing executor of current and target and doing the synchronous run if able to.

Okay. I think you probably still want something like @_inheritActorContext here, because it's important for usability that that function pick up on the current isolation when you know it. If you don't, then closures passed in here are going to default to being nonisolated, and programmers will have to explicitly make them isolated in order to do things that they'll expect to be able to do with the current isolation. (This is basically the same type signature and scenario that happens with task groups.) I think it's going to be very common to use this from an isolated context, and it would be unwelcome and surprising for the new task to start nonisolated.

You're absolutely right that you can't rely on the function being isolated the same as the current context, but that's not why I'm suggesting it.

Agreed, and I think it's fair for us to hold off on doing the warning work to see if it's a real problem in practice.

Hmm, this is interesting; let's think it through.

First off, I don't think the runtime should be playing much of a role here. The task function is an async function, and async functions are generally assumed to handle their own isolation. The caller doesn't need to (and generally should not) eagerly switch to the callee's isolation before making the call. It looks like you've got the runtime proactively deciding whether to run the initial task funclet synchronously or enqueue it, but I don't think that's really useful: regardless of what the runtime does, the first thing that that funclet's going to do is turn around and ask the runtime to switch to the right executor. The result is that you might as well just run the funclet synchronously, and it'll presumably make its executor request, and if that needs to suspend, it'll suspend and enqueue the task. The result is that you're essentially just forcing the executor check to be done twice. (You do need to run the initial funclet in a special runtime context that disallows switching executors on the current thread, though.)

Now, it's an interesting question what the function will try to do. If we infer it to be nonisolated, under SE-0461 that should mean it preserves its caller's isolation, right? And maybe under that logic it shouldn't be trying to switch to the generic executor and thus immediately suspending because the current context is isolated. However, I don't think that's actually how nonisolated currently interacts with @isolated(any); I believe a nonisolated function that's converted to @isolated(any) effectively becomes @concurrent. So it will switch to the generic executor.

Yes, I agree that capturing the isolation would be cleaner even for Task.init. However, @hborla has some very reasonable concerns that this could lead to new reference cycles, which means we can't rely on a change there being viable, and even if it happens, it will need real investigation first.

Assuming that I'm correctly analyzing the isolation and scheduling rules above, I think we have a more urgent need for this API to preserve isolation than for Task.init. The problem is that we can end up reliably violating the explicitly-expressed intent of this API in the default, unannotated case:

actor A {
  func foo() {
    print("Hello, ")
    Task.startSynchronously {
      // doesn't capture self
      print("world!")
    }
  }
}

foo is statically known to be isolated. If the task function does not inherit that isolation and therefore (by the argument above) becomes effectively @concurrent, it will start by trying to switch to the generic executor, and so the entire function will run asynchronously. I think that's a major problem.

ktoso · April 17, 2025, 12:26pm

Thanks for the discussion John!

Yes, good point that the runtime trickery would end up doing the check twice. And that the nonisolated inference will get in the way... I keep forgetting about that interaction and it always ends up messing up those "don't hop" patterns.

John_McCall:

First off, I don't think the runtime should be playing much of a role here. The task function is an async function, and async functions are generally assumed to handle their own isolation. The caller doesn't need to (and generally should not) eagerly switch to the callee's isolation before making the call. It looks like you've got the runtime proactively deciding whether to run the initial task funclet synchronously or enqueue it, but I don't think that's really useful: regardless of what the runtime does, the first thing that that funclet's going to do is turn around and ask the runtime to switch to the right executor. The result is that you might as well just run the funclet synchronously, and it'll presumably make its executor request, and if that needs to suspend, it'll suspend and enqueue the task. The result is that you're essentially just forcing the executor check to be done twice. (You do need to run the initial funclet in a special runtime context that disallows switching executors on the current thread, though.)

I see what you're saying, ultimately all those things runtime may try to do before we run the task will be defeated by the nonisolated inference eventually anyway since it'll try to switch. Instead we need to adjust the inference and the switching behavior. The last example in your writeup indeed is how we'd blow up in this scheme. I was "lucky" while adapting the existing tests towards this new behavior in that the context and closure were both nonisolated or both isolated

Okey, so let's approach this by fixing the initial hop in those closure funclets instead.

I very much agree that fixing the inference rule to not require the explicit capturing here is even more important than in Task.init. Let's do that for this API and maybe if we'd manage to bring it to Task.init that'd be nice, but separate work.

So, there's a few cases now:

the statically isolated to the same context as caller
- the "new" isolation inference rule would take care of that; funclet just hops to that inferred isolation, notices it already is, we're good.
the closure passed to Task.immediate is isolated to something ELSE than the dynamic caller isolation
- this seems like it would just be taken care of by the task_switch as usual, we're on different than expected, so we'll enqueue.
- we won't enqueue the task at first, but try to run it, task_switch would enqueue to target if it has to
the current context has no executor at all, we're "dynamically nonisolated"
- this is a new special case here; the first task_switch would have to recognize that's fine and just run inline, without hopping off to global pool; but subsequently do hop to that pool... we'd use some "fake" executor to signal this to the runtime probably.

Getting a bit into implementation weeds a bit much here, we can continue elsewhere, but it's been good to bring those up here a bit! Thanks!

I'm sure I missed some cases, but overall I think that'll work out -- and I'm very excited to "just" fix the isolation inference rule. That'll be a much welcomed win for understandability of this API.