I apologize for posting this before I post the concurrency proposal, but I think these questions can be usefully discussed now.
It seems certain that Swift will add async
/await
. One question, then, is how to represent these functions in SIL. For that, we need to discuss the ways that async
functions are different from normal functions:
Function splitting
The most important difference is that async
functions are expected to be broken up into multiple partial functions in a coroutine-style transformation. But we explicitly don't want to model this in SIL because it would interfere with optimizations that we very much want to do. For example, we should be able to do normal copy/destroy optimizations within async
functions by analyzing things from the perspective of the async
function. We maintain the normal function structure until we do coroutine splitting, which means all the way through IRGen — which means SIL doesn't need to care about it at all.
The only subtlety here is the possibility of well-defined interleaving of side-effects at the function's potential suspension points, but that's not fundamentally different from the possibility of well-defined interleaving of side-effects during calls the function makes. We just need to make sure that all potential suspension points are treated as conservatively as calls. Since suspension points are generally calls, that shouldn't be a problem.
async
calls
async
functions can also call other async
functions. At the SIL level, we (mostly) don't need to treat these differently from calls to synchronous functions, because by default these calls are synchronous from the perspective of the function, which means they can be adequately modeled with ordinary apply
, try_apply
, begin_apply
(should we decide to support that), etc. And that's really good, because the last thing we want to do is introduce yet another orthogonal axis of function application.
Actor references
async
functions are implicitly parameterized by an (optional) actor that they have to run on. This actor reference is carried dynamically by first-class async
function values. For calls that aren't to first-class values, we need to derive it from the context of the call somehow, in some function-specific way. What that probably means for SIL is that we need something more than function_ref
to get a reference to an async
function. We then treat the actor reference as something we can derive from an async
function value in SIL, and in IRGen we just track the actor reference as part of the lowered value.
Actor switching optimization
We want to be able to optimize async
functions intelligently so that we don't enqueue work unnecessarily onto actors. For example, if an async
function that's semantically tied to an actor starts by making a call to a different actor, we want calls to that function to just initiate that call without switching actors. I think the right idea here is probably just to (1) make sure that we can easily recover from SIL when code needs to be running on an actor and then (2) represent that in IRGen in a way that coroutine splitting can recover and turn into the right form to be used dynamically. If (1) is possible, we don't need extra support from SIL here. I don't know if it's conclusive whether (1) is possible.
Actor function constraints
We're going to have to infer when functions are constrained to run on a specific actor because of the things they do. It's possible that this is doable entirely in the type-checker, but it's also possible that we'll need to do it as a SIL analysis (if it's interprocedural / data-flow dependent?). So maybe this combines with the point above about optimization to make it a harder requirement that we represent actor dependencies explicitly in SIL.
Task management
I think task creation, cancellation info, etc. should all be pretty straightforward to embed with intrinsics, but maybe we'll need to be more intelligent about nesting.
Accessing continuations
async
functions need to be able to cleanly interoperate with functions that use callbacks. The current design proposes doing this with some sort of withUnsafeContinuation
library function, with a prototype like this:
func withUnsafeContinuation<T>(operation: (UnsafeContinuation<T>) -> ()) async -> T
Note that operation
is a synchronous function that's allowed to do whatever it wants to the continuation value as long as it eventually resumes it exactly once. Resuming the continuation value logically transfers control to the "continuation point", i.e. the point of returning from withUnsafeContinuation
. Generally, operation
is also supposed to return, or else it ties up the thread uselessly.
I was going around in circles about how to represent this in SIL without completely blocking optimization. The problem is that somewhere within operation
we have control flow to the continuation point, and the start of that is basically opaque — we have no idea when we might trigger the continuation to run. But then I realized that there's second major problem here: if resuming the continuation immediately starts running code associated with the continuation point, we're actually potentially running that concurrently with operation
, which means we might be doing terrible things to captured local variables. So we really need to block resumption of the continuation until we know we've returned out of operation
. But hey, if we do that, then from the perspective of the async
function the control flow is totally synchronous:
- We start running
operation
. -
operation
might kick off arbitrary asynchronous work, but so might any other call we make. - We return out of
operation
. - We block waiting for the continuation to get called.
- We get resumed at the continuation point.
So we just need the representation to:
- tie the start of
operation
to the continuation point, - allow us to insert the code necessary to block resuming the continuation until
operation
returns, - reflect that the continuation point has arbitrary side-effects like a call, and
- make sure we don't do
async
stuff in the middle, which would really mess us up.
So it can be something like:
(%token, %continuation) = begin_async_with_continuation $T
apply %operation(%continuation) : $(UnsafeContinuation<T>) -> () // totally inlinable!
%result = end_async_with_continuation %token
And then we structurally disallow suspension points, return
, and throw
after begin_async_with_continuation
but before end_async_with_continuation
. If somehow the end_async_with_continuation
becomes unreachable, that's fine — I mean, it probably means the user's code is really broken, but we just need to recognize it and generate a bogus continuation, and that still semantically makes sense because the rule is that the continuation can't start running until operation
returns, and hey, for some reason it didn't return.