[Pitch] Inherit isolation by default for async functions

Jumhyn · September 23, 2024, 5:25pm

I don't think that's a fair rephrasing: the programmer who wrote the called function gets to decide, while the caller of that function gets to rely on language mechanisms to ensure that they're using their own callees as expected. Obviously that's the ideal case and this proposal calls out some of the sharp corners, but I do thing that in cases of e.g. main actor isolation the Swift Concurrency model is a huge improvement over GCD queues.

nkbelov · September 23, 2024, 5:27pm

I'd argue in defense of SE-0338 here: I find it a strange assumption to rely on that specifically the prologue of an async function should continue running in the same context. For one, it should be opaque to the caller anyway, otherwise the API design is leaky. Second, it could just await Task.yield() in its first line for what it cares — so it would achieve no meaningful logic in its short time on the calling context.^[1]

It is much more consistent (and a much simpler mental model) to either offer to keep the context for the full duration of the callee (this is what isolated (any Actor)? = #isolation achieves), or to forgo it completely.

Even in callback-based APIs it could just someOtherQueue.async immediately — whatever happens in the prologue is just an implementation detail. ↩︎

Jumhyn · September 23, 2024, 5:28pm

Yeah, I think this part of SE-0338 is pretty clearly the correct decision, and its worth noting that in this regard the pitch at hand isn't just a reversal of SE-0338.

hborla · September 23, 2024, 5:39pm

Thanks everyone for the discussion so far! I will reply to the other comments above a bit later, but I want to quickly clarify this:

That's right, switching to the caller's actor both in the function prologue and after any async calls is a critical part of SE-0338 for correctness / data-race safety, and this proposal is not changing that. What's proposed here is not any less safe than the current rule. I should probably remove all language in the proposal that suggests this is a "reversal" of SE-0338 to avoid this misconception.

kean · September 23, 2024, 6:14pm

I just wanted to note that that's not what I meant, but I probably also wasn't clear and went a bit off-topic. It's about a mental model around async functions in general and not about "prologue".

vns · September 23, 2024, 6:22pm

This is a great default (and therefore I’m not sure if this needs to be changed), yet Swift lacks simple tooling for altering this when you need on the caller side — for instance, when the programmer who wrote the function made a mistake, and you have no way to fix this (or at least in a predictable time). Currently IIRC using task executor preference is the only way to address this, which requires access to the executor, and — I’m not sure on this part since haven’t used feature a lot — has a more weak relationship with isolation than other mechanisms like isolated parameter.

sliemeobn · September 23, 2024, 6:24pm

I thought a bit more about the @concurrent spelling, which I agree with the posters above is not ideal. The keyword does not make the function more or less concurrent, and concurrency is everywhere, so this is a bit misleading.

I first liked the async(something) suggestions, but IIUC the proposal hints at how sync functions could also one day get the "run on the global task pool, never on an actor" annotation.

Partaking in the bike-shedding, since what we are looking for is in a way the counterpart to @SomeGlobalActor:

how about @noactor?
_{It is not as pretty a word (@concurrent is a fine-looking word, rrrrr) but it feels more correct.}

Andropov · September 23, 2024, 6:27pm

The SE-0338 behavior surprised me a lot when I was first learning Swift Concurrency. It took some time until I fully internalized that the callee decides the isolation. But once I did, it became one of the things I liked the most about the language. Given how hard concurrency is, it's really neat to be able to reason about it locally.

Inheriting isolation by default for async functions gets rid of this key benefit of Swift Concurrency. We'd be back to having to worry about where the function is called, which to me sounds like a massive drawback. I do not miss having to trace back all the callers of a function to reason about the thread-safety of my code. Or having to debug a half-second stutter in an app because some code path is unintentionally being called in the Main Actor.

It's impossible for me to try to imagine the ramifications this change would have, but I feel like in the alternate universe in which the behavior described in this pitch had been chosen for SE-0338, we'd now have an equally compelling pitch, equally full of very reasonable points going for not inheriting isolation by default in async functions.

Also, I'm dubious about how often you'd see @concurrent used in real codebases. I fear it could end up like one of those performance-only attributes that is mostly ignored by app developers, while now we have this very nice default that nudges code towards using multiple threads by default.

mattie · September 23, 2024, 6:54pm

Just two points I want to stress.

You currently do need to trace call paths to reason about the thread-safety of code if you are suppressing warnings or using Obj-C async translations. I'm pretty sure this proposal will make code exclusively safer than it is today.

It will have an impact on long-running synchronous code if no isolated parameters are involved. And those are now typically invisible to callers.

Edit:

The concern around blocking threads is very real, but I don't think there's any negative impact to safety. Is there?

malhal · September 23, 2024, 7:24pm

My question is - in the first sample in the pitch why did you use class NotSendable and not struct NotSendable? Seems to me that is the cause of the Sending 'self.x' risks causing data races Swift 6 error. Since the class has no properties, thus no shared mutable state, there's no need for a class.

Andropov · September 23, 2024, 7:37pm

But this not-so-simple rule already breaks down when you think about other function-like things. For example, what about closures? If I understood correctly the proposal, they will hop off the current actor to run if the closure type is @Sendable. This may also be quite surprising to new developers, particularly if they're used to function calls inheriting isolation.

The current rules may not be so simple either (you have things like #isolation...) but at least, due to how the default is to hop off the current actor, most developers learn to not have an expectation about where a given code will run unless the enclosing scope (function, closure...) is annotated with those isolation requirements. Inheriting isolation by default in nonisolated async functions may create some expectations about where could will run in other similar constructs, which will not hold true.

The mention of long-running synchronous code is interesting. I've been thinking about this since I first read the proposal. We may all have different ideas about what "long-running" means too. This is relevant in the context of when @concurrent should be used. You shouldn't even put really long-running synchronous code in the Swift Concurrency thread pool. Once I made the mistake of parallelizing a numerical simulation using TaskGroups, and it led to some long head-scratching debugging about my now unresponsive app until I realized that I had exhausted all the threads in the thread pool.

So really heavy synchronous code doesn't belong in the concurrency pool, and very short code can run in the actor just fine, so... when should we use @concurrent? Just for mildly long running code? That's fine by me, but then code that fits in that category now may be blazing fast in a few years...

Well, people write bad code too. Previously, if you had an async function that at some point called an old, non-annotated function that required being called on the main thread, they'd be forced to do something like this:

@MainActor func callOldAPI() async {
    // Some async stuff first...
    await doAsyncStuf()
    // Calling something that requires Main Actor but isn't properly annotated
    oldAPI()
}

As otherwise a non-@MainActor async function wouldn't be called in the main thread. But now, they could drop the @MainActor annotation from the function and rely on always calling it from the main thread.

This may seem like a contrived example, but as far as I can tell this particular flavor of bad code couldn't be written before, Swift forced you to add a proper annotation (or the code would fail ~100% of the time). I think that's a very good thing.

John_McCall · September 23, 2024, 7:51pm

Architecturally, I think it's best for most computation work to still be done in the cooperative thread pool. You don't want to have a lot of arbitrary extra threads providing long-term competition with the thread pool for CPUs; that should be a tool best reserved for specific goals, like reserving the high-priority main thread so that the UI can always update even when other work is happening. So a better solution to the responsiveness problem is just for long-running computations to periodically yield and allow other work to interleave.

David_Smith · September 23, 2024, 8:15pm

Prior to libdispatch existing, this is pretty much how all Mac applications worked^[1], and it honestly did quite well even on the much more limited hardware of the day.

although the tool available at the time was -performSelector:afterDelay:, which had a lot more sharp edges ↩︎

mattie · September 23, 2024, 8:16pm

You'll find all kinds of stuff like this in proposals. The goal is to illustrate a problem using a concise example.

nkbelov · September 23, 2024, 8:23pm

Having thought about this further, I'll ultimately express myself in support of keeping the SE-0338 behaviour as the default (or one of the equally weighed options, see below) — however I do agree that when this is not the desired behaviour, the ergonomics can suffer.

Speaking from the language user perspective, when designing concurrent code, these are the rules that I've developed for myself so far:

I use actors only for highly concurrent data structure-y kind of objects and develop their APIs as if I were to write Array or Dictionary from scratch: only the basic mutation and accessor methods, all in a way that ensures the transactionality and in the smallest possible volume that just has to touch enough of the inner mutable state and be done with it.
Everything else (higher-order business logic, as well as the code that interacts with multiple actors at the same time) goes elsewhere, and basically the only "elsewhere" left is the global executor: since all the ops on my actors are already atomic, and their methods rightfully upkeep their isolation, there's no further need for executor restrictions. All that code relies on SE-0338.

The above comprises 90% of my code as app developer, and I'll subscribe to @John_McCall above (generalizing his statement) that architecturally code is best to default to the global pool / default executor.

There's only one category of objects where I explicitly don't want this default: some data structures like various caches/deduplicators/queues/streams/what-have-you that provide async API (or both async and sync API) and yet have to sit within the actor, and that being the sole reason why it's inadequate for their async functions to leave the actor in the first place, as they're part of its state.

These are mostly utility structs or classes that would suffer from both being a separate actor in their own right and "actorless" through deferring all their ops to the global executor — in other words, declaring a different isolation is detrimental to them as a feature. IMO there's simply a design gap in the support of such features — and not that SE-0338's default is faulty, which is why I've expressed above that we might simply be missing a third mode.

There's an expressivity imbalance in how this third mode has to be achieved: while the global executor preference requires virtually no spelling, the "I'd like to inherit the actor" part requires the bulky isolation: isolated (any Actor)? = #isolation parameter in every function.

Perhaps not only should we promote this option to be equivalently accessible on individual functions, but it also could be worth exploring if it would make sense as a type-level annotation.

juanarzola · September 23, 2024, 8:28pm

I like the new "inheriting" behavior proposed for nonisolated. So if it always inherits, then why is it called "nonisolated"? It may be isolated to an actor, depending on what it inherits. It's confusing to declare something nonisolated but 99% of the time it's isolated because it's running in some actor.

That said, here's an idea for simplifying some concepts, kind of similar to what @nkbelov was proposing but with different names:

The default isolation of functions is no longer called nonisolated. The default is unspecified.
unspecified behaves the way this proposal is changing nonisolated to be. There's no keyword for it, this is just how we refer to it in English.
nonisolated still means "hop off the actor" like it does today (and yes, It does have the drawback that nonisolated async is different from nonisolated sync, but most of the time you wouldn't be using nonisolated, you'd be using the default unspecified anyway).

The idea of calling it "unspecified" is almost like talking about "nil", "not set", it's not a new concept to learn, it's just not specified yet, and will be later via inheritance.

I think that this has the benefit of being slightly more backwards compatible because explicit "nonisolated" (the one I'd typically use to hop off the actor) would work the same as before, it's only the implicit default (now unspecified) that would change.

John_McCall · September 23, 2024, 8:32pm

unspecified by itself doesn't suggest it has anything to do with actors.

jhammer · September 23, 2024, 8:39pm

One aspect of SE-0338 that I find very valuable is being able to reason locally about the isolation of a function just by looking at its declaration (i.e. without having to track down all the callers). Whether its concurrency or value types, local reasoning seems to be in the ethos of Swift and is a strength of the language, in my opinion.

juanarzola · September 23, 2024, 8:39pm

I think that I see what you are saying, but I also think it's not confusing to declare a type with unspecified isolation - that would mean that I am not specifying it when declaring the function or class, it'll take the isolation of where it's being used.

John_McCall · September 23, 2024, 8:43pm

Just try to imagine what the code would look like:

unspecified func foo() async {
  ...
}

and remember that you're reading this by itself and not in the context of having just looked at this proposal. What, exactly, is the reader supposed to know is unspecified about this function?

I think your idea is salvageable, but it needs a different keyword for sure.