Swift Concurrency Roadmap

Lantua · October 31, 2020, 12:18am

Yes.

xwu · October 31, 2020, 12:39am

This makes sense to me. I think what would be helpful at this point, though, is some more detail on what considerations have already been accounted for that gives you confidence that such lower-level facilities can come later without feeling unduly bolted on or being unintentionally hampered by decisions today.

An approach like this was taken with the ownership model, where the law of exclusivity was landed first before ABI stability, but the feature was placed in the context of how it makes possible later additions.

In the same way that the direction of these current proposals isn't a surprise because they're making concrete many of the ideas of a previous manifesto, I think it'd certainly be possible to devote some discussion on how the design dovetails with the ideas explored in related areas (for example, the ownership manifesto) without necessarily committing to anything in those areas. It'd be more acknowledging that this particular design makes space for, or at least doesn't intentionally stomp over, related long-anticipated plans and reasonably foreseeable use cases, even if it doesn't take steps towards realizing them presently.

Karl · October 31, 2020, 12:47am

I’m having a little difficulty with this: is it correct that types are marked actor-local, rather than variables/instances?

ktoso · October 31, 2020, 1:24am

That is correct for the current shape of the actor proposal, and that is how the "can't deadlock" is achieved.

In comparison with other actor runtimes, this is an "actors are reentrant by default" rather than the default taken by Orleans[1] which is "actors are NOT reentrant by default, but can be flipped to be so by annotating them.

We are taking a step into a more permissive direction than other actor runtimes here in the design.

In Akka style one has both the "dont receive any other message until this returns" as well as "do receive other messages until this returns and continue then". We built those in the library and they surface e.g. in akka persistence [2] where there is persist(...) { ... } (no other message can interleave the actor's execution between the persist call and the callback it offers) and persistAsync(...) { ... } the actor can receive other messages (is reentrant) while it waits for the persistAsync to complete before running that callback. Swift's actors today, offer the second "don't block actor from processing other things" semantics only.

In reality I do think we'll need the ability to uninterrupted/atomically { ... } or something like that... but this has not been designed yet.

Alternatively, a pretty "actor way to think about it" is to spawn more child actors for every "linear" execution one needs, so that's also a world we could end up in that would not be too weird to be honest.

[1] Reentrancy | Microsoft Orleans Documentation
[2] Classic Persistence • Akka Documentation

John_McCall · October 31, 2020, 1:57am

That's the current idea, yes. The enforcement is static via the type system, so values of actor-local type cannot be shared across actor boundaries, including by being stored in memory that can be accessed by multiple actors, such as a property of a non-actor-local class. And you cannot "escape" a value of actor-local type into a type that's not actor-local; for example, you would not be able to convert it to Any (but you could convert it to actorlocal Any). Whether this will be acceptable for arbitrary classes is something we're going to have to figure out; there are other ways to do it.

masters3d · October 31, 2020, 2:22am

im still reading but does full actor isolation mean that actors can crash/die and a swift program will keep going with other actors? if so can actors be spawned to replace dead ones?

John_McCall · October 31, 2020, 2:25am

The intent of the full actor isolation is to define away data races by preventing code from concurrently executing conflicting accesses to the same non-atomic memory. That has always been the goal. To the extent that we achieve it, we will be able to assume a single-threaded model for non-atomic memory; to the extent that we fail to achieve it, we will continue to rely on the undefined behavior of such accesses, and so again we will be able to assume a single-threaded model for non-atomic memory. In no case do we have any intention of defining semantics for concurrent conflicting accesses to the same non-atomic memory.

That then leaves us with two basic questions:

When is memory "non-atomic"?
When are two locations in memory "the same"?

Both of these have had answers for a while now, although maybe we haven't communicated them effectively:

All memory is non-atomic except as accessed through explicitly atomic facilities. Swift offers only extremely modest atomic facilities right now, but we expect to fill these out in time. Those general atomic facilities will almost certainly be declaration-driven. The restrictions that we intend to impose as part of data isolation will also be declaration-driven. So we expect that it will be straightforward to identify a general principle that allows atomic declarations to be safely accessed from multiple actors/threads when other declarations can't be.
Two locations in memory are generally "the same" in this sense if they are part of the same containing object, which is to say, a local variable, a global or static variable, or a class property. Different properties of the same struct are generally "the same memory" for the purposes of judging this. Note that this is, not coincidentally, strongly analogous to the rule used by exclusivity.

Note in particular that our actor isolation model does not depend on deep-copying objects and preventing any memory from being shared between actors. We intend to allow memory to be shared as long as there is something ensuring that it is used safely:

it could be statically a unique reference
it could be immutable if we don't know dynamically that it's a unique reference
it could have only fields that are somehow safe:
- perhaps they're restricted to only be accessed by a particular actor
- perhaps they're atomic
- perhaps they're explicitly unsafe

So that's why we don't think anything we're planning to do will prevent us from implementing a more complete atomics library.

John_McCall · October 31, 2020, 2:27am

No, that's not how we're using that term. It's an interesting future direction, but we think it would far too to try to impose on the standard Swift ecosystem — it would need to be opt-in in some way. So it's in our mind's eye, but we're not looking to design it now.

kneekey23 · October 31, 2020, 2:35am

I'd like to second this comment. So excited for this!

Chris_Lattner3 · October 31, 2020, 5:49am

Awesome, I am very excited to see this coming together after so much work over the years on things like exclusivity and the other fundamentals that have gone into this. I'm also happy to see that it is generally aligned with the rough concurrency manifesto outlined earlier. The approach is better in several specific ways than that outline, e.g. the elimination of deadlock is very interesting and could be great (but also needs to be carefully considered, because there are tradeoffs).

Also fantastic, I think this is a really great programming model which will lift the programming experience in Swift and provide a strong conceptual foundation for concurrent programming.

As I related to multiple core team members in previous communication, I think that splitting this into two phases like this (and the proposed approach for the second stage) is extremely concerning for several reasons:

A major point of introducing actors is to get rid of shared mutable state -- and the corresponding bugs that go with them. This is the key to providing a safe concurrent programming model, and one of the major failures of actor implementations like Akka. Stage #1 doesn't achieve this.
Actors will be adopted by the community very rapidly, and a lot of code will be written against "stage 1" of the design. Doing a hard source break in a subsequent release of the language is going to fracture the community, and cause unnecessary problems for adoption.
The proposed "Stage 2" solution to memory isolation (actorlocal, mutableIfUnique, et al) doesn't solve the general use case (it covers a few specific subcases) which means that "stage 2" will be less expressive than "Stage 1". This means that it may be very difficult to adopt even for people who are willing to rewrite their code.
The proposed approaches for achieving memory isolation are type system intensive (making the language much more complex), will be difficult to explain to non-super-expert Swift programmers, and will have some fairly concerning semantics implications that may make them not adoptable in general. We should provide a simple model that is easy to explain.
The description above makes it sound like you're not interested in the extremely important case data structures that use fine grain locking and lock free structures internally. While I agree that it is good to push programmers (by default) to just use actors to protect shared mutable state, I also think it is important that we allow expert programmers to build powerful libraries that compose with actors correctly.

I believe that there is a simple solution here, which I have been discussing with JohnMC. The description above does a good job of summarizing the two issues, which are cleanly separable:

I will try to find some time this weekend to write the reference type issue up, it seems completely additive to the model proposed here, and will resolve the concerns above. If that is well received, we can talk about globals.

I will also try to review the initial proposal drafts in detail to provide a round of feedback when I can. It is good to see these coming together, but there are a lot of details that will require a significant amount of iteration. While it is super useful to see a draft of these all at once (to understand how they fit together) it will be challenging to keep all the moving parts in my head as the design changes over time. I hope we can serialize some of the formal reviews bottom-up (e.g. starting with async functions).

-Chris

Sajjon · October 31, 2020, 11:17am

Awesome! Very exciting indeed! I have a few questions

Question 1:
Can we use await in test methods? How would timeout works?

Question 2:
(addressing devs working at Apple)? Are there any plans to easily create A Combine Publisher by passing in a async function? I think this is possible in JavaScript land, creating Promise (part of JS itself, not RxJS I believe) by passing in and async method.

Question 3:
How will async/await live alongside FRP frameworks such as Combine and how should I reason when choosing between these two rather different tools/solutions when designing asynchronous parts of my code?

If the answer to Q1 and Q2 is YES we can easily use async methods with XCTest, then it feels like it makes sense to prefer design my code with easy to test async methods and then “as late as possible” turn them into reactive streams when I see need for Merge/Combine/Map etc. What are your thoughts in the matter?

Question 4:
(Relating to Q3 maybe, also relevant for Apple employees) do you plan to make use of async/await and Actors in SwiftUI?

michelf · October 31, 2020, 1:38pm

I get a similar felling. Introducing a feature and then adding restriction to it short while thereafter seems like a recipe for discontent.

Perhaps the two stages should happen in parallel. For instance, Swift 6 and 5.x language modes could be introduced simultaneously with 6 coming with full actor isolation while 5.x comes with the same thing minus breaking changes (and is therefore not fully actor-isolated). This way projects can migrate to full isolation at their own pace while still being able to interoperate with other modules using actors. Actors support in 5.x is then be seen as a compatibility compromise for using a Swift 6 feature in an existing code base and would produce warnings (perhaps via a flag) for things that'd be an error in Swift 6. This way you can migrate code progressively while keeping it in a working and testable state.

If some code can't be migrated to 6 because of missing things, it can continue using 5.x for a while. Then when 6.1 make things better, its improvements also become available as 5.x.y for backward compatibility and the progressive migration process can continue. This feedback loop can continue for a while until we're confident all the important use cases have a good migration path.

Chris_Lattner3 · October 31, 2020, 4:11pm

I don't think there is any need for a massive source break -- both async and actors are additive features to the language. There may be some minor issues at the edge (e.g. taking await as a keyword), but nothing that should cause systemic incompatibility. I'll try to complete a writeup in the next day or so.

-Chris

rvsrvs · November 1, 2020, 3:45pm

I gather that the first two (queues and threads) are sort of the difference between say, Combine and NIO. The thing I'd love to see for actors is a green threads implementation. And that all those "modes": (dispatch queue, pthreads, green threads and whatever other scheme there may be out there that I don't understand) be surfaced in such a way that application programmers can pick the one most suitable to their needs.

It seems reasonable to me (as someone who won't have to implement it ) to have async/await explicitly support several m:n concurrency models as "structured concurrency". In the context of this discussion then it could allow me to thread my actor types in a way most suitable to my needs. I could imagine a declarative model here (á la SwiftUI) where the modifiers allowed me to change the mode of a task from one m:n form to another.

I've worked in systems where all of the above were available, but never all at the same time. I've always sort of had to live with what the underlying OS was willing to provide. I'm excited for this because it has always seemed to me like the compiler could do so much more to optimize my code than I can.

Chris_Lattner3 · November 1, 2020, 6:37pm

Just to follow up on this, I posted the proposal on this thread in the pitches section.

dabrahams · November 1, 2020, 11:49pm

I'd like to understand why we should be confident that this step results in a usable programming model.

While reentrant actors still technically eliminate data races, they create a very racy programming experience, akin to sharing a bunch of atomic properties across multiple threads. True, it's slightly different, because your thread of execution can only be interrupted where it issues an await instead of at arbitrary points, but if the actor's properties have any important relationships, before any await you have to be sure invariants are intact, and you have to be prepared for the actor's mutable properties to have “shifted underneath you” when the await is over. We use locks to avoid both of these effects in multithreaded code, and it seems the same basic kind of care is needed here.

IIUC part of the reason async/await has been so successful in other languages is that it lets us reason about async code using the same tools that apply to synchronous code, but reentrancy seems to substantially break that illusion. Common idioms, like using the properties of an instance to hold the state of a single larger computation, will break most spectacularly and with exactly the same kinds of unpredictability as if there were an actual data race, if deployed in the usual ways in reentrant actors.

I have yet to read through all the documents (a quick scan finds no mentions of reentrancy), so apologies if I've missed it, but it seems to me we at least need a set of “reentrant actor programming guidelines” if we're going to adopt this model. Does that exist somewhere?

anandabits · November 2, 2020, 4:38am

I think this mis-states the motivation for weak self, at least in many cases. The reason weak self is used usually used in my experience is to avoid extending the lifetime of an object. For example, when fetching data from the network it is usually not desirable to keep an object alive to wait for the response. Instead, if the object is deinitialized there is no longer a need to load the data and the network request should be cancelled.

I've given quite a bit of thought to this aspect of the design. It definitely makes it a lot easier to unintentionally write code that will keep an object alive waiting for the result of a task which could take an arbitrarily long amount of time. On balance, I think the tradeoffs are probably right. But I think there need to be clear guidelines for best practices around when to use a detached task to separate task lifetime from object lifetime.

I'm having a hard time understand exactly what this means. Some example code would be very helpful.

I agree that re-entrancy will result in code that doesn't behave the way it looks. It will be necessary to drop all assumptions about the state of an actor at every await, including the assumption that locals derived from the state of an actor still reflect its state. That said, it looks to me like the advantages of this approach are significant. It looks like this is a tough tradeoff we have to make where both choices have serious downsides.

lukasa · November 2, 2020, 8:11am

We definitely see lots of weak self in SwiftNIO programs that simply do not require it.

Joe_Groff · November 2, 2020, 3:41pm

anandabits:

I think this mis-states the motivation for weak self , at least in many cases. The reason weak self is used usually used in my experience is to avoid extending the lifetime of an object. For example, when fetching data from the network it is usually not desirable to keep an object alive to wait for the response. Instead, if the object is deinitialized there is no longer a need to load the data and the network request should be cancelled.

I've given quite a bit of thought to this aspect of the design. It definitely makes it a lot easier to unintentionally write code that will keep an object alive waiting for the result of a task which could take an arbitrarily long amount of time. On balance, I think the tradeoffs are probably right. But I think there need to be clear guidelines for best practices around when to use a detached task to separate task lifetime from object lifetime.

One thing I think we definitely need is a way to tie cancellation of a task to the lifetime of an object. There are a couple of ways I could see this being done:

Have a cancellation token that carries a weak reference to the object. Checking isCancelled can check whether the reference is still nonnull.
Have a form of task nursery that can be used as a stored property in a class, instead of being scoped, so that child tasks spawned in that nursery all get canceled when the object is destroyed and cleans up its nursery.

John_McCall · November 2, 2020, 4:22pm

I agree with the goal behind your statement, but I don’t know that reference-based solutions would work, since any tasks would often be keeping the object alive. You need a higher-level “I’m done with this” step.