Preventing Data Races in the Swift Concurrency Model

Jean-Daniel · January 4, 2021, 8:54am

I'm not sure what Chris means by "explosion of attributes". Is it concern about the number of new attributes, or the fact the a lots of declarations everywhere will now have to be declared with attributes.

Renaming the attributes will only solve the first concern, not the fact that too much declarations will require attributes.

Chris_Lattner3 · January 4, 2021, 9:51pm

I'm concerned about the later - similar to how marking one parameter const in a c++ program means you may need to go spread const everywhere.

Chris_Lattner3 · January 4, 2021, 9:58pm

Hi Doug,

In an attempt to hill climb on a solution here, I think it would make sense to start by talking about the closure part of this design. Are you opposed to having a @concurrent attribute on function types, which is orthogonal to but works similar to @escaping?

The semantics of an escaping closure would be that it would be subject to ConcurrentlySharable checking (both for captures and arguments), I think it would default to capturing by value instead of by reference, and we could even make them default to async.

If we go this direction, it has a number of advantages because it is a new additive features that is opt-in. For example:

It allows modeling the full cross product of "concurrent but not escaping" "escaping but not concurrent" "neither" and "both".
It would eliminate one of the major sources of source breakage in this direction, because you wouldn't impose sharability checks on existing code.
It implies that local functions would only be concurrent if declared as a @concurrent func foo() marker.
It works great with actors and structured concurrency alike, and be a unifying theory across the two.

What do you think about this?

-Chris

Douglas_Gregor · January 7, 2021, 12:38am

Chris_Lattner3:

Naming bikeshed for ActorSendable : +1

The protocol-based actor isolation pitch proposes the use of a protocol ( ActorSendable ) to describe types that are safe to transfer across actor boundaries. It is a promising model; we're using the term "shareable" here because the problem isn't limited to crossing actor boundaries.

I'm really happy to see embracing something like the ActorSendable idea. I agree with your premise that we should find a better name than ActorSendable for this concept, given that it should be applies to structured concurrency as well. I'm not sure that Sharable is descriptive enough though. Would you be open to a longer name like ConcurrentlySharable or something else to scope this and make it more obvious that this is a specific kind of sharing?

Yes, I'm open to longer names to make things more clear here. In my experimental implementation of this idea I ended up going with ConcurrentValue, because the fundamental property here is that values of the type can be used concurrently. I think @Joe_Groff's formulation as being about reachability to unsynchronized shared mutable state is the thing to capture.

Yes.

Chris_Lattner3:

Non-shareable types + Closures: +1

Agreed on this section , just one nit pick:

In general, the problem here is one of escape analysis : we need to determine whether a given non-shareable value can "escape" out of the current context and into a context where they might be used concurrently. The suggestions above likely to not account for all of the ways in which a value of non-shareable type can escape.

This isn't really about escape analysis. This is just applying the ActorSendable type restrictions to the capture list of the closure -- the compiler already forms the capture list, so it isn't a new sort of escape analysis. As mentioned in my comments below, you only apply these restrictions to concurrent closures, so compatibility with existing code shouldn't be affected.

Good point, thanks.

I disagree fairly strongly with the main direction of that paper, as I noted in my response. Implicitly turning global variables into per-actor-instance variables is a huge semantic change, and I don't think it's going in the right direction. Technically not "source breaking", because your code will still compile, but will behavior very strangely the moment anyone introduces an actor that touches a global---even for correctly-synchronized global variables, such as a singleton that handles synchronization internally (it would no longer be a singleton in your proposed). At least with a source break, you know that semantics have changed.

Do you want to debate that point further, or are there other models you want to discuss?

This is incorrect. You can evolve a let into an @actorIndependent var without breaking API or ABI.

In this model, "only certain kinds of async functions" can be called from another actor because of the ActorSendable/ConcurrentValue story. Either the whole model has this problem or it's not a real problem; there is no argument specific to let here.

I've been writing code against this model, and it is so very, very useful in practice. This current pitch on Preventing Data Races takes away the special case from actors and makes all let's with ActorSendable/ConcurrentValue/Shareable/wathever types accessible concurrently.

I truly don't understand where you're coming from with this line of reasoning. Immutability is perhaps the simplest and most pervasive tool we have for making concurrent code safe from data races. Swift has nudged programmers toward using let because immutability makes code simpler to reason about, and it also makes it safe to use concurrently when working with ActorSendable/ConcurrentValue types---something Swift has also nudged programmers worked with pervasive value semantics. Why would we not embrace this wholeheartedly in our concurrency design?

Nor are all nonescaping closures non-concurrent. Per your hill-climbing question:

I'm not opposed to having a @concurrent attribute on function types. I think we'll need them regardless.

The prototype and my pitch take the approach of having new code---code that is within an actor or async code---opt in to the checking based on the "escaping closures are assumed concurrent" notion. Having @concurrent types be completely opt-in leaves large holes in the checking model that lets us diagnose data races in code that uses actors with existing APIs. Most examples are more subtle, but let's be blatant:

actor class MyActor {
  var count: Int

  func f(array: [Int]) {
    runLater {
      self.count = self.count + 1
    } 
  }
}

Where we got runLater from some library:

func runLater(_ body: @escaping () -> Void) {
  // toss this on a timer somewhere to run
}

My proposed scheme and the prototype will complain about this because escaping implies concurrent by default. But it allows non-escaping closures to just work

extension MyActor {
  func silly(array: [Int]) {
    array.forEach { i in count += i } // okay because closure is non-escaping, so it's in the actor's domain
  }
}

With @concurrent being a purely opt-in feature, we'll either silently admit the data race in the first example, or have to reject the second example.

Having escaping imply @concurrent by default doesn't mean that escaping and concurrent aren't orthogonal; rather, the defaults are lined up and you have different spellings for the exceptional cases:

@escaping () -> Void: escaping, concurrent
@escaping @nonconcurrent () -> Void: escaping, non-concurrent (expected to be very, very rare)
() -> Void: non-escaping, non-concurrent
@concurrent () -> Void: non-escaping, concurrent (parallel/concurrent algorithms would use this)

The first and third are the most common cases, and with the approach I'm describing, they don't have to change at all (neither ABI nor API). But we can still address the uncommon cases to fill out the matrix.

Because it took me longer to figure out that this can be checked directly.

It would still be implicit, because most closures will become @concurrent implicitly based on context. That's probably okay in practice.

Doug

Chris_Lattner3 · January 7, 2021, 6:57am

Douglas_Gregor:

Chris_Lattner3:

I'm really happy to see embracing something like the ActorSendable idea. I agree with your premise that we should find a better name than ActorSendable for this concept, given that it should be applies to structured concurrency as well. I'm not sure that Sharable is descriptive enough though. Would you be open to a longer name like ConcurrentlySharable or something else to scope this and make it more obvious that this is a specific kind of sharing?

Yes, I'm open to longer names to make things more clear here. In my experimental implementation of this idea I ended up going with ConcurrentValue , because the fundamental property here is that values of the type can be used concurrently. I think @Joe_Groff's formulation as being about reachability to unsynchronized shared mutable state is the thing to capture.

Cool, ConcurrentValue sounds great to me!

Yes I absolutely do, but I think there are several other parts of the model that are worth converging on first (which also have nothing to do with global variable semantics), so I suggest we focus effort there and come back to this topic.

Ok, the confusion is that @actorIntependent hasn't come up in this particular thread.

To recap the other threads, I have not seen clear motivation on the design of @actorIntependent. My understanding is that it has something to do with wanting actors to conform to protocols with sync requirements. If that is the case, then we need to discuss the topic and possible solutions, because there are better (in my opinion) ways to solve that problem.

However, it is entirely possible I've misread the goals here (again, because I haven't seen the motivation). In any case, we need to discuss this further.

I'd love to know more about the use cases, because it has simply been presented as "the design" without motivation in the proposal. I offered to write up a doc exploring the issue but you requested that I not do that. I gave a short version in response. I haven't had a chance to get back to that thread, so it is on me I guess, but I'd prefer to see actual rationale in a doc that explains @actorIndependent so I'm not trying to understand and discuss a moving target.

In any case, the entire design of actors + sync requirements has alternate designs that haven't been evaluated (in the forums). I don't see it as a forgone conclusion that @actorIndependent is a good design, and if it doesn't stick, then the let resilience / API migration issue is a very important concern. I suggest that we split this issue out to its own discussion thread.

Douglas_Gregor:

Chris_Lattner3:

In short, I'm not sure why initialization-only data is worth special casing like this: it's not very expressive to only allow cross-actor/task sharing of init-only data, and there are alternate (much more general ways) to allow actors to provide sync requirements, including vars and function requirements. I can elaborate if you'd like.

I truly don't understand where you're coming from with this line of reasoning. Immutability is perhaps the simplest and most pervasive tool we have for making concurrent code safe from data races. Swift has nudged programmers toward using let because immutability makes code simpler to reason about, and it also makes it safe to use concurrently when working with ActorSendable/ConcurrentValue types---something Swift has also nudged programmers worked with pervasive value semantics. Why would we not embrace this wholeheartedly in our concurrency design?

I'm a huge fan of immutability and immutable types, even immutable reference semantic types (that's one of the motivations for the ConcurrentValue proposal). However that isn't what this proposal does or is about. This concept is about allowing actors - which are inherently bags of mutable state that are accessed asynchronously - to have narrow initialization semantics with special case sync access behavior.

This muddies the water by complicating the model, and isn't motivated anywhere that I've seen. Arguing that this is motivated by "immutability being good" doesn't seem sufficient: while we can agree that immutability is good :-) we also need to look at the effect of the design decisions on the resultant programming model because there are many factors to weigh.

This is fantastic news. I'd love to see this get nailed down, I think it will help clarify a bit chunk of the global concurrency model.

I don't understand this, can you please explain more? There are two designs possible here (the "matrix" and the "tower" approach). I don't have a strong opinion either way, but I'll use the matrix to explain my understanding of how this works.

In the matrix design, @concurrent is a function type attribute that is orthogonal to @escaping and you can have the full cross product of attributes. This means that you get a natural subtype relationship as well: @concurrent () -> () can implicitly convert to () -> () just like @escaping () -> () implicitly converts to () -> ().

Given a design where @concurrent is a new opt-in feature, you get no data race and don't reject the second example. The first example fails to build when you implement runLater with an actor:

func runLater(_ body: @escaping () -> Void) {
  // toss this on a timer somewhere to run
  otherActor.enqueueAfterTime(1.0, body)
  // ^ error, cannot pass non-concurrent closure across actor boundaries because it isn't a `ConcurrentValue`.
}

or with structured concurrency:

func runLater(_ body: @escaping () -> Void) {
  // toss this on a timer somewhere to run
  taskGroup.spawn {stuff(body) }
  // ^ error, cannot capture 'body' in a concurrent closure because non-concurrent closures aren't `ConcurrentValue`s.
}

This is because spawn (or async let, or whatever) takes a concurrent closure, and they require ConcurrentValue conformance of anything they capture and pass/return as argument/results.

In your second example:

extension MyActor {
  func silly(array: [Int]) {
    array.forEach { i in count += i } // okay because closure is non-escaping, so it's in the actor's domain
  }
}

This is totally fine. forEach doesn't take a concurrent closure, so it doesn't have ConcurrentValue checks etc. It is just local synchronous code, protected by the actor as you'd expect. There's no change to type checking for existing code.

I don't have time to read your large patch, can you please explain what behavior you are proposing now?

The obvious and simple thing is to require the @concurrent attribute on the local function to keep it consistent with closures: the decl attribute would change the type of the local function.

Trying to infer syntactically might be possible, but it seems like it will cause problems with non-trivial cases. That said, I don't know what you're proposal is and shouldn't speculate.

Yeah totally agree, it should work just like the existing @escaping logic, where "the escaping bit" (and thus, "the concurrent bit") is inferred and checked based on the type inferred by the closure if there is context. This is very common since you're often passing the closure to a function.

As a concrete example, in parallelMap(stuff, {...}) the closure is inferred to be @concurrent, so it gets ConcurrentValue checks applied to it.

-Chris

crontab · January 7, 2021, 8:47pm

A very common use case of unprotected lets is, say a server app that loads its configuration at startup and lets everyone read it. Ideally it would enforce immutability of this data and also unprotected access.

One caveat here is that this data should be synchronized once before other threads can read it. If tasks that access this immutable data start after its creation then afaik it's guaranteed to be synchronized.

Another caveat like Doug already mentioned, is unprotected let object references that allow modification.

Not sure how the above can be resolved, but other than that immutable unprotected data seems like a very useful thing.

Chris_Lattner3 · January 8, 2021, 6:04am

Right - I am very +1 on immutability and immutable types. Just make something like that a class or struct with a bunch of let properties, and make it be a ConcurrentValue and you're done.

My point is that there is no reason to make such a bag of immutable state an actors itself - actors are about protecting other mutable state from concurrent accesses. If the state is immutable, then you don't need actors / queues / async etc, they can directly share it without wrapping it in actor.

-Chris

crontab · January 8, 2021, 9:46am

Yes but if an app wants to be 100% correct it will take the "everything is an Actor" model as a basis, I presume. Therefore all data should belong somewhere, either MainActor or any Actor.

I might have a case against this. Immutable states still need synchronization guarantees. In a lower-level multithreaded world this can be achieved by ensuring that shared immutable data is created before the other threads are created since each pthread_create() issues a sync.

I'd be most comfortable if say Swift's Actor constructors issued an implicit sync at the end of their execution (similar to what pthread_create() does) and let me access immutable data within them, unprotected. Because without this I'd be limited to defining e.g. my server app's configuration in global non-actor vars and somehow ensure their non-laziness.

Am I asking too much, or is there a way to achieve this that I'm missing?

Chris_Lattner3 · January 11, 2021, 6:02am

No, this is a common misunderstanding, akin to saying "classes should be eschewed, real swift programmers only use structs and protocols". While it could be possible to use actors for everything, that wouldn't be a very efficient way to solve all concurrency problems. Some things are really best solved by shared mutable state with synchronization (e.g. concurrent hash tables) or by shared immutable types. The model already supports both of those.

Immutable state doesn't need synchronization by definition: nothing mutates it, so you can't have read/mutate races. Read/read races aren't concurrency problems. I think you may be concerned that the pointer to an partially initialized instance might escape before it is fully set up?

The important observation here is that Swift already guarantees that the self pointer for a class/struct/etc cannot escape until all of the members of the type are initialized. If all of the members are immutable, then this means that getting a pointer to an instance guarantees it is fully initialized and (since it is immutable) you know it won't/can't change.

This isn't an accident: this has been a key design point of the initialization model from the beginning. This is a key aspect of the "designated initialization model" and why there are different rules before and after the "super init" call.

-Chris

crontab · January 11, 2021, 9:28am

No, it's a different problem. Immutable data, too, requires synchronization though only once. When you write something to memory on a multicore system it's not guaranteed to become immediately visible to all cores.

However you know it's immutable and you don't want to waste cycles by syncing it every time you read it. So what you usually do in lower-level multithreading is: you ensure your immutable data is initialized before the threads that will be reading it are created, since thread creation issues a sync and therefore it will be safe to read from those threads, unprotected.

Now putting the pieces together, what I was proposing: if say there's a guarantee Actor issues a sync at the end of its constructor (not familiar with the internals yet, but there's some chance it's what actually happens) it means any immutable data defined in the actor is safe to read unprotected.

If on the other hand, actors don't expose their immutable data then what options do I have with say my scenario with the server config structures that will be initialized once and then read a lot? Think of the gigantic configs that Apache or nginx load at startup and then use all the time.

You are suggesting that I create unprotected globals (also ensure they are not lazy!), then what is the guarantee that they are synced and safe to read? There's a good chance some synchronization will happen somewhere before this data is read, but then a rare chance it won't be.

Of course exposing actor lets should be considered very carefully anyway, but on the surface it seems like a good idea that will allow me to follow "everything is an actor" model where it would be easier to provide a formal proof of correctness of the program, and have immutable unprotected data with faster access, too.

dabrahams · January 11, 2021, 4:45pm

Oh, but that is mostly true.