Swift Concurrency Roadmap

John_McCall · December 13, 2020, 4:49am

At least one bug I remember was an actor-like class not remembering to dispatch to its queue in a completion handler before accessing a stored property. It was eliminated by construction simply by converting that class to be an actor.

dabrahams · December 13, 2020, 5:05am

That's because it essentially makes everything an error propagation point, and if everything can throw, it becomes impossible to restore temporarily broken invariants. However, this leads me to an interesting observation: it would be fine to implicitly insert a cancellation check/throw at any existing error propagation point. I'm not sure that actually gets us anything we want, but it's a tractable middle ground.

ktoso · December 13, 2020, 6:47am

Yeah the bugs we found so far are in the category of "forgetting to protect the state" (i.e. by using some specific queue).

A few were found just because they would not compile anymore (after making some classes into actors), and that pointed out that there were bugs there, and in some other cases the removal of the callback boilerplate uncovered that a function isn't really doing anything useful after all the boilerplate was removed, so we could completely remove them. This stems from developers being so worried about getting async code wrong that they'd over protect things and the amount of boilerplate made it hard to realize if it really is needed here, or just copy pasting whatever all other functions in the file were doing -- in some cases it was the latter.

Many pieces of code we looked like had insane many levels of nesting, back-and-forth hopping between queues in very specific places and even re-inventing new Future types and even types to "block until all that async stuff inside there has completed". I think my favourite example was a ~100 lines of code, 2 or 3 queues, a blocking "synchronizer" type and a super intricate dance between all those. Once we converted it to async we realized... "that's just an (async) map()!" - before the conversion it was impossible to tell, because the patterns involved were hiding away what was actually going on there.

The codebases we looked at didn't do much very smart/intricate fancy async patterns (e.g. where reentrancy would definitely have caused issues).

Since you mention re-entrance as well: You'll be happy to know, I have a writeup incoming about re-entrancy to the actors proposal, so we can pick up that specific topic in the upcoming week I hope.

ambrosia_florum · January 17, 2021, 11:31pm

I might lack the view of the bigger picture, but what convinced implementors of this feature to pursue async/await addition? To me, it seems that similar effect is possible if curried functions are used:

let semantically_async_but_this_is_irrelecant: (URL) -> () -> [User] = {
   let group = DispatchGroup ...
   var result = ...
   //spawn tasks
   return { group.wait(); return result }
}

Another thing I can't understand is whether the feature to ensure ownership was presented or not. Because if there isn't any way to say who can change what, then the entire model is useless, no?

Also, why bake this into the compiler, if the same effect could be implemented with just ownership and various concurrency primitives available in Foundation? Is this like with function builders thing?

David_Smith · January 18, 2021, 12:07am

It looks like you’re assuming that “await” blocks like group.wait() does. It does not, it suspends and allows other things to run on the thread. It’s quite different from what can be accomplished easily with current library solutions.

ambrosia_florum · January 18, 2021, 1:08am

Is it still much different from something like other primitives on osx can offer? Like pthreads or such for example.
Also, there is little said about the implementation of this, could you say some more about it. I am sure it would shed more light on what it is all about.

Jon_Shier · January 18, 2021, 3:01am

I suggest you read the async/await proposal, as it explains more about the feature.

ambrosia_florum · January 18, 2021, 3:41am

Thanks, I looked it, but it did not increase my comprehension by a big margin. In fact I think there is an error, because it is written in the document that Because asynchronous functions must be able to abandon their thread, and synchronous functions don’t know how to abandon a thread, a synchronous function can’t ordinarily call an asynchronous function,
yet, later this one is given as a correct example: func collect(function: () async -> Int) { ... }. A little more consistency would be nice.

Sadly, it doesn't say much about ownership, or why partial application is a worse alternative. Would be nice to hear from someone who involved in implementation.

So two questions: 1) where's ownership control, 2) why cps is not considered

Avi · January 18, 2021, 4:51am

Async/await is for concurrency. Curried functions and ownership (of what?) have nothing to do with that. The various proposals all have to do with how to efficiently give up control in the middle of a function so that other, unspecified (by the waiting code) functions can make progress.

A common example is waiting for a network resource. A bit of UI code can say "get me that image from this URL" and then (a)wait for it such that other code can run on the caller's thread while that waiting is taking place.

ambrosia_florum · January 18, 2021, 5:11am

You should take a closer examination of this code.

ambrosia_florum:

let semantically_async_but_this_is_irrelecant: (URL) -> () -> [User] = {
   let group = DispatchGroup ...
   var result = ...
   //spawn tasks
   return { group.wait(); return result }
}

Now tell me, how is that different from the 'suspension points' this proposal mentions.
To be clear:

let ongoingProcess = semantically_async_but_this_is_irrelecant (URL.init(...)) //at this point function starts performing part of the work
... some other stuff happening ...
let result = ongoingProcess () //func should have done all the work. Retrieve it or wait

That's how I see it.

In setting of concurrent execution exists a notion of mutable resources. In order to ensure that mutations happen in a required order, all participants must agree on who is allowed to change the data. To control this, the feature called ownership exists. Recall pthead's conditional variable, if that tells you anything.

There is nothing that requires calling thread to block all other concurrency processes. Why add this new feature again?

Avi · January 18, 2021, 6:14am

It's different because group.wait() blocks the calling thread until the group is left as many times as it was entered. With async/await, the caller of semantically_async_but_this_is_irrelecant would await the result, which would free up the thread to execute other code. The essential difference is whether the suspension is blocking or not. The whole point of concurrency is that it's not blocking.

I'm not sure what you're trying to say here. Without async/await, what mechanism allows the calling thread to continue with other work while the function which invoked the async method runs to completion?

Concurrency isn't multi-threading. It's about allowing a single thread to interleave work from different functions. On top of that, we use multiple threads for efficiency.

ambrosia_florum · January 18, 2021, 6:47am

There are things that do not do it. A custom non-blocking implementation is possible based on NSThread, OperationQueue, pthreads, you name it.

You mean that function can await multiple points at the same time, right? Is it irrelevant since all modern cpus would use a possibility for parallelism?

wowbagger · January 18, 2021, 6:52am

These 3 proposals should cover it:

Actors

ConcurrentValue and @concurrent closures (previous revision: Protocol-based Actor Isolation: Draft #2)

Preventing Data Races in the Swift Concurrency Model

Avi · January 18, 2021, 7:00am

Today, we use completion handlers. The proposals and/or the manifesto give a good explanation of why completion handlers are suboptimal, from both an execution standpoint and from a developer standpoint.

No. A function is suspended when it invokes await. It can only be suspended at one point at a time.

Using multiple threads, whether executing concurrently on multiple cores, or interleaved on the same core, is not always desired. Think of the main thread for UI. UI operations must occur on this thread, but sometimes you need to invoke an async process. The proposed async/await allows the main thread to not block (which is very bad for UX) while that async work is ongoing. It doesn't matter whether that async work is itself running on the main thread. It's about the main thread itself not being blocked.

Max_Desiatov · January 18, 2021, 8:26am

On architectures like WebAssembly, or AVR, or even bare metal on any CPU multi-threading is a high-level construct. Those environments are mostly single-threaded.

In addition, threads are expensive when compared to async/await tasks. You should be able to spawn thousands of tasks and actors, while creating a thousand OS threads is inefficient, time will be wasted on context switching. The other relevant difference is that concurrency built with OS threads is preemptive, and async/await is cooperative, which is conceptually much simpler, and easier to implement and debug.

Most importantly, cooperative concurrency possible with async/await is orthogonal to multi-threading, not mutually exclusive with it. You can have multi-threaded executors, or a single-threaded one, like this one for WebAssembly, which integrates with the JavaScript even loop, which is single-threaded. When all browsers support WebAssembly atomics, we'll get a multi-threaded executor to schedule async/await tasks on multiple web workers (i.e. OS threads).

ambrosia_florum · January 18, 2021, 9:59am

I was thinking about speculative/out of order execution/ilp because I thought that async functions can await multiple points simultaneously. But I was told that they don't do it, so I vaguely understand what the idea of cooperativity ought to imply here.

Once I heard that context switches indeed cost much, but the same is not true for thread switching. I don't know for certain, but DispatchGroup is something like a single context with many threads to be put in it. Are these suspensions really that much more efficient?

ambrosia_florum · January 18, 2021, 10:12am

Maybe I didn't emphasized it enough, but this pseudocode looks like this all thing with suspension.

let load_and_decode: ([URL]) -> (Decoder) -> (Filter) -> () -> [Image] = {
  //spawn local thread pool
  //run loading from net asynchronously using threads 
  return { decoder in
    //run asynchronously decoding
    return { filter in
      //concurently apply filter to images
      return { 
      	// wait all threads in pool to finish work
      	return result
    }
  }
}

Can anybody, please, show what is the goal of OP? Some examples demonstrating differences between regular thread pools would be very appreciated.

Max_Desiatov · January 18, 2021, 10:30am

The difference is that you don't need a thread pool and don't need new closure scopes with async/await. I suggest you look at the implementation of async/await in other languages (esp. single-threaded ones like JavaScript, where multi-threading caveats don't get in the way) to get the feel of it. async/await is not "just rewriting it as a bunch of chained closures under the hood", the transformation is more complex. I'm not 100% sure that async/await is built on top of LLVM coroutines, but I suspect that the code transformation happening in the compiler is of that nature. If you need a higher level example, have a look at the Regenerator transformer for JavaScript, which transforms suspending generator functions (on top of which JavaScript's async/await is built, replace yield with await there when reading) into a state machine.

ambrosia_florum · January 18, 2021, 11:34am

I have no idea why you did showed it to me. There is no mention of yielding to be implemented in the proposal.

This excerpt is from the doc:
Because potential suspension points can only appear at points explicitly marked within an asynchronous function, long computations can still block threads. This might happen when calling an asynchronous function that just does a lot of work, or when encountering a particularly intense computational loop written directly in an asynchronous function. In either case, the thread cannot interleave code while these computations are running, which is usually the right choice for correctness, but can also become a scalability problem.

Which hardly suggests any cooperativity. It just blocks the calling thread! Doesn't it look like DispatchGroup? I thought that this feature was about implementation tricks that will make it kill, but it seems to be just another way to do stuff. You maybe know more than I about this stuff, but to me, it looks like swift is about to have a ton of new useless sugar.

Max_Desiatov · January 18, 2021, 11:40am

I'm not sure what else can I provide to clarify my point. The quote you gave explicitly refers to yielding and cooperative multitasking here:

"potential suspension points" = await markers required to make calls to async functions. Long computations (i.e. CPU-bound tasks, as opposed to IO-bound tasks like networking or reading a file) can still block threads precisely because concurrency is cooperative here. With preemptive concurrency a thread "blocking" a CPU core can be rescheduled at an arbitrary point of execution. With async/await a task can "unblock" an OS thread by suspending ("yielding" if we make a connection to generators here) and allowing other tasks to run on the same thread.

The quote from your last post describes basically just that, read "cannot interleave" as "cannot preempt and needs the task to cooperatively suspend in an explicit way to avoid blocking".

I understand, I had a similar knee-jerk reaction when I stumbled upon promises and generators, and later async/await when they were introduced to JavaScript, but it all clicks together after investigating how it all works under the hood. I recommend checking out the implementation details and experience of developers of other languages (not only JavaScript, but Python, C#, and Rust) that went through a similar transition, at least before making strong claims of how it's going to work out for Swift.