Async/await status?

Oof :sweat_smile: I could write ten pages on that question easily.

Since you asked what I think personally, I think green threads are generally a bad idea. They are often a portability play in language runtimes/VMs, like Java (which long ago moved from green threads).they make ports an order of magnitude harder as well.

Async/await is easier to implement. IMHO, Golang’s use of green thread techniques came from a feature requirement on how parallelism is represented to the developer.

To the main question - I think emulating coroutines using native threads can provide “acceptable” performance for a lot of applications, but it will hardly be worth comparing to a mechanism handled by the compiler.

it is hard to believe that, given how many years it took for C++/Rust/etc to do coroutines. looking at Gor Nishanov's C++ presentations on coroutines gives you an impression it was anything but easy to implement.

did anyone here try to use C/C++ coroutines from swift? e.g. wrap those inside Obj-C wrapper class and call that from swift, would that fly?

1 Like

Yes, I'll fully admit I took the existing work out there to build upon into account when formulating my opinion.

I have found this GitHub - belozierov/SwiftCoroutine: Swift coroutines for iOS, macOS and Linux.

3 Likes

that one is "green threads" based (setjump/longjump)

it would be interesting to see the (upcoming) C++ coroutines (AFAIK available now in clang) used from within swift... they claim to be "the best coroutines ever" and claim to be interoperable with pretty much everything.

interesting article that favours the (green) thread approach (as in goroutines):
http://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/

1 Like

I would like to see Swift adopting and pioneering for main stream languages what is called “Structural Synchronous Reactive Programming“ (SSRP) in Céu http://www.ceu-lang.org/

As opposed to “Functional Reactive Programming” (FRP) - with Combine as a good example - which helps to model reactive data flows via functional transformers, SSRP helps to model control flows which should happen as a reaction to events with the help of structural language constructs. await is only one of several other constructs like watching, every, parallel....

SSRP is a synchronous reactive approach which means that the reaction to events is logically immediate. This helps to precisely specify the behavior even in the case of cancellation of parallel control flows.

This synchronous reactive model of computation could be restricted to special Actors, with Actors themselves communicating in a more asynchronous fashion like proposed by the “Globally Asynchronous Locally Synchronous”. (GALS) approach.

SSRP is especially helpful in the real time control of hardware or other time sensitive domains. These could be domains for Swift to conquer!

18 Likes

Just to follow on from this talk, here are a few more in depth details about how it's implemented in Rust: https://www.youtube.com/watch?v=NNwK5ZPAJCk

3 Likes

It seems that Java is going that way with "virtual threads" in Project Loom. The article below is interesting and it makes me wonder: why have the async and await keywords at all?

https://cr.openjdk.java.net/~rpressler/loom/loom/sol1_part1.html

5 Likes

These virtual threads look pretty neat:

Key Takeaways

  • A virtual thread is a Thread — in code, at runtime, in the debugger and in the profiler.
  • A virtual thread is not a wrapper around an OS thread, but a Java entity.
  • Creating a virtual thread is cheap — have millions, and don’t pool them!
  • Blocking a virtual thread is cheap — be synchronous!
  • No language changes are needed.
  • Pluggable schedulers offer the flexibility of asynchronous programming.

Sounds like a nice approach that could work well both for Swift on iOS (or any other client-side) and server-side.

1 Like

At first read through, it does have a strong DispatchWorkItem on concurrent queues vibe I can’t shake. Hmm, maybe I’ll need to think about it some more. :thinking:

1 Like

I'm confused.

You need to wait for something to happen without wasting precious resources? Forget about callbacks or reactive stream chaining — just block.

All that virtual threads do is another layer of virtulisahtion on top of the OS's. I don't understand how it eliminates the need for asynchronous programming?

1 Like

Excuse the plug, but we have Rust’s futures in Swift for a while now :)

That's a great, well-written article, but it's clear that they haven't actually settled on an implementation. Their prototype includes at least two different designs, probably a function-global frame allocation vs. a narrower continuation allocation. Whatever they do, it's going to have serious trade-offs that I haven't seen any balanced evaluations of yet — which is fair, it's still fairly early days.

You cannot have truly lightweight threads while still managing stacks in a traditional way. Traditional stacks require a substantial reservation of virtual address space plus at least one page of actual memory; even with 4KB pages (and ARM64 uses 16KB), that level of overhead means you'll struggle with hundreds of thousands of threads, much less millions. Instead, local state must either be allocated separately from the traditional stack (which means a lot of extra allocator traffic) or be migratable away from it (which makes thread-switching quite expensive, and so runs counter to the overall goals of lightweight threads). Since the JVM already has a great allocator and GC, I assume they're doing the former, but that's going to introduce a lot of new GC pressure, which is not something I'd think most heavy JVM users will be fans of.

If you don't have "colored" async functions, you have to do that for every function that isn't completely trivial. That allocated context then has to be threaded through calls and returns the same way the traditional stack pointer is. Since Swift doesn't already do that for ordinary functions, and we've declared our ABI stable on Apple platforms, this really just isn't an option, even if there were no other downsides.

16 Likes

There was a paper recently published by @kavon and John Reppy with a more in-depth measurement of the performance of different stack and continuation implementations, which might be interesting: https://kavon.farvard.in/papers/pldi20-stacks.pdf

11 Likes

Loom is indeed quite a hot topic in JVM land since quite some time... It's shaping up quite well, but the "weight" of (virtual) threads remains a bit unclear, as John alludes to.

There's an interesting circle the JVM has walked here; Way back then it hat green threads (eons ago... in 1.1), however those were M:1 mapped, so all java.lang.Tread would share the same underlying OS thread. Obviously, this was limiting for real parallelism (and multi core CPUs), so Java made the switch to mapping its Thread 1:1 to OS threads. That has a nice benefit of mapping "directly" to calling native code etc. It's quite heavy though 500~1000k per Thread used to be the rough estimate, though AFAIR things have improved in JDK11 which I've not used in anger. But in any case, Loom's fibers/virtual-threads are definitely going to be "light" at least as compared to present day j.l.Thread :thinking:

Needless to say that relying on today's Thread directly is too heavy for reactive frameworks, so runtimes like Netty, Akka, Reactor, Reactive Streams impls (anything really), end up scheduling in user-land, many fine grained tasks onto those heavy threads, i.e. scheduling M:N (M entities onto N real threads). All reactive or async libraries effectively do this today.

Loom, is interesting since it flips the mappings around again; i.e. what libraries used to have to do because Thread is too heavy, loom does (basically it will do exactly the same thing in terms of scheduling as those reactive libs do today), and map M "virtual" threads onto N real threads. So... it's going back to green threading, but with M:N (and not M:1 like it historically had).

I remain a bit iffy about the "weight" question for Loom... perhaps they'll figure it out somehow with VM trickery though. The simple thing about stream or actor runtimes on the JVM is that they simply "give up the thread", and when they're scheduled again they start afresh, there's no need to keep around any stack for those models to work well. So I wonder how stackful (threads) will lean themselves to such lighter execution models. Yet another library scheduler on top of virtual threads sounds a bit silly -- 2 layers of user land scheduling seem a bit weird, yet leaving it plain as "lib contepts : directly virtual thread" mapping will be interesting to see if it really is light enough... (One could argue the shape of such APIs will change dramatically though :thinking:)

// Thanks for the paper @Joe_Groff, that's a topic I'd love to learn more about, will dig into it!

3 Likes

Thanks for sharing! If anyone has questions about the paper I'm happy to answer.

Depending on how these layers are setup, they could make sense. Manticore features "nested schedulers", which consists of a stack of schedulers that is user controllable: http://manticore.cs.uchicago.edu/papers/icfp08-sched.pdf

That scheduler paper is based on ideas from this (wonderful) paper: https://www.eecis.udel.edu/~cavazos/cisc879-spring2008/papers/cps-threads.pdf

9 Likes

Thanks for the links, this sounds quite interesting -- added to my queue :bookmark:

1 Like

I see they are using heap allocated stack chunks, and they have a Continuation class that they say might become public API in later releases.

That makes sense, but does Swift effectively do the same thing in async/await code, because it has to heap allocate the closure (the continuation) with the captured local values?

I wonder if the two styles boil down to the same things under the hood, after static analysis and optimization. ?

I see the point about having to pass that context through all functions. I wonder how the Java devs are solving that. Maybe they JIT compile 2 versions of each function.

Does Swift do something like that with generics? Ie, you write foo<T> and get separate runtime copies for foo<Int> and foo<Float>?

I does seem like an explicit await keyword in front of a function call fits with Swift’s style, just like we have an explicit try keyword in contrast to Java, which also marks a modified calling convention.

Yes, it will have to use some non-contiguous implementation. It can be isolated to just async functions, though, rather than impacting everything. That is why all the languages that use colored async functions do so.

That would certainly be possible if you wanted to make lightweight threads that were pinned to an OS thread take advantage of the contiguous stack. It would be a lot of extra complexity for the JIIT, though.

We can, but we don't have to.

1 Like