Swift Concurrency threading model questions

hassila · June 11, 2021, 7:54am

First off, I would like to congratulate everyone involved in bringing all the concurrency features to the table - it looks like an awesome first step. I think the model where we trade context switches and potential thread explosions (and blocking) for heap allocator pressure is great - will be very interesting to see how it performs.

I did view the behind-the-scenes talk and have tried to keep up with the discussions in various threads, but there are a few questions I have:

It is still is unclear to me how the default executor keeps its pool of threads around and how threads are both created and woken up to get started to work. Is this fundamentally just a pool of pthreads which are woken up with the usual mechanisms, or something else? I understand the non-blocking (possible) out-of-order continuation execution, but how are things bootstrapped at a lower level? Where can I find out more about that (source pointers are fine :-) ) - I am curious as there was a comment about the performance characteristics being unknown on e g. Linux and I’d like to understand more and see if there is some work that needs to be done there. (Possible that it would require support for custom executors)
Same goes for thread shutdown when no more work is around (so basically, how are the threads managed in the default executor)

I care quite a bit about the initial latency of getting more threads running, not only efficient execution when under load, thus wanting to understand those characteristics better. (I believe the new model has potential to work very nice under load)

Overall, thread-per-core and non-blocking, non-context-switching by default is just awesome.

John_McCall · June 11, 2021, 6:01pm

The pool itself is implemented in libdispatch, and Swift's integration with the pool is mostly just calling a function to submit jobs. The libdispatch maintainers don't do most of their work in open source. In the past, they have periodically updated a public repository with the current state of the project, but I don't believe they've updated that yet with this work, and I shouldn't speculate about if/when they will. Unfortunately, that means I can't give you a code pointer.

The best public documentation on it right now is probably the "Swift concurrency: Behind the scenes" WWDC session by Rokhini Prabhu. I believe the threads are just ordinary pthreads; I don't know the answers to most of the rest of your questions.

The technical infrastructure is in place for custom executors, and we're currently using that infrastructure to integrate the main actor. The design hasn't yet gone through evolution. I don't think this affects the non-Darwin implementation of the default global executor, though. I believe the performance concern there is just that the Linux dispatch implementation is a relatively naive and unoptimized thread pool; the way we use it should still be width-limited, which is really the most important property we need.

David_Smith · June 12, 2021, 12:28am

Dispatch workqueue threads are not quite ordinary pthreads; for example a decent amount of pthread API isn’t expected to work on them. I would expect the cooperative pool threads to be similar to other dispatch threads though.

hassila · June 12, 2021, 8:29am

Thanks @John_McCall and @David_Smith,

That actually helps quite a bit in understanding - in a previous life I spent some time help porting libdispatch to Solaris and also helped out with libpthread_workqueue that implemented the non-portable workqueue interface, so I'm somewhat familiar with the implementation (as it looked then at least... ). Things worked decently well on Linux/Solaris with libdispatch when used with libpthread_workqueue, even if it never performed the full system thread/core management as the kernel interface on Darwin does.

I haven't looked at the current libdispatch yet (assuming it does not use libkqueue/libpwq), but then I can start poking there, if that just uses a simple pthread work pool as the non pwq dispatch did, there's is possibly room for improvement.

We've got no big need to use pthread API:s on the threads really, but we'd like to be able to pin threads to cores and to optionally support low-latency thread wakeup.

So, way back then (context is Linux, server-side, low-latency, no power constraints, full control of machines and a lot of cores set up with isolated cpu sets...) we ended up supporting optional busy-waiting threads in the portable pthread workqueue implementation to significantly cutting latency (proved to be a very significant improvement both in labs and production with external high-res measurements of response times of the system using Endace DAG cards).

I think it is very reasonable that the swift default executor implementation leverages the existing libdispatch and I'll try to look at whether there's room to provide PR:s to improve the libdispatch / Linux implementation if needed after running some benchmarks (if patches to libdispatch are practical if it's fundamentally timed snapshots of an internal dev tree...) or whether it'd be easier to just build a new executor in the future when/if custom executors are exposed.

One again, major kudos for the new concurrency infrastructure, very happy with the direction.

Genaro-Chris · July 11, 2022, 6:45pm

Sorry if this is a little off topic, but which scheduling type does swift new concurrency model uses, pre-emptive scheduling or is it co-operative scheduling?

John_McCall · July 11, 2022, 7:21pm

Those terms originate from an era when computing was predominantly single-core. They still have some meaning, but no system with true parallel execution can be understood as being either preemptive or cooperative. Things can actually happen simultaneously now, and to handle that, programmers must embrace practices that work in the modern world.

The technical answer is that Swift tasks are scheduled cooperatively onto a lower-level model of platform threads (which is a complex topic on some platforms, but Swift doesn't care and just assumes they exist) which the system scheduler is free to schedule as it pleases. By "cooperatively", I mean that Swift tasks occupy a thread until they reach a suspension point. Whether that thread is preempted or runs in parallel with other threads is the thread system's business.

That means that, for example, if task A is spinning on a spin lock and task B will release that lock, then:

if B is not currently scheduled onto a thread, there is no guarantee of progress; whereas
if B is currently scheduled onto a thread, whether there is a guarantee of progress is up to the thread scheduler.

But Swift's style of cooperative scheduling does mean that you can know whether a task is scheduled onto a thread: for example, if a function contains code to acquire and release a lock, and it does not suspend between those points, then it is guaranteed that any task running that function which has acquired the lock and not yet released it has been assigned a thread (and specifically the same thread that it was assigned when it acquired the lock, which is usually a precondition for correct lock use).

Genaro-Chris · July 11, 2022, 10:00pm

Thanks so much