Actors 101

crontab · August 24, 2024, 10:36am

No I'm not confusing anything. A detached task is a task that runs in parallel whereas a non-isolated function may or may not run in parallel depending on how it's called. These two concepts are not competing, they are different things for different purposes.

nkbelov · August 24, 2024, 10:37am

Personally, I'd not subscribe to the "detached tasks should not be used" thought very hard, but indeed with my current understanding I'd only use them in a very specific circumstance:

I need to spawn an unstructured task in the first place
This happens lexically in an actor's method
The logic touches that actor's state very little, if at all (or I don't have any transactionality concerns)

I think it'd be worth to extrapolate the thought that not only Swift Concurrency is very new and people are still exploring how to fare with actors, but even more, the concerns of task cancellation and priority are even less explored (as they aren't as critical to program correctness), so there's even less good common knowledge on how/when to apply those.

crontab · August 24, 2024, 10:40am

Yes and the compiler will make sure you don't touch your current actor's state or otherwise do it in a safe manner, hence my suggestion to use detached tasks wherever the compiler allows you to, which won't be in many situations anyway.

mattie · August 24, 2024, 10:43am

This situation, however, is exactly what a nonisolated function + regular non-detacted Task can accomidate. The detached part clears out task-local state. And because you do not need to do that, I think you actually don't want to detach.

mattie · August 24, 2024, 10:46am

Sorry, I wasn't clear. I'm making a distinction between a Task and a Task.detached, not a non-isolated function used on its own.

nkbelov · August 24, 2024, 10:47am

At that point I think that Task.detached just provides clearer semantics: it's easier to reason about "I'm hopping off the actor for good" when it's explicitly spelled out than with a regular Task.init and the need to trace the isolation of the called functions.

crontab · August 24, 2024, 10:49am

Got it but still, the crucial difference between the two is in that an ordinary task is executed serially wrt to your current actor whereas a detached task is executed in parallel. You can "feel" the difference by converting some task to detached to see what new warnings or errors the compiler will give.

mattie · August 24, 2024, 10:57am

Yes this is true! And I think you are right - this is one of the reasons that you see Task.detached show up so much. However, I do think it will hinder developing a better sense of how to control your isolation explicitly via types/signatures. And that can be very problematic.

So, yeah, you certainly can use Task.detached. But I still think a Task + a non-isolated function is a better approach. And, with a little practice, it feels equally explicit to me. And then, the whole thing about Task.detached mutating the task-local state feels extremely strange to use in its place.

mattie · August 24, 2024, 10:59am

This only matters for synchronous functions though. And I don't want to trivialize that, it is an important and common thing! But, a non-isolated async function + a Task will behave identically.

Oh, and while you cannot do this today, eventually I hope that this proposal gets accepted. And, then we'll be able to write this to explicitly remove isolation at the definition site of a closure.

Task { nonisolated in
  print("nonisolated")
}

David_Smith · August 24, 2024, 2:21pm

I agree with mattie, and this is one of quite a number of reasons I encourage people to avoid explicit Tasks where possible^[1] and use async functions instead. Accidentally dropping your priority boost and vouchers from your caller on the floor just because you wanted to avoid actor inheritance is not what anyone thinks they're doing, or what they wanted.

Fun fact I'm curious if people know: libdispatch ALSO has a "detached" concept, with the same semantics. So far whenever I've brought this up, everyone arguing for using detached a lot in Swift had never heard of DISPATCH_BLOCK_DETACHED.

obviously sometimes a Task is the right thing to do, I just think it's about 75% less often than I see from my semi-random sample of code in the wild ↩︎

ptomaselli · August 24, 2024, 5:21pm

I’m not sure if I’m even correct, but I have a way I’ve been thinking about this that I find helpful. Maybe this will help, or maybe someone can correct me, either would be great

When Swift introduced Optional, this was great, but the real big benefit is that the existence of Optional also created the concept of non-optional values, which is where the real good stuff is. And eventually we learned to only reach for Optional where it’s truly needed.

Actors and isolation are (potentially) great, but the real good stuff is that they create the concept of nonisolated code. And that’s the real good stuff because nonisolated code is stuff that is provably safe to execute concurrently by the system if it so chooses.

I could be wrong, but I think what this means is that if new platforms or devices appear with larger concurrent pools, audited code gets faster automatically. In a provably safe way.

So wrt the original topic, I’m in the “only reach for isolation and actors when needed” camp for now. Mostly thanks to the understanding I’ve gained from @mattie’s great recent work. (And hopefully this post is not incorrect; if so that’s all on me )

David_Smith · August 24, 2024, 6:53pm

"island of serialization in a sea of concurrency"

jaleel · August 24, 2024, 10:30pm

Tasks are not the only part of Swift Concurrency, you can create custom executor for long running jobs.

David_Smith · August 24, 2024, 11:50pm

In many cases though, it's preferable to not do this, and instead to yield periodically from the long-running task. Having more than threads than CPU cores uses both more memory^[1] and and more processing power^[2], and custom executors don't fully support priority donation.

Even if you can't yield, it's not always the case that occupying a thread in the pool with long-running work is the wrong choice: it will only cause issues^[3] if all of the following are true

there is other pending non-MainActor work that needs to run
all threads in the pool are occupied
the work that needs to run is more important than continuing to make progress on the long running work.

Long-running non-yielding work is definitely more challenging to do safely though.

People are very fond of coming up with simple rules that can be applied mechanistically without thinking about the specifics of the situation, but it's simply not possible to do that for all aspects of a topic as complex as concurrency and asynchrony. Cooperative and preemptive multitasking offer different tradeoffs, and both are situationally useful.

mostly due to having to allocate space for the stack, but also memory in the kernel for the scheduler to track the thread ↩︎
due to both the time spent creating and destroying the thread, and due to context switches as it runs ↩︎
Leaving aside the special case of synchronously waiting for asynchronous work that itself needs a pool thread, which is simply incorrect ↩︎

ibex10 · August 25, 2024, 3:28am

Thank you, @David_Smith.

After reading your post, I felt that I should learn more about the Threading Architectures.

Quoted from Thread Management

Thread Management

Each process (application) in OS X or iOS is made up of one or more threads, each of which represents a single path of execution through the application's code. Every application starts with a single thread, which runs the application's main function. Applications can spawn additional threads, each of which executes the code of a specific function.

When an application spawns a new thread, that thread becomes an independent entity inside of the application's process space. Each thread has its own execution stack and is scheduled for runtime separately by the kernel. A thread can communicate with other threads and other processes, perform I/O operations, and do anything else you might need it to do. Because they are inside the same process space, however, all threads in a single application share the same virtual memory space and have the same access rights as the process itself.

This chapter provides an overview of the thread technologies available in OS X and iOS along with examples of how to use those technologies in your applications.

Note: For a historical look at the threading architecture of Mac OS, and for additional background information on threads, see Technical Note TN2028, “Threading Architectures”.

I can't find that Technical Note anywhere. Do you know where I can find it?

Also Thread Management was last updated on 2014-07-15. Is it still relevant?

PS: Because I am an old dog, I prefer reading good technical documentation.

David_Smith · August 25, 2024, 4:12am

Technical Note TN2028: Threading Architectures (though I can't say how useful it will be, that's quite an old document)

I share your preference for reading over watching, but for better or worse two of the best recent resources you'll find are WWDC videos: "Swift Concurrency: Behind The Scenes" from WWDC 2021, and "Modernizing Grand Central Dispatch Usage" from WWDC 2017

Really though, you may be happiest just finding a good operating systems textbook and starting there. The details vary a lot from system to system but the core concepts should be generally similar.

jaleel · August 25, 2024, 9:01am

Yeah, completely agree. Tbh last time was thinking a lot when you need such case, but it's hard to justify as regular stuff will just work in 99.99% of cases. My response was mostly you can actually do with Swift concurrency.

Anyway, always good to learn something new, thx for input

P.S. maybe a bit stupid idea, but wonder if we theoretically could not like create a thread, but rather occupy one CPU thread for long running jobs?

By the way, any suggestions on textbooks?

David_Smith · August 26, 2024, 4:17am

An approach I've used for this is making a singleton actor for them and multiplexing everything onto that. That way N-1 pool threads are available for general use. This has other downsides though (e.g. if you actually block that thread, all your other long running stuff will just hang).

I took a very roundabout hands-on path to learning about this stuff, so no idea sorry!

aetherealtech · August 27, 2024, 7:18pm

I replied here with why I think this is an artificial and ultimately meaningless distinction.

What if it turns out those two lines in the top fork are actually switching on and off very rapidly, but so rapidly you can't see it (technically they are, because computer displays use subpixels for red, green and blue that can't be literally on top of each other)? Would it matter if the tiny slivers of red happen to line up vertically, instead of being staggered on a resolution so small you'd never be able to see it?

"Parallel" are "concurrent" are literally synonyms in English (if you don't believe me, Google their definitions). If the software industry has introduced a distinction between the two, it is highly suspicious because apparently there wasn't a better word to identify the two supposedly different concepts. Maybe that's because there is no distinction after all.

When and how could it matter for code that it was created and tested on single-core hardware, but now it's being prepared for use on multi-core hardware? The answer is "it doesn't"... except for one small part: your synchronization primitives that you use to ensure forked threads meet up again at an agreed upon point now have to be implemented at least partially in hardware instead of just being purely software constructs. On a single core machine, a Darwin lock would just be a boolean flag the OS kernel stores on a thread context, and when it's scheduling loop is picking the next thread to run for a time slice, it will skip any that are waiting on locks they don't own. On multiple cores, the parallelism is no longer being implemented (just) in the OS kernel but below it, on the hardware. Therefore even it needs to protect its "shared state" with locks, and those have to be supplied by the CPU itself in the form of atomic instructions.

If you're not an OS kernel developer, this should be largely irrelevant to you... except if you're writing performance critical code and want to make sure your locks are implemented as atomics instead of mutexes because one is much faster than the other (but more limited, it can only synchronize a single memory access, not an arbitrary block of instructions).

If that's not your concern, and you've noticed your multithreaded code doesn't work once you start running it on multi-core hardware, all that's happened is your code has race condition or re-entrancy bugs (it relied on relative order of execution where none was guaranteed), and the probability of encountering those bugs jumped from 0.001% on a single-core machine (not 0%) to 1%, and you finally won that lottery.

That bug didn't become a bug by supporting multi-core hardware. It was always there, it just had a low enough reproducibility rate you never noticed.

My point here is that you should stop thinking about hardware. That's not what you're coding to (how hardware actually executes your code is insanely complicated and not at all what we probably picture, it's slicing it up, reordering stuff, staggering it in superscalar cores with multiple execution units, executing ahead with branch prediction, doing all sorts of super complex caching and guessing of where you're going to read from memory next, etc.). You're coding to a virtual machine that presents a logical execution environment for your code. When you introduce a Thread, or Task, you are introducing parallelism/concurrency to this logical execution. That is all that matters. Once you introduce concurrency, you have asked for all guarantees of in-order execution (between the instructions in two different threads/tasks) to be removed.

If you're trying to rely on a difference in execution between single and multi-core environments, you're asking for race conditions to accidentally never be encountered.

vns · August 27, 2024, 7:38pm

We are talking about terminology in the certain domain – computer science, not linguistic definition. We can refer to wiki on the matter, stating:

Note that in computer science, parallelism and concurrency are two different things: a parallel program uses multiple CPU cores, each core performing a task independently.

Sorry, but you the only one who talks at the hardware scale – you brought that in, while nobody ever mentioned it in the discussion at all. The difference between just CPU instructions level and thread/task level is tremendous from the high-level language perspective. We don't operate on that level, and it is just irrelevant on how exactly subatomic particles do the job to the topic.

With parallelism you have distinct resources dedicated to do the job, concurrency itself doesn't require this. It is like saying that you show two movies on one projector in parallel by altering frames back-and-forth instead of getting a second projector. That's the level of distinction we operate on, and at this level concurrency and parallelism are different things in the same way as showing two movies with one or two projectors.