[Concurrency] Actors & actor isolation

wear_here · October 30, 2020, 9:28pm

I see how this is a separate axis, yes, and how actor state is restricted even beyond what private would imply. From the link you share:

synchronous functions may only be invoked by the specific actor instance itself, and not even by any other instance of the same actor class.

(my emphasis)

What I am wondering is if these axes are orthogonal: is it at all meaningful that mutableArray is internal here? I'm not sure that access control modifiers really matter at all for actor state. Perhaps they would matter if the state was annotated @actorIndependent? I wonder if the language or the developer tools might clarify this at all.

Lantua · October 30, 2020, 9:29pm

Maybe we can jump over to the discussion thread? I don't think this fits the roadmap thread.

(This was moved from the roadmap thread upon request, the question was answered a few posts above).

Karl · October 30, 2020, 10:00pm

Something about the name actor class feels weird to me. Will there ever be actor structs or actor enums? If not, and actors will always be classes, why mention it at every definition?

I mean, the proposal says:

Actor classes behave like classes in most respects: the can inherit (from other actor classes), have methods, properties, and subscripts. They can be extended and conform to protocols, be generic, and be used with generics.

But then goes on to show a bunch of examples which would be valid in a class but not in an actor class. It's worth noting that methods, properties and subscripts are in no way unique to classes, nor are extensions, protocols or generics.

The only thing actors can do which is in any way class-like is support inheritance. To be honest, inheritance is often a massive pain (especially when considering things like subclasses and Equatable), so most of my actors will be final. I wouldn't even mind if they didn't support inheritance.

I see actors as a fundamentally new thing. They enforce data isolation in a way that is wholly unique in the language, and if anything is more like the grouped-exclusivity rules of value types than the very liberal rules which apply to stored properties of classes.

Douglas_Gregor · October 30, 2020, 10:28pm

They're also reference types, and therefore have reference identity. They satisfy AnyObject constraints. We call them actor class because they are a restricted form of class, and nearly every intuition one has about classes also applies for actor classes.

Doug

Karl · October 30, 2020, 10:36pm

Doesn’t this pretty-much just boil down to reference types with reference identity, though? The other things, like concurrent access to stored properties, don’t apply.

For me, I’d prefer a shorter syntax, since AIUI this is the fundamental unit of data synchronisation. Data within an actor is always synchronised with respect to the other data members, and if you want one piece to live on its own timeline you’d encapsulate it in its own actor.

Lantua · October 30, 2020, 10:50pm

I've been trying to figure this out. If I have an old custom executor, say DispatchQueue or some kind of RunLoop, and would like to wrap it as a (global) actor, how should I go about doing it? There don't seem to be a direct way to call async function from inside sync function, and old executor would accept only sync ones. Feels like it would need to open up PartialAsyncTask somehow.

Jumhyn · October 30, 2020, 11:18pm

I think that's a totally reasonable interpretation (and undoubtedly how a not-insignificant portion of the Swift community uses the terms), but we should make sure we're being consistent in how we're using this terminology (i.e., fix TSPL if we want to change how these terms are expected to be used). Also, if "value type" and "type with value semantics" are synonymous, do we need a general term for "type defined by a struct/enum/tuple"?

IIRC I've been corrected myself on this usage by @dabrahams, who may have stronger feelings than I about the terminology here. In any case, I've opened a PR to reword the Escaping reference types section in terms of "semantics." If that doesn't feel as though it more precisely communicates the intended meaning, feel free to decline!

Douglas_Gregor · October 30, 2020, 11:20pm

PartialAsyncTask will have some kind of synchronous run() operation that should allow one to do this. For DispatchQueue, we will probably want to add some API to allow you to run an async operation on that particular queue. The details here will evolve as more of the pieces of the prototype implementation come together.

Doug

Lantua · October 30, 2020, 11:33pm

It doesn't sound that much different from converting it to first class function (including the call-once restriction) which seems to be contrary to what @John_McCall said earlier (quote below). Or do you plan to have some fast path for default execute implementation?

nonsensery · October 30, 2020, 11:34pm

Regarding this (code) comment, from the Actor Isolation section (near the end):

// Safe: this operation is the only one that has access to the actor's local
// state right now, and there have not been any suspension points between
// the place where we checked for sufficient funds and here.

Does this mean that, if one actor method call suspends (because it calls an async function), then other method calls on that same actor could run while the original is suspended? That is, could separate method calls on an actor be interleaved?

From the rest of the proposal, I would have expected that a given actor method call would run completely before any others were allowed to run.

Lantua · October 30, 2020, 11:38pm

Yes, it's in the Async Function pitch.

This design currently provides no way to prevent the current context from interleaving code while an asynchronous function is waiting for an operation in a different context. This omission is intentional: allowing for the prevention of interleaving is inherently prone to deadlock.

nonsensery · October 30, 2020, 11:48pm

Yeah, I saw that for async functions. It seemed like Actors were meant to provide a level of serialization above plain async function calls.

For example, this bit:

If we wanted to make a deposit to a given bank account account , we could make a call to a method deposit(amount:), and that call would be placed on the queue. The executor would pull tasks from the queue one-by-one … and would eventually process the deposit.

It's not clear whether "task" above means the Task representing the entire method call, or the PartialAsyncTasks that make up its actual execution. To me, it seems to say that actor method calls would not be interleaved.

Edit: Seeing that the only method in the Actor protocol is enqueue(partialTask: PartialAsyncTask), it probably means the partial tasks.

Douglas_Gregor · October 30, 2020, 11:59pm

nonsensery:

Regarding this (code) comment, from the Actor Isolation section (near the end):
// Safe: this operation is the only one that has access to the actor's local
// state right now, and there have not been any suspension points between
// the place where we checked for sufficient funds and here.
Does this mean that, if one actor method call suspends (because it calls an async function), then other method calls on that same actor could run while the original is suspended? That is, could separate method calls on an actor be interleaved?

Yes.

That's not correct. Each async call is potentially a suspension point where other code could be interleaved on the actor. This prevents deadlocks. It's also why we consider it important to mark these in the code with await.

(I think we need to call this out specifically in the proposal)

Doug

Lantua · October 31, 2020, 12:01am

I think it's important enough to be repeated on relevant pitches (which would at least be this one & async function). I needed to find that for half of this pitch to even begin to make sense.

anandabits · October 31, 2020, 12:50am

I’m a big fan of the actor model so I’m happy to see this direction. Thanks to everyone who has been working on it!

I have implemented a library that includes an actor-based concurrency model that encodes the serialization context in the type system. In this design, a class is able to abstract over the serialization context and generic code is able to constrain a type parameter based on serialization context. This has been very useful in some parts of the library. As one example, a UI layer of the library constrains type parameters to the main serialization context.

It doesn’t look like this kind of abstraction is possible in the current proposal. Did you consider a solution that would support this? You include an Actor protocol. If this protocol included an associated type representing the actor’s queue this would be come possible. This would be an anonymous compiler-synthesized type by default and the actor’s global actor when one is specified.

Lantua · October 31, 2020, 1:02am

Could you provide an example? It feels like it should be possible with this pitch's actor as a foundation, but I can't imagine the scenario you mentioned just yet.

xwu · October 31, 2020, 1:03am

I came here to express the same intuition, but I see that @Karl has already expressed it better.

I agree that being a reference type and satisfying AnyObject intuitively come as a package. In this language, where we have multiple different sorts of value types, it's perfectly understandable to have multiple different sorts of reference types. Why might that be useful here? Well:

Adopting @Karl's viewpoint allows us to decouple user expectations of reference types from user expectations of classes. That means we can more critically evaluate whether actors need to, for example, support inheritance or if instead it'd be nearly or just as powerful if they didn't.

But even if we don't make any changes to the design after such re-evaluation, the restrictions that apply to actors but not to "non-actor classes" or vice versa would feel natural in a design where actors aren't treated as "restricted" classes. Take, for example, the rule that actor classes can only inherit from actor classes and non-actor classes from non-actor classes: this would require no explanation at all if actors aren't classes, only reference types.

John_McCall · October 31, 2020, 1:22am

The default actor executor will be more efficient than enqueuing something as a block on a DispatchQueue. If that's all you're doing, you should endeavor to switch it to an actor. But if you do have a DispatchQueue you can't just eliminate, it's not unlikely that we could provide an adapter that, with the right OS support, could also do better than enqueuing something as a block.

nh7a · October 31, 2020, 3:01am

Agreed. The first thing I wonder was why it has to be actor class, not simply actor? And how does inheritance work with it?

class Foo {}
actor class Bar: Foo {}

Is this allowed? Or does it depend on how Foo is implemented?

actor class ActorFoo {}
class Bar: ActorFoo {}

Is Bar an actor class or it has to be spelled out actor class?

class Foo {}
actor Bar: Foo {}  // error
actor ActorFoo {}
actor Bar: ActorFoo {}

I think this is much simpler to read and understand, whether inheritance will be allowed or not.

tclementdev · October 31, 2020, 8:44am

My worry is that actors seem to be mostly encouraging a number of design patterns that are known to be problematic:

Actors encourage to go wide by default. Developers will create many actors which are backed by their own private queues. But we have learned that this is a mistake. Applying concurrency without care leads to terrible performance. The better approach seems to be to go serial first, then apply concurrency as needed with great care.
Actors encourage to protect shared state with queues instead of locks. Dispatching small tasks to queues is inefficient. It's unclear how actors will make the difference between say, a function that merely does an insertion into a dictionary, and a function that performs a long-running task. The first type of methods will be very inefficient to move into a queue.
Actors encourage to write more async methods. While async methods are fine, they also make programs more complex and introduce subtle out-of-order bugs. For example it is possible for actor methods to be called in an interleaved fashion in mid-execution while being suspended, which causes hard to debug bugs. It's also not obvious how such bugs should then be addressed once you only have async methods to call on other actors. This is usually the sign that too much asynchronocity exists in the program and that a lot of code should have probably been written synchronously in the first place and moved onto a background queue/thread at a higher level.
Async methods are also contaminating in that awaiting them requires to turn the caller into an async method itself which can rapidly turn the whole program into an async mess. Rather, some methods should really just be synchronous and use locks to protect state.
Actors encourage developers to not think about threads. Whether we like it or not, we cannot ignore the reality of the underlying OS and hardware that programs are running on. I have seen many developers throwing a lot of queues at the OS/hardware (and I've done it myself) with terrible results.

I'm worried because this feels like a reenactment of the problems that appeared following the introduction of the libdispatch as we were told to not worry about threads and that it was ok to create hundreds, even thousands, of queues. Much later we were told that, in fact, we should use a very limited number of queues, consider them as "execution contexts" in the program (which all of a sudden sounds like we should care about threads) and apply concurrency very sparingly. 10 years later we are still seeing developers making these same mistakes in their libdispatch code, this is deeply entrenched, for the worse. We need to be very careful here because once something like this is out, it will be used widely and without limit.

I understand that there may be optimizations/tricks that could help alleviate these problems but I haven't seen them explained yet. I'd love to hear more.