[Pitch] Custom Actor Executors

Nothing useful to add other than this is extremely exciting! :tada:

1 Like

I assume the backwards-deployment story is “cannot be backwards-deployed” because of the new protocols and such. If that’s the case, what will happen if someone sets up a whole actor network with custom executors and tries to run it on an existing runtime? Do we think the necessary availability guards are enough of a tip-off that people won’t try to do that?

(Alternately, if it can be backwards-deployed at all, then “neat” and also “how will that work for the introduction of new protocols and requirements?”)

1 Like

Looks like a solid pitch to me! Just one question/concern/addition:

Currently, if you annotate two functions as @MainActor and one calls the other, it doesn't need to await the result (assuming the function isn't marked as async explicitly. What happens if I adopt the asme SerialExecutor in two actors, would Swift be able to check it's running on the same SerialExecutor? Or would I still need to mark my calls between these two pieces of code with await?

4 Likes

I think @John_McCall answered exactly this in another discussion thread yesterday:


Isn't that what the @_alwaysEmitIntoClient is for, or does it not work on whole types or protocols?

2 Likes

It only works for functions.

2 Likes

The proposed ways for actors to opt in to custom executors are brittle, in the sense that a typo or some similar error could accidentally leave the actor using the default executor. This could be fully mitigated by requiring actors to explicitly opt in to using the default executor; however, that would be an unacceptable burden on the common case. Short of that, it would be possible to have a modifier that marks a declaration as having special significance, and then complain if the compiler doesn't recognize that significance. However, there are a number of existing features that use a name-sensitive design like this, such as dynamic member lookup (SE-0195). A "special significance" modifier should be designed and considered more holistically.

There was a pitch about this, a few months ago: Pre-Pitch: Explicit protocol fulfilment with the 'conformance' keyword

1 Like

I'll update the doc to stick more to using Job more in the proposal text, but for what it's worth: they're the same, except for the ownership/safety. The Job is move-only, consumed by running it and therefore generally safe, and UnownedJob whose lifetime is not managed, and is unsafe to access after it was "consumed" (e.g. by runJobSynchronously(theJob)).

That is what is happening here. We have an existing API today that wasn't documented, but exists, that was accepting an UnownedJob: Executor.enqueue(UnownedJob). This API exists and is becoming deprecated in favor of the move-only Job accepting version Executor.enqueue(Job).

At the same time, current limitations of the early version of move-only types (which are being pitched, but have not reached a formal review yet) can be limiting enough that users may need to resort to using the UnownedJob. The type is therefore not deprecated, and should be treated as an escape hatch, that you may need to use when job is used in a generic context (e.g. like the proposal mentions: you cannot store Job in an array, but you can store UnownedJob). Therefore, entry-point is deprecated - we lead you towards the safe API, however the escape hatch is not deprecated.

Hope this clarifies why the enqueue method is deprecated, but the type not.

This is the method to "run a job", synchronously, immediately when and where this method is invoked:

// in an SerialExecutor:
self.runJobSynchronously(job)

The happens-before and happens-after typical terminology in concurrent systems to express ordering guarantees, here we just mean that this order of an enqueue's effects must be visible to the run; if there is any synchronization necessary to make this happen, the executor must take care of that.

No; it is as the name implies "run this job now, synchronously, on the current thread".

The proposal includes several existing (!) types that are @available(5.1) because they've been already back deployed ever since the Swift concurrency was back deployed. We never formalized those types through an SE review, so we're doing this now, and while doing so, adding the new APIs.

AFAICS backdeployed things will continue to work as they have until today, and some of the APIs probably we can back deploy -- for example, there never was an official API to "run a job" before this proposal, but the runtime function to do so obviously existed already back then -- it is what Swift itself uses to run jobs -- so I think we can try a bit and backdeploy the runJobSynchronously method perhaps. So old implementations, using UnownedJob only, could at least run a job using an official API rather than an underscored one.

I don't have a full picture of what we can and cannot do though with this.

@DevAndArtist is right that @John_McCall just explained this elsewhere actually: `nonisolated lazy let` on an actor - #13 by John_McCall

You'd have to write await but it wouldn't actually suspend if they shared the same serial executor. To get this guarantee into the type-system, we'd need what is discussed in Future Directions: DelegateActor property of the proposal

The previous pitch of custom executors included a concept of a delegateActor which allowed an actor to declare a var delegateActor: Actor { get } property which would allow given actor to execute on the same executor as another actor instance. At the same time, this would provide enough information to the compiler at compile time, that both actors can be assumed to be within the same isolation domain, and await s between those actors could be skipped (!). A property (as in, "runtime behavior") that with custom executors holds dynamically, would this way be reinforced statically by the compiler and type-system.

3 Likes

I'd like to try not to dive to deep into not-yet-completely-designed future directions discussions in this thread, but a few quick comments:

We're not sure about how this would be best exposed. It might be a tricky tradeoff dance and may need various ways to opt into this "sticky" behavior. Perhaps it is a property of an executor, rather than just how one passes it somewhere?

We have not thought enough about this space yet, but it is clear that a "sticky executor" (where we always hop back to it, rather than to the global pool) would definitely be of interest for things like event-loop based systems. But then again, wouldn't that just be a Task on an actor that has that specific executor? :thinking:

We've not thought it through in depth yet, so I'd like to be careful and not promise anything - I don't know if, when or how we'd surface these semantics.

3 Likes

How does “enqueue” imply synchronicity?

2 Likes

Executors are required to follow certain ordering rules when executing their jobs:

  • The call to SerialExecutor.runJobSynchronously(_:) must happen-after the call to enqueue(_:).

I get most of what you say, but I'm still confused about this. What is ordered here? The execution?

If the runJobSynchronously(_:) must happen-after the call to enqueue(_:), then aren't you saying that the enqueued job executes before the synchronously run job? How is that "immediately"?

Let me try saying this another way, since I'm presumably just misunderstanding you:

Since the custom executor can choose to run an enqueue()-ed job at a time it chooses, then there seem to be 3 possible outcomes when a runJobSynchronously(_:) is subsequently called:

  1. The enqueued job has finished executing already. In that case, the synchronous job can run "now", and it will therefore happen-after.

  2. The enqueued job is still executing, In that case the synchronous job has to wait for the enqueued job to finished. The synchronous job would still happen-after, but not "now", only "next".

  3. The enqueued job hasn't started executing. The synchronous job can run "now", and therefore it will happen-before the enqueued job, not after.

Is my understanding flawed here?

I think Quincey’s confusion suggests a good argument for why this API shouldn’t be on SerialExecutor. This function is a necessary function within the internal implementation of an executor, but putting it on SerialExecutor means it will present as a public function of every conforming type, which in this case means at least every default actor. I wouldn’t want someone to see this method on an actor and think that it’s essentially dispatch_sync.

3 Likes

Yeah it seems the move of that method from a free (undocumented) func onto the executor made it more confusing than helpful.

We can put it directly on the job instead, like this: Job.runSynchronously(some SerialExecutor) which should cause less confusion I hope.

Edit: I just confirmed that I just missed that __consuming already works, so yeap, we can express it on Job which is probably the best place :+1:

3 Likes

This looks awesome! I anticipate the day when NIO EventLoopFutures are no more. :slightly_smiling_face:

Regarding this code snippet…

@available(SwiftStdlib 5.9, *)
extension Job {
  // TODO: A JobPriority technically is the same in value as a TaskPriority,
  //       but it feels wrong to expose "Task..." named APIs on Job which 
  //       may be not only tasks. 
  //
  // TODO: Alternatively, we could typealias `Priority = TaskPriority` here
  public struct Priority {
    public typealias RawValue = UInt8
    public var rawValue: RawValue

    /// Convert this ``UnownedJob/Priority`` to a ``TaskPriority``.
    public var asTaskPriority: TaskPriority? { ... }
    
    public var description: String { ... }
  }
}

…wouldn’t it make more sense to type-alias TaskPriority to Job.Priority? As is stated, the point of declaring Job.Priority separately instead of directly referencing TaskPriority is to decouple jobs from tasks from a naming standpoint since jobs can, in theory, generalize beyond tasks. Why not let Job.Priority be the “source of truth”, so to speak, instead of TaskPriority?

typealias TaskPriority = Job.Priority

…instead of…

extension Job {
    typealias Priority = TaskPriority
}
1 Like

I don’t think we want to emphasize Job to general users of Swift Concurrency. I can very easily imagine people thinking they should be making their own Jobs to run specific functions, which is how you interact with concurrency libraries in a lot of other systems, including Dispatch and Java.

2 Likes

This is great, async await does not play well with frameworks that rely on thread locality. I imagine both core data and realm can benefit from this. I'm also curious about the backporting story here especially for the people who have already been relying on the underscored functions to implement their own executors. Another question I have is about the stickiness of when a task is executed. For instance if we wanted to adapt core datas managed object context to execute jobs the only way i've come up with always hop back to a specific actors executor is to pass an isolated parameter between function calls.

ex.
In the case of core data managed objects they are only valid inside a managed object contexts perform block. This makes it really useful to hop back to the managed objects contexts queue after every suspension point. In the example below is there an easier way to express this currently?


actor DB {
    
    private let _unownedExecutor: CoreDataExecutor
    ....
  
    func transaction<T>(resultType: T.Type = T.self, body:  @escaping (isolated DB) async throws -> T) async throws -> T { 
    }
}

func registerUser() async throws  {
    try await db.transaction { context in
       await networkService.makeReq() // scheduled on global executor
       // hop back to the db's executor
       try repo.insert(context: context) // continue on db's executor because we pass in an isolated DB 
      instance
    }
 }

I recognize this is outside of the current pitch but I would be in favour of having task executors be sticky to to the executor they were started on or at least have the option in the api. This would would also open up the possibility of using task locals to inject database connections (managed object contexts) instead of needed to explicitly pass an isolated instance around.

There are several different ways to handle a "model object"-style database that I can see, depending on the answers to a pair of questions:

  • Can the database be concurrently modified from outside of your program?
  • Does the database need to process separable transactions concurrently?

If the answer to both is "no", then a lot of complexity disappears because suddenly transactions can't really fail. This might apply if, say, you're just using the database for persistence. I tend to think that this is the only scenario where you should consider modeling the database as an actor, because actor operations also can't "fail", at least not at the most basic level. (As a programmer, of course, you can define a higher level of "failure" where e.g. an actor method checks some preconditions, discovers they're no longer true, and exits early. But this doesn't happen implicitly, whereas it's pretty intrinsic to concurrent databases.)

In this case, you'll be using the actor's normal scheduling to manage transactionality. Normally, Swift's actors are non-reentrant, which means any await will break up the transaction. I'm not sure you can fix that at the executor level — the information just isn't there to tell you whether a suspending task has finished executing or not. To run an async operation as an atomic transaction, you'd have to have a function like your transaction() that can register the current task and block anything else from running. But it's at least an fair question whether this is something you should even try to do: if transactions can suspend to await arbitrary async operations, that can leave the database blocked for a very long time, or even lead to deadlock.

Regardless, you should make model objects non-Sendable so that you can't escape them from the actor, and then you can just manage them as internal state of the actor. And you can store all sorts of other useful state directly on the actor object, since it's all guarded by the same exclusive executor.

By modeling the database as an actor and using actor isolation as the transactional tool, you strongly encourage transactions that are more complex than a single closure to be written as extensions on the actor type. I think that's fine, though — it doesn't violate encapsulation in either direction to do that in a private extension, and it makes it very clear which parts of your code are database operations and which aren't.

Once you start answering "yes" to either of the questions above, I think an actor becomes a less appropriate model for the database, because transaction failure becomes an unavoidable concern. Swift isn't going to roll back your changes to the actor's isolated properties when the database operation fails. In this case, you still want your model objects to be non-Sendable, but you'll probably be creating them fresh for each transaction. That's still a kind of data isolation, but there's no additional isolation that you're relying on from an actor.

If you've got a distributed or high-performance database, I personally am fairly skeptical of using a model object approach at all. In this case, I think you need to think much more carefully about transactions and conflict resolution, and being able to write naive code that silently turns into an unnecessarily sweeping transaction is probably an anti-pattern.

2 Likes

Thanks for the detailed response! I think my example may have been misleading. DB was a really bad name for that actor :/. My real intensions here are to have a more ergonomic experience while using core data and async await. Transactions in core data are not typical DB transactions. All changes are made in memory and then committed usually at the end of a managed object contexts perform block by calling save. Usually one would spin up a completely new managed object context for every db operation. I think using custom executors to manage transactionality would be hard to achieve however I think it would be useful to ensure all database operations happen inside a managed object contexts perform block which is a runtime contract for core data. In the example I posted above I'm really only using an actor as a means to run code on a specific thread after a suspension point not really for isolation as a new one would be created for every "transaction" that needed to be performed. How every using an actor is currently the only way to specify what executor a block of code should be run on.

One day if we have a task level apis to specify an executor that were sticky after suspension points core datas apis might be able to be spelled differently.

Instead of

public func perform<T>(_ block: @escaping () throws -> T) async rethrows -> T

it would be possible to have

public func perform<T>(_ block: @escaping () async throws -> T) async rethrows -> T

This would kick off a new task that would run all it's code on the MOC executor. Apologies if this was off topic. Working with core data with async await isn't the easiest right now and it would be great if custom executors could ease some of that pain.

Though if you're "just" after thread-safety of accesses to such "database context object" then by having operations be methods on an actor you'd be hopping back to that actor always, even if the method was async; so any API that would vend API operations could be made not-Sendable and therefore be prevented to escape "the database actor" like John mentioned. Though you won't get "transactionallity" from just doing that because of reentrancy.

Another thing I wanted to clarify:

Task locals are orthogonal to execution contexts.

They set things on a task and wherever (on whatever executor) it happens to be running, the locals remain available to it. While I would not recommend using task locals to things like putting connections into them (because a missing one will be hard to debug), you're free to use them today for such things already – although I would not recommend using task for necessary for basic operation data, such as connections, and would recommend limiting their use to things like traces or metadata that enhances the context in which a thing executes.

This looks good. The use case I think about most is a library that requires particular threads / periodic thread-local-related cleanup calls as mentioned in the pitch.

It would nice to have another couple of words on The default global concurrent executor is currently not replacable if only to say whether this is still a future direction of the project or if you suspect replacing actor executors is sufficient.

2 Likes

Some scattered comments, mostly on API design:

The SomeType.asSomeOtherType... APIs are rather unusual per Swift API naming conventions. Usually, these are expressed as an initializer (e.g. String(_: Int)) and sometimes there's an equivalent property (Int.description).

It seems both could be fine here—SerialExecutor.unowned reads pretty nicely for example, so I'm unsure why the proposal goes for something more ad hoc than that. Indeed, later in the text, an example seems to show that TaskPriority(_: Job.Priority) exists, and yet Job.Priority.asTaskPriority is proposed as a distinct API.


What does the argument label mean in UnownedSerialExecutor.init(ordinary: SerialExecutor)? Are there extraordinary conversions from one to the other—and even if so, why does the "ordinary" one need a label here?


If Job.Priority is not meant to diverge in the values it can represent from TaskPriority (and it does not seem that it would make sense), then indeed one should alias the other. Or alternatively, how would one feel about just renaming this ConcurrencyPriority (leaving only a legacy type alias for TaskPriority)?


I was going to make a comment that runJobSynchronously(_: Job) could be runSynchronously(_: Job) or synchronouslyRun(_: Job) per API naming guidelines not to repeat the type of the argument. If as discussed above this is best made a consuming function on Job spelled runSynchronously then that naturally addresses the issue, though in that case it'd best have a label for the argument (Job.runSynchronously(on:) maybe?).


Where the design includes both preconditionTaskIsOnExecutor and assertTaskIsOnExecutor, is this not composable using the existing precondition and assert functions with a function Task.isOnExecutor(_:) -> Bool? Something like this at the use site:

precondition(Task.isOnExecutor(foo))
4 Likes