Is this serial executor implementation semantically valid?

I've occasionally used this sort of construct for debugging purposes:

final class Exec: SerialExecutor, TaskExecutor {
  func enqueue(_ job: consuming ExecutorJob) {
    // add some log statements here etc...
    job.runSynchronously(
      isolatedTo: self.asUnownedSerialExecutor(),
      taskExecutor: self.asUnownedTaskExecutor()
    )
    // ...and maybe some more logs here.
  }
}

I was wondering if this is a semantically valid implementation of SerialExecutor. The evolution document states the requirements as follows:

Executors are required to follow certain ordering rules when executing their jobs:

  • The call to ExecutorJob.runSynchronously(on:) must happen-after the call to enqueue(_:).
  • If the executor is a serial executor, then the execution of all jobs must be totally ordered: for any two different jobs A and B submitted to the same executor with enqueue(_:), it must be true that either all events in A happen-before all events in B or all events in B happen-before all events in A.

The first point seems like it pretty clearly holds, but I'm not totally[1] sure about the second. What exactly are the "events" in a job? Does this executor implementation ensure the "happens-before" relation is upheld in the intended manner? My intuition is that it may not, but I can't precisely articulate why...


  1. Pun? ↩︎

2 Likes

My understanding of the second point is that, unlike a TaskExecutor, a SerialExecutor must guarantee that it executes only one job at a time and fully completes that job before starting the next one. In other words, it must not "magically" interrupt the currently running job or begin executing a different job before the current job has finished.

For example, a SerialExecutor should not use multiple threads as its backing execution resource, since they may execute multiple jobs concurrently.

3 Likes

Thank you, @NotTheNHK.

Can you also please help me here? I am bewildered by that statement. :slight_smile:

Not totally certain myself, but my interpretation is that you shouldn't execute a job on an executor other than the one it was originally enqueued on. For example, a job enqueued on executor A shouldn't be executed on executor B. This may be a bit far-fetched, but it would make sense to me.

1 Like

Thanks, this seems like a good characterization of the requirements. I think part of the challenge for me is understanding where exactly the "boundaries" of the jobs are. An example I have played with that I think "breaks the rules" would be something like this:

@globalActor
actor GA {
    static let shared = GA()

    let exec = Exec()

    nonisolated var unownedExecutor: UnownedSerialExecutor {
        exec.asUnownedSerialExecutor()
    }
}

@GA
func test() async {
    // job A: start
    print("a")

    Task { @GA in
        // job B
        print("b")
    }

    print("c")
    // job A: end
}

Under "normal" circumstances, the Task body print statement should occur after those in its caller. But with this custom executor implementation, the unstructured Task body is run before the calling function completes which results in the jobs "overlapping".

Indeed. And in this implementation we could end up with data races if the executor had jobs enqueued from different threads, since the implementation just co-opts whatever the calling thread is to run the job.

It does make me wonder a bit though about the scenario where we have a single managed thread for the executor, but still had the implementation immediately run jobs when we're already on that thread. This would presumably prevent data races, but would break the "one job at a time" constraint. I'm curious what sorts of problems could happen in that case... though I suppose there are lots of things that can break if ordering expectations aren't upheld (e.g. you free a resource but it's not done being used).

1 Like

I'm not sure if that's the appropriate conclusion to draw. It seems to me that it may be okay to have a sort of proxy executor that passed jobs through to another underlying implementation.

I think this may be more about the "C++ style" memory model requirements. I'm not super familiar with the details, but my impression is that if an executor implementation receives an enqueue call on one thread but runs the job on a different thread, this is trying to say that that cross-thread coordination must be appropriately synchronized.

1 Like

You're right, the Exec example would be an invalid executor under these requirements. Since the enqueue method is non-isolated and runSynchronously executes on the calling thread.

Could you elaborate on what you had in mind?

This was totally just my interpretation, but if that's what the text is trying to express, it's a rather convoluted way of doing so, IMO.

Tangential note

All of my custom executors store enqueued jobs in a data structure before they are consumed by a dedicated thread. As you said, this obviously requires synchronization. Preferably lock-free, but atomic data structures beyond the most basic queues are quite hard to get right (at least for me).

On that note, I'm curious what kinds of schedulers everyone uses for their custom executors.

This is an interesting example. I would leave this up to the system and focus only on the jobs the executor receives, while upholding the "one job at a time" rule. Otherwise, every executor could be considered invalid.

The Exec implementation does not follow the contract correctly. I think this demo shows that the enqueue function itself is re-entrant, so there's no way we can have a total order.

Later in the original proposal, the author gave a somewhat related example:

/// A naive "run on calling thread" job executor. 
/// Generally executors should enqueue and process the job on another thread instead.
/// Ways to efficiently avoid hops when not necessary, will be offered as part of the 
/// "executor switching" feature, that is not part of this proposal.
final class InlineExecutor: SpecifiedExecutor, CustomStringConvertible {
  public func enqueue(_ job: __owned ExecutorJob) {
    runJobSynchronously(job)
  }
}

This code can fulfill the "total order" contract if the implementation of runJobSynchronously uses locks or DispatchQueue.sync, but I interpret the author's comments to mean that this should be avoided.

FYI, demo links to an empty Godbolt editor.

Thanks, now fixed. :sweat_smile:

At the risk of being wrong, I have a different take than you guys. IMO enqueuing a job and executing a job are two different things. While we often see simple examples in which job.runSynchronously is called directly inside enqueue, I think that's a special case and not normal. It's very likely that enqueue in a general implementation store the job in an internal data structure rather than execute it immediately. In other words, eunqueing a job is not an event (that is, operation) inside the job.

Based on this understanding, the two rules make sense to me.

  • The first rule is confusing because ExecutorJob.runSynchronously(on:) is part of enqueue(_:) in simple examples. If you realize they are two separate things in general, it makes sense they should be ordered as specified in the first rule.

  • The second rule is confusing because it explicitly mentions "two different jobs A and B submitted to the same executor with enqueue(_:)", which may mislead readers that enqueuing is part of the events. I believe that's not the case. Actually, if enqueuing a job was considered events of that job, the following notes in the second rule doesn't make sense:

    Do note that this allows the executor to reorder A and B–for example, if one job had a higher priority than the other–however they each independently must run to completion before the other one is allowed to run.

    Reorder means "enqueued first but executed later", which is only meaningful if enqueuing a job isn't considered a job event.

1 Like

Right. At a low level — the level you need to think of things in order to implement an executor — a job is just a synchronous function. The events of the job are the events performed by that synchronous function. In principle, you could just call that function directly, but the concurrency runtime needs to manage some global state around it, so ExecutorJob encapsulates the function, and we tell you to call runSynchronously instead. But really it's just a synchronous function that needs to be called.

2 Likes

I wonder if the deadlock example by @FranzBusch demonstrated that two jobs did overlap when enqueue called job.runSynchronously directly?