[Pitch] Custom Main and Global Executors

This is a new pitch for the ability to write custom default executors (both the main executor, as used for the main actor, and the global executor, used by default for Swift Tasks).

There is a draft Custom Main and Global Executors PR that you can review.

This is still a work in progress, but we think it's useful to bring it to the community in its present form to get the discussion started. Editorial feedback — typos and the like — should be raised directly against the PR, but please provide any design feedback in the pitch thread below.

Interested to hear what others think.

17 Likes

I realize this may be out of scope, but just probing the implications here—would there be any expectation that libdispatch or swift-corelibs-libdispatch would respect this setting as well? That is, if a cross-platform codebase had uses of both @MainActor and DispatchQueue.main.async, could they expect to make use of this feature and have things work as expected?

Does having the following on ExecutorJob pose a problem for Embedded? or does it just mean that I have to supply some type named String on my own in order to meet the conformance?

  /// Get a description of this job.  We don't conform to
  /// `CustomStringConvertible` because this is a move-only type.
  public var description: String { get }

That's an interesting question, though we're talking here about an existing type rather than anything that's part of this proposal per se. I'm not sure what the deal is with String in Embedded Swift today. @kubamracek would know what the situation is with this, I imagine.

String and CustomStringConvertible are available in Embedded Swift.

1 Like

The intent is that the proposal will include the tools necessary for libraries to schedule work on a custom main executor, at which point this is really a question about adoption of that feature by the library in question, which is out of scope for this pitch thread.

My current thinking is that we'll likely add a level-triggered event mechanism to the proposal as currently written, which would allow libraries like Dispatch to hook into the custom main executor.

3 Likes

Super excited about this pitch. In the server ecosystem we have been hooking the global and main executors for some time now. I have some questions about the details of the proposal.

SerialRunLoopExecutor

Do we really need the SerialRunLoopExecutor protocol? The only place we use it in public static var executor: any SerialRunLoopExecutor but that seems like it could just be defined as public static var executor: any (RunLoopExecutor & SerialExecutor). Is there a benefit of having the protocol here?

Default executor

extension Task {
  /// The default or global executor, which is the default place in which
  /// we run tasks.
  ///
  /// Attempting to set this after the first `enqueue` on the global
  /// executor is a fatal error.
  public static var defaultExecutor: any Executor { get set }
}

Should this property rather be any TaskExecutor? This way we could use it with all the APIs that take a taskExecutorPreference like withTaskExecutorPreference.

Replacing the main & default executor

The defaultExecutor and MainActor.executor properties are both settable and the proposal says they must only be set before the first enqueue. That makes sense I'm interested in how this is enforced though. Is this something that the runtime enforces? Also could we add a small section to the proposal showing this being done?

Adding a Task.preferredTaskExecutor

I know this is a bit orthogonal to this proposal but currently there is no way to get the preferred task executor for the current task. There are some escapes on UnsafeCurrentTask but those are often not usable. For example in swift-async-algorithms we are required to spawn a few unstructured tasks and need to inherit the task executor of the caller Support task executors · Issue #341 · apple/swift-async-algorithms · GitHub. My understanding is that we should be able to provide a Task.preferredTaskExecutor property now without going into unsafe land. cc @ktoso

Private storage

  /// Storage reserved for the scheduler (exactly two UInts in size)
  var schedulerPrivate: some Collection<UInt>

If this is only two UInts in size why don't we provide this as a tuple of them?
Naming nit: Shouldn't this be called executorPrivate or rather executorPrivateStorage?

ExecutorJobKind

Can we make this a nested struct inside the ExecutorJob so it is ExecutorJob.Kind?
I was also wondering if we could have used an enum with associated values here so that we could move all of the kind specific APIs to that. Something like this

switch executorJob.kind {
  case .task(let task):
    task.allocate(...)
}

Memory management APIs

I would love to understand those more and how those are expected to be used. Who is going to call the various allocate/deallocate methods? If an executor should do this could we provide some samples how that's done? I was kind of expecting those APIs on Executor since I was assuming the Task itself calls the executor then but this looks like it is the other way around.

Building support for clocks into Executor

In reality, timer-based scheduling can be handled through some
appropriate platform-specific mechanism,

While this is true. Often systems want the executor to also be responsible for the timer scheduling since it allows for optimisations in the executor instead of delegating to an external system to then inform the executor to enqueue the job. For example in NIO's event loop we manage our own event_fd and kqueue/epoll to handle the execution of jobs but also to handle execution of jobs at a specific time.
As it stands when calling Task.sleep we are always going through the global default executor even though we have a task preferred executor set. This already turned out problematic in server applications since it causes unnecessary context switches.
I understand that clock handling is a complicated topic and it probably deserves its own proposal.

7 Likes

So setting it "again" must result in a crash; Exact implementation strategy is still TBD I think. There's some tools in the runtime to achieve such behavior (maybe swift_once). But effectively you can think of it as atomically swapping and if an override was set already, crash.

The reason we're lacking this is unrelated to this proposal. It is about being able, or not, to get the right witness table for the type...? But AFAICS this should (have been) doable, unless I'm missing something.

I think we discussed this at some point but @John_McCall had recommended we don't do APIs to "get the (....)Executor". It is true that this proposal is the first time we have an API that returns an executor, so since we're doing it here for the defaule ones perhaps it's time to reconsider.

I don't think this API should exist on Task if that's what you meant there?

It'd have to be some other "this job is a task thingy" like TaskJob that then has those internal APIs... Worth considering but we really have to be careful not to accidentally cause allocations here, like even a cast may introduce allocations which could cause unexpected slowdown in an executor who wanted to task-allocate :thinking:

This is API for the task-local "stack" allocator. This is how task-locals and other records associated with a task are allocated, so for these kinds of purposes.

Primarily I'd expect an executor to use these if it needed to allocate some executor private state, and use those schedulerPrivate fields to store it and it is able to maintain stack discipline for the deallocation and therefore not incur a complete heap allocation but use the task allocator.

It's pretty open ended what an executor may want to store in there, but that's why we thought to expose the allocate/deallocate rather than some specific task record specific thing -- it's entirely up to executors to figure out how to use.

The location of the API though could perhaps be on Executor actually... Executor.allocate(on:capacity:) would be an ok way to spell it as well :thinking:

No specific benefit other than being able to write it slightly more concisely when it comes up.

Yes.

I wasn't terribly keen on putting a tuple into the API surface here because of the lack of normal indexing and their general oddness, but thinking about it further I think it's equally weird using a Collection (since you can't really delete an element from it), so I've changed to a tuple. As for the name, I think you're right — I had taken the name from the existing C++ code, but your suggestion makes sense.

Yes.

While I like the idea, I don't think that's quite as simple as it sounds as the values are used from the C++ side as well.

That's an interesting thought.

I don't think there's anything stopping an executor implementation from providing timer scheduling — we just aren't specifying it here. Part of the reason for this is that while it's true that some executor implementations (like those based around kqueue(), select(), WaitMultipleObjects() and so on) have a natural way to run a timer queue, there are other situations (like embedded systems) where the timer queue might be directly driven by some ISR and so for them having the executor handle it seems weird.

1 Like

I wonder whether the allocate() APIs should instead look like this:

extension ExecutorJob {

  var allocator: TaskAllocator? { get }

}

struct TaskAllocator {

  /// Allocate a specified number of bytes of uninitialized memory.
  public func allocate(capacity: Int) -> UnsafeMutableRawBufferPointer?

  ...
}

i.e. you'd be able to write something like

if let chunk = job.allocator?.allocate(capacity: 1024) {

   ...

}
4 Likes

That’s also quite nice. No need to tie it to executor like my suggestion above, this is much clearer than the currently proposed shape though so +1 from me :slight_smile:

I also think it's nice because it means if we add another kind of job at some point that does support allocation, or we change whether an existing job type has allocation support, we don't need to adjust the API at all. I'll update the draft PR.

1 Like

Why do the allocate() APIs return optional pointers? Under which circumstances can they fail to allocate memory, and how is that different than other allocating APIs that do not return optional values?

1 Like

Excellent question. Looking at the existing code, it appears that the allocator actually will always return memory or crash, so maybe we don't need them to be optional pointers after all.

How do we handle the situation when multiple frameworks or libraries both want to control the default or main executor?

3 Likes

The proposal currently says that it's a fatal error to set the executor after the first enqueue, but actually that's hard to do because when it's a custom executor we won't see the enqueue call. I'm wondering whether the solution is to make it so that the executor can be set once and only once, and so that reading the executor implicitly sets it to the platform default if not already set.

Then (a) multiple libraries attempting to set it to different things would trigger a fatal error, and (b) we could guarantee that it hasn't been set after the first enqueue since to get to the executor to call it, we'd have to read it (thus setting it, if as-yet unset).

4 Likes

The proposal notes that an embedded target can disable the ability to set the executors at all, leaving it up to the runtime implementer to control the concurrency environment via the runtime C API. An intermediate option might be to provide a way for a Swift implementation of the default executors to express statically that it should be linked as one of the default executors, which could be done in such a way that it becomes a link error when multiple libraries try to provide a default executor at the same time.

On dynamically-linked platforms, we wouldn't be able to do such a thing with static linking, but it might be possible to use static annotations to drive load-time resolution of the default executor in a more controlled way as well. Similar to how we have the runtime consult the __swiftNN_hooks section in order to install back-compatibility hooks in a controlled way, rather than leave that up to dynamic symbol resolution hacks, maybe we could store an image's desired default executors in a section, and the Swift runtime can fail at load time if multiple images claim the default executor in incompatible ways.

1 Like

I guess you were thinking we could have e.g. a global strongly-defined symbol that we define when an executor is marked as the default somehow, and then the linker would error out if it found multiple definitions?

The downside of trying to determine which executor to use statically at compile/link time is that there genuinely are situations where you want to be able to decide what to use at run time. The case in point is on Linux, where you might have an executor that is backed by io_uring, and another that uses epoll. The io_uring one may be faster, but if you're running in some kind of container system somewhere, or on some VPSs, io_uring is disabled because of security concerns. So you'd really like to be able to tell your program which to use at run time.

Perhaps. I think we were thinking that the main program should be in charge of which executors it wants to use, rather than people trying to get their library code to automatically set an executor somehow — though I can see why e.g. NIO might consider doing the latter.

"Only the main program gets to decide" seems like a reasonable rule (and that's the rule we impose for compatibility hooking—only the main executable image can provide hooks). The rule that the executor can't change once any async work has been enqueued without the program crashing also seems like a reasonable practical way to enforce that, since only the main program can reliably run code when the program is in that state. I see the need for program-driven logic to pick an executor, though I think it's nice to be able to promote runtime issues to compile- or link-time issues when that level of flexibility isn't needed (and removing any possibility of mutable dispatch vectors being hijacked could be especially nice for small-scale, high-security embedded/static binaries in particular).

1 Like

I was thinking about this more. Maybe the right thing is Span<UInt8> nowadays?

That's interesting but I personally find it sad that we would have to make the API less Swifty for the C++ side. Can we implement parts of this in the runtime in Swift using C++ interop to be able to get a better Swift API?

I really like this idea. Minus the optional pointer since so far Swift always returns or crashes. This might be a larger topic though since in some embedded environments you might actually want to do something if you can't allocate which isn't crashing.

I must say this confuses me more. Are you saying that those allocate/deallocate APIs are used to store stuff in the schedulerPrivate storage? Or are you saying that those are used to allocate additional things inside the task that don't fit into the schedulerPrivate storage?

I would really appreciate if the proposal would lay out a concrete example of how this is supposed to be used. So folks implementing executors can judge if they want to adjust their implementation or how they can use this feature.

Thanks. Those are good arguments and I agree that timers need to be handled in a separate proposal.