SE-0505: Delayed Enqueuing for Executors

Hello, Swift community.

The review of SE-0505: Delayed Enqueuing for Executors begins now and runs through January 27th, 2026.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to me as the review manager via either email or forums DM. When messaging me directly, please put "[SE-0505]" at the start of the subject line.

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

  • What is your evaluation of the proposal?
  • Is the problem being addressed significant enough to warrant a change to Swift?
  • Does this proposal fit well with the feel and direction of Swift?
  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available on the evolution website.

Thank you,

Freddy Kellison-Linn
Review Manager

6 Likes

Thanks for putting together the proposal. To me, it's directionally the right thing to do but unfortunately it fails to address one issue that could even lead to denial of service attacks for services: Impossibility of full resource reclamation which arguably violates Structured Concurrency.

Concretely, the issue is that swift_task_enqueueGlobalWith* will consume resources in the executor that cannot be freed until the timer/deadline expires. To address this, the proposal should return a handle to the scheduled timer that can be discarded when no longer needed.

Practically, it means that a service which reasonably implements handling requests wrapping the actual work in a withDeadline function would accumulate resources of already-finished requests.

Pseudo code example:

func handleRequest(_ request: Request) async throws -> Response {
    // All the APIs used in here are for demonstrative purposes only, they don't actually exist.
    let now = Clock.now()
    let deadlineClamped = clamped(
        request.deadline,
        now + .seconds(1) ..< now + .minutes(5)
    )

    return try await withDeadline(clampedDeadline) { // (1) Triggers cancellation when deadline is reached
        try await handleRequestImpl(request) // (2)
    } // (3)
}

The issue here is that if handleRequestImpl actually does its job very efficiently, we'll still keep around the deadline that withDeadline would need to schedule. withDeadline would have to schedule a deadline in case handleRequestImpl takes too long, for example it might reach out to other services which might be too slow.

Under Structured Concurrency this would be non-compliant because

The core concept is the encapsulation of concurrent threads of execution (here encompassing kernel and userland threads and processes(*)) by way of control flow constructs that have clear entry and exit points and that ensure all spawned threads have completed before exit.

(*) in Swift Concurrency parlance that'd be a task.

In the above example code, what I would like to see is the following:

  • (1) should under the hood call swift_task_enqueueGlobalWithDeadline with the appropriate deadline
  • So if (2) completes quickly, ...
  • ... (3) should call a dispose/cancel/... method that can remove the enqueued global deadline from the executor -- This step impossible under the current proposal

If it helps, to make the problem visible today, you could run this

cat > /tmp/repro.swift <<"EOF"
@main
struct Example {
    static func main() async throws {
        while true {
            do {
                async let _ = Task.sleep(for: .seconds(1000000000000000))
                try? await Task.sleep(for: .nanoseconds(0))
            } // EDIT: This is non-obvious but this `}` _will cancel_ all unawaited `async let`s such
              // as the `async let _ = Task.sleep(for: .seconds(1000000000000000))` above.
              // And yet: It accumulates resources --> Structured Concurrency violation
        }
    }
}
EOF

and then

swiftc -O -parse-as-library -o repro repro.swift
./repro # wait 30 seconds, then look at the memory consumption of this binary

On my machine for example it produces 2 GB of timer waste in 30 seconds:

$ /usr/bin/time -l /tmp/repro & sleep 30 ; pkill repro
[1] 58273
johannes:~
$ time: command terminated abnormally
       30.04 real        14.34 user        36.25 sys
[...]
          2077149008  peak memory footprint
10 Likes
protocol Clock {
  func run(_ job: consuming ExecutorJob,
           at instant: Instant, tolerance: Duration?)

Having Clock define methods for running executor jobs feels unintuitive to me. In the standard library today there are only two concrete clocks—ContinuousClock and SuspendingClock—and as a user I naturally expect a Clock API to be about time: instants, durations, measurement, and related concepts.

This proposal instead makes Clock part of executor internals, which is where my concern lies. To be clear, I’m not opposed to the existence of a Clock protocol itself. The issue is that it becomes tightly coupled to executors rather than remaining a general-purpose time abstraction.

From that perspective, it might be cleaner to:

  1. Keep a public Clock protocol that models time and is used by ContinuousClock, SuspendingClock, etc.
  2. Introduce an internal ExecutorClock: Clock protocol that adds the executor-specific functionality needed for scheduling jobs.

Alternatively, I’m curious whether delayed job execution could be implemented without introducing an executor-specific clock protocol at all.

8 Likes

How can we then discard tasks that sleep? Similar situation is with iteration over async sequences:

Task {
   for await element in  sequence {
     consume(element)
   }
}

Such sequence can be potentially infinite.

My assumption is that systems that need this ability might have to implement it on their own, using throwing and cancelation.

How can we then discard tasks that sleep?

By cancelling. Under Structured Concurrency this is automatic for async lets and for TaskGroups you call group.cancelAll().

How can we then discard tasks that sleep? Similar situation is with iteration over async sequences:

Task {
[...]

Correct, Task { is the escape hatch from Structured Concurrency. Whenever you write Task { ... } you're leaving behind Structured Concurrency and all the guarantees and benefits it provides to you. General advice: Avoid Task { ... } like the plague, it's the analogous of goto in Structured Programming.

It might not be obvious but my example above does cancel the tasks and does use Structured Concurrency without escaping it. However the implementation of timers in Swift Concurrency does not follow its own rules -- which is the issue I'm pointing out. (I edited my original post to point this out.)

        while true {
            do {
                async let _ = Task.sleep(for: .seconds(1000000000000000))
                try? await Task.sleep(for: .nanoseconds(0))
            } // <- this `}` will cancel the `async let _ = Task.sleep(...)` above(!)
        }
1 Like

+1 I'm still deliberating this proposal, but exposing ExecutorJob in the Clock protocol does seem confusing and out of place.

3 Likes

Yes.

Could there be a better way to do this, without touching the Clock.

protocol ExecutorClock: Clock {
   ...
  func run(_ job: consuming ExecutorJob,
           at instant: Instant, tolerance: Duration?)

  func enqueue(_ job: consuming ExecutorJob,
               on executor: some Executor,
               at instant: Instant, tolerance: Duration?)
}
protocol SchedulingExecutor: Executor {
   ...
   func enqueue <C:ExecutorClock>(_ job: consuming ExecutorJob, ...
   ...
}
2 Likes

The pitched api surface:

protocol Clock {
  func run(_ job: consuming ExecutorJob,
           at instant: Instant, tolerance: Duration?)

  func enqueue(_ job: consuming ExecutorJob,
               on executor: some Executor,
               at instant: Instant, tolerance: Duration?)
}

Does not really make sense for a generic clock, shouldn't it be this?

protocol Clock {
  func run(_ job: consuming ExecutorJob,
           at instant: Instant, tolerance: Instant.Duration?)

  func enqueue(_ job: consuming ExecutorJob,
               on executor: some Executor,
               at instant: Instant, tolerance: Instant.Duration?)
}

The clock's instant has a duration type and that can differ from Duration.

The protocol of SchedulingExecutor should really only have one entry point for scheduling, the deadline variant of executing at a specific instant. Not only is the math relatively easy to do (given the operators required for instant/durations associated with the clock) but also that is the funnel point for clocks already and this interface is a more "advanced" feature and therefore having a requirement to do a small amount of calculation (when it would even be warranted) is a reasonable ask and appropriate level of progressive disclosure. By just having that interface we have less of a potential of confusion about which is the intended entry point and what the conceptual burden of a clock is at the core. Additionally, doing so paves the way more firmly for showing the appropriate mechanisms for things like deadlines are more favorable than timeouts etc.

This is more of a general issue with executors as a whole though. There's no way presently to cancel a job, and jobs consume resources until the executor tries to run them; this isn't specific to jobs enqueued with a delay — it affects all jobs, including cancellable Tasks. I don't think we should hold up the delayed enqueuing proposal in order to fix that. Granted, that might mean some new APIs in the future.

I'm also not sure about all the mentions of swift_task_enqueueGlobalWith*. That API is being re-cored on top of this; it isn't the underlying implementation.

This was extensively discussed in two of the pitch threads. See for instance
[Pitch 2] Custom Main and Global Executors - #41 by John_McCall.

The definition of Clock says:

public protocol Clock<Duration>: Sendable {
  associatedtype Duration
  associatedtype Instant: InstantProtocol where Instant.Duration == Duration
  ...

so no, Duration cannot differ from Instant.Duration.

This also came up during the pitch phase. The reason for having both options in the protocol is that there are situations, particularly in embedded systems, where having to compute an instant is unnecessary overhead — in particular, reading a clock may not be a cheap operation — while they are able to easily wait for a duration.

It's also possible in some cases that, even on the same clock, the two operations are subtly different in character. For example, waiting for a duration might be immune to system clock changes, where waiting for an instant isn't.

There isn't a requirement to implement both of the methods, mind; if you implement one, the library code will do the mathematics to make the other one work.

2 Likes

I think there is a meaningful semantic difference to task jobs and the jobs created for sleeps.

Tasks always have to run to completion even in the case that they are cancelled. Not only because there currently is no concept of cancelling the underlying task's job, but more importantly, they have to run to completion for the correctness of the user's code inside the task. So I don't see a need for jobs created for running tasks to ever not get executed.

The ad-hoc jobs created for implementing sleeps that are then enqueued onto the executor are entirely different. There is no requirement that those ever run since the sleep implementation is resuming the task's job once it notices cancellation.

1 Like

Fully agree with Franz's point.

I don't disagree, but I think we should fix the general problem of not being to cancel a job once it's handed off to an executor, and that fix doesn't really belong in this proposal (IMO).

It does need fixing in both cases, and while I understand your point that it seems worse for Task.sleep() because of the potentially long delay, it is the same fundamental problem under the hood, and the kinds of fixes we might apply to fix it are also the same.

1 Like

Thank you everyone who participated in this review. The LSG has decided to return the proposal for revision.