Unexpected dynamic suspensions & executor hops in concurrency runtime functions

in thinking about the discussion here, it made me realize there are still a number of runtime API that can introduce executor switching that may result in unexpected/undesirable behavior.

IIUC, the issue from the linked thread is that if the default concurrent pool is saturated (for a given priority), then calling a nonisolated async function from certain execution contexts (e.g. from actor-isolated code) can end up 'stuck' waiting to switch to the concurrent executor to do whatever it's supposed to do. and notably, it seems this deferred execution can occur even if the concurrent pool's resources aren't necessary to perform some immediate work. in the scenario from the thread, that led to a strange issue where Task.sleep couldn't actually register its delayed resumption until it switched to the generic executor even though it was both called from, and would resume execution upon, the main actor, which was not saturated with work.

skimming through the runtime, it seems there are a number of existing API that may be subject to similar issues. the ones that stood out to me are:

  1. Task.sleep()
  2. Task.yield()
  3. withUnsafeCurrentTask (async overload)

additionally there are async getters in some places, though i'm not sure how exactly their semantics differ from async functions, so we'll set them aside.

i think there's an argument that all of these API should probably have nonisolated(nonsending) semantics, the cases for which i'll briefly make:

Task.sleep()

see motivating thread

Task.yield()

the documentation says this:

If this task is the highest-priority task in the system, the executor immediately resumes execution of the same task.

but given the current implementation this is not necessarily true, or at least somewhat misleading. for example, in the following case the higher-priority Task will yield to a lower-priority one, despite those being the only two (user-created) Tasks in 'the system':

@Test
@MainActor // or use a custom global actor
func yield_test() async throws {
  var doneFirst: TaskPriority?

  var highDone: Bool = false
  Task(priority: .high) {
    print("start high pri")
    print("high pri yielding")
    await Task.yield()
    print("end high pri")
    if doneFirst == nil { doneFirst = .high }
    highDone = true
  }

  var lowDone: Bool = false
  Task(priority: .low) {
    print("start low pri")
    print("end low pri")
    if doneFirst == nil { doneFirst = .low }
    lowDone = true
  }

  while (!(highDone && lowDone)) {
    try? await Task.sleep(for: .milliseconds(1))
  }

  #expect(doneFirst == .high) // 🛑
}

/*
 prints:

 start high pri
 high pri yielding
 start low pri
 end low pri
 end high pri
 */

personally i guess i'd alternatively be okay if the documentation was changed to resolve this apparent inconsistency, but the current behavior also does seem like it might be undesirable – switching executors to effectively enqueue a continuation resumption seems potentially inefficient.

withUnsafeCurrentTask (async overload)

this has a similar issue as Task.yield(). if you replace the yield in the prior example with something like this:

await withUnsafeCurrentTask { @MainActor (_) async -> Void in
  print("in UCT")
}

then you get the same behavior where the high-priority Task will suspend despite there being no clear reason to do so. granted, i don't think there are any promises that this would ever not happen, but it still seems a bit suprising and possibly inefficient.


thinking about this a bit more has made me wonder: are there functions in the concurrency runtime that should not inherit their caller's isolation? hand waving away the nontrivial complexities around ABI & source compatibility, are there other reasons against adopting such behavior throughout the runtime?

IIRC when the default isolated parameter feature was rolled out, the runtime was audited for cases which should migrate to that behavior, but the API mentioned above were not changed. curious if anyone has thoughts on either these particular cases, or the broader question of general adoption.

5 Likes