[Pitch] API to run a closure off of an actor (on the concurrent executor)

Happy Friday! I thought I'd split out an idea from the review of SE-0461 into a separate mini pitch thread.

Consider the following code:

// This is really slow. You should never call it on an actor.
func someLongExpensiveOperation() { ... }

actor MyActor {
  func waitForLongOperation() async {
    someLongExpensiveOperation()
  }
}

someLongExpensiveOperation is expensive, and I don't want to run it on MyActor, the main actor, or any other actor.

Today, if you want to move a bit of code off of an actor and onto the concurrent executor, you have three options:

  1. Wrap the code in a nonisolated and async function, and call that function.
func someLongExpensiveOperation() { ... }

nonisolated func someLongExpensiveOperationAsync() async {
  someLongExpensiveOperation()
}

actor MyActor {
  func waitForLongOperation() async {
    await someLongExpensiveOperationAsync()
  }
}
  1. Wrap the code in await Task.detached { ... }.value.
// This is really slow. You should never call it on an actor.
func someLongExpensiveOperation() { ... }

actor MyActor {
  func waitForLongOperation() async {
    await Task.detached {
      someLongExpensiveOperation()
    }.value
  }
}
  1. Use a task executor preference.
// This is really slow. You should never call it on an actor.
func someLongExpensiveOperation() { ... }

actor MyActor {
  func waitForLongOperation() async {
    await withTaskExecutorPreference(globalConcurrentExecutor) {
      someLongExpensiveOperation()
    }
  }
}

Option 1. is not very convenient, because you have to declare a function just to call it potentially only in one place. Option 2. breaks task structure in unnecessary ways. Option 3. does not work if the enclosing actor has a custom executor, such as the main actor. None of the options are terribly intuitive or ergonomic.

Instead, we could add an API to the concurrency library that accepts a closure, and runs the closure off of an actor. The API would look something like this (using the explicit, subject-to-change syntax from SE-0461):

@_alwaysEmitIntoClient
@execution(concurrent)
nonisolated public func runConcurrently<E, Result>(
  _ fn: @escaping @execution(concurrent) () async throws(E) -> Result
) async throws(E) -> Result {
  try await fn()
}

It could be called like this:

// This is really slow. You should never call it on an actor.
func someLongExpensiveOperation() { ... }

actor MyActor {
  func waitForLongOperation() async {
    // Offload this work from the actor.
    await runConcurrently {
      someLongExpensiveOperation()
    }
  }
}

The name of this API is terrible - please give me some better ideas :slight_smile:

Thoughts?

12 Likes

I forgot to mention the third option, which is to use a task executor preference. However, I don't think that task executor preferences can solve the problem outlined here. Whether or not a task executor preference applies is a dynamic property, and it depends on whether the current actor has a custom executor. This means that a task executor preference can never apply when calling a nonisolated function from the main actor, for example. SE-0461 will also make the task executor preference apply in fewer scenarios -- only for @execution(concurrent) functions. What I think we need here is something that statically switches isolation domains, so that the closure is guaranteed to always run off of the calling actor.

Could you provide snippets demonstrating how these existing options would look in the example from the first post?

I think that would be helpful for people to better understand and compare the possibilities.

1 Like

Done

1 Like

On the surface level, the proposed API is sugar for Task.detached { ... }.value, since you're writing the exact same code inside of the closure expression.

Can you elaborate what you mean here about "break[ing] task structure in unnecessary ways" and how the proposed API doesn't do that?

Wrapping the code you want to offload in Task.detached creates a new task to run that code. The new task does not propagate cancellation, priority, task locals, etc. There's no need for a task at all if you're going to await on the work. The proposed API does not create a new task. It's basically Option 1. of declaring your own wrapper function, but lifted into the Concurrency library as a generic API instead of everybody writing their own little async functions just to offload work from an actor.

7 Likes

Super—this makes a lot of sense. In case others are as slow as I am on the uptake, I think concretely explaining this distinction rather than lumping it under the more nebulous "breaking task structure" might be helpful :)

At the use site (which is what we're optimizing), I think just concurrent reads pretty well:

await concurrent {
  someLongExpensiveOperation()
}
3 Likes

That could be trimmed even more:

await {
  someLongExpensiveOperation()
}
1 Like

I'm definitely sympathetic to the use case here. A part of me would be sad to see further departure from the 'callee decides where to run' philosophy, but I'm also sympathetic to the concerns raised by @gwendal.roue about @concurrent (proposed as @execution(concurrent)) being 'different' in some sense with regards to whose 'problem' it is to decide where something runs. Actor isolation is fundamentally about correctness and the author is absolutely best-positioned to be the arbiter there, but whether particular calls need to be moved to the concurrent pool is much more about specific execution environment, which authors likely aren't well positioned to know or care about.

My suspicion is more or less what Gwendal lays out in the linked post. Library authors won't want to make things concurrent by design because that limits downstream flexibility. But from the app developer side, I still stand by what I wrote in that thread: what I usually want to do is take a synchronous, computationally expensive API and wrap it in a concurrent async function to use as the 'blessed' entry point for approximately the entire app. I specifically don't want ad hoc use sites to have to remember to wrap in runConcurrently { ... } or similar, because approximately 'nobody' downstream should be using the synchronous entry point.

Which is to say, I think the current solution of 'pull this out into a nonisolated async function has some benefits, and I worry that introducing a lightweight way to just say 'move this off to the concurrent pool' will allow/encourage a bit more laziness in deciding when such a move is actually appropriate. Of course, if people are already substantially reaching for Task.detached { ... }.value as a first line of defense (as opposed to option 1), then maybe we do just need to make this pattern more accessible.


Other thought—this is a very short wrapper for someone to write themselves. Does it clear the 'trivially composable' bar? I think that, if SE-0461 is accepted, then yes, because the 'obvious' way to write this as:

nonisolated public func runConcurrently<E, Result>(
  _ fn: @escaping () async throws(E) -> Result
) async throws(E) -> Result {
  try await fn()
}

would surprisingly not accomplish the goal here at all.

5 Likes

I don't think we should be adding "run..." methods to the standard library; these semantics are expressed by executors and executor preferences/requirements IMHO. This is the inverse of the other thread: this does specifically concern itself with executors and not just isolation.

The use-case I understand though, however I think this should be expressed as an executor "requirement" versus the "executor preference" which we have today already.

This effectively is a block of code that we require to execute on some specific (task) executor, in this case the global one but it could be any task executor;

A very common thing in actors is to offload some work to an IO executor which has a limited set of dedicated threads; this would be spelled as

await withTaskExecutor(io) {
  // definitely on io here
}

// existing API:
// await withTaskExecutorPreference(...) { }

There could be an API that hops off to global expressed as:

await withTaskExecutor(.global) {} 
// tbh trying to stop using the word "concurrent" for these
// but this is the ".concurrent" global pool

This is different than the task executor preference API, because it's not a preference but a requirement to run on that task executor, and therefore it wins over actor executors - we explicitly shed isolation off here;

We could debate how to express if that requirement should be inherited or not, but I think we could say that requirements are not inherited but preferences are!

We can also offer Task/currentTaskExecutor and Task/currentTaskExecutorPreference to query and propagate these when necessary.

7 Likes

I think this is a good idea and we should totally do this. A withTaskExecutor API will also help with the feedback that SE-0461 makes task executor preferences less useful, because withTaskExecutor gives you a way to guarantee you're calling an async function that runs on the caller's actor on the specified executor, even if you're making the call from an actor with a custom executor.

The only thing that gives me pause is that I still wonder if we need something simpler for people who don't understand concurrency at the level of executors, and all they're looking to do is get some code off the main actor, because I see people use this pattern all the time:

I worry that people will still reach for Task.detached { ... }.value over withTaskExecutor(.global) { ... } because they understand what Task.detached means, and task executors are level deeper than that. I'd love to hear what others think about this.

7 Likes

Yeah, I'd agree that executors are a level deeper than detached tasks, and wanting to "get code off the main actor" probably needs some way of being expressed before folks are there yet, and possibly even before they've really internalized what detached tasks are.

2 Likes

My strong impression is that we should do something here, but I'm at a loss as to what. Using Task.detached as a way to disable actor inheritance without being aware of its other semantics appears to be so ubiquitous that I'm tempted to say we should deprecate it, except a) increasing migration pain further seems unpalatable, and b) we'd still need something that does what it does for the rare cases people actually want to detach in.

Empirically, zero people I've asked so far have been aware of DISPATCH_BLOCK_DETACHED, the libdispatch equivalent, which suggests that there was not previously much desire from the outside-of-Apple community for these semantics.

(edit) or I suppose it might be something people wanted but didn't know how to get. Hmm. I think I'm still reasonably confident in my original interpretation.

1 Like

I would estimate that effectively all use of Task.detached { } is as a replacement for DispatchQueue.async()—that is, to perform concurrent work from a non-concurrent context. There is no such thing as “structured Dispatch”, and I suspect quite a few people treat async/await as simply being syntactic sugar over completion-handler-based concurrency. The idea that they are fundamentally cutting against a different abstraction isn’t even a consideration.

7 Likes

I believe Mr Milchick would be positively enamored by the lexical discourse here, Miss Huang however would agree with those striving for a simpler approach.

How about

func someLongExpensiveOperation() async { }

actor MyActor {
  func waitForLongOperation() async {
    nonisolated await someLongExpensiveOperation()
  }
}
1 Like

would an async let binding also be an option?

1 Like

I've been using async let for just this purpose in my code. My @MainActor isolated classes use it to run expensive logic on not-the-main thread, using a closure async let x: Int = { … }() and later let y = await x.

1 Like

Yes, you can use async let for this purpose, but there are a few downsides:

  1. It's awkward if you want to immediately await on the work but there is no result. You need to explicitly write a type annotation if the result of the function is Void to resolve a compiler warning, and you need to write a variable name just to await on it. Applying it to the motivating example:
func someLongExpensiveOperation() { ... }

actor MyActor {
  func waitForLongOperation() async {
    async let result: Void = someLongExpensiveOperation()
    await result
  }
}
  1. If you want to move more than one expression, you need to write an immediately evaluated closure:
func someLongExpensiveOperation() { ... }

actor MyActor {
  func waitForLongOperation() async {
    async let result: Void =  {
      someLongExpensiveOperation()
      // some other work
    }()
    await result
  }
}

Alternatively, you can use an explicit task group, which is significantly more boilerplate:

func someLongExpensiveOperation() { ... }

actor MyActor {
  func waitForLongOperation() async {
    await withDiscardingTaskGroup  { group in
      group.addTask {
        someLongExpensiveOperation()
        // some other work
      }
    }
  }
}

In any case, I'll add this to the list of options so it's in the text.

4 Likes

I’ve used this pattern a lot, and some concerns are resolved if we ever get multi-line do-expressions in the language. If we loosened some rules to not have to spell out Void, would this be a good alternative to what’s being proposed?

func someLongExpensiveOperation() { ... }

actor MyActor {
  func waitForLongOperation() async {
    async let result = do {
      someLongExpensiveOperation()
      // some other work
    }
    await result
  }
}

You lose the ability to specify a specific executor on the same line, but it’s certainly far more approachable to concurrency newbies than introducing another primitive. I also find most of the time I’m using this pattern I specifically want to try to interleave work on the executor instead of awaiting immediately, so it’s nice to have the flexibility for when the await occurs.

3 Likes

Would a more convenient spelling of that be:


await do {

      someLongExpensiveOperation()

      // some other work

    }

In other words, like Holly’s original “runConcurrently” but using the “do” block we are already familiar with?