Inheritance of actor isolation for TaskGroup

It seems, the rules for actor inheritance for Task closures and those added with addTask from a TaskGroup are not the same.

I'm referring to the rules stated in SE-0420, specifically:

According to SE-0304, closures passed directly to the Task initializer (i.e. Task { /here/ }) inherit the statically-specified isolation of the current context if:

the current context is non-isolated,
the current context is isolated to a global actor, or
the current context has an isolated parameter (including the implicit self of an actor method) and that parameter is strongly captured by the closure.

(emphasises mine)

So, why is behaving TaskGroup differently? (specifically, ignoring or not compiling "and that parameter is strongly captured by the closure or event")

Here's an example that demonstrates my observations:

enum Test {
    
    // All fine!
    static func testTask<T>(
        value: T,
        isolated: isolated any Actor = #isolation
    ) {
        print(isolated)
        Task {
            _ = isolated
            await sending(value)
        }
    }
    
    static func testTaskGroup<T>(
        value: T,
        isolated: isolated any Actor = #isolation
    ) async throws {
        try await withThrowingTaskGroup(of: Void.self) { group in
            group.addTask { // Error: Passing closure as a 'sending' parameter risks causing data races between 'isolated'-isolated code and concurrent execution of the closure
                _ = isolated
                await sending(value) // Sending 'value' risks causing data races
            }
            while ((try await group.next()) != nil) {}
        }
    }
    
    
    static func sending<T>(_ value: sending T) async {}
}

The broader context is:

I'm providing static utility functions which have an isolated parameter. These are defined in a library. Users can specify the actor context, either by having it associated to the global actor or executing from an Actor instance.

I also tried to use async let assigning it the result of an async function with an isolated any Actor parameter. But there's also a issue, which is somewhat related:

It seems, the issue is due to current limitations in the compiler. Unfortunately, this excludes a whole class of useful applications.

I'm aware that upcoming features which give better control of the isolation may provide a solution in the future. But currently, the TaskGroup as a whole is not working in this usage scenario at all.

Is there a workaround for this scenario?

1 Like

you may find the discussion in this thread of interest, as i believe it is pretty similar. in particular, Konrad's reply here, which states:

TaskGroup is designed specifically to introduce concurrency and run child tasks in parallel. If we just inferred [isolation] by capture we'd constantly linearize their execution almost by accident one might say.

i'm not entirely sure what the intent of your actual code is, so it's unclear exactly what an appropriate workaround or solution would be. one thing to note though is that if you're going to pass value as a sending parameter, then presumably it should be marked as sending in the function parameter as well (otherwise how can it be known that it's safe to send?). i believe your sample code can be made to compile with various annotations and unsafe opt-outs, but it's perhaps better to first ask what your ultimate goal for the code is.

2 Likes

Just coming back from a break so short reply from a phone:

This is very much by design and has been the case ever since the initial introduction of swift concurrency. And yes, we’ll be able to express what you’re asking for not only once we implement “closure isolation control” (there’s a pitch but due to unfortunate life happenings, we didn’t manage to implement it yet; it remains an important piece of the puzzle we’ll want to solve though).

The rationale is basically that both structured concurrency APIs: task group and async let, are intended to introduce tasks running in parallel with the enclosing actor. This was a choice we made in the design because these APIs are explicitly used to “fan out work” and it wouldn’t be very useful to have them all run on the same actor by default.

Do use cases exist for all child tasks to run on the same task as the surrounding actor? Yes, but it’s not the default case.

In your specific example though… the sending() function is sending the value, but the value is not marked sending in the testTaskGroup so that’s not right — if you’re passing along a parameter like that it’ll have to be sending also in the testTaskGroup so it is sent from caller, through testTaskGroup to the sending func.

4 Likes

Thanks for both answers. OK, understood: the task group is specifically designed to run child tasks in parallel off of the calling isolation.

i'm not entirely sure what the intent of your actual code is

My usage scenario is different, in that it wants to run two or more child tasks concurrently on the same actor that is provided by the caller as an "isolated any Actor" parameter.

The functions which are the operations for the child tasks all have an "isolated any Actor" parameter. The functions are expected to have prolonged execution times, but do suspend a lot. Basically they run an async loop, waiting for some event to become available, then execute a very short computation - which mutates the value, then wait for the next event.

So, basically all computations run on the given actor. The value can be mutated by all child tasks, but no data races can happen.

My intention for TaskGroup was, to handle and maintain the tasks better. My current working solution uses separate tasks, which I do not really like honestly. Ideally, I would have used async let , exactly the same way as Simon Leeb already posted in his thread.

In your specific example though… the sending() function is sending the value, but the value is not marked sending in the testTaskGroup

Yes, the function testTaskGroup is not compiling anyway because we can't use the "isolated any Actor" parameter.

However, the other function, testTask does compile as it is.
It is not "sending" the parameter, well because (I believe) it will not be sending, because it's using the same isolation. So, the function testTask should compile and be correct even when passing a non-sendable, and when mutating it. Is my assumption correct?

this i find a bit confusing. if the value can be mutated by child tasks, but those child tasks may be running concurrently with respect to each other, won't that be a race the system should try to prevent? in your example i don't see how allowing concurrent calls to the sending<T>() method could be safe, unless the generic T is also required to be Sendable because that method runs off the actor[1] (though if the value is only ever sent once, then maybe it's okay?). anyway, i'm curious to better understand your actual problem domain a bit more concretely, as i feel i'm still misunderstanding somewhat.

as for workarounds – the primary difference between the Task initializer and the TaskGroup.addTask() method is that the former uses the unofficial @_inheritActorContext attribute on its closure parameter. if you wish to propagate the isolation in a similar manner[2], i think a utility extension might do the trick:

extension TaskGroup {
  // N.B. adding child tasks with the same isolation undermines some of the design
  // goals of TaskGroup by causing work to be serialized.
  mutating func addIsolatedTask(
    isolation: isolated (any Actor) = #isolation,
    @_inheritActorContext
    _ operation: sending @escaping @isolated(any) () async -> ChildTaskResult
  ) {
    // ideally add some assertion about operation.isolation and isolation here,
    // (they should have the same value), but it currently seems to crash
    // the compiler
    self.addTask(operation: operation)
  }
}

// ...

  static func testIsolatedTaskGroup<T>(
    isolation: isolated any Actor = #isolation,
    value: T,
  ) async {
    await withTaskGroup(of: Void.self) { group in
      group.addIsolatedTask {
        _ = isolation // must still explicitly capture the isolated parameter
        isolation.assertIsolated("missing isolation")
        await sending(value)
      }
      await group.waitForAll()
    }
  }

  1. unless the NonisolatedNonsendingByDefault feature is enabled, though that would still leave the use of a sending parameter a bit confusing IMO ↩︎

  2. the standard disclaimers about relying on language features that haven't been formalized apply ↩︎

1 Like

Sorry, but is this even possible?

Sure, this is why "concurrency != parallelism". You can have a task "on the actor" and it suspends, and then another task runs "on the actor". These tasks execute concurrently on the actor, however not in parallel. This is also known as actor reentrancy.

Today you could achieve such execution semantics with making a custom actor executor that is also a task executor, and use that task executor for the child tasks. However task executors do not provide isolation so the compiler would not believe you this is "safe" isolation wise, even though execution wise at runtime it would be. I vaguely remember we may have a bug with this exact scenario though that is on my list to look into... so please double check before applying this pattern.

4 Likes

100% this. Being able to express this would make it so much easier to work with non-sendable types. For example in server frameworks I think it would be nice to maintain some "request local" state in a simple non-sendable class. It can be easily passed around while handling the request. Since the context is non-sendable, swift will prevent concurrent access to it, so no locking required. This model actually works quite well today already and allows for an intuitive API, but currently completely falls apart when you want to do IO bound work concurrently. For example:


protocol Connection: Sendable {
    func write(_ value: String) async throws
}

class ConnectionRef {
    var value: String = ""

    private let connection: any Connection

    init(connection: any Connection) {
        self.connection = connection
    }

    func send() async throws {
        // Simulate some I/O work
        try await self.connection.write(value)
    }
}

final class Server {
    func accept(_ connection: any Connection) async throws {
        let connectionRef = ConnectionRef(connection: connection)
        try await handle(connectionRef)
    }
}


func handle(_ connection: ConnectionRef) async throws {
    connection.value = "Hello, World!"
    try await connection.send()

    try await withThrowingTaskGroup(of: Void.self) { group in
        group.addTask {
                //    `- error: passing closure as a 'sending' parameter risks causing data races between code in the current task and concurrent execution of the closure
            try await connection.send() // not possible because connection is not Sendable, but safe if we could inherit isolation
                //    `- note: closure captures 'connection' which is accessible to code in the current task
        }
        group.addTask {
            try await Task.sleep(for: .seconds(10))
        }
        _ = try await group.next()
    }
}

The compiler is correct, since by default task group schedules the child tasks on the concurrent pool, "connection" is crossing an isolation boundary and could be accessed from different threads at the same time.

If I could tell the compiler to not switch the execution & isolation context, this pattern would be safe. And yes, both tasks would be able to make progress, since once the first one suspends, the other can take over and vice-versa. Obviously, if the tasks are 100% cpu bound they would just run one after another.

1 Like

Thank you very much for the reply! I will try to provide a little more background to my issue.

The executions are intended to run on the same actor. Values (any) should always be isolated to the same actor. Nothing should run "off" the actor.

Each child task is executing a long running async throwing function that suspends frequently and potentially for long durations. Actually it's an event loop. Internally, the implementation of the async throwing function obtains events from an AsyncStream. When processing an event, the actual CPU load is very small (Âľsec range), while the async function might not terminate at all (its enclosed within a Task, so it can be cancelled, but this is a further detail).

So, data races should not happen (Swift Concurrency should prevent this). Of course, there's concurrent (but no parallel) access to the values, as Konrad already pointed out in his reply.

Maybe, the term "concurrent" access might be ambiguous. We should distinguish it with "safe" (accesses may have race conditions) and "unsafe" (may cause data races, i.e. crashes). Swift should not let us do these unsafe things. ;)

your example i don't see how allowing concurrent calls to the sending<T>() method could be safe

I believe, the following code is safe, because the value will not be accessed on different isolations. Do I miss something?

This is fine for the compiler:

    static func testTask<T>(
        value: T,
        isolated: isolated any Actor = #isolation
    ) {
        print(isolated)
        Task {
            _ = isolated
            await sending(value)
        }
    }

Intuitively, I believe this is safe. But I could be wrong :)
Notice, that the isolation in the task closure will be explicitly captured, so that the task is isolated to the actor. I believe, I do not need to add sending to the parameter value in func testTask. Well, Swift 6.2 is happy.

These are some technical details I have so far. The background of all this is a library. And most types like T in the example, are provided by the user and there should be no Sendable conformance required.

I'll look into your workaround, using @_inheritActorContext for now, until we have the opportunity to specify actor instances for closures when passing them as parameter (upcoming feature).