Passing large amounts of data between actors without making copies

Luckily for me, the UI that shows/edits the large amount of data is not accessible during either of the long-running tasks. (During those times, there's just an AsyncStream of progress updates between the actor and the main thread.)

1 Like

I just mean, add those methods and don't otherwise use products directly. But the Disconnected approach of encapsulating them in a separate type is honestly a better one.

2 Likes

Out of curiosity, why avoid making it sendable?

I've had this come up where the class also needs to be @Observable, or conform to ReferenceFileDocument. Either of those makes sendability conformance significantly not fun (or even impossible - I'm not sure I'm equipped to tell the difference).

1 Like

What happens with the UI (of the real app) if it wants to re-render (for some reason) and needs accessing products ?

I have a similar scenario to John - in my case it's that my actor is spending lots of time doing things with a C library. When I'm doing the long-running operations on the actor, I disable the UI.

To that end, it would be super interesting if there were a type which can be safely passed back and forth, and the UI code on @MainActor would see it as nil while it's being worked on by the background actor.

I guess in that case you might as well do that long-running work on the main actor...

I tried that and could not get it to work. The call to processProducts() still fails to compile with the error "Non-sendable result type '[WorkProduct]' cannot be sent from actor-isolated context in call to instance method 'processProducts'"

If you have gotten this to work on your end, I'd love to see a diff or pull request against my sample project repo.

I guess in that case you might as well do that long-running work on the main actor...

I would say no, because it's only one window that's got disabled UI during that work, the other windows are still fully responsive.

Disconnecting the local variables like @John_McCall suggested worked for me (see Disconnect 'self'-isolated local vars, so they can be sent. by nsc · Pull Request #1 · siracusa/Worker · GitHub)

6 Likes

The need to unsafely disconnect products within processProducts points to a second missing feature: you need to be able to say that a function maintains the disconnectedness of a particular parameter. If processProduct were declared this way, products would still be in its own region and could be safely transferred out.

Both of these are worth issue reports; they seem like common needs.

10 Likes

I wonder if Swift could provide a helper function that checks at runtime that a value is in its own region:

func assumeDisconnected<T>(_ value: T) -> sending T?

It would have to walk the object graph of value and find all contained references. It would then compare the reference counts of each referenced object to the number of discovered references. If the numbers match the value can be considered disconnected. Otherwise it would return nil.

4 Likes

That sounds like a super interesting idea. I feel like the discovery part could be possible with Mirror alone. By just iterating through the stored properties and checking for AnyObject conformance. Not at my laptop to test though.

1 Like

Ah ha, very nice! I was missing the nonisolated(unsafe) let products = products bit in processProducts()

But this is just an actor. There shouldn't be any reason to code the above up directly anymore.

2 Likes

To me this looks more like a mutex since it calls queue.sync. If it was queue.async then it would be closer to an actor.

(It's interesting that DispatchQueue supports both sync and async operations, something actors can't do.)

Not equivalent through, e.g. thread pool size is different and no await is needed so callers themselves could remain non async.

Also, of course a class where only the getters and setters are protected by locks and nothing else is rarely the right thing to do. For example this is a race:

class C: @unchecked Sendable {
  var n: Int { get { lock, read, unlock } set { lock, write, unlock } }
}

let c = C()

Task {
  c.n += 1
}

Task {
  c.n += 1
}
3 Likes

Right, this was my original point. You can definitely write a thread-safe type with a carefully thought out interface that provides exactly the transactional operations that its clients need, but that takes a significant amount of thoughtful design work, not just up front but whenever it needs to evolve in the future. It is generally much easier — and much more compositional — to write a simple aggregate with simple, orthogonal properties. The latter is not the same as the former, and you can’t actually make it thread-safe with fine-grained locking; it’s a misunderstanding of what data races actually are.

6 Likes

Note that I was careful NOT suggesting this...

Among what's been suggested:

  1. make WorkProduct sendable via protection with a lock + closure

which indeed could cause a major redesign due to API change.

  1. make WorkProduct sendable via protection with a serial queue

This has a nice property of not changing its API.

But this is just an actor. There shouldn't be any reason to code the above up directly anymore.

It is superficially similar to what actor is doing but making it an actor is an API change (sync -> async) which could also amount to an unwanted major redesign due to async virality.

Using a serial dispatch queue exclusively with sync like that is essentially just using an expensive mutex. If used from multiple concurrent contexts, it is prone to the same high-level data races as any other fine-grained locking solution. I cannot stress enough that these problems are fundamental to concurrency (at least, concurrency with any measure of true parallelism) and not to anything about Swift’s presentation of it.

5 Likes

Unfortunately, actors are not a silver bullet for resolving high-level data races either :-(

In a way they could give "an illusion of correctness" because they eliminated all low-level data races (the presence of which (e.g. in a form of runtime traps) would otherwise bring developer's attention to potential issues).

Illustrating example
// Edit: simplified a bit

actor State {
    private var state = 0
    func get() async -> Int {
        await someLogStatement()
        return state
    }
    func set(_ newValue: Int) async {
        await someLogStatement()
        state = newValue
    }
}

let state = State()

Task {
    // data race
    await state.set(state.get() + 1)
}
Task {
    // data race
    await state.set(state.get() + 1)
}
Task {
    try await Task.sleep(nanoseconds: 100_000_000)
    let result = await state.get()
    print("resulting value is \(result)")
}

func someLogStatement() async {
    try! await Task.sleep(nanoseconds: .random(in: 0 ..< 1000_000))
}

RunLoop.main.run(until: .distantFuture)

Concurrency is hard.