Actor Races

I'm not sure that you need an AsyncStream for this particular case; as well as just the task closure, you could also store some metadata indicating what triggered the task (and a timestamp of when it started, etc).

A server notification may be a high-priority event which always causes the running task to be cancelled and for a new task to run (I'm just using a simple force boolean parameter):

actor OneTaskOnly {
  private var runningTask: Task<Void, Never>?
  /*   private var runningTask: (Task<Void, Never>, Reason)? */

  func runTask(_ block: @escaping @Sendable () async throws -> Void, force: Bool = false) {
    func launchTask() {
      self.runningTask = Task {
        do {
          try await block()
          self.runningTask = nil
        } catch {
          self.runningTask = nil
        }
      }
    }

    if runningTask == nil {
      launchTask()
    } else if force {
      runningTask!.cancel()
      launchTask()
    }
  }
}

I just don't see re-entrancy as being the underlying problem here; the tasks are not trying to mutate the same variables and interleaving at suspension points; you're looking for a higher-level kind of mutual exclusion, where the granularity of that exclusion is based on the kinds of requests your server offers, how the data is modelled in the database, and how strict the consistency requirements are as different portions of the database get updated by those various requests.

3 Likes

Interesting example indeed. Can you think of a fix? For simplicity let's assume it was the old style closure based API:

func downloadAndStore(execute: @escaping () -> Void) {
    loadWebResource("data.txt") { download in
        database.store(download) {
            execute()
        }
    }
}

It's a race here as well, so, how'd you avoid it?
And if there's a good solution here how best to back-port it to async/await version?

1 Like

The way I have solved that is with a serial queue. You then use an AsyncOperation class to hold the contents of the func, and submit that to the queue. Something like this

func downloadAndStore(completion: @escaping () -> Void) {
	serialQueue.addOperation(
		AsyncOperation { finishHandler in
		    loadWebResource("data.txt") { download in
		        database.store(download) {
		       	 	completion()
					finishHandler()
		        }
		    }
		}
	)
}

There is no AsyncOperation in Cocoa, but the Operation class supports the concept of an async operation, so it is quite easy to make a subclass (eg LLVS/AsynchronousOperation.swift at master · mentalfaculty/LLVS · GitHub)

How would this look in async/await? I suspect you can do this with AsyncStream (How do you use AsyncStream to make Task execution deterministic?), but it is not really any more elegant than the AsyncOperation approach above, and not that accessible to inexperienced developers.

For this reason, I think it would be nice if actors could take care of the streams, queues or whatever they need to make this happen, so the developer can just write something like this

actor DataManager {
    func downloadAndStore() atomic {
         let download = await loadWebResource("data.txt")
         await database.store(download)
    }
}
4 Likes

I see, thank you.

Would the following fly as an interim / hybrid approach?

func downloadAndStore() async -> Result {
    await serialQueue.addAsyncOperation {
         let download = await loadWebResource("data.txt")
         await database.store(download)
    }
}
Here's a proof of concept implementation of the hybrid approach
let serialQueue = OperationQueue()
serialQueue.maxConcurrentOperationCount = 1

func foo(_ name: String, timeout: UInt64) async {
    await serialQueue.addAsynchronousOperation {
        print("\(name) op started")
        try await Task.sleep(nanoseconds: timeout)
        print("\(name) op finished")
    }
}

func test() async {
    await foo("first", timeout: 2_000_000_000)
    await foo("second", timeout: 1_000_000_000)
}

print("Start")
Task {
    await test()
}
RunLoop.main.run(until: .distantFuture)
print("Stop")
print()

extension OperationQueue {
    func addAsynchronousOperation<T>(_ block: @escaping () async throws -> T) async {
        addAsynchronousOperation { finish in
            Task { () -> T in
                let result = try await block()
                finish()
                return result
            }
        }
    }
}

where "AsynchronousOperation" is taken from the link you provided above.

The test gives desired output:

Start
first op started
first op finished
second op started
second op finished
1 Like

I do agree that this is a surprising and pernicious pitfall — not just with Swift, but with all languages that support async / await (or anything coroutine-shaped, really) and also support shared mutable state in any form. While Swift concurrency is a vast improvement over the state of the art, the existence of the difficult-to-spot partial task (i.e. the stretch of control flow until the next await) has serious downsides for our ability to reason about state and control flow.

I wrote about my concerns over this pitfall (and also here) during the reviews of the concurrency features. I still wonder if we might do anything about it.

As @ktoso noted above, non-reentrant actors could partially solve this problem…but with serious deadlock and performance downsides. Beyond those downsides, would they in fact completely solve the issue? It seems to me that the problem here is not only with actors, but with anywhere an await appears.

Two other thoughts:

  1. Are there useful warnings the compiler could emit? Could we, for example, warn if there are modifications to potentially shared state on both sides of an await within a function? Does the compiler have enough information to emit such a warning?

    If so, there could be a kind of explicit “commit work” operation that compiles to a noop (!), but lets the compiler know that the programmer believes they have left shared state in a good state and therefore a subsequent await is acceptable.

  2. Might we consider Eiffel-like invariant checks?

    For those unfamiliar, the basic idea is that a type has function(s) that verify that all the class’s invariants are correct, and the compile automatically adds a call to the appropriate invariant check function(s) every time a public method returns — or before every await. The function is only a check; in optimized builds, it does not run.

    Those willing to add invariant checks to their types would presumably be able to catch misreasoning about await sooner.

4 Likes

Maybe an opt-in atomic marker on functions, don't know if at all something like this is doable. I really welcome avoiding deadlocks and improve on performance but in some cases we can nudge the scheduling to execute to our liking?

1 Like

It sounds like what's really being requested here is a TaskQueue in the standard library analogous to a serial OperationQueue from GCD. I'm in support of this; it's not terribly difficult to mock up the basic idea of what it should look like, but in practice it's a bit tricky (here's my crack at a draft). I could see it being either an actor or a class that uses a locking mechanism of some sort or something.

7 Likes

In-between the TaskQueue and reentrant Actor, isn't there some room for non-reentrant actors ?

It looks to me that non-reentrant actors would make it trivial to implement a taskqueue, while only providing a taskqueue would make people like me bypasse the actor concept entirely, and implement a poor-man version of non-reentrant actor using TaskQueues (which would be a bit of shame ?)...

One of the biggest issues I personally have with the actor model is its non-deterministic functionality. Enqueueing something into the TaskQueue has zero guarantees which addOperation method wins in a case where multiple calls happened at the same time. While two tasks could await the chance to enqueue an operation, there is no guarantee that the task that came first will win. We really need some strong guarantees for a FIFO like TaskQueue here.

3 Likes

One way to look at a Swift actor is that it’s a class-like thing with an implicit mutex around its operations. In this view, an “atomic” or “non-reentrant” actor method is equivalent to taking a mutex in a method that makes an async call, and releasing it in a callback¹… which is the sort of thing that sends experienced thread wranglers running for a panic button.

It would be preferable if we could find an idiom for these use cases that isn’t a differently-shaped foot cannon.

¹ With the distinction that it doesn’t matter if the continuation happens on a different thread

3 Likes

I assume the design documents about Actors already went through all the tradeofs involved in the various flavors of actor systems, but although i'm absolutely not an expert, i had the feeling non-reentrant actors did exist in other systems (either by locking, or implementing some sort of mailbox for buffering calls) and made local actor state invariants much much easier to reason about.

EDIT: i may have read your message wrong. You may have been talking about how a swift-specific implementation of non-reentrant actors would look like for the current state of the langage, in its most straightforward solution. In which case you may be 100% right as i have no idea how the current system is implemented under the hood.

This is true, but it comes at the cost of making non-local interactions harder to reason about, hence the foot cannon description above. I'm glad we're discussing improving this situation but I agree with @jayton that we should be wary of just trading one problem for another.

2 Likes

I see what you mean, however one could easily argue that, at least in the general case, it is impossible to have a good global understanding of a system whose component's local state is hard to figure out: If each actor's local state is hard to make sense of because of reentrency, then how would one expect to make sense of a system with tens or hundreds of actors ?

1 Like

Indeed, the "right answer" is not especially obvious, since all the obvious ones have clear tradeoffs. I'm very curious to see what ends up being designed for this.

To be perfectly honest, I don't think there's any design which is going to make this easy. Programmers working in concurrent systems need to learn to think transactionally, which is already a stretch, because there are inevitably going to be ways to compose transactions together non-transactionally. Swift can
stop you from writing await actor.foo = actor.foo + 1, and maybe it can advise you that await actor.setFoo(actor.foo + 1) still looks really questionable, but it can't actually force you to add a proper incrementFoo() method or whatever makes sense transactionally for your situation, and that's always going to be the biggest problem here.

Composing actor operations by preventing interleaving during async actor functions seems like it's just trading one problem for another, because now every async actor operation is a potentially unbounded serialization point, and if scalability is important, that can easily be just as wrong as the potential non-atomicity, and you have to chase all of those points down and rework the code. I've seen so many towers of awful workarounds based on recursive locks and careful unlock/relock regimens. Eventually you're back in the situation that Swift puts you in: calls to peer actor operations have to be treated as re-entrant because they might need to unblock actor progress — either they're written that way currently or they will be in the future.

I agree with Adrian's point that the more pressing issue is not having a way to maintain FIFO-ness except by basically switching to channels with AsyncSequence.

18 Likes

Well, that’s basically all mutable properties touched by a task, since the compiler doesn’t know whether some piece of data is shared or not.

In my opinion, actors will be very common in most applications. I think it is rare that you can just give immutable input to a task and let it work in its own little world without it needing to coordinate with some state on the outside.

1 Like

Not necessarily. My vote would be that this is opt-in (hence my use of an "atomic" keyword). By default, it would work just as now. A developer could choose to make very specific operations opt-in. Yes, they would have to understand the repercussions of that. It would be nice if Swift could detect a case of reentrance at run time, and at least give an exception for that.

Doesn't this have the same issues? If a running task submits to the same queue, presumably it will deadlock. In this sense, a queue is exactly the same as an actor.

1 Like

I wonder if a "problem" (once again, absolutely not an expert here, so take no offense) of the current design isn't that the communication medium between two actors isn't represented anywhere.

The tradeoff between blocking vs nonblocking reminds me of channels in go. Whenever establishing a channel between two sequential processes, one can decide whether the call will block or not, and if not how large is the buffer storing the calls allowed to grow. i know go is using CSP and not actors per say, but it makes me think whether @atomic vs reentrant vs, etc wouldn't rather be properties of the layer "under" actors, and if, once that layer becomes visible, it wouldn't open the path to even more interesting customisation properties.

QFT:

That’s it. That’s the whole game right there.

I too see the appeal of having a TaskQueue-like something-or-other in the standard library. The thing is, the generally non-blocking nature of Swift’s concurrency model is a huge and hard-won strength. Task queues and atomic methods are both tempting abstractions, but they reintroduce all the problems of thread pool exhaustion and performance-degrading blocking and deadlock that Swift’s concurrency model is assiduously designed to avoid.

I agree with John: the problem here isn’t Swift’s concurrency implementation, or a lack of features; the problem is that coarse atomicity is a fundamentally problematic approach to concurrency. The problem isn’t in the language or the library; it’s in our heads. Transactional thinking is hard. (I suspect that anybody who takes on this problem is going to find themselves reinventing the last 50 years of work on databases.)

I would be in favor of Swift finding an ecosystem of language features + library features + idioms — the last of those probably being the most important — that encourage transactional programming patterns without going “full relational DB.”

One of the problems I see in the current model is that there are potential transaction boundaries at every await, and those are hard to spot. I proposed a couple of ideas about that upthread. (I’d be curious in particular to hear thoughts on whether (1) in that message is a dead end.) Neither of those ideas solves the larger transactionality problems John talks about in his message, but they might at least help surface the problems.

8 Likes