Writing tests to demonstrate actor re-entrance bugs

Yeah I'm still thinking you have actors in the wrong place. Making the World an actor and your game objects structs sounds like it is both creating an artificial bottleneck and failing to make things properly transactional.

I imagine it would be like a microservices system with lots of small databases that each manage separate data, and communicate with each other asynchronously through event streams. Instead of each individual database having its own lock keeping it transactional, you install one global lock on the whole system. This both needlessly constrains the system and would be really hard to make it properly transactional.

The reason you'd think to do this is because you want to define a single atomic "transaction" that touches multiple microservices. But as soon as you start wanting something like that, you've either proven you sliced up your microservices incorrectly, or you actually don't want microservices at all (your system is really monolithic, it can't function if one part goes down so microservices are just needlessly complicated).

The purpose of making each game entity an actor is to ensure that it can receive messages concurrently but its state isn't unsafely accessed. But making the World an actor and "opening up" each invidual Entity by making them non-isolated (not actors) is like putting your lock around the whole group of databases. Two entities can't process messages in parallel even though they should, and the world has unsafe access to each entity's state while inside its lock.

I wonder if the operations you want to be atomic transactions would become as such by switching to the World being a struct and each Entity being an actor. If you follow this pattern, your methods on each Entity probably don't need to be async, because they just access their own state and can do so synchronously. In your example of one entity getting a KO message and then a move message, well the KO message would not need to be async, even though another entity or the world sending that message would need to await it. So that guarantees the whole KO message is processed so subsequent messages will see that the entity is dead.

3 Likes

Hi Dan,

I would expect a game design to make the entities actor s and the World a struct (where the "world" is defined to the static/immutable part of the game and "entities" are dynamic parts that interact and change state during the game session).

I am going to take your suggestion to switch the actor/structs role. :+1:t2:

And come back to you when with my experiences.

KR Maarten

2 Likes

So I swapped the actor role between world and entities. And I agree this makes more sense:

  • the chance of values getting overwritten because of ā€œoutdatedā€ copies of structs hanging around is a lot less.
  • I donā€™t have to queue commands and wait for the next tick to execute them making the game feel more responsive.

There are now await statements everywhere, so I still donā€™t have a lot of confidence about the codeā€™s resilience. However, the ā€œblastā€ radius should be less as typically individual entities or properties of the entity are impacted, instead of the entire world.

Btw: most time was spent converting the tests. The XCTAssert* functions donā€™t seem to like async code/values. Meaning I needed to convert code like:
XCTAssertEqual(entity.position, .zero)

To:

let newPosition = await entity.position
XCTAssertEqual(newPosition, .zero)
3 Likes

I really find the explorations you are doing here fascinating! Thanks for sharing and being up for trying stuff.

I know nothing at all about game development. But I do have question. I assume that all of the world state is tied to player interaction which originates on the MainActor. Why isnā€™t the world state on the MainActor too?

Because itā€™s a multiplayer game. :blush:

The entire simulation exists on a (Vapor) based server.

1 Like

Ah I'm sorry! You included that right at the beginning.

The mutations to your world come from a non-isolated context, and you need to provide that isolation yourself. Makes perfect sense!

Yeah, I believe adopting Swift Concurrency means you need to determine where the isolation boundaries lie. The first approach (World as actor) creates one big 'continent' that is an isolated island. Easy to reason about, but leads to a monolith and less flexibility.

The approach with entities as actors leads to an archipelago of islands of isolation. That makes sense if they are more or less independent of each other.

As the game is focussing more on PvE instead of PvP, I think the latter approach is a better fit.

Perhaps we should close this topic, as it is evolving more into a topic about multiplayer game architecture using Swift Concurrency. If you find it useful, I don't mind creating a separate topic for that.

Yes, your aSyncContext example will work. In practice, I have found queues that allow synchronous enqueuing to have no downsides and more flexibility. But by no means are they a requirement!

1 Like

Concurrency is not Parallelism by Rob Pike is a nice talk about this topic.

Concurrent does not mean parallel. It does not even imply parallel. Those are unrelated concepts. I would say the "concurrent" gives you the possibility of being parallel, but it does not mean that this would actually be the case.

When you are using a lock in Swift you have and indentation:

self.lock.withLock {
  self.count += 1
}

Actors do not have indentation which makes is very difficult to remember that this is an isolated critical section between 2 async calls. Given how much of the surrounding "noise" there is in every method those 2 tiny async keywords just blend with the rest of the code.

Obviously blocking the actor for the whole duration of the method execution would deadlock. People know this, they are 100% aware of this fact. I have written my fair share of Swift concurrency code, but there is not even a single project where I would be confident that there are no re-entrancy related problems.

Yep. For this you would need an meta-scheduler that would look at all of the jobs scheduled a given time, run independent executions with all of the orderings, and fail if at least 1 execution failed. This is not possible as Jobs can schedule new Jobs and pretty soon we will have combinatorial explosion.

That said there are ways to manually control the execution order. I think the best thread would be Reliably testing code that adopts Swift Concurrency? by @stephencelis. It was posted 2 years ago (may 2022), and not a lot has changed since then. From what I remember their conclusions were (btw. I do not follow their pointfree.co podcast, so maybe something has changed in the meantime):

For me those 2 things are hacks. Task.megaYield does not guarantee yield, and MainSerialExecutor is just too complicated. Below I will share my own ways of dealing with those stuff.

State machines

State machines work great and are a natural fit for concurrency.

Basically, with actor re-entrance we have a fully distributed system. There is no global state, we only see our local limited scope. It may happen that 2 outside entities independently request te same action:

  1. Incoming request to perform operation "A"
  2. Operation "A" awaits for some other operation
  3. Another incoming request to perform operation "A"

An example would be: every cache that ever existed. You probably have seen a Swift implementation already. 1st result in duckduckgo is donnywals.com/Using Swiftā€™s async/await to build an image loader (non-affiliated).

Another example would be some kind of finish/close (pseudo-code, not tested):

actor Thingy {
  enum CloseState { open, inProgress(Task), closed }

  private var closeState = CloseState.open

  func close() async {
    switch self.closeState {
    case .open: ā€¦
    case .inProgress(let task): await task
    case .closed: break
    }
  }
}

The above code could also be rewritten with just Task?. It kind of suggests that the Task is a state machine.

swift-async-algorithms also has some more examples: ChannelStateMachine.swift etcā€¦

Building blocks

Let's go back to the bad old days of just using threads. My way of approaching the situation was: "you are not clever enough, find already implemented data structure that does things for you".

For example I used an actor like pattern:

  • each thread has a ThreadProxy class
  • to run a method in the thread we call a method on the proxy
  • to communicate between threads we use 2 ConcurrentQueues: inbox and outbox

With this our UI is always responsive, and if the user clicks very fast (faster than the background operations) we just buffer the actions in the inbox: ConcurrentQueue.

Anyway, going back to the Swift concurrency: what are the building block here?

In a way we design our architecture based on those building blocks, instead of writing synchronous Swift, but with actor instead of class. For example instead of actor mutation we can send a message on a channel. This is easily testable because now we have an observable effect (the message). Representing behavior as data is a fairly interesting concept.

Below is an example of this, note that I do NOT propose this architecture to solve OP problem, it is just that some form of a "game" is a good way of illustrating things. I will use AsyncPubSub which facilitates fan-out many-to-many communication (we will look at the implementation later):

  • messages can be send to the AsyncPubSub
  • subscribers can subscribe to the AsyncPubSub
  • each message is forwarded to all of the subscribers

(I feel like this data type should already be inside swift-async-algorithms, but for some reason it is not. I have already written the proposal a few months ago, I can post it if anybody is interested.)

struct Move: Sendable { playerId: String, ā€¦ }
struct GameState: Sendable { currentPlayerId: String, board: Board, ā€¦ }

actor Player {
  private let id: String
  private let moves: AsyncStream<Move>.Continuation
  private let gameStates: AsyncPubSub<GameState>

  func run() {
    for await state in self.gameStates.subscribe() {
      guard state.currentPlayerId == self.id else { continue }
      let move = self.calculateMove(board: state.board)
      self.moves.yield(move)
    }
  }
}

actor GameLogic {
  private let moves: AsyncStream<Move>
  private let gameStates: AsyncPubSub<GameState>

  func run() {
    // Emit the initial state.
    self.gameStates.yield(self.state)

    for await move in self.moves {
      // You have to wait for your turn!
      guard move.playerId == self.currentPlayerId else { continue }
      // We can do an 'await' call.
      // If a player yields a move during the 'database.store' it will be added
      // to the 'AsyncStream<Move>' buffer. No re-entrancy.
      await self.database.store(move: move)
      self.updateState(move: move)
      // Notify everyone about the new state.
      self.gameStates.yield(self.state)

      if self.state.winner != nil {
        break
      }
    }

    // Cleanup, store game result in database etcā€¦
    self.gameStates.finish()
  }
}

let (moves, movesContinuation) = AsyncStream<Move>.makeStream()
let states = AsyncPubSub<GameState>()

let playerA = Player(id: "A", moves: movesContinuation, states: states)
let playerB = Player(id: "B", moves: movesContinuation, states: states)
let logic = GameLogic(moves: moves, states: states)

withTaskGroup { group in
  // TODO: Subscribe to the 'states' in your UI.
  // TODO: For debug: 'print' every 'state' message.
  group.addTask { playerA.run() }
  group.addTask { playerB.run() }
  group.addTask { logic.run() }
}

This is a skeleton of what one can do. It is trivial to test. In reality there is a race condition where we may emit the initial state before our players subscribe, to solve it:

  • instead of AsyncStream<Move> use AsyncStream<PlayerAction> where PlayerAction = connect(playerId) | move(ā€¦)
  • in GameLogic use (you guessed it) state machine: State = waitingForPlayers(connectedPlayerIds) | inGame(GameState). Wait for all of the players to connect before starting the game.

We may also want to add PlayerAction.disconnect(playerId) (forfeit?) for when the user just exists the game. This will immediately notify the other player that they won. With this the whole thing becomes a standard client-server application, we are just using typed enums instead of JSON over WebSocket.

Side-note: the whole design can be made sync via protocol Player { func getMove(state:) -> Move }, and the GameLogic would just call it. This is just an illustration of a pattern of using "building blocks", NOT A SOLUTION TO ANY PARTICULAR PROBLEM.

Happens before/after relationship between the messages may also be needed:

  • player A makes a move
  • game logic starts new state calculation
  • player B makes a move
  • game logic finishes state calculation after player A move
  • I guess we already have a player B move, so we will use it <- this move was based on the old state (before the player A move) it may not be correct

To solve it we can just add stateId and then each move will include a stateId representing a state on which it was calculated. Game logic will reject all of the moves based on the old state. This gives us ordering of the messages.

Remember:

  • all of the ids should be random guids - we don't want player A to guess player B id. (aimbot for chess?)
  • get of the actor as soon as possible - if calculateMove is sync then it can be moved outside of the actor -> easier tests.

If we are implementing an AI bot (the "computer" player) then there is a fancy thing we can do: as soon as our bot emits a move it will try to predict the player move and pre-calculate its next move. This way when the player actually does the move we will already have calculated the response. This makes our bot very fast at the expense of the energy. We may want to disable this feature when the user is at less than 20% of the battery. Also, remember to use background priority for prediction. The design is concurrent, but we may not get a parallel execution.

Skeleton for prediction:

actor Player {
  private let id: String
  private let moves: AsyncStream<PlayerActions>.Continuation
  private let gameStates: AsyncPubSub<GameState>
  private var boardToPredictedMove = [Board:Move]()

  func run() {
    let subscription = self.gameStates.subscribe()
    // Notify 'GameLogic' that we are ready.
    self.moves.yield(.connected(playerId: self.id))

    for state in subscription {
      guard state.playerId == self.id else { continue }

      self.cancelUserMovePrediction()

      let move: Move

      if let predicted = self.boardToPredictedMove[state.board] {
        move = predicted
      } else {
        move = self.calculateMove(board: state.board)
      }

      self.boardToPredictedMove.removeAll()
      self.startPredictingUserMoves(state: boardAfterOurMove)
      self.moves.yield(move)
    }
  }
}

Anyway, this all goes back to the Concurrency is not Parallelism by Rob Pike: you don't think about running things in parallel, you think about how to break the problem down into independent components that you can separate and get right, and then compose to solve the whole problem together. Basically: keep the gophers running, otherwise they are unemployed and their families starve.

As for the distributed systems: I don't remember the details (I read it 4 years ago) but "Designing Data-Intensive Applications" by Martin Kleppmann was a pretty good summary of how to design those. For example it talks about the importance of idempotence.

Tests

Depends. From my experience a lot of code bases use a set of "core" data types. I think that it would be beneficial to test them thoroughly, because they are used in so many places. And for that we have to test different scenarios/orders to fully say: "this code works".

To illustrate the example let's quickly implement the AsyncPubSub that I mentioned before. The real implementation would probably be:

  • store a [de]queue of Messages
  • each Message holds:
    • yieldCount: Int - number of consumers at the time of 'yield/send'
    • consumeCount: Int - number of next calls that consumed this message
  • when consumeCount == yieldCount -> [de]queue.popMessage()

This is a bit too complicated for this post, so we will just create a separate AsyncStream for each consumer.

final class AsyncPubSub<Message: Sendable>: Sendable {

  typealias SubscriptionId = UInt64

  final class Subscription: Sendable, AsyncSequence {
    let id: SubscriptionId
    let stream: AsyncStream<Message>
    let pubSub: AsyncPubSub<Message>

    func makeAsyncIterator() -> AsyncIterator { self.stream.makeAsyncIterator() }
    func finish() { self.pubSub.unsubscribe(self) }
    deinit { self.finish() }
  }

  private struct State {
    fileprivate var nextId = SubscriptionId.zero
    fileprivate var subscribers = [SubscriptionId:AsyncStream<Message>.Continuation]()
  }

  private let state = Locked(State())

  deinit { self.finish() }

  func send(_ event: Message) {
    self.state.lock { state in
      // If 'state.isFinished' -> dictionary is empty.
      for (_, continuation) in state.subscribers {
        continuation.yield(event)
      }
    }
  }

  func subscribe() -> Subscription {
    return self.state.lock { state in
      let id = state.nextId
      state.nextId += 1

      let (stream, continuation) = EventStream.makeStream() // unbounded
      let subscription = Subscription(ā€¦)

      if state.isFinished {
        continuation.finish()
      } else {
        state.subscribers[id] = continuation
      }

      return subscription
    }
  }

  func unsubscribe(_ subscription: Subscription) { ā€¦ }
  func finish() { ā€¦ }
}

We have following cases to tests:

  • subscribe -> send message -> message is received
  • send message -> subscribe -> nothing is received as we do not send past messages
  • AsyncPubSub.finish -> subscribe -> empty
  • Subscription.finish -> send message -> nothing is received

To test all of this we need some form of "order in the wild west of concurrency". For this I have my own _test_event library:

  • _test_event(object, message) - emit an event
  • try await _test_event_wait(object, message, count: Int = 1) - wait for an event to occur count times. try is for timeout.

A single test looks like this:

func test_subscribe_thenProduce_omNomNom() async throws {
  let bus = AsyncPubSub<Int>()

  // Subscribe -> collect
  try Task.withTimeout {
    let events = bus.subscribe()
    _test_event(bus, "SUBSCRIBE")
    let all = try await events.collect()
    XCTAssertEqual(all, [5, 42, -3])
    _test_event(bus, "DONE")
  }

  // Subscribe -> collect
  try Task.withTimeout {
    let events = bus.subscribe()
    _test_event(bus, "SUBSCRIBE")
    let all = try await events.collect()
    XCTAssertEqual(all, [5, 42, -3])
    _test_event(bus, "DONE")
  }

  // Wait for 2 subscriptions -> send messages
  try Task.withTimeout {
    try await _test_event_wait(bus, "SUBSCRIBE", count: 2)
    bus.send(5)
    bus.send(42)
    bus.send(-3)
    bus.finish()
  }

  try await _test_event_wait(bus, "DONE", count: 2)
}

Unfortunately not everything is as simple as the test above, so sometimes we have to emit a test_event in the production code. This is "ok", as there is a #if DEBUG check inside. It does not happen often.

Isolate mutable state

Anyway, going back to the actor re-entrance: if the problem is only with state mutation in another actor then sort your properties by mutation corelation, as in "those 2 properties are always mutated together":

actor Thingy {
  let prop1: String
  let prop2: String

  var group1_bool: Bool
  var group1_array: [String]

  var group2_int: Int
}

group1_bool and group1_array are always mutated together, so let's move them to a separate type called BoolArray.

There is a certain movement in programming that adheres to "Make invalid states unrepresentable" mantra. The idea is that we model our data so that is is not possible to enter an invalid state -> the compiler catches the bugs for us.

Let's say that in our BoolArray example there is a state that will never occur (for example (false, empty array)), we model our newly created type to make it impossible to arrive there.

This way never forget to modify a property before doing an await call. This means that if we re-enter, our state is always valid. It is not pretty, but it scales nicely with the number of programmers in the project.

Obviously our newly created type (BoolArray) is an actor. Actuallyā€¦ not really. We can make it Sendable with Lock/Mutex.

final class SessionRegistry: Sendable, Sequence {

  private let idToSession = Locked([Session.Id:Session]())

  func get(id: Session.Id) -> Session? {
    self.idToSession.lock { $0[id] }
  }
}

(I will ignore the performance benefits of not using actors, performance NEVER matters. 99% of optimizations are premature.)

Swift stdlib will get Mutex soon (it took 3 years for something that is basically a must-have), but there are tons of custom implementations available. You can grab swift-concurrency-extras/LockIsolated from pointfree.co. (Btw. just a reminder: never ever hold a lock during an async call.)

Anyway, locks are pretty great with Swift concurrency. They are also cheap. In certain cases you will be able to replace actor with final class which removes all of those pesky await calls.

It may depend on the coding style, but somehow in my case var properties are extremely rare. Maybe it is because I write a lot of unit tests, and having a mutable state makes testing difficult. IDK. It happened multiple times that I had an actor with a bunch of Sendable let properties which can be converted to final class without any problems.

Global actors

You do not have to await if you run on the same actor.

I have seen people using @MyGlobalActor on property/function basis. For me this is an anti-pattern. It may make sense when you are writing the code, but try going back to it after a few months. It is just soooā€¦ difficult to reason about.

I do not use global actors a lot, but when I do I tend to assign the whole domain to a single actor, so that everything is synchronized. I also include the actor name in the class name.

Example:

@globalActor
actor DatabaseActor: GlobalActor {
  static let shared = DatabaseActor()
}

@DatabaseActor
class Database {}

@DatabaseActor
class DatabaseRead {}

Data structures

There is a whole word of data structures designed for concurrent access. A truly massive amount of academic works.

That said, they are not available in Swift, and I would advise not implementing them by hand.

2 Likes

Task.megaYield most certainly is a hack, but when the executor is controlled its behavior is rather predictable: it does seem to guarantee a suspension point and scheduling of the next task.

I don't think I'd call withMainSerialExecutor a hack, but it is dangerous to employ outside of tests unless you know what you're doing. The more recent executor preference APIs will be preferred in the future, but they are still incomplete.

Not according to the dictionary.

What else could "concurrent" mean besides parallel execution? Can you give an example of code where "concurrency" is introduced but it doesn't imply parallel execution of what was previously serial?

Consider single-threaded environment: it is possible to execute code there concurrently, but never in parallel. In programming concurrency and parallelism are two distinct established terms, that mean different things. Iā€™ve shared here my favorite depiction of this difference. I also recommend linked in post you quote speech by Rob Pike on the topic as well, there are great illustrations.

2 Likes

I have relatively recently come to the conclusion that both yielding (or sleeping for arbitrary time periods), and replacing schedulers with serial/single thread ones in tests is missing the opportunity to "properly" test asynchronous behavior by directly emulating what users (or clients of the code under test) do. When an asynchronous job is spun off in code (and it's done internally, so you can't await the resulting Task in a test... and presumably you shouldn't, which is why that Task is not exposed to the public), someone cares about what that task does (otherwise why start it?), and therefore has to be notified of when it has completed or produced its result. The test should therefore hook into this same notification. The only caveat is you probably want to put in a timeout so that if the code is broken and the task never finishes, it won't run forever. Technically that's a problem with any test, even fully synchronous ones, it's just rare for a synchronous function to risk blocking forever. It's too bad XCTest doesn't allow the configured timeout to be less than a minute.

It's easy to forget that the notification mechanism exists and is public, and thus available to the test. For example, any spun off task that eventually updates UI, using standard SwiftUI tools, uses the objectWillChange publisher on a view model (usually the object under test), and that is available for the test to subscribe to. If the task finishing doesn't trigger this publisher, the code is broken and the test should fail. Guessing how long it takes to finish, or forcing serial execution of everything, wouldn't catch that.

I wrote about this here (the first parts are about rolling your own fakeable scheduler abstractions, so skip to "Solution 2: Rethink the Problem". Sorry I don't have links to sections, I should probably figure out how to do that).

Parallel doesn't mean multiple cores. It means two tasks are worked on simultaneously, even if that means switching rapidly between them. Are single threaded superscalar processors "parallel" because they have multiple ALUs (and they do in fact execute multiple instructions "simultaneously", but carefully ensure it's always logically equivalent to serial execution)? And when there's a single memory unit, even a multi-core machine is still rapidly switching between tasks in some parts (maybe I'm getting something wrong about how exactly memory bus hardware works, but I'm completely ignorant of those details because they make no difference to how code logically executes).

Threads (and Tasks in Swift Concurrency) are software parallelism, cores are hardware parallelism. The only difference that makes to code is how locking primitives are implemented. In software parallelism, locks can be implemented in software as mutexes. In hardware parallelism, the hardware has to supply locks as atomic instructions. In either situation, code that creates multiple threads/tasks is abandoning guaranteed in-order execution, whether that is achieved through a scheduler loop in software or extra hardware resources.

It remains that "parallel" and "concurrent" are synonyms. Introducing "concurrency" into code can only mean removing order guarantees that were there before introducing concurrency. I once again ask what could it possibly mean to make code "concurrent" but that doesn't mean removing some order guarantee?

@maartene on the topic of actors re-entrancy, Iā€™ve been thinking on that cases again, and while I agree that writing tests for such cases is hard if not impossible task (at least so the tests are robust), you can design your code and some of the tests to ensure that your code behaves correctly despite any reentrancy.

The key thing is that reentrancy hits around suspension points, so you need to address two cases:

  • avoid relying on state across suspension points either be eliminating them or structuring the code in the way it wonā€™t be affected by that
  • validate state across suspension points

With the first you mostly try to avoid introducing suspension points into the actor methods to the great advance, and if you have to, then design to put them in the end of the method / avoid state usage after. This is not something you can greatly test, but rather framework for designing. I have an assumption (for now just theoretical) that it also might lead to a better design decisions.

The second is why you possible would be able to test, but not necessarily with unit-tests. Your goal is to ensure that your program checks that invariant which it expects to hold between suspension points, the best tool for the job is assertions of course. If violation of the invariant makes it impossible to complete normal path in the method, then abort say with error (I prefer not to make any operation die silently).

In other words, the concern is to ensure that you operate on a valid state at each time, not test for reentrancy itself. If you can be sure that the function is robust against state change across suspensions, you can be sure it will be fine with reentrancy.

As Iā€™ve been writing this, Iā€™ve remembered that Swift has support for single-threaded behaviour of the concurrency via compilation flag (need lookup), which is good for checking on concurrency issues, so I wonder if that one can also be utilized for the goal? Again, thatā€™s just an idea that needs exploration.

There is a typo in a linked post: it should be saying single-threaded. Parallelism at the level of processor instructions is a bit different abstraction, compared to what we usually have to deal with. Iā€™m no expert on these level of details, but for me this is still a concurrent execution, just because processors are capable to execute tremendous amounts of instructions, so that we cannot make a distinction from actually parallel execution, when different resources are work on the single task, without any interruptions for others. So as said, you can have concurrent execution, but not necessarily parallel.

Regarding your illustration, the irony is that because of how computer displays work, that top line is actually very rapidly switching between red and black (the green and blue subpixels next to the red subpixel of each pixel are off), but it is happening with such a high (spatial) frequency, it looks like a solid red line to you.

The difference between those two lines is of degree, not of kind. Both of them are switching on and off. One of them is doing so less frequently. It's a difference to be sure, but there is no objective frequency that, when crossed, changes the nature of it. Any distinction drawn between a line that switches on and off very rapidly, and one that switches on and off less rapidly, is arbitrary (especially when the places where the switches occur can't be controlled). There are no clearly defined two categories here.

The same is true in in time just as it is in space. Movies played on a projector are rapidly switching on and off, but it happens so fast it appears as a steady projection. If I could move fast enough, I could perform two tasks for you "simultaneously" by switching between them so fast you actually see two copies of me in two different places "at once" (though I guess they'd both have 50% opacity).

And what if you studied computer hardware and discovered that multi-core machines work by the clock crystal sending an electromagnetic pulse through the conductors that moves through each core, one at a time, so that all N cores perform a clock cycle's worth of work on each clock cycle, but very zoomed in analysis of the timing reveals that there is serial execution, where core 1 performs on the first 1/N slice of the clock cycle, then the second core performs on the second 1/N slice of the clock cycle, etc.?

Would that realization that computer hardware actually works that way on a physical level change everything and reveal that no computers ever perform "parallel" execution?

I understand the difference. I'm saying it doesn't matter and might even be an artificial fabrication of our minds. The stipulation that parallel execution means "truly" simultaneous in some absolute sense requires us to make assertions about physics I'm not even sure we ever could. What if the universe is a single core simulation that executes each "tick" of time by nudging each physical object forward one at a time, but the time unit is a tiny fraction of the Planck time unit, so we'll certainly never be able to watch this serial execution? The only thing that matters is that all objects make forward progress "together". That's what it means that all objects move "in parallel".

1 Like

This is also how I tend to model things. Concurrency is about taking a total ordering of events and finding places to weaken it to a partial ordering without sacrificing the desired results; it's not a coincidence that the lowest level synchronization primitives generally take a "memory ordering" parameter.

3 Likes

I think this is the source of your confusion. These terms are not defined over the real world, they are defined over models and are not concerned with how processors work or anything like that. Model are inherently abstract.

The unfortunate naming also doesn't help to "feel" the distinction. You can replace the word "concurrent" with "overlapping", which, I think, conveys the accepted meaning of that word in CS better. The execution of two concurrent tasks can overlap, but that does not necessitate or implies parallelism.

2 Likes