Reliably testing code that adopts Swift Concurrency?

It's definitely true that in the span of a few microseconds between the view appearing and the subscription starting a screenshot could be taken and not counted. I think I'm OK if my test ignores that edge case and only focuses on the happy path of screenshots only happen after the subscription is started.

That's definitely true of a Wild West world of concurrency and non-determinism, but the example posted in this thread, in my opinion, does not fall into that category. And neither does a lot of very common situations one comes across in application development. For example, every case study and demo in TCA (of which there are dozens, many involving long-living and interleaving effects) does not fall into this category either.

However, if I was writing a low level systems library that could have thousands of concurrent users, then I would not use a serial executor and think that it is testing real life. That would be a great use case for property-based testing or fuzz testing.

4 Likes

That does seem to point to a general problem that's also captured by Stephen's original example—perhaps the way SwiftUI's task modifier is designed makes it too easy to entangle the "setup" phase from the "execution" phase for testing purposes. Buffering should ideally not be something you need to rely on for testing either. A developer can factor the task code into two separate async methods—one which sets up the subscriptions and doesn't return until they're ready to go, and another that runs the continuous business logic that consumes the subscriptions—and that would allow for a test to call the setup phase, schedule its own test events, and then execute the business logic, but that requires some forethought on the user's behalf. Maybe a better design would be for there to be a variant of task modifier that takes a value conforming to a protocol with distinct setUp and run methods, to encourage clients to factor their code that way up front.

1 Like

We definitely agree that AsyncStream isn't 100% ideal, but it's gotten us far with the least amount of production code invasiveness. Assuming folks know the rough edges and expectations of how mock streams are interacted with in tests, I think it's a fair trade-off, and the buffering seems essential to testing these things without any other serial guarantees (or the ability to "flush" an executor's queue).

Forgive me for putting more work on you to explain, but I'm not quite seeing how this would help. Can you sketch out some code or pseudocode that shows how it could apply to the screenshot example?

In your original example, you could factor task() into something like this:

class ViewModel: ObservableObject {
  @Published var count = 0
  var progress: AsyncChannel<()>?

  @MainActor
  func setUpTask() async -> NotificationCenter.Notifications {
    return NotificationCenter.default.notifications(
      named: UIApplication.userDidTakeScreenshotNotification
    )
  }

  @MainActor
  func runTask(screenshots: NotificationCenter.Notifications) async {
    for await _ in screenshots {
      self.count += 1
      await progress?.send(())
    }
  }

  func task() async {
    runTask(screenshots: await setUpTask())
  }
}

which would allow you to write the test setup as:

@MainActor
func testBasics() async throws {
  let vm = ViewModel()
  let vmProgress = AsyncChannel<()>()
  vm.progress = vmProgress

  // Set up the subscription in the ViewModel.
  let screenshots = await vm.setUpTask()
  XCTAssertEqual(vm.count, 0)

  // Kick off the main loop of the ViewModel task.
  async let task = vm.runTask(screenshots: screenshots)

  // Simulate a screen shot being taken.
  NotificationCenter.default.post(
    name: UIApplication.userDidTakeScreenshotNotification, object: nil
  )

  // Wait for the task to post progress through the loop.
  await vmProgress.next()

  XCTAssertEqual(vm.count, 1)
}

Since the setUp is a separate async function, your test can call it and let it complete before posting its own notifications, allowing you to avoid the first implicit scheduling dependency. I preserved the AsyncChannel for posting progress in the main loop to deal with the second scheduling dependency here, but runTask should also at this point be straightforward to express as an AsyncStream that takes the notification stream as an input and turns it into a running count, like you might in Combine, which would let you test the stream's behavior in isolation.

1 Like

Unfortunately, this doesn't solve the problem. The post can still happen before the subscription, at which point the channel will get stuck and hang the test forever, which took 78 iterations for me to hit:

…
Test Case '-[Tests testBasics]' started (Iteration 77 of 10000).
Test Case '-[Tests testBasics]' passed (0.002 seconds).
Test Case '-[Tests testBasics]' started (Iteration 78 of 10000).

That's unfortunate. Like I said before, we should provide an API that lets code wait for the subscription to be ready before proceeding for testing purposes:

If you were going to factor runTask into an AsyncStream transformation, though, then you could also make this easier to test by having the input stream a generic AsyncSequence instead of requiring it to be an NSNotification.Notifications sequence, which would let unit tests provide their own stream under their control.

I'm not quite sure how that would work under the hood, but if this particular API could be fixed that'd be nice!

I think this depends on any AsyncSequence<Element> throws/nothrows (or AnyAsyncSequence/AnyThrowingAsyncSequence) to be viable. Adding a generic per async sequence dependency to each view and view model becomes quickly untenable.

FWIW I've defined a helper that effectively uses Async(Throwing)Stream as type-erasers given an async sequence, so converting Notification.Notifications is a one-liner at least. It's just not as safe as it could be since the @rethrows effect can't be tracked right now.

2 Likes

@stephencelis Have you considered passing, say, an UnsafeContinuation<Void, Never> into the the Task and resuming it when task start up has completed? This is the only way I've been able to get a Task to to notify on startup and is what I'm using in the CheckedExpectation I posted a link to way up thread. It's the only way I've found to reliably assure notification of task startup completion.

I think for the example I posted there's a chicken-egg problem if you intend to pass an optional continuation to the vm.task() endpoint. An AsyncStream.Continuation or AsyncChannel get around the problem, but that's basically back to Joe's solution.

True, I was basically only getting around the initial setup and not trying to test the entire stream. Basically, a whole layer of assertions around AsyncChannel is going to be the only way to do this right now.

I think this is absolutely key. I suggest what the example and investigations here in the thread have established is that there is a clear flaw in the implementation of AsyncSequence.

It is a reasonable expectation of this type that when you await its initialization, it does not return until it is listening to the events it will handle. What is required is a bug report/evolution proposal to request that AsyncSequence behaves as would be expected in an async-await world. With that, no additional testing tools or infrastructure would be required for the AsyncSequence example.

Later: On reflection, it seems more likely that the implementation fault might lie with NotificationCentre's implementation of AsyncSequence in NotificationCenter.Notifications rather than in the AsyncSequence protocol per se.

Yeah, I think this is the case. AsyncSequence as a protocol can admit varying implementations, and it appears to be an undocumented quirk of NotificationCenter.Notifications that the underlying subscription isn't established before the sequence value is returned to the caller, making it impossible to order test notifications in relation to the subscription.

1 Like

A good story often has an epilogue.

As things seemed to point to NotificationCentre.Notifications as the culprit in the hard-to-test code here, I wrote a minimal test to demonstrate the bug, intending to submit a Feedback report to the team. Not difficult to test in a nicely encapsulated test:

func test_notificationCentreNotifications_doesNotDisplayRaceCondition() async throws {
    // Setup test expectation
    let expectation = self.expectation(description: "Notification received.")
    let fulfillExpectation = { expectation.fulfill() }
    
    // Initiate the NotificationCentre.Notifications AsyncSequence
    let asyncSequenceTask = Task {
        let screenshotSequence = await NotificationCenter.default.notifications(
            named: UIApplication.userDidTakeScreenshotNotification
        )
        for await _ in screenshotSequence {
            fulfillExpectation()
        }
    }
    
    // Post the notification created when a screenshot occurs.
    _ = await MainActor.run {
        Task.detached(priority:.userInitiated, operation: {
            await NotificationCenter.default.post(name: UIApplication.userDidTakeScreenshotNotification, object: nil)
        })
    }

    // Test expectations
    await self.waitForExpectations(timeout: 1)
    asyncSequenceTask.cancel()
}

Just one "problem":

Executed 100000 tests, with 0 failures (0 unexpected) in 138.589 (203.695) seconds

OK. So, our diagnosis above, pointing the finger at the internal initialization of NotificationCentre.Notifications seemed plausible, but clearly isn't correct.

So we’re back answering the original topic of the thread, a focus on the testability of async-await, with the original question “So how are we supposed to test code like this?”. Effective problem solving requires clear problem definition. And while it might not be initialization of NotificationCentre.Notifications that was the issue, it still seems like an initialization issue was the problem that was making the code untestable.

This is the insight. While the ViewModel was initialized synchronously, it originally only created its AsyncSequence to monitor for screenshots when the async method task() was called. That could be the cause of a race condition. What if we moved the creation of that AsyncSequence to occur as part of the initialization of the ViewModel, and also kick off monitoring during init()?

@MainActor
class ViewModel: ObservableObject {
    @Published var count = 0
    let screenshots = NotificationCenter.default.notifications(named: UIApplication.userDidTakeScreenshotNotification)
    var screenshotMonitor: Task<(), Never>?
    
    init() {
        Task { await monitorForScreenshots() }
    }

    func monitorForScreenshots() async {
        screenshotMonitor = Task {
            for await _ in screenshots {
                self.count += 1
            }
        }
    }
    
    func cancelMonitoring() {
        screenshotMonitor?.cancel()
    }
}

The test for this is then pretty straightfoward, directly examines the count property on the ViewModel, and does not use dependency injection:

func test_selfInitializedViewModel_doesNotDisplayRaceCondition() async throws {
    // Initialize the ViewModel
    let viewModel = await ViewModel()
    
    // Baseline expectations
    await MainActor.run { // MainActor.run required because ViewModel is @MainActor given @Published count
        XCTAssertEqual(viewModel.count, 0, "Expected 0 count, found \(viewModel.count)")
    }
    
    // Post the notification created when a screenshot occurs, then wait till it has been issued.
    _ = await expectation(forNotification: UIApplication.userDidTakeScreenshotNotification, object: nil)
    await NotificationCenter.default.post(name: UIApplication.userDidTakeScreenshotNotification, object: nil)
    await waitForExpectations(timeout: 1)
    
    // Test expectations
    await MainActor.run { // MainActor.run required because ViewModel is @MainActor given @Published count
        XCTAssertEqual(viewModel.count, 1, "Expected 1 count, found \(viewModel.count)")
    }
    
    // Cleanup
    await viewModel.cancelMonitoring()
}

Let’s be really sure. Result of running this test one million times:

Executed 1000000 tests, with 0 failures (0 unexpected) in 2274.146 (3139.190) seconds

What I’ve learned is: to ensure AsyncSequence code is testable, your production code should initialize your AsyncSequence synchronously, as part of its owner’s initiailzation. This also makes the production code slightly more reliable (eliminates the possibility of missing an early screenshot during that async startup), though functionally this probably would be a non-issue.

8 Likes

Not resurrecting this thread to suggest relying on Task.yield() is a good idea (though I imagine folks will rely on it till there are better tools to assist testing async code), but I did notice that this behavior has gotten much worse in the Xcode 14 betas. Previously, the single Task.yield() would result in a passing test 99% of the time, but now if fails far more often than not. I need to do a "mega" yield for it to consistently pass on my machine:

await Task.detached(priority: .low) { await Task.yield() }.value
await Task.detached(priority: .low) { await Task.yield() }.value
await Task.detached(priority: .low) { await Task.yield() }.value

This is just to let the team know that there may be performance regressions in the latest concurrency runtime.

4 Likes