Reliably testing code that adopts Swift Concurrency?

That does seem to point to a general problem that's also captured by Stephen's original example—perhaps the way SwiftUI's task modifier is designed makes it too easy to entangle the "setup" phase from the "execution" phase for testing purposes. Buffering should ideally not be something you need to rely on for testing either. A developer can factor the task code into two separate async methods—one which sets up the subscriptions and doesn't return until they're ready to go, and another that runs the continuous business logic that consumes the subscriptions—and that would allow for a test to call the setup phase, schedule its own test events, and then execute the business logic, but that requires some forethought on the user's behalf. Maybe a better design would be for there to be a variant of task modifier that takes a value conforming to a protocol with distinct setUp and run methods, to encourage clients to factor their code that way up front.

1 Like

We definitely agree that AsyncStream isn't 100% ideal, but it's gotten us far with the least amount of production code invasiveness. Assuming folks know the rough edges and expectations of how mock streams are interacted with in tests, I think it's a fair trade-off, and the buffering seems essential to testing these things without any other serial guarantees (or the ability to "flush" an executor's queue).

Forgive me for putting more work on you to explain, but I'm not quite seeing how this would help. Can you sketch out some code or pseudocode that shows how it could apply to the screenshot example?

In your original example, you could factor task() into something like this:

class ViewModel: ObservableObject {
  @Published var count = 0
  var progress: AsyncChannel<()>?

  @MainActor
  func setUpTask() async -> NotificationCenter.Notifications {
    return NotificationCenter.default.notifications(
      named: UIApplication.userDidTakeScreenshotNotification
    )
  }

  @MainActor
  func runTask(screenshots: NotificationCenter.Notifications) async {
    for await _ in screenshots {
      self.count += 1
      await progress?.send(())
    }
  }

  func task() async {
    runTask(screenshots: await setUpTask())
  }
}

which would allow you to write the test setup as:

@MainActor
func testBasics() async throws {
  let vm = ViewModel()
  let vmProgress = AsyncChannel<()>()
  vm.progress = vmProgress

  // Set up the subscription in the ViewModel.
  let screenshots = await vm.setUpTask()
  XCTAssertEqual(vm.count, 0)

  // Kick off the main loop of the ViewModel task.
  async let task = vm.runTask(screenshots: screenshots)

  // Simulate a screen shot being taken.
  NotificationCenter.default.post(
    name: UIApplication.userDidTakeScreenshotNotification, object: nil
  )

  // Wait for the task to post progress through the loop.
  await vmProgress.next()

  XCTAssertEqual(vm.count, 1)
}

Since the setUp is a separate async function, your test can call it and let it complete before posting its own notifications, allowing you to avoid the first implicit scheduling dependency. I preserved the AsyncChannel for posting progress in the main loop to deal with the second scheduling dependency here, but runTask should also at this point be straightforward to express as an AsyncStream that takes the notification stream as an input and turns it into a running count, like you might in Combine, which would let you test the stream's behavior in isolation.

1 Like

Unfortunately, this doesn't solve the problem. The post can still happen before the subscription, at which point the channel will get stuck and hang the test forever, which took 78 iterations for me to hit:

…
Test Case '-[Tests testBasics]' started (Iteration 77 of 10000).
Test Case '-[Tests testBasics]' passed (0.002 seconds).
Test Case '-[Tests testBasics]' started (Iteration 78 of 10000).

That's unfortunate. Like I said before, we should provide an API that lets code wait for the subscription to be ready before proceeding for testing purposes:

If you were going to factor runTask into an AsyncStream transformation, though, then you could also make this easier to test by having the input stream a generic AsyncSequence instead of requiring it to be an NSNotification.Notifications sequence, which would let unit tests provide their own stream under their control.

I'm not quite sure how that would work under the hood, but if this particular API could be fixed that'd be nice!

I think this depends on any AsyncSequence<Element> throws/nothrows (or AnyAsyncSequence/AnyThrowingAsyncSequence) to be viable. Adding a generic per async sequence dependency to each view and view model becomes quickly untenable.

FWIW I've defined a helper that effectively uses Async(Throwing)Stream as type-erasers given an async sequence, so converting Notification.Notifications is a one-liner at least. It's just not as safe as it could be since the @rethrows effect can't be tracked right now.

2 Likes

@stephencelis Have you considered passing, say, an UnsafeContinuation<Void, Never> into the the Task and resuming it when task start up has completed? This is the only way I've been able to get a Task to to notify on startup and is what I'm using in the CheckedExpectation I posted a link to way up thread. It's the only way I've found to reliably assure notification of task startup completion.

I think for the example I posted there's a chicken-egg problem if you intend to pass an optional continuation to the vm.task() endpoint. An AsyncStream.Continuation or AsyncChannel get around the problem, but that's basically back to Joe's solution.

True, I was basically only getting around the initial setup and not trying to test the entire stream. Basically, a whole layer of assertions around AsyncChannel is going to be the only way to do this right now.

I think this is absolutely key. I suggest what the example and investigations here in the thread have established is that there is a clear flaw in the implementation of AsyncSequence.

It is a reasonable expectation of this type that when you await its initialization, it does not return until it is listening to the events it will handle. What is required is a bug report/evolution proposal to request that AsyncSequence behaves as would be expected in an async-await world. With that, no additional testing tools or infrastructure would be required for the AsyncSequence example.

Later: On reflection, it seems more likely that the implementation fault might lie with NotificationCentre's implementation of AsyncSequence in NotificationCenter.Notifications rather than in the AsyncSequence protocol per se.

Yeah, I think this is the case. AsyncSequence as a protocol can admit varying implementations, and it appears to be an undocumented quirk of NotificationCenter.Notifications that the underlying subscription isn't established before the sequence value is returned to the caller, making it impossible to order test notifications in relation to the subscription.

2 Likes

A good story often has an epilogue.

As things seemed to point to NotificationCentre.Notifications as the culprit in the hard-to-test code here, I wrote a minimal test to demonstrate the bug, intending to submit a Feedback report to the team. Not difficult to test in a nicely encapsulated test:

func test_notificationCentreNotifications_doesNotDisplayRaceCondition() async throws {
    // Setup test expectation
    let expectation = self.expectation(description: "Notification received.")
    let fulfillExpectation = { expectation.fulfill() }
    
    // Initiate the NotificationCentre.Notifications AsyncSequence
    let asyncSequenceTask = Task {
        let screenshotSequence = await NotificationCenter.default.notifications(
            named: UIApplication.userDidTakeScreenshotNotification
        )
        for await _ in screenshotSequence {
            fulfillExpectation()
        }
    }
    
    // Post the notification created when a screenshot occurs.
    _ = await MainActor.run {
        Task.detached(priority:.userInitiated, operation: {
            await NotificationCenter.default.post(name: UIApplication.userDidTakeScreenshotNotification, object: nil)
        })
    }

    // Test expectations
    await self.waitForExpectations(timeout: 1)
    asyncSequenceTask.cancel()
}

Just one "problem":

Executed 100000 tests, with 0 failures (0 unexpected) in 138.589 (203.695) seconds

OK. So, our diagnosis above, pointing the finger at the internal initialization of NotificationCentre.Notifications seemed plausible, but clearly isn't correct.

So we’re back answering the original topic of the thread, a focus on the testability of async-await, with the original question “So how are we supposed to test code like this?”. Effective problem solving requires clear problem definition. And while it might not be initialization of NotificationCentre.Notifications that was the issue, it still seems like an initialization issue was the problem that was making the code untestable.

This is the insight. While the ViewModel was initialized synchronously, it originally only created its AsyncSequence to monitor for screenshots when the async method task() was called. That could be the cause of a race condition. What if we moved the creation of that AsyncSequence to occur as part of the initialization of the ViewModel, and also kick off monitoring during init()?

@MainActor
class ViewModel: ObservableObject {
    @Published var count = 0
    let screenshots = NotificationCenter.default.notifications(named: UIApplication.userDidTakeScreenshotNotification)
    var screenshotMonitor: Task<(), Never>?
    
    init() {
        Task { await monitorForScreenshots() }
    }

    func monitorForScreenshots() async {
        screenshotMonitor = Task {
            for await _ in screenshots {
                self.count += 1
            }
        }
    }
    
    func cancelMonitoring() {
        screenshotMonitor?.cancel()
    }
}

The test for this is then pretty straightfoward, directly examines the count property on the ViewModel, and does not use dependency injection:

func test_selfInitializedViewModel_doesNotDisplayRaceCondition() async throws {
    // Initialize the ViewModel
    let viewModel = await ViewModel()
    
    // Baseline expectations
    await MainActor.run { // MainActor.run required because ViewModel is @MainActor given @Published count
        XCTAssertEqual(viewModel.count, 0, "Expected 0 count, found \(viewModel.count)")
    }
    
    // Post the notification created when a screenshot occurs, then wait till it has been issued.
    _ = await expectation(forNotification: UIApplication.userDidTakeScreenshotNotification, object: nil)
    await NotificationCenter.default.post(name: UIApplication.userDidTakeScreenshotNotification, object: nil)
    await waitForExpectations(timeout: 1)
    
    // Test expectations
    await MainActor.run { // MainActor.run required because ViewModel is @MainActor given @Published count
        XCTAssertEqual(viewModel.count, 1, "Expected 1 count, found \(viewModel.count)")
    }
    
    // Cleanup
    await viewModel.cancelMonitoring()
}

Let’s be really sure. Result of running this test one million times:

Executed 1000000 tests, with 0 failures (0 unexpected) in 2274.146 (3139.190) seconds

What I’ve learned is: to ensure AsyncSequence code is testable, your production code should initialize your AsyncSequence synchronously, as part of its owner’s initiailzation. This also makes the production code slightly more reliable (eliminates the possibility of missing an early screenshot during that async startup), though functionally this probably would be a non-issue.

8 Likes

Not resurrecting this thread to suggest relying on Task.yield() is a good idea (though I imagine folks will rely on it till there are better tools to assist testing async code), but I did notice that this behavior has gotten much worse in the Xcode 14 betas. Previously, the single Task.yield() would result in a passing test 99% of the time, but now if fails far more often than not. I need to do a "mega" yield for it to consistently pass on my machine:

await Task.detached(priority: .low) { await Task.yield() }.value
await Task.detached(priority: .low) { await Task.yield() }.value
await Task.detached(priority: .low) { await Task.yield() }.value

This is just to let the team know that there may be performance regressions in the latest concurrency runtime.

9 Likes

Unfortunately, this became an even bigger problem in Xcode 14.3 RC (14E222a). My tests that previously (Xcode 14.2) passed without the need to use await Task.yield() - now fails randomly.

What's worse, the yield workaround proposed by @stephencelis does not work anymore for me on Xcode 14.3 RC. Previously I used yield workaround + a one-second timeout for some of my tests, which fixed issues on a slow CI server. Now, even with the timeout increased to 10 seconds and thousands of yields, my tests are failing from time to time on M1 Max MacBook.

Before swift-concurrency it was hard to test asynchronous code. Now, it's impossible.

3 Likes

With AsyncSequence you don’t really know when the counter is incremented. And that’s actually a part of the ViewModel’s contract. It is not correct to test that ViewModel increments the counter right after the notification, because it’s not what it is doing. Instead what ViewModel can promise is that it will increment the counter some time later, and will let you know when it does.

And it notifies using objectWillChange. So, I think test should be waiting for objectWillChange to fire. In the end it is not really testing the async stuff, it tests the view model with sync public API.

class ObjectDidChangeExpectation: XCTestExpectation {
    var cancellable: AnyCancellable?
    var wasScheduled: Bool = false
    
    init<T: ObservableObject>(_ object: T) {
        super.init(description: "ObjectDidChangeExpectation(\(object))")
        self.cancellable = object.objectWillChange.sink(receiveValue: { [weak self] _ in
            guard let self = self else { return }
            if self.wasScheduled { return }
            self.wasScheduled = true
            DispatchQueue.main.async { [weak self] in
                guard let self = self else { return }
                self.fulfill()
                self.cancellable = nil
            }
        })
    }
}

final class ViewModelTests: XCTestCase {
    func testBasics() {
        let vm = ViewModel()
        let completion = XCTestExpectation(description: "End of task")
        let task = Task {
            await vm.task()
            completion.fulfill()
        }
        
        XCTAssertEqual(vm.count, 0)
        
        let ex = ObjectDidChangeExpectation(vm)
        
        // Simulate a screen shot being taken.
        NotificationCenter.default.post(
            name: UIApplication.userDidTakeScreenshotNotification, object: nil
        )
        
        self.wait(for: [ex], timeout: 1)
        
        XCTAssertEqual(vm.count, 1)
        
        task.cancel()
        self.wait(for: [completion], timeout: 1)
    }
}

But, it does not work. My first suspicion was that XCTestCase.wait(for:timeout:) does not drain main loop. So, I reimplemented it using continuations instead of expectations:

extension ObservableObject {
    @MainActor
    func waitingForChange(_ action: () -> Void) async {
        await withUnsafeContinuation { (continuation: UnsafeContinuation<Void, Never>) in
            var pendingContinuation: Optional<_> = continuation
            var cancellable: AnyCancellable?
            cancellable = self.objectWillChange.sink(receiveValue: { _ in
                if let cont = pendingContinuation {
                    pendingContinuation = nil
                    DispatchQueue.main.async {
                        cancellable?.cancel()
                        cancellable = nil
                        cont.resume()
                    }
                }
            })
            action()
        }
    }
}

final class ViewModelTests: XCTestCase {    
    @MainActor
    func testBasics() async {
        let vm = ViewModel()
        let task = Task { await vm.task() }
        
        XCTAssertEqual(vm.count, 0)
        
        await vm.waitingForChange {
            // Simulate a screen shot being taken.
            NotificationCenter.default.post(
                name: UIApplication.userDidTakeScreenshotNotification, object: nil
            )
        }
        
        XCTAssertEqual(vm.count, 1)
        
        task.cancel()
        await task.value
    }
}

But it does not work either. After debugging the code, I've realised that the actual problem is that starting a task to observe notifications is racy. Any notifications posted before vm.task() starts to execute are ignored. So both versions of the test above are working fine. They are failing because there is a bug in the SUT. The fix would be to use Combine or raw NotificationCenter API. I don't think this bug can be fixed using async API.

My learning from this case, it to not use NotificationCenter.Notifications at all.

1 Like

Is this problem even avoidable by a hypothetical better implementation of NotificationCenter.Notifications or is it inherent to all "hot" AsyncSequences and the fact that there's no way for an awaited async function to perform some initial logic synchronously?

Probably similar question, without any acknowledgment that there is an issue, or any suggested technique or workaround: [Pitch] Observation (Revised) - #43 by gwendal.roue

3 Likes

Thanks for cross-referencing that, and yes I think it's the same root issue and a real shortcoming of async/await.

It is difficult to know when observation of an AsyncSequence has started. Typically, even from an async context, you spin up a Task to do the observation. And you don't know when that task starts. We've unfortunately reverted to Combine and publishers for some of this.

2 Likes

Just wanted to share an update, as we've continued to write more and more tests for code that uses Swift concurrency. Introducing Swift concurrency to a feature continues to be the biggest cause of test flakiness in our experience, but we have managed to reduce flakiness dramatically by cribbing some helpers from swift-async-algorithms, in particular, a "C async support" module, which exposes a global hook override for task enqueuing, which we use to redirect everything to the main, serial executor:

Whenever we have a flakey continuous integration failure, we wrap the test with this helper and don't typically have a concurrency-based problem with that test again. As a bonus, the test runs more quickly.

The solution is far from perfect, but has saved us from a ton of pain, and we think it basically makes async code behave more like Combine code (i.e. well-defined "subscription" order), and hence becomes a lot more reliable to test.

I believe it will probably only work for code that does not use custom executors, but that should be the case for most folks right now. We also haven't tried to include this code in library/application code yet, but if Swift doesn't provide a public solution to the problem, we'll likely look into extracting this helper into its own package, which should make it easier to drop into a project.

20 Likes