A New Approach to Testing in Swift

My first exposure to the idea of tests living next to implementations is from D's unittest {} blocks, and from what I recall, they had an argument for including tests in release and production builds, both for the ability to test for configuration-dependent issues like Max noted, and also because it can be useful to have tests available even in deployed environments as a diagnostic tool, since it's possible that something about the deployment environment breaks things in a way that automated tests could check for if they can be run directly in that environment. Clearly there should be the possibility of leaving tests out of the build, but that seems like another good argument for keeping the test configuration separable from debug/release.

5 Likes

I suggest not tying test compilation / inclusion to build mode. Sometimes you can only reproduce a problem by building your unit tests in release mode. Some tests only make sense in release mode (e.g. benchmarks).

6 Likes

It's great to see that Swift is going to receive a first-class testing solution! :)

I have two topics which in my opinion would improve working with it on daily basis.

Simplified test setup

Currently, there are two common approaches for test setup:

Using setUp and tearDown methods:

private var networkClientSpy: NetworkClientSpy!
private var sut: Service!

override func setUp() {
    networkClientSpy = NetworkClientSpy()
    sut = Service(client: networkClientSpy)
}

override class func tearDown() {
    networkClientSpy = nil
    service = nil
}

func testWhenFetchDataThenClientRequestCalled() async {
    // when
    await service.fetchData()

    // then
    XCTAssertEqual(networkClientSpy.requestCalls, 1)
}

However, it creates a boilerplate code which is required even for trivial setups (like the one above).
This approach highly relies on implicitly unwrapped optionals which:

  • is prone to errors (developer may accidentally forget to assign a value to the property in setUp method)
  • may be highlighted as false positives by linters, like SwiftLint
  • makes mistakes not highlighted at compile time (the best approach would be: if it compiles - it works)
  • requires to explicitly clean up (nullify) the properties to prevent increasing memory usage (once initialised property for every test method persists in memory until whole tests suite is not completed).

Initilise XCTestCase's properties directly

private let networkClientSpy = NetworkClientSpy()
private lazy var service = Service(client: networkClientSpy)

func testWhenFetchDataThenClientRequestCalled() async {
    // when
    await service.fetchData()

    // then
    XCTAssertEqual(networkClientSpy.requestCalls, 1)
}

Because XCTestCase initialises new instance of itself for every test method run, each of networkClientStub and service is unique for given test method. Unfortunately it has one con: it is not freed from memory after test method is finished. The XCTestCase instance is freed when whole tests suite is done.

The second approach significantly reduces the boilerplate but will only be useful when the instance of XCTestCase would be freed from memory immediately after the test method is finished.

Automated test doubles creation

Use macros for test doubles automated creation. It is also important to follow up a good practice and use correct test double names (here is a good reference by Martin Fowler), and to not generalise everything under a Mock name.
I would focus on Stub and Spy (and maybe a Dummy) as they give the context about the test dependencies.
Dummy, Stub and Spy can be generated by a macro as their role is strictly defined. Mock, as it has a custom logic, must be created manually by a developer.

Dummy:

  • calls fatalError for every method and property call
  • for protocols, it creates a new type which match given protocol
  • for classes it creates a subclass, overrides methods and properties
  • for enums and structs it has create a random value of that type

Stub:

  • allows to create an object with randomly generated values. It could work in the same way as Sendable (each property type has to confirm to Stub protocol, and primitives confirms to it by default)
  • for protocols, it creates a new type which match given protocol
  • for classes it creates an random instance of that type
  • for enums and structs it creates a random value of that type

Spy:

  • for protocols, it creates a new type which match given protocol
  • for classes it creates a subclass and overrides methods with additional counter logic
  • for enums and structs it fails as the implementation cannot be modified

Example

Disclaimer:

I did not spent so much time with macros, so please, forgive me any mistakes with the examples below:

For given protocol:

public protocol NetworkClient {

    func request() async
}

In test target (to isolate from production a generated code for testing purposes):

private let networkClientSpy = #spy(of: NetworkClient.self)
private lazy var service = Service(client: networkClientSpy)

func testWhenFetchDataThenClientRequestCalled() async {
    // when
    await service.fetchData()

    // then
    // uses range as call occurrence check
    networkClientSpy.verify(call: .request, occurrence: .atLeastOnce)
    networkClientSpy.verify(call: .request, occurrence: 1)
    networkClientSpy.verify(call: .request, occurrence: 1...)
    networkClientSpy.verify(call: .request, occurrence: 1..3)
    
    // or
    XCTAssertVerify(networkClientSpy, call: .request, occurrence: 1)
    // etc...
}

I've achieved it at some point several years ago by using Sourcery. Here is my work: GitHub - HLTech/Papuga .
However, the repository is outdated and instead of treating Spy and Stub separate, it has them both merged under Mock name (which is not great).

1 Like

There's an open sourced library called swift-spyable that creates the spy for given protocols using swift macros. It would be cool to see something like this get built into this new testing framework, maybe with support for mock and dummy creations as well.

1 Like

i have many tests that are release-mode only, as they simply do not complete in a reasonable amount of time otherwise. this is a major reason why i use executableTarget and not test for tests. testing modes should absolutely be orthogonal to optimization modes.

1 Like

this can be done today with Swift macros. there's some great community support specifically for mock generation. it would be awesome to see first class support for mock generation, but i think that's a topic that's already been solved in numerous acceptable ways

I believe those new ideas for testing is a great direction. I'm waiting particularly for new parametrized testing possibilities. At Xebia we needed parametrized testing a lot but the one which would play nicely with Xcode and assertion outputs. So I did a small macro with @mkowalski87 for generating new test methods based on input and (if needed) output params. You can check our solution here: GitHub - PGSSoft/XCTestParametrizedMacro: Swift macro for parametrizing unit test methods. I'm wondering how the new parametrized testing will be integrated with Xcode.

1 Like

XCTest is a horrible mess to work with, but in my project we have some pretty sophisticated testing needs and have managed to report all failures through XCTest so that XCode integration just works. Someone wanting to make this work could probably do it with no changes to XCode or XCTest. The open source XCTest has some missing pieces; feel free to steal ours.

In the meantime, one of the best ways to ensure that you'll be compatible with many tools out of the box is to follow the Gnu coding standard for error message formatting. That, for example, will get you Emacs integration OOTB.

4 Likes

tl;dr: Testing preconditions is important. We should be able to do it in Swift without contorting our tests to fit the language.

Strongly agree. I'm bummed to not see much else in this thread about testing preconditions/fatalErrors/assertions, etc.

It's an important use case and certainly fits with the "flexible and capable of accommodating many needs" principle laid out in the vision document, as well as the "validate expected behaviors or outcomes" bullet under "Approachability."

Consider making a custom collection, which is a use case where being able to test preconditions is incredibly important. For example:

precondition(i >= startIndex && i <= endIndex, "index out of bounds")

Even in a precondition that simple, it's easy to mess up the comparison operators (did you see the bug?).

Another common pattern is indices that are tied to a specific value, e.g. only being able to use a String.Index in the string it was born from, or an index into a tree that stores weak or unowned references to specific nodes for O(1) subscripting or navigation (index(after:) and friends). In these cases you need something like

precondition(i.isValidFor(self))

This could have some pretty complicated logic. You can test isValidFor in isolation, which helps, but not being able to test subscripting a collection with an invalid index from end to end leaves me with a distinct lack of confidence in my test suite.

In Testing fatalError and friends it seems that the consensus is "run the test in a child process and detect the crash," which is what the Swift standard library tests do. While I wish there was room for a larger rethink of how Swift deals with traps (perhaps something closer to Go's panic/recover mechanism), it would be a very heavy lift for such a mature language, and might not be possible at all due to decisions that have been locked down by ABI stability. There were some other suggestions in that thread that could be interesting – using distributed actors, or even having an in-process distributed actor with an isolated heap that could panic in a way that could be detectable or even catchable.

Assuming the path forward is run the test in a child process and detect the crash, there's one particular usecase that's worth considering: multiple assertPreconditionViolation expectations in a single test. Consider the following

struct MyArray: Collection, ExpressibleByArrayLiteral {
}

func testSubscripting() {
    var a: MyArray = ["a", "b", "c"]
    assertPreconditionViolation(a[-1], with: "index out of bounds") // assume the first argument is @autoclosure 
    assertEqual(a[0], "a")
    assertEqual(a[1], "b")
    assertEqual(a[2], "c")
    assertPreconditionViolation(a[3], with: "index out of bounds")

    a = []
    assertPreconditionViolation(a[a.startIndex], with: "index out of bounds")
    assertPreconditionViolation(a[a.endIndex], with: "index out of bounds")
}

This is how I'd like to write this test (YMMV of course). I want to have both edges of each edge case (e.g. a[-1] and a[0]) next to each other. It makes it easier to see at a glance that I'm testing exhaustively.

In the naieve implementation of "detect a crash in the child," the child would crash during the very first assertion. That assertion would pass, but none of the others would run at all. Which in practice means you can only have one "expect it to crash" assertion per test, and it must be the last assertion.

A more sophisticated implementation might run the test N times, once per assertPreconditionViolation, and only run a single assertPreconditionViolation per test run. But that implies that each non-crashing assertEqual in the above example would be run M times, where M is the number of calls to assertPreconditionViolation that appear after the assertEqual in question. If subscripting MyArray is slow for some reason, or if allocating new instances of MyArray is expensive, the extra calls can really slow things down. I think it's something resembling a factor of O(n^2).

You can imagine a version where fork(2) is called once per assertPreconditionViolation, which would be equivalent to the idealized functionality of being able to recover from traps – the test runs top to bottom with each line only run once – but that comes with all of the problems of using fork, and I'm sure isn't workable.

You can make a reasonable argument that getting the above test running in a straight line top to bottom is too big a lift, or perhaps not worth the effort. And even something like Rust's #[should_panic] which seems to essentially make any #[should_panic] test a single assertion, would be a big improvement to what we have now.

The big picture though, is that if we're going to allow for testing precondition violations (which I think we should!), the less Swift's implementation constrains what you can express in your test, the better. And the test above is seems to push pretty far up against what the language allows.

5 Likes

Overall, great work, there is only one thing that I think needs a change:

The argument parameter of Test(_:_:arguments:) should be a () async throws -> C closure, otherwise parameterisation of tests is very limited.

Hi everyone,

I see a couple of suggestions pop up in this tread to make testing (using Swift-Testing) easier. For example:

  • Use a closure for the test code (instead of having to make - and name - a function)
  • Nicer test output
  • Easier integration for test doubles
  • Testing fatalError, assert, precondition (has its own thread)

There is currently a question on this forum for GSoC projects. I could write up a project to implement (some of) these improvements.

Would this be something you would support? I'm especially asking the creators/maintainers of this library (@smontgomery).

5 Likes

Hi Maarten! Let's start up a new thread under the swift-testing category and we can discuss there. Thanks!

2 Likes