[Accepted] A New Direction for Testing in Swift

Hello, Swift community.

I'm pleased to announce that the Swift project has accepted a vision document for A New Direction for Testing in Swift:

A key requirement for the success of any developer platform is a way to use automated testing to identify software defects. Better APIs and tools for testing can greatly improve a platform’s quality. Below, we propose a new direction for testing in Swift.

This vision is the natural continuation of the work done on the swift-testing project, which for the last six months has been discussed in its own category on these forums. A new Swift Testing Workgroup will be created to oversee testing efforts within the Swift project, which will eventually fall under the authority of the Ecosystem Steering Group. For more information on the evolution and governance of swift-testing and the work laid out in this vision, please see the vision document.

Since the Ecosystem Steering Group does not yet exist, the Core Team asked the Language Steering Group to review this vision with the assistance of the Platform Steering Group. Having conducted that review, we are all in agreement that this document represents an exciting and compelling vision for the future of testing in Swift. Accordingly, this vision has been accepted.

As with all vision documents, the acceptance of this vision is a strong endorsement of the goals it lays out, a general endorsement of its basic design approach, but only a weak endorsement of any concrete designs described in the document. It is expected that these designs will be brought to the community for discussion and review. The exact review process will be up to the Swift Testing Workgroup, but it will presumably include discussion threads on these forums, much like ordinary evolution review.

Please feel free to discuss this vision in this thread.

John McCall
Language Steering Group

42 Likes

There are some nice things here. I like that it cleans up some of the ceremony we inherited from Objective-C's XCTest framework, and the move to a platform-independent package is very welcome (there are important XCTest interfaces which to this day only exist on Apple platforms).

But I feel that calling this a "new direction" for testing is overselling it. The changes seem to be mostly superficial - of the four principles mentioned in the document, the first two (approachability and expressivity) seem focussed on ergonomics. None of them seem focussed on delivering new capabilities or testing paradigms.

While ergonomics are important, this seems to me to lack the ambition required to solve the biggest problems developers have when testing their code. For instance, mocking is entirely unaddressed - and I think it is a very fair criticism that a vision document titled "A New Direction for Testing in Swift" does not mention the word "mock" even once.

In my experience, mocking is the biggest thing that developers struggle with - in many ways it contradicts the usual strict typing and resolution rules Swift is built on, and so developers perform extraordinary contortions to try and make things mockable. I've seen projects where essentially every major type was hidden behind a protocol existential so it could be substituted in test code, which obviously becomes a huge burden, not only to runtime performance but to the productivity of developers trying to build the program. That's an area where I'd like language integration to actually deliver new capabilities, not just a package with nicer syntax.

So while it is nice and I appreciate the work that has been done, in my opinion we need more and bigger ideas to truly deliver a "new direction" for testing in Swift.

24 Likes

I disagree on the importance of mocking. I try to stay away from mocking as much as possible in my testing. Testing my code against something that is fabricated solely to satisfy the testing setup doesn't give me much confidence that the code will perform correctly when running against the real dependencies. I prefer integration tests (testing my code against the real dependencies as much as possible) over mocking.

8 Likes

to me mocking is largely about ergonomics, and this is an area where i feel that Swift tends to struggle. consider the following API:

extension US
{
    struct CityState:LosslessStringConvertible
    {
        init?(_ string:)
    }
}

it’s quite difficult to write clean, readable test cases without having to force your way through layers and layers of optionals. for example, imagine writing tests for the US.CityState parser. you cannot easily express the expectations, without going through the parser/validator itself! instead, you need some kind of underscored or “unchecked” API to actually instantiate the expectations.

I think, ultimately, this is one of those unfortunate consequences arising from ambiguities of the English language.

The focus of this document is in describing a new library, workgroup, and process where the vision is ultimately to supplant XCTest on all supported platforms. These are a new direction in the sense that they are not Ship-of-Theseus improvements of what has come before: it is not new in the sense of (necessarily) "more and bigger."

An analogy might be how Mac OS X Public Beta was a new direction as compared to Mac OS 9, as opposed to a "new" dishwasher model being a "more and bigger" version of last year's dishwasher model.

4 Likes

The amount of things you need to mock to test is actually a good indicator of design decisions that were put into it. You are right that if you mock every bit around, the environment is too sterile to have meaningful results, yet having mocks is important. Integration tests are slower by design, so they might catch error much later. As unit tests still prevail (and I’d argue they will remain here for a long), this questions are important to address, but in a long term.

As Swift (and XCTest as well) lacks of solutions for that, with macros we have seen improvement in that direction and I think that’s only a matter of time when this will be part of Swift toolkit by default. Yet I don’t believe this should be a part of vision document.


Really nice to see testing come to a more Swift-ish way, and loosing burden of XCTest. No magic function names, no inheritance. The structure of tests are pretty similar to what can be seen in other languages, with additional upside of it being more expressive as of my experience.

I don’t think some ultimately new approach is actually needed here, as well as covering all the details and evolutions paths within vision. The last time I’ve tried swift-testing lib the only concern was to match (and eventually improve) XCTests asserts selection, as some of them were really handy, while unavailable at that time in the library. And mocking, even though it is valuable feature (Python’s ones tooling for that is the best from what I used, but Swift can clearly do better and in type safe manner), IMO not the thing tool will benefit the most for start.

3 Likes

Just speaking for myself, I consider mocking to be a last-resort method of testing. Unit testing — writing a test against the public API of an isolated piece of code and verifying that it generates the desired results — is always preferable when possible. If you properly unit-test a piece of code, you can then unit-test code that builds on top of it without needing to invasively rearchitect it to allow the underlying code to be mocked out. The only problem is that not all code is easy to unit test, like when it fundamentally does world-stateful operations like network requests. Even then, I think people turn to mocking too quickly without considering approaches like isolating more of their core logic or standing up a mock server.

22 Likes

I don't even know what's wrong with XCTest. I'm all keen to congratulate some team, as long as we don't lose anything, but why?

The vision touches on this in Today's solution: XCTest:

When Swift was introduced, XCTest was extended further to support the new language while maintaining its core APIs and overall approach. This allowed developers familiar with using XCTest in Objective-C to quickly get up to speed, but certain aspects of its design no longer embody modern best practices in Swift, and some have become problematic and prevented enhancements. Examples include its dependence on the Objective-C runtime for test discovery; its reliance on APIs like NSInvocation which are unavailable in Swift; the frequent need for implicitly-unwrapped optional (IUO) properties in test subclasses; and its difficulty integrating seamlessly with Swift Concurrency.

8 Likes

FWIW… I often notice that "mocking" has colloquially been overloaded to encompass all varieties of test-doubles (Stubs, Spies, Fakes, and "true" Mocks). It's (unfortunately) ambiguous when I hear engineers make value-statements about mocking when it's not always clear if they are talking about a true "London" style Mockist style-TDD or if they are talking about any (and all) varieties of test-double types (and are suggesting that unit tests should run against production implementations and production types).

3 Likes

I personally agree with this. However, I believe language dominance has taken over and all test doubles are mocks now. That's the world we're in and we have to accept it.

I tend to use the other types of test doubles as verbs to describe what the mock is doing ("This mock is spying on this value, and also stubs the return value of foo"). This tends to go over better since few people really know about all the test doubles by name, but have an intuition for how to use them.

3 Likes

Even then, I think people turn to mocking too quickly without considering approaches like isolating more of their core logic or standing up a mock server.

Yes, yes, 100 times this. I could not agree more. How can this "new direction for testing" smooth the way to things like "standing up a mock server." I am looking for examples that do that using this new system. We desperately need to mock less, inject less, and test the real code against the thinnest abstraction layers we can so we can see the bugs that come from the interactions. And a new direction for testing document should show what that might look like.

Network interactions are a great example. I would imagine things like:

#networkProxy(when: "https:*/get/user/bob", return: makeRecord("bob").json)
@Test func testFetchBob() { 
    #expect(await myService.get(user: "bob") == makeRecord("bob"))
}

Obviously that's a bikeshed kind of syntax, and building it well is very hard. But to call it a new direction it should engage with this kind of real problem. What would solutions even look like? How can the language enable frameworks that enable this?

Every day I face code that is not perfectly isolated for testing, about 50% because I didn't do it right, about 50% because Apple didn't do it right. I can't fix much of the first, and I can't fix any of the second. A testing direction has to engage with that. Something really basic: sometimes you have to poll. I need to be able to say things like:

#expect(within: .seconds(1), that: await service.status == "done")

And I need that to work even when service provides no way to observe it. Yes, that probably, horribly, means polling. I need polling.

Swift is at its best when it reimagines things in a brand new way. I don't want to copy one of the thousand mocking frameworks. I want a new answer, not just a more elegant syntax for the same limitations.

8 Likes

Declaring that a particular test should run with a mock network server is a great example use case! The vision document describes a concept called Traits which aims to address this kind of problem, and swift-testing includes a specific protocol named CustomExecutionTrait for performing custom logic around the invocation of a test. The protocol is currently experimental, but we intend to formalize it and propose promoting it to API. It enables trait extensibility, which the vision document discusses as well:

A flexible test library should allow certain behaviors to be extended by test authors. A common example is running logic before or after a test: if every test in a certain group requires the same steps beforehand, those steps could be placed in a single method in that group rather than expressed as an option on a particular test. However, if only a few tests within a group require those steps, it may make sense to leverage a test trait to mark those tests individually.

Test traits should provide the ability to extend behaviors to support this workflow. For example, it should be possible to define a custom test trait, and implement hooks that allow it to run custom code before or after a test or group.

Here's an example of how the CustomExecutionTrait protocol could be used in the network mocking example:

struct MockServerResponseTrait {
  /// The path of the HTTP request to mock.
  var path: String

  /// The mock response data to provide.
  var response: Data
}

extension MockServerResponseTrait: CustomExecutionTrait {
  func execute(
    _ function: @escaping @Sendable () async throws -> Void,
    for test: Test, testCase: Test.Case?
  ) async throws {
    // Start mock server, handle request for `path`, respond with `response`
    defer { /* Stop mock server */ }
    try await function()
  }
}

extension Trait where Self == MockServerResponseTrait {
  static func .mockedRequest(path: String, response: Data) -> Self {
    Self(path: path, response: response)
  }
}

@Test(.mockedRequest(path: "https:*/get/user/bob", response: makeRecord("bob").json))
func fetchBob() async throws {
  let response = try await myService.get(user: "bob")
  #expect(response == makeRecord("bob"))
}

Traits enable a lot of functionality that was not possible in XCTest. The above example shows a custom trait, but even the built-in ones provide new capabilities, such as assigning custom tags to tests, customizing their display name, or specifying runtime conditions.

11 Likes

Swift in general does not encourage polling or busy-waiting because the language and runtime can't reason about when the waiting thread will become available for more work—it's the textbook definition of the halting problem! :sweat_smile: Instead, if the API you're using has something like an onStatusChanged callback mechanism, use it along with withCheckedContinuation() to transform your code into an awaitable expression and await it in your test function.

We've also learned from years of XCTest feedback that timeouts on individual expectations are flaky. Work that takes one second on your computer may take two seconds on your colleague's, or 10 seconds in a virtualized CI environment. This is why we don't offer something like #expect(within:that:) despite the hypothetical implementation being reasonably straightforward.

I'd be happy to discuss polling, timeouts, etc. with you in a separate thread. Perhaps with more real-world information about the problem you're trying to solve, we can find a solution that meets your needs. :slight_smile:

9 Likes

Just an observation related to testing in general:

I find that code that is easy to test (i.e., not much overhead necessary to set up a focused, easy-to-read, automated test -- whether that's a unit test in the strict sense, or some form of integration test operating on a small object graph) is also easy to maintain. And writing maintainable code (i.e., producing code that is easy to change when requirements change) pretty much defines our job as software engineers.

Moreover, I find that testability and maintainability are basically two sides of the same coin - they are two ways of looking at the same problem.

And the fundamental problem with both maintainability and testability is coupling. Code that is hard to test likely suffers from coupling issues, and the way to make it easier to test is to reduce dependencies and coupling.

Coupling, dependency (and also ownership) issues are relatively easy to spot for a seasoned engineer, but often go completely unnoticed by more novice engineers. Often, they likely won't even realize the connection between coupling and testability, and will rather go to great lengths to successfully write a test for their code -- often at a level that is much too high (e.g. end-to-end / UI tests), or by adding lots of complicated machinery, which in turn makes the tests harder to maintain.

What I am trying to get at here is that maybe the degree of coupling (size of the object graph?) in a test setup could be visualized by tooling in some form when writing tests. If the coupling is high, refactoring might be in order before trying to write the tests.

[I am well aware that metrics can be problematic, in that they can lead to a desire to optimize heavily for that metric (looking at you, code coverage).]

3 Likes

The nicest testing framework I've used is Spock which is a Groovy/Java test framework. It's based on a DSL and the tests mostly look like regular code. I have to say this Swift test framework has too many @'s and #'s. It certainly doesn't look like my code.
I don't know if Swift's DSL capabilities code be used to generate something like Spock but I don't see any mention of Spock anywhere in the design document.

I think focusing on the punctuation is missing the big picture. Let's say you know how to write a function in Swift:

function square(_ x: Int) -> Int {
  return x * x
}

In order to write a test for that function, you only have to learn two new words—@Test and #expect:

@Test
func squareAcceptsZero() {
  #expect(square(0) == 0)
}

That's a great deal of power in a small package. Anything that uses a "DSL" might look "more like Swift" if you focus on punctuation, but would be its own unique language that folks would have to learn. To deliver the same amount of power as swift-testing in that form, I imagine it would be a lot more complex. It's important to note that macros are a key part of the implementation, so you'd have to replace all of that functionality with something.

7 Likes

This is an interesting point. Java and Swift have similar syntaxes and a shared inheritance from C and Smalltalk, and it may be tempting to infer that because a Java feature looks a certain way, the equivalent Swift feature ought to look the same.

Although we don't discuss it in the vision document, we did briefly consider implementing tests as a compiler pass. We didn't pursue this approach for more than a day or so (if memory serves) because it would be impractical and fragile to bake the sort of functionality needed for swift-testing into the compiler.

It also became clear quickly that Swift macros would provide the sort of functionality we want, as they operate as a de facto compiler pass. The trade-off (and we think it's worth it!) is that macros have a fixed syntax in Swift that uses @ or #, as you've noted. I will point out that Java, in the form of JUnit, also uses @.

I hope this helps!

9 Likes

Does a DSL also act like a compiler pass? And by definition it allows you to design your own syntax. I have to say the example for a parameterized test in the vision document seems incomplete. Where's the expected value? Parameterized tests in Spock are one of its highlights.
Consider:

def "test square method"(int a, int b) {
    expect:
    square(a) == b

    where:
    a | b
    0 | 0
    1 | 1
    10 | 100
  }

Not an @ or # in sight. And I suspect that even if you're not a Spock/Groovy expert you can figure out how it works.

1 Like

[quote="phoneyDev, post:20, topic:72309"]
Does a DSL also act like a compiler pass?[/quote]

Yes, although I don't presume to know how Spock is implemented specifically. In order to turn what you have into a valid Java program, you need to perform a transformation on it. That transformation happens before or during compilation of the program. In terms of Swift, to turn what you have into something the compiler can reason about, it would need to be transformed into either Swift source, SIL, or (least likely) LLVM IR.

Fun historical fact: the earliest C++ and Objective-C compilers were implemented as transformation passes that converted the C++ or Objective-C code into equivalent C code.

[quote="phoneyDev, post:20, topic:72309"]
Where's the expected value?[/quote]

Here's a trivial/contrived example of specifying an input and expected output using swift-testing:

@Test(arguments: [
  (2, 20), (3, 30), (50, 500)
]) func timesTen(_ i: Int, result: Int) {
  #expect(i * 10 == result)
}

Edit: not sure what happened with the quotes there.

1 Like