A New Approach to Testing in Swift

AlexanderM · September 22, 2023, 2:31pm

I've played around with using Result Builders to define a new DSL for Quick, but didn't have time to polish it off.

When macros came out, it got me wondering: What's the correct mechanism for meta-programming? Are there rules of thumb for choosing betwen result builders and macros?

How is the performance of having so many macros in a system?

I would naively assume that Result Builders can be faster because they run in-process in the compiler (and because there's incentive to optimize them heavily given how many SwiftUI views would be in a typical project), and don't need to start up a bunch of subprocesses to run the macros, is that the case?

David_Catmull · September 22, 2023, 2:37pm

I think the issue isn't feasibility, it's that each of us is duplicating that effort. Perhaps it would help if Apple provided those protocols alongside their APIs.

grynspan · September 22, 2023, 2:39pm

Hi Laszlo! Our colleagues are actively investigating a general-purpose solution for runtime symbol discovery and we're eager to adopt it once it becomes available. For more information on that effort, check out Kuba's pitch here.

grynspan · September 22, 2023, 2:48pm

Hi Tony! We know there's interest in having custom matchers that use the #expect() macro, and it's definitely something we'd like to support. While we have some ideas here already, we want to hear more from the community about what would be useful.

With regard to string comparisons, we have built in the ability to compare any two collections (using the Swift standard library's CollectionDifference API.) Strings are special-cased and opt out of this extra handling right now, but we hope to change that in the near term to something similar to diff.

I'd love to discuss this further with you—mind starting up a separate thread under the swift-testing subcategory and we can brainstorm?

smontgomery · September 22, 2023, 5:32pm

While developing swift-testing we also experimented with using Result Builders for test definition, but encountered several significant challenges which I described under Alternatives Considered in our vision document draft. Indeed, one of those challenges was the burden we knew this approach would place on the compiler's type-checker, especially since test code can often be quite lengthy compared to other APIs which use result builders. There were other notable difficulties beyond that, too.

By contrast, we believe our current macros-based approach is much simpler from a type-checking perspective since the @Test and @Suite macros expand to more ordinary Swift code consisting of named functions, properties, and types. Although it's true that macros can involve inter-process communication between the compiler and a macro plugin, in practice we've found that type-checking is the more relevant factor when examining our tests' build times.

dmt · September 22, 2023, 6:34pm

I briefly reviewed the documentation and sources, and I didn't find any information regarding support for expected fatals. Did I miss something?

dhomes · September 22, 2023, 7:01pm

Looks interesting, look forward to tinker with this. Now, if you could somehow integrate a gherkin parser to enable bdd testing! Current frameworks out there like cucumberish or xctest-gherkin are quite outdated (still very useful though)

joemasilotti · September 22, 2023, 8:09pm

This is very exciting – a huge thank you for rethinking XCTest!

Over the years I've gone back and forth between Quick/Nimble and XCTest. Same with Rspec and Minitest in Ruby/Rails world. But I keep coming back to the simplicity of XCTest and Minitest.

Nested contexts, before/after blocks, subjects, and shared examples always seem to confuse me more than help. Sure, it looks great and I have DRY test code. But I come back to the suite a week or month later and lose myself in my own code.

I would hate for XCTest to leave this and be more a "BDD-like" test framework. I see some inklings but also some stuff that I love in the proposal so far.

My two biggest gripes for XCTest are:

Naming tests via function. It looks like @Test() solves this right away. I am most excited for this! I would be just as happy if Xcode 16 launched with this single improvement to XCTest. I wrote about a workaround but it has the problem @ratkins mentioned: Xcode can't discover each test.
The limited usability of matchers and output. Without a ton of custom helper methods, test failures usually read "expected this to happen and it didn't". Which isn't super helpful compared to other test frameworks.

That said, it looks like I'll be very happy with this when it launches! Looking forward to following along.

toastersocks · September 22, 2023, 8:49pm

This is very exciting! I especially like the possibility to do parameter-based testing. Is there any chance/interest of extending this to also accommodate property-based testing a la SwiftCheck? Or at least allow third-parties to integrate and provide this functionality? Maybe in addition to providing the parameters directly, the parameters could be given by a type conforming to ParameterProvider or similar. Then third-party frameworks could provide composable value generators that test authors could use to generate parameters. The other half would be the property definition which would stop and reduce the failing input to the simplest failing case and then report the simplified failure case.
This would make property testing a lot more accessible to people than what currently exists in the ecosystem.

jonreid · September 23, 2023, 1:21am

I created a swift.org account just so I could come here and say… YAY! YAY!

Items on my wishlist you've already tackled:

@Test annotation for discovery and control
natural language description of test
parameterized tests!

A fulfilled wish I didn't know I had:

Avoid early instantiation of all test cases at once, which creates so much confusion for Swift devs around test property lifetime. Instantiate for single test case execution instead

Here are my further wishes…

Naming

I'd like to avoid creating a function name. Having to name foodAvailable is cognitive overhead when we've just provided the name "The Food Truck has enough burritos".

Assertions

A way to provide custom #expect statements. That is, I want to write helper assertions.
In addition to x == y where the order doesn't matter, some way to write assertions that clearly identifies expected value vs. actual value.
I love composable matchers (Hamcrest) for their power. But I think most folks prefer the AssertJ-style "fluent" matchers for left-to-right reading.

Test Runner

Have the test runner remember the last failing test suites (and the failing tests within those suites) and run them first. This is really helpful for faster feedback when running all tests.
Similarly, keep track of test times. After running any previously failing tests, run the fastest test cases (and suites).
Randomized XCTest order never got the ability to re-run with a specified seed. Would love to see, "Gosh, this randomized test run failed. It says it used seed BLAH. Here, let's re-run with seed BLAH to reproduce."

grid · September 23, 2023, 5:42pm

I have the following pain-points for your consideration:

Networking - As others have said, testing network code is painful, and I would love to see a generalized solution for this somehow. I know architectural decisions have a fair bit of impact here, so it may be difficult to please everyone, but it should be something you think about from the earliest planning stages, IMO. (My current solution involves using Mocker to mock URLSession responses.)
Speed - This is a pretty big issue to me. When the tests take longer than 5-10 seconds, I tend not to run them until I modify them, because that's time spent just staring at the screen and not actually doing anything. Compile time is also pretty important.
UI Testing - At least partly because of its relation to speed, UI testing is a huge pain for me right now. I experimented with a few UI testing methodologies, and my current solution is using pointfree's swift-snapshot-testing. Unfortunately, while it's very easy to add new tests, running them takes too long. I've had to separate-out the UI tests from the more general unit tests, and I only run the UI tests once or twice a week (or when I make UI changes). And notably, they also do not run in CI. (I have about 260 screen comparisons, and the UI tests take 90 seconds on my Macbook M1 Pro Max 64GB.) If these tests were faster, it would be fantastic to be able to run them in CI at least.

I did scan through the Vision document and didn't see any reference to network testing or UI testing, and I didn't see anything. I realize these may seem more specific than you're trying to get, but I do also kind of feel like any new general testing framework has to address these current

Finally, I'll also second this:

I mostly leave randomization out of tests, for this reason. Would be great to have access to the overall test seed for use in your actual test functions as well!

Thanks for taking our concerns/needs into consideration, it's very much appreciated!

benpious · September 23, 2023, 5:59pm

Very glad to see this; having recently been bitten by several of the quirks of XCTest mentioned in the vision document. It's definitely the right time for a modern Swift testing library.

I'm also very glad to see that the plan is to use custom type metadata for test discovery; I think a lot of large codebases would benefit from the ability to define their own versions of the @Test macro that handle specific use cases, like ensuring that all snapshot tests are automatically parameterized for multiple screen sizes, localizations, etc.

A few things I'd like to suggest:

Allow for Dynamic test parameterization

From the code samples I've seen so far, test authors have to know at compile time what the parameters of a test are. But sometimes it's impractical to gather this information. For example, I work in a code base with a lot of feature flags. This can lead to problems with unit tests; often, engineers forget to test a flag, or are unaware that their code is calling library code that is flagged. A project I've been hoping to tackle at some point is the ability to schedule a re-run of the test whenever it "discovers" new flag (where discovery is "the code under test attempting to access it"), with that flag set to a different value:

The only way I could do this in XCTest would be with a closure or some other form of macro:

func test_thing_with_many_flags() {
    testingAllFlaggedCodePaths {
          let result = myFunction() // `myFlag`, and various other flags I don't know about, might be called during this run
          if myFlag.isOn {
               XCTAssertEqual(result, something)
          } else {
               XCTAssertEqual(result, somethingElse)
          }
    }
}

It would be great if there were APIs to communicate back to the test runner that could be integrated into libraries (under #DEBUG, of course) to make this possible.

Add the ability for tests to specify the way IDEs should show results to the user

Many tests failures are difficult or impossible to describe in words. For example, many iOS codebases use tests to verify view hierarchies which produce a bitmap image, which is then compared to a pre-recorded reference. A test of the Accessibility/VoiceOver attributes of a view hierarchy might want to produce an image calling out a specific section of a view hierarchy that is user interactable but has no accessibilityLabel set.

Today, tests are limited to output like "references images not equal, run the command imagediff path-to-failure path-to-reference.

It would be cool if this test library defined a protocol for locating and invoking something like a MacOS quick look preview, but for unit tests, which vendors could implement in their IDEs. The most basic implementation of this would be something similar to the LLDB debugQuickLookObject which simply produces a bitmap. A more sophisticated system might allow the IDE to define what type of object it expected to receive: examples might include HTML in an IDE built with Electron, or an executable which produces a SwiftUI view hierarchy in an IDE built for MacOS with the native stack.

stevex · September 24, 2023, 12:28am

There doesn't seem to be much mention of UI tests in the vision doc.

UI testing feels a bit like a second class citizen now: You have to create a separate target for it, test recording is often sketchy, and the tests are brittle.

Is improving UI testing a separate project entirely?

wrightak · September 24, 2023, 1:56am

I would like to second this. BeforeEach blocks can easily cause confusion because it can be difficult to keep track of what state your test function is in before it executes. I've seen many Quick test suites where the nesting gets two or three levels deep, with multiple beforeEach blocks far away from the test. I think it's far better to avoid nesting completely and when you have common setup code, create a function for it and call it explicitly at the beginning of your test.

Sajjon · September 24, 2023, 8:14am

Amazing! And it is pretty!

So my 3.5 years old wish is now almost fulfilled!

yarn

swift-testing

Would be cool if terminal could update and replace the line "Test X started" with "Test X passed", since it is not so relevant to see "Test X started" anymore.... is that something you have considered?

Also I think it is a missed opportunity of visual aid to not use a different color/symbol/highlighting for a passed suite, like yarn shows “PASS”

Sajjon · September 24, 2023, 11:20am

One downside of the new #expect macro is that when testing that some throwing inits (or throwing functions not marked with @discardableResult) does not throw we need to insert _ = which we did not need before. Not a huuuuuge issue, but still something which is less elegant than it was before IMO, i.e. we have to:

import K1
import Testing

@Suite("PrivateKey Generation")
struct PrivateKeyGenerationTests {
	
	@Test
	func testGenerationWorks() throws {
		#expect(throws: Never.self) { _ = try K1.ECDSA.PrivateKey() }
	}
}

But I wanna write just #expect(throws: Never.self) { try K1.ECDSA.PrivateKey() }

Sajjon · September 24, 2023, 1:15pm

One very important config is missing: continueAfterFailure, or at least I found no mention of it in MigratingFromXCTest.md

The behaviour of swift-testing corresponds to continueAfterFailure = true. This prevents us from writing tests that e.g. uses subscript to read out elements from a collection, which count we wanna assert/expect before doing so.

i.e.

@Test("skywalkers")
func skywalkers() {
	struct Person {
		let givenName: String
		let familyName: String
	}
	let siblings: [Person] = [
		"Luke",
		// "Leia" // OPS accidentally commented out!
	].map {
		Person(givenName: $0, familyName: "Skywalker")
	}
	
	#expect(siblings.count >= 2)
	
	// convenient to access with subscript since we have already asserted count >= 2
	let luke = siblings[0]
	let leia = siblings[1]
	#expect(luke.givenName == "Luke")
	#expect(leia.givenName == "Leia")
}

The above will result in fatalError:

􀢄  Test "skywalkers" recorded an issue at APITest.swift:21:2: Expectation failed: (siblings.count → 1) >= 2
Swift/ContiguousArrayBuffer.swift:600: Fatal error: Index out of range
error: Exited with signal code 5

Which is not what we want, since it prevents further tests from being run.

So I propose:

Either we ensure we have an equivalence of continueAfterFailure in swift-testing, or
We make #expect return a Bool, instead of Void, in fact this might be the more powerful option anyway since it gives a a little bit more control. This would allow us to write guard! like so:
guard #expect(siblings.count >= 2) else { return }. Or one might wanna use an alternative macro, for better failure message, like so:

#expect(guard: siblings.count >= 2) which would expand to guard ... else { return } and have a simular recorded issue like a failing #expect but with additional info about returning before test completed.

smontgomery · September 24, 2023, 6:41pm

The analogous API in swift-testing to XCTest’s continueAfterFailure is another macro which is similar to #expect but stricter, spelled #require. Both #expect and #require are considered expectations in our terminology, and they both accept expressions, but their failure handling behavior is different:

If an #expect fails, it records the failure but does not throw an error or halt further execution of the currently-running test.
If a #require fails, it records the failure and throws a special Error type. This allows the test to halt further execution.

So the following in XCTest:

override func setUp() {
  self.continueAfterFailure = false
}

func testXYZ() {
  XCTAssertEqual(1, 2)
}

is roughly analogous to the following in swift-testing :

@Test func xyz() throws {
  try #require(1 == 2)
}

We think #require has several benefits: it’s more clear at each usage site how a failure at that particular location will be handled, and the test author can decide on a case-by-case basis whether a particular failure ought to skip the remainder of the test. Its implementation also avoids needing exceptions for control flow, which the Objective-C based version of XCTest relies on and does not work reliably with Swift concurrency.

Thanks for pointing out that this information is missing from MigratingFromXCTest.md — I’ll file an Issue to track mentioning that there. (Edit: Filed this as #18).

Sajjon · September 24, 2023, 6:42pm

Btw, found a bug relating to using a throwing expression in RHS of #expected, I created an issue.

TL;DR:

This works (as expected):


func zero() throws -> Int { 0 }

@Test
func works_LHS_throws() throws {
	try #expect(zero() == 0)
}

But if I use the throwing expression in the RHS of == it does not compile, which I find strange, believe it is a bug:

@Test
func does_not_work_RHS_throws_outer_try() throws {
	try #expect(0 == zero())
}

Sajjon · September 24, 2023, 8:38pm

I've made a POC of a "progress", see DRAFT PR and here is the tiny POC project using the source branch of that PR

I've implemented it so that the test runner emits a "tick" and I replace the "Test foo started." line with an animated " Test foo running." message.

50fps

What do you think? Could be pretty nice for long running tests?!