How to get accurate test duration measurements for Testing and XCTest?

Hello,

I'm currently working on a project that compares test performance between the XCTest and Testing. However, I've found myself a bit stuck in the process - specifically, on how to ensure the results are as accurate as possible.

To provide a bit more context, here are a few approaches I initially wanted to take, although they didn’t work out as expected. First, each project was intended to include a Package.swift file with a structure resembling the following:

let package = Package(
  ...,
  targets: [
    ...,
	.testTarget(
      name: "FooTests",
      dependencies: [...]
    ),
    .testTarget(
      name: "FooXCTests",
      dependencies: [...]
    ),
    ...
  ]
)

Where FooTests and FooXCTests would use, respectively Testing and XCTest without mixing these two in the same target, and would "share" set of tests written using their native API's (if possible).

I considered below approach of getting time measurements using swift test command, which works great for Testing, but not so great for XCTest.

swift test \
--filter FooXCTests|...|..XCTests \
--enable-xctest \
--disable-swift-testing \
--xunit-output ./output.xml

After running above command, no output file would be generated (I have explicitly passed --disable-swift-testing to not to generate empty output for swift testing). However, after looking at SwiftTestCommand.swift I have found that this works when --parallel flag is passed. However, at this point (maybe wrongly) I started thinking that using xcodebuild would be better approach here, as it would always allow me to test projects in comparison to using swift test which would require them to support macOS as well as parallel testing.

My next idea was to abandon swift test, and use xcodebuild - not much of a refactor, I could simply create two test plans - each containing only the test targets written with either Testing or XCTest.

xcodebuild \                                                                    
-scheme swift-foo \
-destination "platform=macOS,arch=arm64,name=My Mac" \
-testPlan foo-swift-testing \ 
-resultBundlePath ./output.xcresult \
-derivedDataPath ./DerrivedData/ \
clean \
test

Another command - same story :), this works like a charm, except when the test plan contains tests written only using Testing. In that case, it seems that the output.xcresult file does not capture valid test durations - they are either missing or always reported as one second.

Taking my approaches and their results into account - where neither command produces reliable results - would it be valid to capture the test run name and duration directly from the stdout output (e.g. from xcodebuild) instead of, for example, using the .xcresult file? Or are there perhaps better methods to extract this information?

Thank you for taking the time to read this. I would greatly appreciate any insights on how to approach this situation.

1 Like

Be aware: parallelized testing is, fundamentally, incompatible with performance testing and test timing, because if you have two tests contending on the same resource (which may be a lock, a file, the CPU itself…), then one test will slow down the other.

Swift Testing generally takes less wall-clock time than XCTest to run the same tests because it can schedule multiple tests on different cores, and it runs those tests in a single process (whereas XCTest must spin up multiple processes to run them in parallel, and that's expensive.) However, the actual time each takes is heavily dependent on the tests themselves and what they do, so YMMV.

--xunit-output, when used with XCTest, is implemented up in Swift Package Manager and is indeed dependent on --parallel being passed. I've got a plan to fix that, but it's not a simple one and will take time to implement.

If you want to gather timing information for your tests, the XML output will give you misleading results because of the contention issue I mentioned earlier. It may be simpler to just measure the total wall-clock time for your test run(s).

I believe this issue (all tests claiming to take one second) has been reported to Apple already and is a bug in Xcode. I don't have more information about it.

Perhaps we can take a step back: what information are you actually trying to gather and compare?

2 Likes

I totally missed that point - thanks. I assumed that running a test plan for, e.g., 100 iterations (clean builds) on a relatively resource-unconstrained machine would give me somehow accurate results. However, I forgot that differences arise not only when executing tests via xcodebuild vs. swift test, but also fundamentally in how they work.

What I’m trying to achieve is not to measure test performance myself, but rather to capture the duration of test runs and the individual test cases within them. This will be done across multiple projects and machines. I’m just curious how the new Testing framework compares to the old XCTest. Using the information mentioned earlier, we could calculate median values, distributions, and outliers, and see how parallelization affects performance. It’s simply an empirical experiment, which I want to make as accurate as possible, taking into account the differences between these frameworks.

I don't need benchmark-accurate timing, but it would be nice if I could at least get individual test timing, rather than the overall test timing of the whole batch. Mainly I just need to see if tests are unexpectedly slow. e.g. I don't care about 0.2 vs. 0.3s, I care about 0.2 vs. 2s.

1 Like

@Jon_Shier I'm prototyping the necessary runtime changes here.

2 Likes