[Pre-Pitch] Data-Dependent Test Serialization

Hi folks! I'm prototyping a few feature for Swift Testing and could use feedback from the community!

Background

When writing test suites over complex codebases, especially those that have dependencies on other languages like C, you may find your tests touch shared mutable state that isn't guarded by Swift concurrency primitives such as actors. Canonical examples of such state are the standard I/O streams and the environment block, either of which can be mutated by any thread at any time outside the control of Swift's structured concurrency.

We generally recommend that you try to rewrite your Swift code, where possible, to avoid using such state. If you cannot avoid using it, Swift Testing offers the .serialized trait which tells Swift Testing to run the tests and test cases therein to run one at a time. You can apply this trait to your test suites and parameterized tests:

@Suite(.serialized)
struct `Environment tests` {
  @Test func peek() {
    let ev = getenv("ENVVAR")
    ...
  }

  @Test func poke() {
    setenv("ENVVAR", "VALUE", 1)
    ...
  }

  ...
}

This works in contexts where all the tests that might touch such state are closely related and can live in a single suite. But what if you have some tests that rely on the environment block, some that rely on stdout, and some that rely on both? :scream:

Today, the only way to ensure that all these tests run serially is to nest them all in a single suite marked .serialized. This can limit your ability to structure your tests to your liking and, worse, may not be possible as the number of dependencies grows. This requirement has also proven to be confusing as it's not clear that .serialized only applies locally rather than globally.

Whatever shall we do, @grynspan?

I'm working on an enhancement to .serialized that I'm very tentatively calling data-dependent test serialization[1].

This feature will introduce new variants of .serialized that take a uniquely identifiable argument, currently either a key path, a non-ephemeral pointer, or a testing tag (declared elsewhere):

@Suite(.serialized(for: \ProcessInfo.environment))
struct `Environment tests` {
  ...
}

@Suite(unsafe .serialized(for: stdout))
struct `Standard I/O tests` {
  ...
}

extension Tag {
  @Tag static var changesLocale: Self
}

@Sute(.serialized(for: .changesLocale))
struct `System locale tests` {
  ...
}

At runtime, Swift Testing will serialize all tests with a given dependency. So, for example, all tests that declare a dependency on ProcessInfo.environment will be serialized regardless of where they appear in the test plan and in relation to each other. These tests will still be free to run in parallel with tests that do not declare any dependencies, or declare dependencies on unrelated values such as stdout or Locale.current.

If a test or suite declares a dependency on more than one datum:

@Suite(
  .serialized(for: \ProcessInfo.environment),
  unsafe .serialized(for: stdout),
  .serialized(for: .changesLocale)
)
struct `Environment AND standard I/O AND locale tests` {
  ...
}

Then Swift Testing will order that test or suite serially with tests that are dependent on any (or all) of said data.

For particularly complex tests where the test author cannot reasonably enumerate all their dependencies, you can also specify a wildcard dependency[2]:

@Test(.serialized(for: *))
func `Dependent on everything`() {
  ...
}

What's next?

Community feedback will be key to the success of this feature. Open questions include but are not limited to:

  • What do we name the new symbols we're introducing? We don't have much in the way of guidelines for naming traits.
  • What kinds of data do you think you'd need to pass as dependencies? Data that acts as a dependency needs to have identity in order for us to determine that two unrelated tests share it.
  • How do you see yourself using this feature? What sort of tests are you unable to write today without it, or at least without a lot of effort?
  • Should the existing .serialized trait become a synonym for .serialized(for: *)? That would mean that a test marked .serialized is serialized with respect to all other such tests rather than just those in the same suite.

  1. Naming is hard, okay? :cry: "Dependency" in particular is not a great name because dependency injection is a commonly-used pattern when writing tests, and this feature isn't directly related to that pattern. I'm happy to take ideas for better names here! ↩︎

  2. Yes, this is valid Swift syntax! I suspect we would spell it differently in the final proposal. Just making sure you're paying attention! :wink: ↩︎

2 Likes

Super excited to see this as I've been reaching for this concept quite frequently in the swift-foundation tests.

A few thoughts/answers to some of your open questions:

Data that Foundation would likely use this for would include: the current working directory, the current environment, and the current internationalization preferences (a combination of one or more of the following: the current locale, the current calendar, the current timezone, the default timezone).

In Foundation I see us using this to guard tests that depend on the global state mentioned above. For example, FileManager-related tests (which temporarily set the CWD to a temporary directory) need to be serialized, ProcessInfo-related tests which mutate the current environment need to be serialized, and FormatStyle-related tests need to be serialized. Today, we perform all of this serialization via a global actor with a utility async function that enqueues the test work on that global actor, but having this built into a test trait would make this much more manageable.

Personally I think it might differ depending on whether it's a test trait or a suite trait. For a single test annotated with .serialized, I think it unambiguously means .serialized(for: *). However, a Suite marked as .serialized could mean .serialized(for: *) or it could mean that its test functions need to be serialized but it can run in parallel to other suites, so I'm not certain if the existing suite trait should be taken to mean the new interpretation.

And lastly, a few questions of my own that I had while reading this proposal:

How is this "quality of dependency" determined? Is it based effectively on KeyPath equality, or is there another mechanism to determine when dependencies are equivalent? I wonder if it might be better to allow for named dependencies (like static properties). For example, in Foundation TimeZone.default is global state that may change during a test and any test that uses it / changes it must be serialized. Calendar.current is also in a similar boat. However, Calendar.current uses TimeZone.default. So if one test depended on \TimeZone.default and another on Calendar.current, we'd need these two tests to be serialized together. It might be better for us to define some .serialized(for: .internationalizationPreferences) dependency that encompasses the full set of (Locale|Calendar|TimeZone).(current|default|autoupdatingCurrent). Do you have any ideas for how this type of dependency might fit into this design?

cc @itingliu in case you have any thoughts on this too since we've been discussing this recently :slight_smile:

3 Likes

It's equality of the key path's root type. It could be equality of the full key path, but we can't decompose key paths so we can't see that e.g. the key path \ProcessInfo.environment["KEY"] is dependent on \ProcessInfo.environment, so we assume all key paths under ProcessInfo share a dependency.

This sort of internal dependency is, of course, invisible to the testing library.

The current design doesn't have a "bespoke" overload, but you could use a key path for this purpose:

enum Internationalization {}

@Test(.serialized(for: \Internationalization.self))
func f() { ... }

I don't want to get into the pattern we have with tags where to declare a new one you have to say "tag" three times and clap your hands:

extension Tag {
  @Tag static var internationalization: Tag
}

So I'd be disinclined to add a dedicated type for this purpose. However, it would be straightforward to add an overload of serialized(for:) that takes a tag, so you could declare the above tag and then write:

@Test(.serialized(for: .internationalization))
func f() { ... }

Edit: Updated the original post and my branch to support tags in this fashion.

I think using keyPaths is a bit awkward for what could essentially be a unique identifier. It makes it seem like Swift Testing is somehow looking at your test and detecting the use of that API or something. I feel like it would raise questions like

Why is \ProcessInfo.environment["KEY"] being serialized with \ProcessInfo.environment["UNRELATED_KEY"] they're unrelated!?

Why are my tests \ProcessInfo.something, \TypealiasOfProcessInfo.something and \WrapperOfProcessInfo.something running in parallel since they refer to the same thing?

Also, I personally haven't found the Tag repetition to be cumbersome at all. Here's a completely unthought example of something I think would cover the current needs without looking too out-of-place.

// swift-testing
public struct SerializationScope: Equatable, ExpressibleByStringLiteral, ExpressibleByArrayLiteral {
    let identities: Set<String>
    public init() { identities = [UUID().uuidString]}
    public init(_ discriminator: String) { identities = [UUID().uuidString, discriminator] }
    public init(_ identities: SerializationScope...) { self.identities = identities.reduce(/*...*/) }

    public subscript(_ identifier: String) -> SerializationScope {
        SerializationScope(identities: identities + [identifier])
    }
}

// Foundation_Testing
extension SerializationScope {
    static let stdin = SerializationScope()
    static let stdout = SerializationScope()
    static let stderr = SerializationScope()
    static let cd = SerializationScope()
    static let stdio: SerializationScope = [.stdin, .stdout, .stderr]

    static let environment = SerializationScope()
    static let currentLocale = SerializationScope()

    static let defaultTimezone = SerializationScope()

    private static let _calendar = SerializationScope()
    static let currentCalendar = SerializationScope(._calendar, .defaultTimezone)
}

// test targets:
extension SerializationScope {
    static let myCustomScope = SerializationScope()
    // different from the parent scope but will compare equal if recreated elsewhere with the same arguments
    static let myDerivedScope = myCustomScope["example"]
}

@Test(.serialized(.stdin, .environment["MY_KEY"], .locale, .calendar))
// The final set of resource access is [stdin, environment.MY_KEY, currentLocale, defaultTimezone, currentCalendar]

@Test(.serialized(.stdout, .cd))
// Can be ran in parallel with the other test, would probably not have been possible with keypaths

@Test(.serialized(.stdio))
// Uses WrapperOfProcessInfo but when it fails its obvious the mistake is ours

Thoughts?

1 Like

Ah, but they're not unrelated. They access the same mutable global state (the environment block, which thanks to POSIX is a big unavoidable ball of use-after-free bugs.)

If you use completely distinct names for the same referent, there's not much we can do to help you. "Do these two completely unrelated symbols describe the same referent?" is undecidable in the general case. This problem isn't specific to key paths and would exist with any other type of label.

The fundamental problem with doing something like this is that it is too easy to accidentally instead write:

private static let stdin = SerializationScope()

And then write the same thing in another file. The two instances don't have any real identity and aren't interchangeable, so you wouldn't actually get the serialization you expect. The API needs to be designed in a way that minimizes the risk of bugs and accidental misuse.

I didn’t know about the environment, seems obvious now that you say it! In that case you’d just have a single environment scope, then no one could get confused or misuse it!

In that case my suggestion could probably be reduced to make SerializationScope an opaque UUID wrapper. The only misuse I could think of is someone using it in a computed property, but then again it makes sense given Swift’s semantics: a computed property executes each time it’s called, producing a new distinct scope.

In your access control example I think that’s a feature not a bug: you declared it private and it’s not recognized as the same scope. Makes sense given Swift’s semantics for private. It would be confusing if it somehow ignored access control.

I just feel like using keypaths while ignoring all but the root type is confusing, I’d be much more comfortable if only the type identity was provided.

enum Internationalization: SerializationScope {}
@Test(.serialized(for: Internationalization.self))

This essentially does the same thing except it uses the type as its identity rather than a UUID. It’s simple, extensible and greatly reduces the chance of misuse.

I’d probably vouch for a protocol like SerializationScope just so you can’t pass accidentally framework types unless intended.

I just don’t see the added value in referring directly to framework types, and even less in specifying (ignored) keypaths.

Agreed, I wouldn't expect the Testing library to pick up on this. But I do think it is important that the API is designed in such a way that this doesn't become a common pitfall (either by making this connection visible to the testing library, or using a dependency mechanism that does not easily lead to this confusion - and the latter seems much easier).

I originally had not realized or expected this behavior at all so I was very surprised by this and agree with @Lancelotbronner's sentiment - if the dependencies are only looking at the root type info, should it just use a type instead of a KeyPath since the actual KeyPath components provide no value beyond a label that is unexpectedly ignored by the library? I think the tag solution works well (I don't have a strong opinion on whether dependencies should be a separate type or re-use existing tags). Does it make sense to mark a tag as serialized such that all tests with a specific tag are serialized amongst other tests with that tag? That's somewhat similar to the type approach but using tags (which seem slightly more preferable to me than creating sentinel, empty types since creating tags is already a feature of the library)

Well the thinking was that in the future we could decompose key paths (with new API) and see these relationships. But since that's not possible, just keying off the type itself would work too, and we could add key paths in the future.

We don't want to imbue tags with narrowly-focused superpowers or their own parallel (no pun intended) trait system. They do provide us with a way to generate arbitrary, unique keys though, so composing on top of them is reasonable.

Hi Jonathan :man_mage:

Thank you for this, this would be very helpful.

BTW it almost feels like magic that this such a declarative API should work. :magic_wand: