[Pitch] Custom reflection during testing

Since we introduced Swift Testing, Xcode has had the ability to break down
expressions produced by an expectation when it fails. These breakdowns can help
test authors quickly figure out exactly why a test failed. For example, given
the following test:

struct MonsterTruck: Equatable {
  var color: Color
  var numberOfWheels: Int
}

@Test func `Monster trucks`() {
  let crushinator = MonsterTruck(color: .red, numberOfWheels: 4)
  let truckasaurusRex = MonsterTruck(color: .green, numberOfWheels: 5)
  #expect(crushinator == truckasaurusRex)
}

Xcode provides a breakdown of the operands to the failed == comparison
(crushinator and truckasaurusRex in this example). As of recent Swift 6.4
development toolchains, our console output via swift test also includes
similar output. For the test above, test authors will now see something like:

β—‡ Test "Monster trucks" started.
✘ Test "Monster trucks" recorded an issue at [...]: Expectation failed: crushinator == truckasaurusRex
↳ crushinator == truckasaurusRex β†’ false
↳   crushinator β†’ MonsterTruck(color: Color.red, numberOfWheels: 4)
↳     color β†’ .red
↳     numberOfWheels β†’ 4
↳   truckasaurusRex β†’ MonsterTruck(color: Color.green, numberOfWheels: 5)
↳     color β†’ .green
↳     numberOfWheels β†’ 5
✘ Test "Monster trucks" failed after 0.005 seconds with 1 issue.

I propose adding a customization point to Swift Testing, a new protocol called CustomTestReflectable, to let developers customize this output.

Read the full proposal here.

Trying it out

To try this feature out, add a dependency to the main branch of swift-testing to your project:

...
dependencies: [
  ...
  .package(url: "https://github.com/swiftlang/swift-testing.git", branch: "main"),
]

Then, add a target dependency to your test target:

.testTarget(
  ...
  dependencies: [
    ...
    .product(name: "Testing", package: "swift-testing"),
  ]
)

Finally, import Swift Testing using @_spi(Experimental) import Testing.

7 Likes

Yes please!

1 Like

In the past there was some interest in taking inspiration from the tools in our CustomDump library to provide nicer test failure messages. Is this still the case?

As an example, the currently proposed test failure message:

Expectation failed: crushinator == truckasaurusRex
↳  crushinator == truckasaurusRex β†’ false
↳    crushinator β†’ MonsterTruck(color: MyLibraryTests.MonsterTruck.Color.red, numberOfWheels: 4)
↳      color β†’ .red
↳      numberOfWheels β†’ 4
↳    truckasaurusRex β†’ MonsterTruck(color: MyLibraryTests.MonsterTruck.Color.green, numberOfWheels: 5)
↳      color β†’ .green
↳      numberOfWheels β†’ 5

…becomes this:

Issue recorded: Difference: …

    MonsterTruck(
  βˆ’   color: .red,
  +   color: .green,
  βˆ’   numberOfWheels: 4
  +   numberOfWheels: 5
    )

(First: βˆ’, Second: +)

And if there are more fields, things become more readable. For example, using a real world example from one of our demos, this test failure:

↳  Row(player: Player(id: UUID(-3), gameID: UUID(-1), name: "Blob Jr", score: 2)) == Row(player: Player(id: UUID(-3), gameID: UUID(-1), name: "Blob Jr.", score: 2)) β†’ false
↳    Row(player: Player(id: UUID(-3), gameID: UUID(-1), name: "Blob Jr", score: 2)) β†’ Row(player: MyLibraryTests.Player(id: 00000000-0000-0001-0000-000000000003, gameID: 00000000-0000-0001-0000-000000000001, name: "Blob Jr", score: 2, nickname: "", recordHighScore: 0, age: 0, credits: 0), imageData: nil)
↳      player β†’ Player(id: 00000000-0000-0001-0000-000000000003, gameID: 00000000-0000-0001-0000-000000000001, name: "Blob Jr", score: 2, nickname: "", recordHighScore: 0, age: 0, credits: 0)
↳        id β†’ 00000000-0000-0001-0000-000000000003
↳        gameID β†’ 00000000-0000-0001-0000-000000000001
↳        name β†’ "Blob Jr"
↳        score β†’ 2
↳        nickname β†’ ""
↳        recordHighScore β†’ 0
↳        age β†’ 0
↳        credits β†’ 0
↳      imageData β†’ nil
↳    Row(player: Player(id: UUID(-3), gameID: UUID(-1), name: "Blob Jr.", score: 2)) β†’ Row(player: MyLibraryTests.Player(id: 00000000-0000-0001-0000-000000000003, gameID: 00000000-0000-0001-0000-000000000001, name: "Blob Jr.", score: 2, nickname: "", recordHighScore: 0, age: 0, credits: 0), imageData: nil)
↳      player β†’ Player(id: 00000000-0000-0001-0000-000000000003, gameID: 00000000-0000-0001-0000-000000000001, name: "Blob Jr.", score: 2, nickname: "", recordHighScore: 0, age: 0, credits: 0)
↳        id β†’ 00000000-0000-0001-0000-000000000003
↳        gameID β†’ 00000000-0000-0001-0000-000000000001
↳        name β†’ "Blob Jr."
↳        score β†’ 2
↳        nickname β†’ ""
↳        recordHighScore β†’ 0
↳        age β†’ 0
↳        credits β†’ 0
↳      imageData β†’ nil

…becomes this:

Issue recorded: Difference: …

    Row(
      player: Player(
        id: UUID(00000000-0000-0001-0000-000000000003),
        gameID: UUID(00000000-0000-0001-0000-000000000001),
  βˆ’     name: "Blob Jr",
  +     name: "Blob Jr.",
        score: 2,
        nickname: "",
        recordHighScore: 0,
        age: 0,
        credits: 0
      ),
      imageData: nil
    )

(First: βˆ’, Second: +)

Further, we can expand the nice printing of failure messages to more complex data structures, like arrays and dictionaries. So, a test failure like this:

↳  lhs == rhs β†’ false
↳    lhs β†’ [MyLibraryTests.Row(player: MyLibraryTests.Player(id: 00000000-0000-0001-0000-000000000002, gameID: 00000000-0000-0001-0000-000000000001, name: "Blob Sr", score: 3), imageData: nil), MyLibraryTests.Row(player: MyLibraryTests.Player(id: 00000000-0000-0001-0000-000000000003, gameID: 00000000-0000-0001-0000-000000000001, name: "Blob Jr", score: 2), imageData: nil), MyLibraryTests.Row(player: MyLibraryTests.Player(id: 00000000-0000-0001-0000-000000000001, gameID: 00000000-0000-0001-0000-000000000001, name: "Blob", score: 1), imageData: nil)]
↳    rhs β†’ [MyLibraryTests.Row(player: MyLibraryTests.Player(id: 00000000-0000-0001-0000-000000000002, gameID: 00000000-0000-0001-0000-000000000001, name: "Blob Sr", score: 3), imageData: nil), MyLibraryTests.Row(player: MyLibraryTests.Player(id: 00000000-0000-0001-0000-000000000003, gameID: 00000000-0000-0001-0000-000000000001, name: "Blob Jr.", score: 2), imageData: nil), MyLibraryTests.Row(player: MyLibraryTests.Player(id: 00000000-0000-0001-0000-000000000001, gameID: 00000000-0000-0001-0000-000000000001, name: "Blob", score: 1), imageData: nil)]

…becomes this:

Issue recorded: Difference: …

    [
      [0]: Row(
        player: Player(
          id: UUID(00000000-0000-0001-0000-000000000002),
          gameID: UUID(00000000-0000-0001-0000-000000000001),
  βˆ’       name: "Blob Sr",
  +       name: "Blob Sr.",
          score: 3
        ),
        imageData: nil
      ),
      … (2 unchanged)
    ]

(First: βˆ’, Second: +)

I also think it'd be nice for this protocol to live outside of Swift Testing. That would make it possible to expose private/internal data to the error messages, and would pave the way for better debugging tools in non-testing environments.

6 Likes

Short answer: yes.

Longer answer: we must walk before we can run. We need to build out the underlying infrastructure first.

The necessary protocol already exists in the form of CustomReflectable. This pitch covers a protocol of the same nature for generating reflections of values specifically for testing.

In our library we found that the additional Custom…Reflectable protocol wasn't quite enough, and there needed to be more variation in printing values to have the nicest possible formatting. That's why we have CustomDumpRepresentable and CustomDumpStringConvertible. I don't think the additional protocols are necessary, but it would be nice to consider the problems they are solving.

Yes, but how would one expose private/internal members for test failures if you can only conform to the protocol in test targets?

// MyModule
struct MyStruct {
  let id: UUID
  private var isFlagged = false
}

// MyModuleTests
extension MyStruct: CustomTestReflectable {
  var customTestMirror: Mirror {
    Mirror(
      self,
      children: [
        (label: "id", value: id as Any),
        (label: "isFlagged?", value: isFlagged),  // πŸ›‘
      ]
    )
  }
}

It wouldn't be appropriate to override a type's customMirror just to get some nicer test formatting in place. That is why it's nice to have this additional layer, CustomTestReflectable, but ideally in a non-testing context.

And in general, the entire pretty diff printer apparatus would ideally be accessible outside of testing since it's not a purely testing concern.

1 Like

Swift Testing already has a CustomTestStringConvertible protocol that serves that purpose, unless I'm misunderstanding.

A developer in this bind could write the following in their primary target:

extension MyStruct {
  package var customTestMirror: Mirror {
    // ...
  }
}

And then in their test target:

extension MyStruct: CustomTestReflectable {}

The logic for printing the diff in Swift Testing is going to be testing-specific simply because the logic for printing test output is not applicable to the general case. The logic for constructing something like a ValueDifference<T> structure is certainly something I'd love to see in the standard library.

Yes, but ideally without this extra nesting:

↳        gameID β†’ Game.ID(00000000-0000-0001-0000-000000000001)
↳          rawValue β†’ 00000000-0000-0001-0000-000000000001

This is what is printed when defining a wrapper struct ID for an identifiable type.

1 Like

We can reasonably suppress mirroring in cases where, for example, T.self is RawRepresentable.Type && !T.hasCustomTestDescription. Conformance to CustomTestReflectable would probably imply that we should capture the full mirror of the value.