[Micro Pitch] Standard Output/Error UTF8 Lines

Motivation

Exit tests have been a wonderful addition to our testing tool case, but the existing recommendations for validating the results of those tests fails to help the author. For instance, consider the following test:

let result = await #expect(processExitsWith: .failure, observing: [\.standardErrorContent]) {
    assertionFailure("Oh no")
}
#expect(result?.standardErrorContent.contains("Oh yes".utf8) == true)

This returns Expectation failed: (result?.standardErrorContent.contains("Oh yes".utf8) → false) == true, with no indication of what was actually returned.

Proposed solution

I propose a new standardErrorUTF8Lines property on ExitTest.Result that can be accessed directly depending on the author's needs:

let result = await #expect(processExitsWith: .failure, observing: [\.standardErrorContent]) {
    assertionFailure("Oh no")
}
#expect(result?.standardErrorUTF8Lines.last == "Oh yes")

This instead returns Expectation failed: (result?.standardErrorUTF8Lines.last → "Ohno.swift:4: Assertion failed: Oh no") == "Oh yes", which directly helps the author either update the test with the proper assertion output, or more usefully directs them towards what changed since the last time the test succeeded.

Implementation

extension ExitTest.Result {
    /// All UTF8-decoded lines written to the standard error stream of the exit test before
    /// it exited.
    ///
    /// The value of this property may contain any arbitrary sequence of bytes. If you are
    /// interested in confirming the raw error content, see ``standardErrorContent``
    /// instead.
    ///
    /// When checking the value of this property, keep in mind that the standard output
    /// stream is globally accessible, and any code running in an exit test may write to it
    /// including the operating system and any third-party dependencies you have
    /// declared in your package. Consider comparing the `.last` value of this property
    /// with [`==`](https://developer.apple.com/documentation/swift/array/==(_:_:))
    /// in your expectation or requirement:
    ///
    /// ```swift
    /// let result = await #expect(processExitsWith: .failure, observing: [\.standardErrorContent]) {
    ///     assertionFailure("Oh no")
    /// }
    /// #expect(result?.standardErrorUTF8Lines.last == "Ohno.swift:4: Assertion failed: Oh no")
    /// ```
    ///
    /// To enable gathering output from the standard error stream during an exit
    /// test, pass `\.standardErrorContent` in the `observedValues` argument of
    /// ``expect(processExitsWith:observing:_:sourceLocation:performing:)`` or
    /// ``require(processExitsWith:observing:_:sourceLocation:performing:)``.
    ///
    /// If you did not request standard error content when running an exit test,
    /// the value of this property is the empty array.
    public var standardErrorUTF8Lines: [Substring] {
        String(decoding: standardErrorContent, as: UTF8.self).split { $0.isNewline }
    }
    
    /// All UTF8-decoded lines written to the standard output stream of the exit test before
    /// it exited.
    ///
    /// The value of this property may contain any arbitrary sequence of bytes. If you are
    /// interested in confirming the raw output content, see ``standardOutputContent``
    /// instead.
    ///
    /// When checking the value of this property, keep in mind that the standard output
    /// stream is globally accessible, and any code running in an exit test may write to it
    /// including the operating system and any third-party dependencies you have
    /// declared in your package. Consider comparing the `.last` value of this property
    /// with [`==`](https://developer.apple.com/documentation/swift/array/==(_:_:)) in
    /// your expectation or requirement:
    ///
    /// ```swift
    /// let result = await #expect(processExitsWith: .success, observing: [\.standardOutputContent]) {
    ///     print("Oh good")
    /// }
    /// #expect(result?.standardOutputUTF8Lines.last == "Oh good")
    /// ```
    ///
    /// To enable gathering output from the standard output stream during an exit
    /// test, pass `\.standardOutputContent` in the `observedValues` argument of
    /// ``expect(processExitsWith:observing:_:sourceLocation:performing:)`` or
    /// ``require(processExitsWith:observing:_:sourceLocation:performing:)``.
    ///
    /// If you did not request standard output content when running an exit
    /// test, the value of this property is the empty array.
    public var standardOutputUTF8Lines: [Substring] {
        String(decoding: standardOutputContent, as: UTF8.self).split { $0.isNewline }
    }
}

Thanks for the pitch, Dimitri!

It is intentional that we don't provide an interface to get the stdout or stderr streams as strings: since they are just byte streams, it is often the case that they are not valid Unicode. This is covered in the documentation for ExitTest.Result. The output written by the test author may be mixed with non-Unicode output from other sources such as system libraries or Swift Testing itself.

If the test author is sure the entire stdout or stderr stream is valid Unicode, it isn't hard to construct a string from either one. Is this proposal a significant improvement, ergonomically, over:

let result = try await #require(processExitsWith: .failure, observing: [\.standardErrorContent]) {
    assertionFailure("Oh no")
}
let stderrLines = String(validating: result.standardErrorContent, as: UTF8.self)?
  .split(whereSeparator: \.isNewline)
#expect(stderrLines?.last == "Oh yes")

?

Hard-coding UTF-8 into the API contract is also probably not the right abstraction. UTF-8 will not always be the correct encoding; no supported platform mandates that stdout and stderr be one specific encoding and in practice they allow a wide variety of encodings via the LANG environment variable.

2 Likes

This is fine, and the right baseline to provide. I would like to propose that many cases are strings, and of those cases, the majority are likely to be UTF8.

I think this new API better supports the majority of authors who treat console output as text output in the default encoding, without leaving behind those that need access to the raw byte stream content through the existing properties.

This is especially the case where the last line is almost always the cause of the exit, especially considering that most swift programs making use of exit tests will likely be checking the presence of assert(), precondition(), or fatalError().

The main benefit is that it leads test writers into being able to check for the things they likely want to test, in a way that allows them to catch issues that come up.

I opted for String(decoding:as:) instead, so a) we always get some diagnostic output (rather than nil), and b), so we don't waste the minuscule amount of time validating a string that will fail in the next #expect(==) clause. Additionally, if the expect fails with nil on the left, we don't know if the last line failed because it decoded to gibberish, or if it failed because there were no lines at all. Using the proposed standardErrorLines will help reassure the tester that they will only ever get nil when the content is empty, and a value to compare (and potentially fail) against if it isn't.

I'm happy to get behind a variant that takes encoding as a parameter like you suggested in Slack, with a default assuming UTF8 that we can deprecate in the distant, distant future:

public func standardErrorLines<Encoding: _UnicodeEncoding>(decodedAs outputEncoding: Encoding.Type) -> [Substring] where Encoding.CodeUnit == UInt8 {
    String(decoding: standardErrorContent, as: outputEncoding).split { $0.isNewline }
}

public var standardErrorLines: [Substring] {
    String(decoding: standardErrorContent, as: UTF8.self).split { $0.isNewline }
}

decoding vs. validating isn't the high-order bit here though. Is an API surface here (regardless of its exact implementation) significantly more ergonomic than telling developers call String.init and split manually? (The answer may well be a resounding "yes"! But the question must be asked.)

1 Like

Hi Dimitri,

(for full clarity, this is my personal opinion, not the Swift Testing Workgroup's)

I'll admit it, I too have written exit tests where I ended up asserting on some output in stdout.

However, I still believe we should not provide a helper in the standard Swift Testing library to get a utf8 value out of STDOUT or STDERR streams. For the following reasons:

  1. these tests will be brittle by design as both channels are not isolated from other stuff happening on a system (as @grynspan explained).
  2. that these streams will be mostly UTF-8 is an assumption that is risky to make. What about platforms that default to UTF-16 or maybe even ASCII?
  3. (a bit more philosophical) I believe exit tests are most valuable to validate that something ended under a certain condition with a non-zero exit code. Instead of validating on output, making your app emit different exit codes for specific situations might also be a way forward. Off course this path may not always be feasible.

Long story short, I can imagine there can be situations where it's needed to assert on something in STDERR or STDOUT. In those cases I accept the risks involved and add the two lines of code needed. But for me it's really the exception, not the rule. And therefore I wouldn't want it to be part of Swift Testing out of the box.

1 Like

I appreciate the viewpoint from the most general place tests can be run, but for the vast majority of Swift tests in my experience, they’ll be run on either macOS or Linux (usually as a part of GitHub workflows). I think this needs to be acknowledged that any platform can still make use of standardErrorContent, but affordances should be made available to meet developers where they are (and will be for the forceable future) to help them use these tools.

Similarly, 8 out of 10 swift developers I know only know how to trap swift code by either calling assert, precondition or fatalError (most won’t even realize you can also check optional unwrapping until you point it out!) — sure, some libraries may use other ways of halting execution and communicating those issues, and sure, we can have (or more accurately wait for, since preconditions don’t output their reason in release builds as of yet) a purpose-built solution for just the 4 cases above, I believe this provides a more-generally useful middle ground, especially since the encoding could conceptually be passed in to handle more esoteric use cases. In all of these cases, the last message will generally be output as a line of text to standard error, so even if its not the last entry (in the case of another thread getting the final say for some reason during trap?), it feels fair to built the vast majority of simple exit tests around these assumptions.

Also, do note that even for the above issues, different error codes are used on macOS vs Linux for the same call to, say, assertionFailure(). Suggesting the everyday swift developer write their own assert handler that outputs a specific code would honestly be more harmful, as that handler nor its associated check wouldn’t be reliably elided automatically in release mode anymore.

Argued another way, there is currently a huge cliff between writing regular tests and writing robust exit tests for typical Swift code. The arguments so far are arguing that any middle ground to check the output would be harmful for the most esoteric exit test, when checking the output of exit tests is already so seldomly used they could benefit from almost any help that they can get.

(Again, the bridge may seem small when you know the solution is to convert to a string, but knowing that’s the next step may as well be a “not worth it” canyon for those that are unaware.)

It’s not up to me to decide if this comes in the form of a highly targeted solution for Swift’s typical cases, or a more general middle ground, but I’d like to push for such a bridge to be considered :blush:

Is your ultimate goal here to make it easier to detect calls to fatalError() and friends?

That ended up being my primary use case, but I also ended up using it to validate actual text-based output as well for an executable-based package.

For the fatalError()-based test though, I actually found it to be extremely useful to make sure a specific assertion/precondition was triggering, so I ended up matching the whole string, but I can see cases where folks only care to check the file, line number, type of failure, or just the message.

Could we somehow preserve the original error information? Pseudo code:

let error: Error? = result.standardError

For some failures (e.g. uncaught throws) that would be the actual error, for things like assert failures it could be a synthesised error object (like NSError or similar).

A swiftError property would probably be my ideal specific use case as well, though I can also see it requiring more robust IPC and swift runtime changes to reliably re-build the file and line number information, rather than parsing it from a string of form <module>/<file>.swift:<line>: <trap type>: <message>\n (though honestly parsing with a regex does get us 98% of the way there)

Generally speaking, Swift errors don't survive process boundaries. This is a problem that I think Swift would need to solve at a layer below Swift Testing as it impacts plenty of other IPC/serialization pathways like XPC. I imagine if we got some sort of support in the runtime/stdlib/language for throwing errors reliably across process boundaries, we could reasonably modify the behaviour of #expect(processExitsWith:) for uncaught errors.

If I am able to land Add a `_swift_willAbort` hook by grynspan · Pull Request #83674 · swiftlang/swift · GitHub, then I think I could offer something like:

let result = await #expect(processExitsWith: .fatalError) { ... }
#expect(result?.fatalErrorMessage == ...) // this is a Swift string from the get-go!
#expect(result?.fatalErrorSourceLocation == ...)

(I will note however that the immediate body of an exit test does not currently have reliable source location information due to limits on macro expansion. That's a problem we'd need to resolve first, I think.)

For technical reasons, it is somewhat difficult to distinguish a call to fatalError() from a call to preconditionFailure(), etc. after-the-fact, although it's not an insurmountable problem.

As another angle - the original post made the following problem statement:

I think it'd be great it if we could actually incorporate the actual info from the result automatically as part of the expectation failure details. As an attachment, or perhaps even directly inline as one of the failure messages if the content is smaller than a given threshold and does decode to a String.

Could we change that string to be, say, JSON representation, or is it string already ABI (de facto if not de jure)?