SE-0489: Improve EncodingError and DecodingError's printed descriptions

allevato · July 23, 2025, 2:15pm

It doesn't look like CustomDebugStringConvertible has anything in its documentation about this. If folks really believe that a CustomDebugStringConvertible that has multiline output is something that should be forbidden or strongly discouraged, then separately from this proposal, can we please state that explicitly in the documentation for that protocol? Is there a reason we can't or shouldn't do that?

FranzBusch · July 23, 2025, 2:41pm

I agree that the documentation for both protocols should be made more specific about the expectations and how they are used. I was surprised to learn that CustomStringConvertible implementations are sometimes calling CustomDebugStringConvertible .

AlexanderM · July 23, 2025, 2:46pm

I'm super happy that the this UX is being addressed, but I don't think the proposed error messages quite hit the mark.

It uses Swift-specific jargon that would be confusing to new users
Introduces a novel format for describing paths, instead of embracing existing standards
Wastes space printing both the stringValue and intValue, when the overwhelming majority cases would only have one or the other

On terminology

"Keyed coding container" vs "unkeyed coding container" are overly generic terms to expose on to Codable users. They make sense at API level that's trying to be format-agnostic, to an audience of library authors implementing Encoders. However, they're confusing terms to users of Codable libraries. Think of a new dev just making their first web request to sling some JSON around. They would be familiar with what an "array" or "object" (perhaps "dictionary") is, but "unkeyed coding container" is niche Swift jargon.

As a point of comparison, YAML parsers also give confusing messages, like this example:

did not find expected ',' or '}' while parsing a flow sequence at line 1 column 4

What the heck is a "flow sequence"? Apparently that's their term for an array, and "mapping" if their term for a dictionary.

I propose that we add some extension points for Decoders to customize error messages, to give richer format-specific message. For example, the JSONDecoder could define something like:

struct JSONDecoder: Encoder { 
    static let keyedContainerName = "object"
    static let unkeyedContainerName = "array"
}

Coding path description

There's no need for Swift to introduce its own format for describing coding paths. This would only be a 15th competing standard, which wouldn't be compatible with the whole world of pre-existing tooling for dealing with serialized data.

Instead, we should ask the Decoder to format the coding path for us, allowing it to use the established format for that kind of data. For example:

struct JSONDecoder: Encoder { 
    func describeCodingPath(codingPath: [any CodingKey]) -> String {
        // Generate a `jq` query, like `.[0].home.country`
    }
}

Other examples:

YAMLDecoder might describe coding paths in the yq query format
XMLDecoder might produce XPath
ProtobufDecoder might produce strings in the "field path" format

Users seeing these messages can just copy the path, plop it right into jq, and start examining their data from there.

Coding key formatting

The CodingKey protocol technically models coding keys as a sum type (similar to a struct), but it's effectively a union type (similar an enum). It guarantees two initializes, one which only sets the string value, and another which only sets the int value. It guarantees a getter for both, but no setters. Going through the CodingKey protocol alone, it's impossible to construct a coding key like CodingKey(stringValue: "a", intValue: 1).

Concrete conformers to CodingKey can add API for setting both, (e.g. an init(stringValue: String, intValue: Int), or setters for the properties), but this is highly unusual.

Thus, there's no point printing both values, if one of the two is almost surely nil. In the unlikely situation both are non-nil, sure print them both, but otherwise we can condense it down:

- CodingKeys(stringValue: "population", intValue: nil)
+ "population"
- CodingKeys(stringValue: nil, intValue: 3)
+ 3

Proposed message example

Here's an example message format I propose, incorporating the three ideas above:

- Key 'population' not found in keyed decoding container.
+ Key 'population' not found in object
- Debug description: No value associated with key CodingKeys(stringValue: "population", intValue: nil) ("population").
+ Debug description: No value associated with key "population".
- Path: [0]/home/country
+ Path: .[0].home.country

dnadoba · July 23, 2025, 9:57pm

@lorentey goes into great detail about the true purpose of these protocols here:

SE-0445: Improving String.Index's printed descriptions

I find that the name and current documentation of CustomDebugStringConvertible (and its debugDescription property) are harmful and misleading, because they aren't at all reflecting their actual purpose.

From what I can tell, the real purpose of CustomDebugStringConvertible is to serve as a secondary variant of CustomStringConvertible to be used when the use of the default description may interfere with understanding, such as when generating the descriptions of aggregate types or collections.

For example, String has an implementation of description that simply returns self, while its debugDescription is careful to provide a quoted display, with properly escaped contents.
let a = "Truman, Harry S."
print(a)      // ⟹ Truman, Harry S.
debugPrint(a) // ⟹ "Truman, Harry S."

let b = "Dwight D. \"Ike\" Eisenhower"
print(a)      // ⟹ Dwight D. "Ike" Eisenhower
debugPrint(a) // ⟹ "Dwight D. \"Ike\" Eisenhower"
Meanwhile, Array always uses debugDescription to print its elements:
let c = [a, b]
print(c)      // ⟹ ["Truman, Harry S.", "Dwight D. \"Ike\" Eisenhower"]
debugPrint(c) // ⟹ ["Truman, Harry S.", "Dwight D. \"Ike\" Eisenhower"]
This is to prevent confusion; if Array did not use the "suitable for debugging" variants when printing its items, then its description could easily become impossible to understand: for example, the comma in Truman, Harry S. would be indistinguishable from the commas that separate array items:
[Truman, Harry S., Dwight D. "Ike" Eisenhower]
So, in my (quite deeply held) view, the entire purpose of debugDescription is to be a secondary variant of description that is expected to be safe to embed into syntactic/structural displays. The documentation should talk about specifically what that means -- it should be talking about the need to avoid punctuation such as "naked" spaces, newlines, commas or colons, and unpaired quotes, brackets, parentheses etc. (It is quite tricky to formally specify what a well-formed debugDescription should be, which I expect partially explains why the documentation doesn't even attempt at hinting at this as a requirement.)

Notably, debugDescription is mostly invoked when building collection/aggregate descriptions, where brevity is really important. So CustomDebugDescription is not at all the right place to add information that isn't already present in description -- in fact, it may sometimes be better to omit or shorten things. When printing an array of 100 items, we really, really do not need to see some over-detailed presentation of each item, repeated 100 times -- brevity is perhaps even more important in this context than it is for description.)

Given all that, my first instinct is to say that a type should only conform to CustomDebugStringConvertible if it already conforms to CustomStringConvertible, but its description isn't suitable for unescaped embedding into syntactic formats. (Such as the case with String.)

We should update the documentation with a version of that.

xwu · July 23, 2025, 10:05pm

The steering group previously considered the matter as part of that prior review. As I reported out in the decision notes, the conclusion of the group was:

[Accepted with modifications] SE-0445: Improving String.Index’s printed descriptions

A central point of discussion was whether the conformance in question ought to be to CustomDebugStringConvertible rather than (and not in addition to) CustomStringConvertible. This group was asked to weigh in on what these protocols are "for."

There was agreement among the group with the simple explanation that debugDescription is the most appropriate API to provide a textual representation that is specifically "for" debugging.

While it's true that, where a standard library type implements both description and debugDescription, the latter often adds or removes punctuation so as to be more suitable for structured display, it's not the formatting details which alone account for the difference between the two APIs. Instead, the group was inclined to agree with reviewer feedback that conformance to CustomDebugStringConvertible conveys the intended use of the corresponding textual representation—namely, that it is oriented specifically towards debugging, and no attempt should be made to parse, convert, or otherwise manipulate the output.

By contrast, a value's description could be suitable for more general use. In this case, for example, it's conceivable that one could have a textual representation for string indices that'd be useful for someone learning about Unicode and who's totally unfamiliar with the String.Index type or even Swift at all. The representation proposed here is explicitly not that.

GarthSnyder · July 24, 2025, 7:29am

+1

Several thoughtful comments have been posted here about code that this change might break. But I'm OK with breakage in this case.

My general sense is that no object, protocol, or method with Debug in the name should be considered part of the Swift ABI contract. These entities are metadata and metabehaviors designed for the use of developers during development. If someone chooses to rely on their specific behavior or format in production code, they do so at their own risk.

I'm not sure these items should even be subject to the normal Swift Evolution process. Just fix them!

ZevEisenberg · July 24, 2025, 11:44am

I tried! But I was asked to go through SE, and I do think it’s valuable to talk through it. I agree that debug things should not be relied upon in theory. But as noted above, the docs are pretty terse, and even with the best docs in the world, no one reads every single docs page, so it’s good to think through the implications of a change like this.

FranzBusch · July 24, 2025, 5:38pm

Thanks for citing this

I think this is an important point. Nothing should parse, convert or manipulate the debug description. Our logging systems aren’t doing that. They aren’t even calling the debug description API directly. However, as I said above, swift-log is relying heavily on description which is often calling debugDescription. Furthermore, errors such as the decoding and encoding errors here are often logged. If this proposal is getting accepted as is, then this will most likely break a few logging backends. To make matters worse the only workaround that I see for those backends is to parse the description and sanitize it which is too costly to do in every single log.

ZevEisenberg · July 25, 2025, 1:02am

In case folks missed this note in the proposal:

Note 1: this proposal is not intended to specify an exact output format. The above is provided as an example, and is not a guarantee of current or future behavior. You are still free to inspect the contents of thrown errors directly if you need to detect specific problems.

I’m glad folks have raised the newline issue, and it’s probably worth digging into and figuring out a pragmatic solution. I’m not very familiar with the inner workings of logging frameworks, but I do wonder whether they ought to be sanitizing their “inputs” (the messages being logged) if things like new lines are likely to wreak havoc?

ZevEisenberg · July 25, 2025, 1:09am

I agree: we should improve the printing of the keys. But it is out of scope for this proposal. That description is constructed by the Foundation encoders/decoders, and SE-0489 is intentionally scoped to just change the stdlib. See Future Directions for ways we might address this to make the output even better, but it’s going to require a Foundation change (which I believe has its own SE-like proposal process).

FranzBusch · July 25, 2025, 9:12am

Most logging backends are not sanitizing messages due to the performance implications. In high performance server use-cases there might be hundreds of logs generated per second across all the cores. If the logging backends would parse every single message and potentially replace newlines they would bring the entire system to a halt. That's why most logging backends that are capable of handling high volume of log messages are just append the message's utf8 bytes to some buffer that gets flushed on a regular interval.

There were some discussions if swift-log should introduce it's own protocol for types to provide a logging description. However, it was decided to leverage Swift's CustomStringConvertible for two reasons:

The protocol already exists, provides a string representation, many types conform to it, and it is a standard practice to not use newlines in description implementations
If swift-log would provide a custom protocol it would force essentially every package to add a dependency if to swift-log to provide this conformance.

Overall, I am very sympathetic on solving the concrete problem at hand to improve the printed descriptions but I would encourage us to pick a description and debugDescription that doesn't include new lines.

John_McCall · July 31, 2025, 9:19pm

SE-0489 has been accepted; please see the announcement for more information.

John McCall
Language Steering Group