Thank you for responding @itaiferber. I really appreciate you engaging with this discussion.
I wouldn't say JSON is the crux of my argument. As you pointed out, it is only the format of the moment. I have emphasized JSON in my argument because it is a concrete use case everyone can understand. Everything I have been saying can be generalized to any format that is conventionally prioritizes ease of human consumption above compactness.
One of the fundamental points I am making is that for aggregate types I don't think it is possible to define a default that is suitable for all formats, yet
Codable requires a type to attempt to just that.
The mantra we like to repeat in the Swift community is that protocols are not just about syntax, but also about semantics. There is a significant semantic conflict between encoding to formats that prioritize compactness and encoding to formats that prioritize human consumption and / or self-documentation.
There is also a significant semantic conflict evidenced by the discussion of whether the encoded representation of a type should be considered a private implementation detail or a public specification that receives community input. I think it is appropriate for the representation to be considered private for binary formats prioritizing compactness but not for textual formats that prioritize human consumption and self-documentation. If human consumption is a priority then the format must be specified and it is reasonable to request input by the community of humans contributing to Swift's evolution.
For these reasons,
Codable is starting to feel to me like a problematic overgeneralization that doesn't take the semantics of encoding and serialization formats seriously enough. Fortunately we have options that don't involve tying ourselves to specific formats. We can model the semantic patterns we see in the domain of encoding formats.\
These feels like an ad-hoc solution. The example of
URL is a good one. There are a lot of formats where URL should be encoded as a String. If this behavior lives in
JSONEncoder then any other encoders for similar textual formats will not benefit from that behavior by default and will need to re-implement that behavior. Worse, there are a lot of types the implementation of any given encoder won't know about.
I can understand this perspective and don't disagree within the context of the design as the team at Apple intended it. If anything, this thread has highlighted the fact that at least some of us in the community want Swift to be good out of the box at targeting textual formats which prioritize human usability (of which JSON is currently most popular). Secondarily, this thread has highlighted the fact that
Codable was not designed to meet this goal. I think many of us would also like Swift to be good out of the box at targeting formats that priories compactness. These are not specific formats, but specific priorities that matter a lot in practice when we serialize data.
I am arguing that these defaults are not actually format agnostic. You even state here that they choose an efficient and concise representation. This is fine for some formats, but also not the right choice for many other formats. In fact, it is not great for the formats that happen to be some of the most pervasive formats in the world. JSON may fall out of fashion, but for the foreseeable future I would bet quite heavily that if something new gains favor it would also be a textual format that is reasonably self-documenting.
That's fair. This thread has moved in a direction I didn't anticipate when I first replied. I still do feel it is relevant to the current review though. I would support this proposal as-is if it becomes clear that the semantics of
Codable choose to prioritize compactness and we agree that given these semantics
Codable is not an adequate solution for formats such as JSON (which should lead to work exploring improved support for textual, self-documenting formats).
As I mentioned above, this feels like a pretty unfortunate ad-hoc solution tied to a specific encoder. Whatever override we would do for JSON would likely be applicable to a very broad range of formats that share similar characteristics (XML, YAML, etc).
I'm speaking more broadly than the current design of
Codable. Imagine using attributes or other annotations at the property level that specify encoding policies. I have written a Sourcery template that supports a broad family of annotations for this purpose. This template very intentionally lowers all dates to specific string or numeric representations in order to ensure consistent serialization of values regardless of the date encoding strategy specified on the encoder.
In my experience, encoder level strategies are something to avoid. I don't want the encoded representation of my type changing based on the configuration of the encoder / decoder. This allows incompatible encoded representations to be produced by a single
Codable conformance. For this reason, it feels like it goes too far in favoring convenience over precision.
This is good to hear!
I agree. My argument is that the current design doesn't have strong enough semantics and that there are at least two broad families of serialization formats that have conflicting semantics.
Codable has in some sense chosen one set of serialization semantics without explicitly stating this and is used for formats that where this set of semantics produces out of the box results that are not great (i.e. encoding ranges as an array). So my argument is fundamentally about correctness, not convenience.
I would modify this to say that sometimes a custom encoding strategy is the way to go. Wrapper types are only one way of getting there. But I do agree that a custom strategy is sometimes required.
There is a third case here: I care, but if the default is specified and happens to align with my needs. And a fourth case: I do care but there is a mechanism to specify the policy / strategy I want to use without writing my own custom implementation.