I suspected you would say this. I would agree if Codable
was primarily used for encoding to a proprietary binary format. However, in reality it is primarily used to encode and decode JSON. That JSON is often transmitted across networks to programs written in other languages. Therefore it cannot be purely private implementation. It is visible to the world and people will depend on it.
Like it or not, people will write code that implicitly depends on the serialization formats used by standard library types. Further, these formats will end up being used by default in JSON data exposed by REST APIs (as Swift on the server grows).
You can argue this is a bad thing and people get what they deserve if they aren't thoughtful and explicit about their own data formats. On the other hand, you can also argue that in practice the decisions made about Swift standard library and Foundation types will have real world consequences in the JSON exposed by a lot of REST APIs.
My position is that we should choose to be a good citizen of the internet by thinking carefully about the consequences of the choices we make regarding JSON serialization of our standard library and Foundation types. The best place to do that is on Swift evolution, not in pull request discussions.
That said, thank you for the link to the pull request discussions. I read them and some of the issues I am concerned with were indeed discussed there particularly in this thread. @dlbuckley shared an illustrative example:
But lets say that we did use an un-keyed or single value container, this is what it would look like in terms of a struct with a date range property converted into JSON (just stating this for keeping all discussions clear in black and white):
Closed Range: `{"myRange" : ["1970-01-01T02:46:40Z", "1970-01-01T05:33:20Z"]}`
Range: `{"myRange" : ["1970-01-01T02:46:40Z", "1970-01-01T05:33:20Z"]}`
PartialRangeUpTo: `{"myRange" : "1970-01-01T02:46:40Z"}`
PartialRangeThrough: `{"myRange" : "1970-01-01T02:46:40Z"}`
PartialRangeFrom: `{"myRange" : "1970-01-01T02:46:40Z"}`
From the JSON perspective the only way you would be able to distinguish between the range types would be documentation for the payload. Compared to:
Closed Range: `{"myRange" : {"from": "1970-01-01T02:46:40Z", "through": "1970-01-01T05:33:20Z"}}`
Range: `{"myRange" : {"from": "1970-01-01T02:46:40Z", "upTo": "1970-01-01T05:33:20Z"}}`
PartialRangeUpTo: `{"myRange" : {"upTo": "1970-01-01T05:33:20Z"}}`
PartialRangeThrough: `{"myRange" : {"through": "1970-01-01T05:33:20Z"}}`
PartialRangeFrom: `{"myRange" : {"from": "1970-01-01T05:33:20Z"}}`
@Tony_Parker argued for the chosen design (unkeyed containers) with:
I don't believe that we should impose the cost of carrying documentation strings in the archive itself as a primary goal. String keys can have a documentation as a side benefit. I don't think for mathematically closed types, which can gain no new keys, the documentation side benefit is worth the tradeoff.
and later:
I'm sure we all have experience with under-documented JSON output, and that is why the idea of putting these strings in the archive is attractive. However, my opinion is that these situations are a failure of specifying the JSON correctly. An application level bug, really.
We can't really fix that by using string keys for this one type. There will always be more cases where you can't figure out what the intended purpose of a value in JSON is, so this would be at best a partial solution anyway.
I agree that we cannot fix under-documented JSON. But we can acknowledge that this is a significant problem in the real world. We can also acknowledge that even when JSON is documented choices made in the schema design have consequences on how easy or difficult that JSON is to work with (especially from other languages).
I also agree that this discussion is beyond the scope of a single type. However, I do think that acknowledging these realities should influence the choices we make about how our types are serialized to JSON. This is a complex tradeoff but I feel like the community should play a role in making the decision.
My overarching point here is that when it comes to the JSON serialization format used by Swift types I feel we have a responsibility to consider not just the experience of Swift programmers, but also the broader internet community which will inevitably encounter JSON produced using whatever serialization format we choose. This format should not be considered a private implementation detail.
One unfortunate tension that I noticed in reading the code review discussion is that there really competing goals in different serialization contexts. Sometimes you want an optimized (perhaps binary) serialization. In other cases (such as public JSON APIs) you often want a format that is human readable and relatively clear about meaning even though supplemental documentation is always going to be necessary.
Codable
requires types to make a single choice about serialization which is necessarily going to be sub-optimal in one of these contexts. One way to resolve this is to choose the optimized format and use wrappers at a higher level in the system when human readable formats are necessary. Is this the informal policy that is being adopted by the standard library and Foundation?
Have you given any thought to other ways of resolving the tension between the goals of self-documenting data formats and more optimized serialization formats?