@hartbit recently revived some discussion about this topic in a PR by @norio_nomura, and I figured that instead of restricting the discussion to the PR, this would be a perfect opportunity to solicit some community feedback. As such:
Hi all,
In Swift 4.1, we introduced key strategies to JSONEncoder
and JSONDecoder
to make it easier to conform to some common JSON practices. Specifically, we wanted to facilitate converting keys from Swift's camelCaseGuidelinesForPropertyNames to the common_json_snake_casing_of_keys, so these key strategies allow for automatic case conversion of all CodingKey
s to snake case on encode, and from snake case on decode.
However, there are some edge cases in round-tripping keys which the API does not currently address. Specifically, properties which contain words in all-caps (usually initialisms like "HTTP"/"URL"/"UUID"/etc.) don't directly round-trip due to the lossy nature of these transformations; for instance, a key for myHTTPLink
will appear in the JSON as my_http_link
, but when converted back from JSON, it becomes myHttpLink
(because the converter has no way of knowing that "http"
is a special case which should become "HTTP"
and not "Http"
).
More concretely, given the following definition:
struct Foo : Codable {
var myHTTPLink: URL
}
the synthesized init(from:)
will try to make the following decode(_:forKey:)
call:
myHTTPLink = try container.decode(URL.self, forKey: .myHTTPLink)
When a .keyDecodingStrategy
other than .useDefaultKeys
is applied to JSONDecoder
, requests for keyed containers cause it to first map the keys in the found container:
switch decoder.options.keyDecodingStrategy {
case .useDefaultKeys:
self.container = container
case .convertFromSnakeCase:
// Convert the snake case keys in the container to camel case.
// If we hit a duplicate key after conversion, then we'll use the first one we saw. Effectively an undefined behavior with JSON dictionaries.
self.container = Dictionary(container.map {
key, value in (JSONDecoder.KeyDecodingStrategy._convertFromSnakeCase(key), value)
}, uniquingKeysWith: { (first, _) in first })
case .custom(let converter):
self.container = Dictionary(container.map {
key, value in (converter(decoder.codingPath + [_JSONKey(stringValue: key, intValue: nil)]).stringValue, value)
}, uniquingKeysWith: { (first, _) in first })
}
This means that a JSON object which looks like
{
"my_http_link": ...,
"my_other_thing": ...,
}
is mapped to
[
"myHttpLink": ...,
"myOtherThing": ...
]
When the decode(_:forKey:)
call above happens, an attempt is made to look up myHTTPLink
's stringValue
("myHTTPLink"
) in the container; since no such key is present, the property can't be found.
The current workaround for this is to give the CodingKey
a string value which matches the round-tripped string:
private enum CodingKeys : String, CodingKey {
case myHTTPLink = "myHttpLink"
}
but of course, this is suboptimal. Besides the fact that this behavior is undiscoverable until you get bitten by it, requiring the user to define a CodingKeys
enum
somewhat defeats the purpose of using the strategy in the first place.
There are a few approaches we could take to address this issue (each with tradeoffs), but we're interested in getting some feedback on them, especially from folks who use these strategies:
-
@norio_nomura's PR offers a solution which addresses the issue of attempting to convert
"my_http_link"
into"myHTTPLink"
by sidestepping the problem altogether. With a new key strategy (.useSnakeCasedKeys
), instead of trying to convert from aString
to aCodingKey
, the solution is to only ever apply the transformation in one direction β from theCodingKey
to theString
. Containers are left alone, and when adecode(_:forKey:)
call is made, the key is reconverted into a string to match what's in the payload. This is promising, but has two main wrinkles:- This relies on us having access to the encode-side conversion for both directions whether we're encoding or decoding, which means that it behaves differently from
.custom
conversions. With.custom
conversions, we only ever have one side of the transformation, so writing your own.custom
strategy would leave you in no better a place - Slightly more problematic is access to
KeyedDecodingContainer.allKeys
, which is a view into a container's keys based on the key type you're accessing the container with. This is done via a mapping of the actual string keys in the container to theCodingKey
type:public var allKeys: [Key] { return self.container.keys.compactMap { Key(stringValue: $0) } }
. This operation inherently converts from strings to keys, which means that we cannot apply a transformation of key-to-string. This means that as-is, the value ofallKeys
would have keys missing from it, even though you would be able to get a result bydecode(_:forKey:)
'ing one of those missing keys. This is problematic for types which inspectallKeys
in order to dynamically decode values from the container- One solution to this solution could be to handle
CaseIterable
CodingKey
types specially β if aCodingKey
type isCaseIterable
, we can iterate over its cases and convert those to snake case instead of going the other way. Of course, no existingCodingKeys
today areCaseIterable
, which means that in order to opt into this implicit behavior, you have to know to make yourCodingKeys
CaseIterable
(and we'd likely need to make synthesizedCodingKeys
CaseIterable
as well).
- One solution to this solution could be to handle
- This relies on us having access to the encode-side conversion for both directions whether we're encoding or decoding, which means that it behaves differently from
- Another solution could be to expose the underlying functions that
JSONEncoder
andJSONDecoder
use to perform these conversions (say, asJSONEncoder.KeyEncodingStrategy.toSnakeCase(_:)
andJSONDecoder.KeyDecodingStrategy.fromSnakeCase(_:)
); with access to these functions, it might be easier to write your own.custom
conversion which watches out for these specific cases that you might care about and want to handle specifically. This still requires you to be aware of the issue, and write your own solution - Offer, either through a property or another strategy, a customizable list of words that are special-cased, e.g.
.convertFromSnakeCaseWithExceptions(["HTTP", "URL", "UUID", ...])
; these words would be used in the conversion so that when we split words on"_"
, words that match the list can be handled specially- The wrinkle here is some amount of complexity: does it matter whether the exceptions are given in upper-case, lower case, or mixed case (e.g. given
"hTTp"
should"my_http_link"
become"myhTTpLink"
)? What about keys in JSON which don't exactly match the expected case (e.g."my_hTTp_link"
)?
- The wrinkle here is some amount of complexity: does it matter whether the exceptions are given in upper-case, lower case, or mixed case (e.g. given
- Some combination of the above, or a different approach which we haven't considered
- Leave things as-is; this is an acceptable risk for the convenience of applying the strategy to a whole payload possibly full of edge cases. [To be clear, I don't actually believe that this is a reasonable long-term solution to the problem.]
All three main solutions have a bit of a discoverability problem as you won't necessarily know to apply a different strategy until you hit the problem. There are two riffs on solution 3 which might help a bit with this:
- Deprecate
.convertFromSnakeCase
in favor of the option which explicitly requires exceptions; this way existing code is made aware of a potential solution to a problem that the author might not know they had - Instead of offering a new strategy, give
JSONEncoder
andJSONDecoder
a mutable property holding the list of exceptional words. The list can come prepopulated with some reasonable defaults (for some definition of reasonable), but the list can be added to/removed from as necessary. This doesn't make the problem more discoverable, but might give a slightly better default answer. The risk here comes for folks who have already accounted for this deficiency and provided their ownCodingKeys
with values likemyHttpLink
because the current workaround will now be wrong
Thoughts on this issue? Have you run into this case and had to work around it somehow? Curious to see if folks have hit this in practice.
(To be clear, at this point in the release, we are not looking to introduce new API into Swift 4.2; we are interested in feedback on a solution that would be both ergonomic and more easily correct than what we have in place today.)