Hi all,
Here's a preliminary pitch text. I have working code, but I haven't polished it up for the series of individual PRs that I would like yet. There's also still two TODOs in the Detailed Design
section.
But if anyone has feedback for the current state of the document, it would be very welcome!
Roundtripping key coding strategies
Introduction
Many encoders and decoders that can be used with the Codable
system, embrace the concept of key coding strategies
. They were introduced with JSONEncoder
from Foundation
, but the general concept is so useful that is has been adopted by many other encoders and decoders.
A brief list of encoders and decoders that adopts the concept of key coding strategies
:
Many of these implementations - including JSONEncoder
contains a flaw where not all encoded keys roundtrip correctly when using a key coding strategy
.
For instance the key imageURL
will encode to image_url
when used with the convertToSnakeCase
keyEncodingStrategy
from JSONEncoder
. But that same key will be transformed to imageUrl
when applying the convertFromSnakeCase
keyDecodingStrategy
from JSONDecoder
.
The underlying issue is that there are two separate transformations involved. If there were only one transformation - from key to transformed key - then the issue could be fixed.
But isn't this just an issue with JSONEncoder
and JSONDecoder
from Foundation
that is actually out of scope on Swift Evolution?
No, there is unfortunately an underlying reason as to why there exists two transformations today:
KeyedDecodingContainer
contains an API called allKeys
that returns all the keys of the container. But internally these keys may be transformed, so a 'reverse' transformation must be applied in order to map to the key type of the KeyedDecodingContainer
.
This basically means that any attempt at implementing key coding strategies must include a 'reverse' transformation, which again leads to the issue described.
Of course one could argue that this system is already broken by the introduction of key coding strategies (and I will argue that below), so something else is required if we want to fix the situation.
This proposal intends to provide alternative API to perform key coding strategies and also API to avoid key coding strategies for custom types in situations where the encoding/decoding would otherwise break.
Swift-evolution thread: [Pre-pitch] Roundtripping key coding strategies
Motivation
Today the Codable
system has a leaky abstraction if used with encoders and decoders that perform transformations on their keys.
In Foundation
this is currently only present in the JSONEncoder
and JSONDecoder
, but the issue described is be the same for any other encoders/decoders that would attempt to do something similar to JSON(En/De)coder
s key(En/De)codingStrategy
.
The issue is that there is currently a pair of transformations present - and since the transformations are lossy, you don't necessarily get to the source key by encoding and decoding it.
For instance if I set the keyEncodingStrategy
of a JSONEncoder
to .convertToSnakeCase
and a similar keyDecodingStrategy
to .convertFromSnakeCase
and use it with the following struct:
struct Person: Codable {
var imageURL: URL
}
Then the encoding transform will produce:
{
"image_url": "..."
}
But the decoding transform will go from snake case to camel case, trying to look up the key: imageUrl
, which does not exist.
This is a common source of bugs when using key coding strategies, and at least in code bases that I am familiar with, the workaround is often to add custom coding keys like:
enum CodingKeys: String, CodingKey {
case imageURL = "imageUrl"
}
This allows the imageURL property to roundtrip when used with the snake case encoding and decoding, but this is a 'leaky abstraction', since the developer needs to be aware of the necessity for adding this key - and also this specific key is there to support a specific configuration option of a specific encoder/decoder pair.
Codable
entities and the encoder/decoder they are used with are supposed to be decoupled, but in this situation, the developer needs to know if the codable entity is used with an encoder/decoder pair that use key transformations - and also need to remember to map the key correctly, so that it will be 'found' when converting back from snake case to camel case.
Often I have seen attempts to 'fix' the behavior with the notation you would use if you didn't apply a key coding strategy:
enum CodingKeys: String, CodingKey {
case imageURL = "image_url"
}
which of course is no good when used with snake case conversion, since the key that will be looked up is "imageUrl"
Other times I have seen developers thinking that the custom CodingKey
implementation must be a mistake and removing it entirely, because unless you are very familiar with both the use case and the peculiarity of this mapping, then the code does look a bit 'off'.
Finally having this custom coding key also means that you are in trouble if you wish to encode/decode the same entity with an encoder/decoder where you are not using a similar key transform.
As described in the introduction, this issue is not specific to JSONEncoder
and JSONDecoder
, since all encoder/decoder pairs are basically forced to provide two transformations in order to support the allKeys
API on KeyedDecodingContainer
. As soon as you have the two transformations, they are basically required to be 'lossless' in order to have any key roundtrip correctly.
An attempt to analyze the allKeys
API
In order to figure out how to propose an alternative to the allKeys
API, we must first analize it's use cases.
When encoding a simple fixed struct with synthesized CodingKeys
, there is usually no use for allKeys
.
One use case could be to count all keys to ensure that only the explicitly handled keys are present in the input. For this use case you only need the count of keys.
Another use case is where the keys are dynamic - in the sense that they are perhaps not fully known by the author of the Codable
type, but can be extended later on.
One such implementation can be seen with AttributedString
here:
https://github.com/apple/swift-corelibs-foundation/blob/2db661061615dc366bd31af779d6f4551cb3197d/Sources/Foundation/AttributedString/AttributedStringCodable.swift#L493
The key type used for this KeyedDecodingContainer
is AttributeKey
, and it is precisely 'dynamic' in the sense that it can represent any String
as it's key value, and the exact use cases are unknown to the implementation since the AttributedString
functionality contains an aspect of extensibility.
So what happens when encoding and decoding AttributeString
using JSONEncoder
and JSONDecoder
with snake case key coding strategies? It fails to roundtrip text marked up with the .imageURL
property. This property appears to be marked up using a key named NSImageURL
. This is encoded to n_s_image_url
and upon decoding this will look for a key named NSImageUrl
, which does not exist.
https://forums.swift.org/t/pre-pitch-roundtripping-key-coding-strategies/52777/4
Proposed Solution
The proposal is to introduce three changes. One of these is in the domain of Foundation
, so it is out of scope for discussion in this forum. I do, however, feel that it is necessary to understand the complete picture, and I think that we could limit discussion about it on the forums to be around: 'do you think that it would be a good idea to create a PR containing these changes to swift-corelibs-foundation
and of course then let Apple decide on whether or not to accept the changes.
Here are the proposed changes:
-
Introduce an allRawKeys: [CodingKey]
API on KeyedDecodingContainer
and KeyedDecodingContainerProtocol
.
In order to not break backwards compatibility, KeyedDecodingContainerProtocol
will supply a default implementation of allRawKeys
that just returns allKeys
, but authors of types conforming to KeyedDecodingContainerProtocol
are advised to implement allRawKeys
explicitly.
Create a PR against swift-corelibs-foundation
that adds support for allRawKeys
for JSONDecoder
.
-
Introduce a protocol in the standard library named PreformattedCodingKey
. Encoder
and Decoder
implementations that support some form of key coding strategy would be advised to implement opting out of key coding strategies for CodingKey
types that conform to PreformattedCodingKey
.
Create a PR againts swift-corelibs-foundation
that respects the PreformattedCodingKey
for JSONEncoder
and JSONDecoder
-
Create a PR against swift-corelibs-foundation
that deprecates JSONEncoder
s keyEncodingStrategy
as well as JSONDecoder
s keyDecodingStrategy
and introduces a common keyCodingStrategy
that is a transformation in the direction from a CodingKey
to an encoded key.
How do these changes help?
For use cases where the coding keys are completely dynamic, any key coding strategy will have the possibility of transforming the keys into a shape that cannot be recognized upon decoding. In that situation it could be relevant to let the CodingKey
in question conform to PreformattedCodingKey
in order to completely opt-out of having the keys transformed upon encoding and decoding.
In order to get a peek into the decoding process, or perhaps check the number of keys, the allRawKeys
API on KeyedDecodingContainer
could be a solution.
In order to have your synthesized CodingKeys
round trip correctly without any manual key mapping or knowledge about how keys are transformed during encoding, use a keyCodingStrategy
like useSnakeCase
.
Examples
Here is a repository demonstrating a version of JSONEncoder
and JSONDecoder
that deprecates keyEncodingStrategy
and keyDecodingStrategy
respectively and introduce a shared keyCodingStrategy
instead.
It also respects conformance to the included PreformattedCodingKey
protocol (although this pitch proposes that this protocol is added to the Swift standard library and not to Foundation
)
https://github.com/mortenbekditlevsen/JSONCoder
Here are some of the included tests:
final class JSONEncoderTests: XCTestCase {
func testUseSnakeCase() throws {
struct Model: Codable {
var imageURL: String
}
let encoder = JSONEncoder()
encoder.keyCodingStrategy = .useSnakeCase
let data = try encoder.encode(Model(imageURL: "a"))
let expectedString = "{\"image_url\":\"a\"}"
XCTAssertEqual(String(data: data, encoding: .utf8), expectedString)
let decoder = JSONDecoder()
decoder.keyCodingStrategy = .useSnakeCase
let model = try decoder.decode(Model.self, from: data)
XCTAssertEqual(model.imageURL, "a")
}
// NOTE: This only works in tests and is only included for
// illustrative purposes.
func testCustom() throws {
struct Model: Codable {
var imageURL: String
}
struct MyCodingKey: CodingKey {
var stringValue: String
var intValue: Int? { nil }
init(stringValue: String) {
self.stringValue = stringValue
}
init?(intValue: Int) {
self.stringValue = "\(intValue)"
}
}
let encoder = JSONEncoder()
encoder.keyCodingStrategy = .custom({ codingPath in
MyCodingKey(stringValue: "\(codingPath.last?.stringValue.hash ?? 0)")
})
let data = try encoder.encode(Model(imageURL: "a"))
let expectedString = "{\"3520785955319405054\":\"a\"}"
XCTAssertEqual(String(data: data, encoding: .utf8), expectedString)
let decoder = JSONDecoder()
decoder.keyCodingStrategy = encoder.keyCodingStrategy
let model = try decoder.decode(Model.self, from: data)
XCTAssertEqual(model.imageURL, "a")
}
func testPreformattedKey() throws {
struct MyPreformattedCodingKey: PreformattedCodingKey {
var stringValue: String
var intValue: Int? { nil }
init(stringValue: String) {
self.stringValue = stringValue
}
init?(intValue: Int) {
self.stringValue = "\(intValue)"
}
}
struct Model: Codable {
var imageURL: String
func encode(to encoder: Encoder) throws {
var container = encoder.container(keyedBy: MyPreformattedCodingKey.self)
try container.encode(imageURL, forKey: .init(stringValue: "imageURL"))
}
init(imageURL: String) {
self.imageURL = imageURL
}
init(from decoder: Decoder) throws {
let container = try decoder.container(keyedBy: MyPreformattedCodingKey.self)
self.imageURL = try container.decode(String.self, forKey: .init(stringValue: "imageURL"))
}
}
let encoder = JSONEncoder()
encoder.keyCodingStrategy = .useSnakeCase
let data = try encoder.encode(Model(imageURL: "a"))
let expectedString = "{\"imageURL\":\"a\"}"
XCTAssertEqual(String(data: data, encoding: .utf8), expectedString)
let decoder = JSONDecoder()
decoder.keyCodingStrategy = .useSnakeCase
let model = try decoder.decode(Model.self, from: data)
XCTAssertEqual(model.imageURL, "a")
}
Detailed Design
Adding PreformattedCodingKey
The proposed solution adds a new protocol, PreformattedCodingKey
:
/// Suggests to `Codable` encoders and decoders that no key encoding or.
/// decoding ought to be performed on `CodingKey`s of this type.
@available(macOS 9999, iOS 9999, watchOS 9999, tvOS 9999, *)
public protocol PreformattedCodingKey { }
Handle PreformattedCodingKey
conforming keys in JSONEncoder
private func _converted(_ key: CodingKey) -> CodingKey {
// Use the plain key if it is preformatted
if key is PreformattedCodingKey {
return key
}
switch encoder.options.keyEncodingStrategy {
case .useDefaultKeys:
return key
case .convertToSnakeCase:
let newKeyString = JSONEncoder.KeyEncodingStrategy._convertToSnakeCase(key.stringValue)
return _JSONKey(stringValue: newKeyString, intValue: key.intValue)
case .custom(let converter):
return converter(codingPath + [key])
}
}
Handle PreformattedCodingKey
conforming keys in JSONDecoder
private struct _JSONKeyedDecodingContainer<K : CodingKey> : KeyedDecodingContainerProtocol, TestKeyedDecodingContainerProtocol {
...
/// Initializes `self` by referencing the given decoder and container.
init(referencing decoder: __JSONDecoder, wrapping container: [String : Any]) {
self.decoder = decoder
self.codingPath = decoder.codingPath
// Use the plain container if the keys are preformatted
guard !(Key.self is PreformattedCodingKey.Type) else {
self.container = container
return
}
switch decoder.options.keyDecodingStrategy {
case .useDefaultKeys:
self.container = container
case .convertFromSnakeCase:
// Convert the snake case keys in the container to camel case.
// If we hit a duplicate key after conversion, then we'll use the first one we saw. Effectively an undefined behavior with JSON dictionaries.
self.container = Dictionary(container.map {
key, value in (JSONDecoder.KeyDecodingStrategy._convertFromSnakeCase(key), value)
}, uniquingKeysWith: { (first, _) in first })
case .custom(let converter):
self.container = Dictionary(container.map {
key, value in (converter(decoder.codingPath + [_JSONKey(stringValue: key, intValue: nil)]).stringValue, value)
}, uniquingKeysWith: { (first, _) in first })
}
}
Handling allRawKeys
TODO: Add actual suggested code here:
- Introduce new API on
KeyedDecodingContainerProtocol
- Default implementation returning
allKeys
- New API on the
KeyedDecodingContainer
Handling useSnakeCase
TODO: Add actual suggested code here:
- Introduce
keyCodingStrategy
on JSONEncoder
and JSONDecoder
in both Darwin Foundation overlay and swift-corelibs-foundation.
- Deprecate
keyEncodingStrategy
and keyDecodingStrategy
- Implement strategies. If a default
keyCodingStrategy
is used, there should be a fallback to the deprecated encoding and decoding strategies.
Impact on Existing Code
The allRawKeys
is additive, but with a default implementation, so all existing KeyedDecodingContainerProtocol
conformers will continue to compile, although it would be advisable to implement a specialized version.
Also no direct impact for the PreformattedCodingKey
protocol, since adoption of this protocol is additive.
Note that conforming an existing CodingKey
to PreformattedCodingKey
will change it's encoding and decoding behavior, so that must be done with thoughts about how this intersects with current and future use of key coding strategies.
There will be deprecation warnings for existing keyEncodingStrategy
and keyDecodingStrategy
, but opting in to a keyCodingStrategy
can be done at the leasure of the user.
With regards to any current Decodable
conforming type that uses allKeys
from the KeyedDecodingContainer
upon decoding, I have demonstrated above that it is not reliable when used together with convertFromSnakeCase
. Deprecating this API will let this fact be known and allow the author to take steps to using allRawKeys
or alternatively let the CodingKey
in use conform to PreformattedCodingKey
.
Alternatives Considered
Using Dictionary
instead of PreformattedCodingKey
You can already today ensure that key coding will not be performed on your CodingKey
by leveraging the fact that keys in Dictionary
are treated as data and not as CodingKey
s.
This knowledge can be used to circumvent key coding strategies today. As can be seen in the following gist, the ergonomics are quite horrible, so conforming your CodingKey
to the PreformattedCodingKey
seems like a great win.
Don't touch my keys:
https://gist.github.com/mortenbekditlevsen/7918fb98638f8a9e2b017f0fad12da0b
Graceful fallback for allKeys
Even though using useSnakeCase
for a JSONEncoder
, the allKeys
method on KeyedDecodingContainer
could still use the convertFromSnakeCase
transformation to recover the same keys as it does today when using the convertFromSnakeCase
key decoding strategy. There is, however, no obvious choice for the custom
case here, and I guess that in the long run it could easily cause more confusion than benefit.
Full allKeys
support for simple enum backed CodingKey
s
I did a small hack based on the great work by @stephencelis and Brandon Williams in their swift-case-paths library.
This hack basically allows you to generate an array of all cases of an enum without associated values. In other words, it queries the runtime to return information that is comparable to what the CaseIterable
conformance gives us at compile time.
Using that hack, the allKeys
implementation could iterate over all cases of your CodingKey
when that CodingKey
is an enum like the synthesized ones.
Having access to these cases allow you to perform the key coding transformation correctly and return a list of keys present in the KeyedDecodingContainer
.
A fallback, in case your CodingKey
is not a plain enum-backed version could be to attempt initializing the CodingKey
directly from the encoded key in the KeyedDecodingContainer
- or the fallback could even be to attempt the graceful fallback described in the section above.
While fun to play around with, this solution seems a bit strange, and as it mainly only fully works with plain enums, I don't consider it fit for actual use.
Acknowledgements
A huge thanks to @norio_nomura for the original PR to include useSnakeCase
as a key coding strategy.
Many thanks to everyone providing feedback on the pre pitch discussion.
Revision history
EDIT:
Removed a leftover suggestion from previously about also deprecating allKeys
. After suggestion from Itai Ferber above I am currently not suggesting to deprecate that.