[Pre-pitch] Roundtripping key coding strategies

Morten_Bek_Ditlevsen · October 13, 2021, 10:09am

Hi all,

The current state of key coding strategies with `Codable`

Today the Codable system has a leaky abstraction if used with encoders and decoders that perform transformations on their keys.

In Foundation this is currently only present in the JSONEncoder and JSONDecoder, but the issue that I'm explaining would be the same for any other encoders/decoders that would attempt to do something similar to JSON(En/De)coders key(En/De)codingStrategy.

The issue is, that there is currently a pair of transformations present - and since the transformations are lossy you don't necessarily get to the source key by encoding and decoding it.

For instance if I set the keyEncodingStrategy of a JSONEncoder to .convertToSnakeCase and a similar keyDecodingStrategy to .convertFromSnakeCase and use it with the following struct:

struct Person: Codable {
  var imageURL: URL
}

Then the encoding transform will produce:

{
  "image_url": "..."
}

But the decoding transform will go from snake case to camel case, trying to look up the key: imageUrl, which does not exist.

This is a common source of bugs when using key coding strategies, and at least in code bases that I am familiar with, the workaround is often to add custom coding keys like:

enum CodingKeys: String, CodingKey {
  case imageURL = "imageUrl"
}

This allows the imageURL property to roundtrip when used with the snake case encoding and decoding, but this is a 'leaky abstraction'.

Codable entities and the encoder/decoder they are used with are supposed to be decoupled, but in this situation, the developer needs to know if the codable entity is used with an encoder/decoder pair that use key transformations - and also need to remember to map the key correctly, so that it will be 'found' when converting back from snake case to camel case.

Often I have seen attempts to 'fix' the behavior with the notation you would use if you didn't apply a key coding strategy:

enum CodingKeys: String, CodingKey {
  case imageURL = "image_url"
}

which of course is no good when used with snake case conversion, since the key that will be searched for is "imageUrl"

Other times I have seen developers thinking that the custom CodingKey implementation must be a mistake and removing it entirely, because unless you are very familiar with both the use case and the peculiarity of this mapping, then the code does look a bit 'off'.

Finally having this custom coding key also means that you are in trouble if you wish to encode/decode the same entity with an encoder/decoder where you are not using a similar key transform.

The road to a solution

So the basic issue is that we can't make a perfect inverse of a lossy transformation.

One solution (credit goes to @norio_nomura who made a PR with a proposal for a solution, that was unfortunately got closed during a PR cleanup: Add useSnakeCasedKeys to JSONDecoder.KeyDecodingStrategy by norio-nomura · Pull Request #14039 · apple/swift · GitHub) is:

Recognize that if you have two lossy functions, a and b, then b(a(input)) won't equal input for all values of input.
If you only have one function and treat the coding keys as the only source of truth, then you can get a perfect mapping.

Fortunately most API that deal with coding keys in the entire Codable surface area have the original coding key as input. This is the brilliance of the suggested PR above: the transformation of the coding keys during both encoding and decoding is actually a transformation on the coding key.

UNfortunately, there's one API where this is not true - namely:

var allKeys: [KeyedDecodingContainer<K>.Key] { get }

on KeyedDecodingContainer.

Since this method only has the "encoded" coding keys as a source, it basically depends on the inverse transformation that the proposed solution tries to get rid of.

Can this issue be solved?

I don't know.

One idea would be to deprecate the allKeys API and introduce a similar allRawKeys method that returns an Array of anonymous CodingKey directly from the input.

For myself - in all the time I've been using Codable, I have not directly been using the allKeys API.

If I have a case of 'dynamically keyed data' I tend to always decode a Dictionary<String, Something>, since decoding a Dictionary implicitly opts out of key decoding (the Dictionary is treated as data, not as a container with coding keys as keys).

One use case I can see for this API is to test if the decoding container only contains the keys that I am interested in, and no others. But that could also be solved by counting the allRawKeys array.

My gut feeling is: If you use key coding strategies together with the allKeys API, are you not already in trouble? In what sort of dynamic situation would it be ok and even expected for you to get back an array of transformed keys using this API?

So this leads me to ask some questions in this forum:

How have you used the allKeys API on KeyedDecodingContainer?
Have you used it together with key coding strategies?
Have you used it for anything other than getting the count of keys?
Would you be able to replace your use of allKeys with something like the proposed allRawKeys API? Or even by decoding a Dictionary and dealing with that afterwards?

Final thoughts

A solution to this issue does not only benefit JSONEncoder/JSONDecoder, but any encoder/decoder attempting to map keys in the process.

With a solution in place, you would be able to use almost any function as a transform - for instance your keyCodingStrategy could hash the keys in some way, and still be able to decode the hashed keys into the original coding key. That sounds like a powerful abstraction to me. :-)

Note: I have deliberately not mentioned keys that would clash using a specific key coding strategy:

struct Person: Codable {
  var imageURL: URL
  var imageUrl: URL
}

This situation is neither improved nor made worse by only having a key transform in one direction.

itaiferber · October 13, 2021, 1:06pm

I don't have much to add here beyond what's been discussed already, but some assorted thoughts/links:

Lots of good background in the original PR and in the followup thread, which I think are worth reading
Gut feeling: this is a problem mostly created by JSONEncoder for itself — I'm not sure how many other encoders allow key modification in this way. It may be entirely sufficient to confine the solution there
- One thing to keep in mind is that the key encoding strategies currently allow for a .custom strategy, which may not follow the same rules as the strategies one might have in mind. I don't know how often .custom is used, but it is there
I think allRawKeys could work (and we may not need to deprecate allKeys if we can just offer the raw version), though adding it at this point would require giving it a default implementation to prevent source breakage
- You could theoretically just have the property default to returning allKeys, though it may be a trap to forget to implement it

I'd love to be able to solve some of the inconsistency here too — we may need some input from some Foundation folks if they have time to spend on the problem, especially if the solution is constrained to Foundation-specific code (e.g. altering the JSONEncoder/JSONDecoder strategies) instead of changing all containers. (It does appear that the new AttributedString decodable implementation uses allKeys for decoding containers of attributes, so they may have some thoughts about how changes here might affect the implementation.)

Morten_Bek_Ditlevsen · October 13, 2021, 1:29pm

Hi Itai,
Thank you for your comments - and for adding a link to the followup thread.

I am going to experiment with AttributedString to see if it may already have issues with encoding/decoding properties when using key coding strategies. It may at least have issues if you define custom attributes and use multiple uppercase letters in a row.

I think that in time, the very best outcome would be to deprecate all existing key coding strategies and replacing them with ones that only go in one direction. The .custom one too could be replaced with one that always goes in the direction from key to 'encoded key' on both the encoder and decoder.

I think that even though the issue arises through the implementation in JSONEncoder and JSONDecoder, key coding strategies, the issue is not actually with them, but with the fact that the 'allKeys' API exists (and I know that it exists for a perfectly good reason) - because it's existence basically requires anyone that wants key conversion to implement them as two separate functions instead of one.

Morten_Bek_Ditlevsen · October 13, 2021, 6:59pm

A quick followup about the AttributedString Codable conformance:

It already doesn't play well with snake case key coding strategy:

let input = AttributedString("Hello").settingAttributes(.init([.imageURL : URL(string: "https://example.com/")!]))
print("Before: ", input)

let encoder = JSONEncoder()
encoder.keyEncodingStrategy = .convertToSnakeCase
let decoder = JSONDecoder()
decoder.keyDecodingStrategy = .convertFromSnakeCase

let encoded = try! encoder.encode(input)
print("Encoded: ", String(data: encoded, encoding: .utf8)!)

let decoded = try! decoder.decode(AttributedString.self, from: encoded)
print("After: ", decoded)

Outputs the following:

Before:  Hello {
	NSImageURL = https://example.com/
}
Encoded:  ["Hello",{"n_s_image_url":{"relative":"https:\/\/example.com\/"}}]
After:  Hello {
}

So it suffers from a similar (but not quite the same) issue as described above, but doesn't fail parsing - it just has an 'empty' attribute covering the "Hello" String.

I think that there may be some learning from this - I hope that I can formulate it clearly:

First of all my intuition about 'dynamic' keys still holds with this example: namely that it doesn't play well with key coding strategies.

Secondly: Is there a way to fix this? The only thing I can think of is a mechanism to explicitly opt-out from key coding strategies. You actually already get that when decoding dictionaries keyed by Strings - the keys are left alone. I suspect that dictionary decoding wouldn't work in this case, however, but perhaps we can think of another way to opt out of the key coding? Another 'marker' protocol on CodingKey, perhaps?

Morten_Bek_Ditlevsen · October 16, 2021, 5:37am

Just wanted to chime in that XMLCoder maintained by @Max_Desiatov also has the concept of key coding strategies.
The Firebase open source iOS sdk uses a JSONEncoder-inspired encoder/decoder pair, that also supports key coding.

Just to say that the general concept is definitely useful across more encoders and decoders, and I believe that a solution to the issue would be welcome in other projects too.

Morten_Bek_Ditlevsen · October 16, 2021, 5:52am

I’m having a bit of an issue figuring out how to focus this pitch:

Just suggest the addition of new key coding strategies that basically implements the ideas presented in the PR linked above.
Suggest an allRawKeys api
Suggest a way to opt-out of key coding (this makes the general concept of key coding into something that the Codable api now needs to know about.

Since applying key coding strategies already breaks round tripping today - for types like AttributedString, then one could argue that 1 could stand alone and improves one area (fixing the leaky abstraction) without incurring additional issues (that are at least conceptually present and may be quite well hidden).

The only thing is: this has already been attempted (the linked PR), but that got stranded in internal api review.

I could go for proposing 2, but I am unsure how much it would actually solve. I don’t think using that api could fix the AttributedString round tripping issue.

Then there is 3. That could potentially be used to fix AttributedString round tripping. The cost is another marker protocol - together with a new thing to learn when implementing (quite advanced) Codable types.

What do you think?

gwendal.roue · October 16, 2021, 6:59am

GRDB has it as well. The implementation blindly follows Foundation (because when in doubt I grab inspiration from standards), and is also unable to roundtrip is some specific cases: fooID -> foo_id -> fooId.

Nickolas_Pohilets · October 16, 2021, 9:21pm

2 & 3 may work together. Instead of (or in addition to) allRawKeys, you can have allKeys which is available only when key confirms to PreformattedCodingKey.

Morten_Bek_Ditlevsen · October 25, 2021, 11:27am

That's a really neat idea!
I'm not certain how this would look at protocol level though. You can extend the functionality of a concrete type based on the protocol conformance of generic parameters, but can you similarly extend the protocol requirement based on the protocol conformance of an associated type?

And also: could this be implemented in a backwards compatible way?

Morten_Bek_Ditlevsen · October 25, 2021, 11:48am

Hi all,
Thank you for the comments so far!

I think that I am soon ready to formalize the presented ideas.
So basically there are three parts, that can be though of as independent:

Introduce key coding strategies that only go in the direction of the encoding (from CodingKey of the type in question to encoded representation). Deprecate old key encoding and decoding strategies (in Foundation!)
Introduce a new protocol (I like the suggested name: PreformattedCodingKey). Skip key coding strategies for keys conforming to this protocol during both encoding and decoding.
Introduce a protocol requirement to KeyedDecodingContainerProtocol for an allRawKeys: [CodingKey] property. Add a default conformance that returns allKeys for backwards compatibility.

Regarding 1:

My suggestion would be for adding a new enum KeyCodingStrategy that is shared bewteen JSONEncoder and JSONDecoder. The sharing would help making it evident that there is only one transform in play.

Add a new option for JSONEncoder and JSONDecoder called keyCodingStrategy while deprecating keyEncodingStrategy and keyDecodingStrategy for the two types respectively.

The rationale is that even though individual cases can be deprecated, it should be made clear that a new .custom case for the decoding is also in the direction of encoding. Deprecating the existing .custom case would require a new name to be invented. And the argument for sharing strategies between encoder and decoder also suggests that deprecation of the existing strategies would be sensible.

A rather big issue: This is all in Foundation, so basically outside of the realm of SE...

Does anyone know of a good way to aproach this? The previous PR which almost does what I am suggesting here got stranded for two and a half years and then closed.

Note that this step may of course be taken by any non-Foundation encoder/decoder pairs today already.

Regarding 2 and 3:

Both of these could definitely be discussed in SE and respecting the protocol would be natural for both JSONEncoder/JSONDecoder and other encoder/decoder pairs .

It would be natural for types like AttributedString to opt-in to PreformattedCodingKeyfor their codingkeys, but again this lives in Foundation so the call ought to be made somewhere else.

What do you think?

itaiferber · October 25, 2021, 8:11pm

This all sounds pretty reasonable, and I think these suggestions are all worth pitching, at the very least. But you're right — a lot of work here would need to be done in/by Foundation, so it's worth trying to get someone from the team involved to see if they might be interested in this endeavor. (/cc @Tony_Parker)

Regarding PreformattedCodingKey — I think one major question from Codable-type authors will be: "can I get automated CodingKey synthesis but also get the keys to be treated as pre-formatted?" I think the reasonable answer is "no", and I think it's not such a big concern: if your type has strong opinions about how its keys should be handled, it's not unreasonable to need to do a bit of work to indicate that. But, just something to consider, and to spell out in a potential pitch.

Morten_Bek_Ditlevsen · October 29, 2021, 11:07am

I agree that automated CodingKey synthesis implies treating the keys as keys rather than 'data' (which is sort of how I see the PreformattedCodingKey: that the keys are data that should be left alone).

I'll add a few other people to this thread: @drexin (since you are reviewing my PR related to SE-0320 which also regards Codable), @tomerd (since you initiated the broader discussion about improvements to Serialization in Swift).

Morten_Bek_Ditlevsen · December 11, 2021, 11:21am

Hi all,
Here's a preliminary pitch text. I have working code, but I haven't polished it up for the series of individual PRs that I would like yet. There's also still two TODOs in the Detailed Design section.

But if anyone has feedback for the current state of the document, it would be very welcome!

Roundtripping key coding strategies

Proposal: SE-NNNN
Author: Morten Bek Ditlevsen
Review Manager: TBD
Status: Awaiting implementation

Introduction

Many encoders and decoders that can be used with the Codable system, embrace the concept of key coding strategies. They were introduced with JSONEncoder from Foundation, but the general concept is so useful that is has been adopted by many other encoders and decoders.

A brief list of encoders and decoders that adopts the concept of key coding strategies:

https://github.com/apple/swift-corelibs-foundation/blob/main/Darwin/Foundation-swiftoverlay/JSONEncoder.swift
https://github.com/MaxDesiatov/XMLCoder/blob/main/Sources/XMLCoder/Encoder/XMLEncoder.swift
https://github.com/firebase/firebase-ios-sdk/blob/master/FirebaseDatabaseSwift/Sources/third_party/RTDBEncoder/RTDBEncoder.swift
https://github.com/groue/GRDB.swift/blob/master/README.md#column-names-coding-strategies

Many of these implementations - including JSONEncoder contains a flaw where not all encoded keys roundtrip correctly when using a key coding strategy.

For instance the key imageURL will encode to image_url when used with the convertToSnakeCase keyEncodingStrategy from JSONEncoder. But that same key will be transformed to imageUrl when applying the convertFromSnakeCase keyDecodingStrategy from JSONDecoder.

The underlying issue is that there are two separate transformations involved. If there were only one transformation - from key to transformed key - then the issue could be fixed.

But isn't this just an issue with JSONEncoder and JSONDecoder from Foundation that is actually out of scope on Swift Evolution?

No, there is unfortunately an underlying reason as to why there exists two transformations today:

KeyedDecodingContainer contains an API called allKeys that returns all the keys of the container. But internally these keys may be transformed, so a 'reverse' transformation must be applied in order to map to the key type of the KeyedDecodingContainer.

This basically means that any attempt at implementing key coding strategies must include a 'reverse' transformation, which again leads to the issue described.

Of course one could argue that this system is already broken by the introduction of key coding strategies (and I will argue that below), so something else is required if we want to fix the situation.

This proposal intends to provide alternative API to perform key coding strategies and also API to avoid key coding strategies for custom types in situations where the encoding/decoding would otherwise break.

Swift-evolution thread: [Pre-pitch] Roundtripping key coding strategies

Motivation

Today the Codable system has a leaky abstraction if used with encoders and decoders that perform transformations on their keys.

In Foundation this is currently only present in the JSONEncoder and JSONDecoder , but the issue described is be the same for any other encoders/decoders that would attempt to do something similar to JSON(En/De)coder s key(En/De)codingStrategy .

The issue is that there is currently a pair of transformations present - and since the transformations are lossy, you don't necessarily get to the source key by encoding and decoding it.

For instance if I set the keyEncodingStrategy of a JSONEncoder to .convertToSnakeCase and a similar keyDecodingStrategy to .convertFromSnakeCase and use it with the following struct:

struct Person: Codable {
  var imageURL: URL
}

Then the encoding transform will produce:

{
  "image_url": "..."
}

But the decoding transform will go from snake case to camel case, trying to look up the key: imageUrl , which does not exist.

This is a common source of bugs when using key coding strategies, and at least in code bases that I am familiar with, the workaround is often to add custom coding keys like:

enum CodingKeys: String, CodingKey {
  case imageURL = "imageUrl"
}

This allows the imageURL property to roundtrip when used with the snake case encoding and decoding, but this is a 'leaky abstraction', since the developer needs to be aware of the necessity for adding this key - and also this specific key is there to support a specific configuration option of a specific encoder/decoder pair.

Codable entities and the encoder/decoder they are used with are supposed to be decoupled, but in this situation, the developer needs to know if the codable entity is used with an encoder/decoder pair that use key transformations - and also need to remember to map the key correctly, so that it will be 'found' when converting back from snake case to camel case.

Often I have seen attempts to 'fix' the behavior with the notation you would use if you didn't apply a key coding strategy:

enum CodingKeys: String, CodingKey {
  case imageURL = "image_url"
}

which of course is no good when used with snake case conversion, since the key that will be looked up is "imageUrl"

Other times I have seen developers thinking that the custom CodingKey implementation must be a mistake and removing it entirely, because unless you are very familiar with both the use case and the peculiarity of this mapping, then the code does look a bit 'off'.

Finally having this custom coding key also means that you are in trouble if you wish to encode/decode the same entity with an encoder/decoder where you are not using a similar key transform.

As described in the introduction, this issue is not specific to JSONEncoder and JSONDecoder, since all encoder/decoder pairs are basically forced to provide two transformations in order to support the allKeys API on KeyedDecodingContainer. As soon as you have the two transformations, they are basically required to be 'lossless' in order to have any key roundtrip correctly.

An attempt to analyze the `allKeys` API

In order to figure out how to propose an alternative to the allKeys API, we must first analize it's use cases.

When encoding a simple fixed struct with synthesized CodingKeys, there is usually no use for allKeys.

One use case could be to count all keys to ensure that only the explicitly handled keys are present in the input. For this use case you only need the count of keys.

Another use case is where the keys are dynamic - in the sense that they are perhaps not fully known by the author of the Codable type, but can be extended later on.

One such implementation can be seen with AttributedString here:

https://github.com/apple/swift-corelibs-foundation/blob/2db661061615dc366bd31af779d6f4551cb3197d/Sources/Foundation/AttributedString/AttributedStringCodable.swift#L493

The key type used for this KeyedDecodingContainer is AttributeKey, and it is precisely 'dynamic' in the sense that it can represent any String as it's key value, and the exact use cases are unknown to the implementation since the AttributedString functionality contains an aspect of extensibility.

So what happens when encoding and decoding AttributeString using JSONEncoder and JSONDecoder with snake case key coding strategies? It fails to roundtrip text marked up with the .imageURL property. This property appears to be marked up using a key named NSImageURL. This is encoded to n_s_image_url and upon decoding this will look for a key named NSImageUrl, which does not exist.

https://forums.swift.org/t/pre-pitch-roundtripping-key-coding-strategies/52777/4

Proposed Solution

The proposal is to introduce three changes. One of these is in the domain of Foundation, so it is out of scope for discussion in this forum. I do, however, feel that it is necessary to understand the complete picture, and I think that we could limit discussion about it on the forums to be around: 'do you think that it would be a good idea to create a PR containing these changes to swift-corelibs-foundation and of course then let Apple decide on whether or not to accept the changes.

Here are the proposed changes:

Introduce an allRawKeys: [CodingKey] API on KeyedDecodingContainer and KeyedDecodingContainerProtocol.

In order to not break backwards compatibility, KeyedDecodingContainerProtocol will supply a default implementation of allRawKeys that just returns allKeys, but authors of types conforming to KeyedDecodingContainerProtocol are advised to implement allRawKeys explicitly.

Create a PR against swift-corelibs-foundation that adds support for allRawKeys for JSONDecoder.
Introduce a protocol in the standard library named PreformattedCodingKey. Encoder and Decoder implementations that support some form of key coding strategy would be advised to implement opting out of key coding strategies for CodingKey types that conform to PreformattedCodingKey.

Create a PR againts swift-corelibs-foundation that respects the PreformattedCodingKey for JSONEncoder and JSONDecoder
Create a PR against swift-corelibs-foundation that deprecates JSONEncoders keyEncodingStrategy as well as JSONDecoders keyDecodingStrategy and introduces a common keyCodingStrategy that is a transformation in the direction from a CodingKey to an encoded key.

How do these changes help?

For use cases where the coding keys are completely dynamic, any key coding strategy will have the possibility of transforming the keys into a shape that cannot be recognized upon decoding. In that situation it could be relevant to let the CodingKey in question conform to PreformattedCodingKey in order to completely opt-out of having the keys transformed upon encoding and decoding.

In order to get a peek into the decoding process, or perhaps check the number of keys, the allRawKeys API on KeyedDecodingContainer could be a solution.

In order to have your synthesized CodingKeys round trip correctly without any manual key mapping or knowledge about how keys are transformed during encoding, use a keyCodingStrategy like useSnakeCase.

Examples

Here is a repository demonstrating a version of JSONEncoder and JSONDecoder that deprecates keyEncodingStrategy and keyDecodingStrategy respectively and introduce a shared keyCodingStrategy instead.

It also respects conformance to the included PreformattedCodingKey protocol (although this pitch proposes that this protocol is added to the Swift standard library and not to Foundation)

https://github.com/mortenbekditlevsen/JSONCoder

Here are some of the included tests:


final class JSONEncoderTests: XCTestCase {
  func testUseSnakeCase() throws {
    struct Model: Codable {
      var imageURL: String
    }

    let encoder = JSONEncoder()
    encoder.keyCodingStrategy = .useSnakeCase
    let data = try encoder.encode(Model(imageURL: "a"))

    let expectedString = "{\"image_url\":\"a\"}"
    XCTAssertEqual(String(data: data, encoding: .utf8), expectedString)

    let decoder = JSONDecoder()
    decoder.keyCodingStrategy = .useSnakeCase
    let model = try decoder.decode(Model.self, from: data)

    XCTAssertEqual(model.imageURL, "a")
  }
  
  // NOTE: This only works in tests and is only included for
  // illustrative purposes.
  func testCustom() throws {
    struct Model: Codable {
      var imageURL: String
    }
    struct MyCodingKey: CodingKey {
      var stringValue: String
      var intValue: Int? { nil }
      init(stringValue: String) {
        self.stringValue = stringValue
      }
      init?(intValue: Int) {
        self.stringValue = "\(intValue)"
      }
    }

    let encoder = JSONEncoder()
    encoder.keyCodingStrategy = .custom({ codingPath in
      MyCodingKey(stringValue: "\(codingPath.last?.stringValue.hash ?? 0)")
    })
    let data = try encoder.encode(Model(imageURL: "a"))    

    let expectedString = "{\"3520785955319405054\":\"a\"}"
    XCTAssertEqual(String(data: data, encoding: .utf8), expectedString)
    
    let decoder = JSONDecoder()
    decoder.keyCodingStrategy = encoder.keyCodingStrategy
    let model = try decoder.decode(Model.self, from: data)
    
    XCTAssertEqual(model.imageURL, "a")
  }
  
  func testPreformattedKey() throws {

    struct MyPreformattedCodingKey: PreformattedCodingKey {
      var stringValue: String
      var intValue: Int? { nil }
      init(stringValue: String) {
        self.stringValue = stringValue
      }
      init?(intValue: Int) {
        self.stringValue = "\(intValue)"
      }
    }

    struct Model: Codable {
      var imageURL: String
      func encode(to encoder: Encoder) throws {
        var container = encoder.container(keyedBy: MyPreformattedCodingKey.self)
        try container.encode(imageURL, forKey: .init(stringValue: "imageURL"))
      }
      init(imageURL: String) {
        self.imageURL = imageURL
      }
      init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: MyPreformattedCodingKey.self)
        self.imageURL = try container.decode(String.self, forKey: .init(stringValue: "imageURL"))
      }
    }

    let encoder = JSONEncoder()
    encoder.keyCodingStrategy = .useSnakeCase
    let data = try encoder.encode(Model(imageURL: "a"))

    let expectedString = "{\"imageURL\":\"a\"}"
    XCTAssertEqual(String(data: data, encoding: .utf8), expectedString)

    let decoder = JSONDecoder()
    decoder.keyCodingStrategy = .useSnakeCase
    let model = try decoder.decode(Model.self, from: data)
    XCTAssertEqual(model.imageURL, "a")
  }

Detailed Design

Adding `PreformattedCodingKey`

The proposed solution adds a new protocol, PreformattedCodingKey:

/// Suggests to `Codable` encoders and decoders that no key encoding or.
/// decoding ought to be performed on `CodingKey`s of this type.
@available(macOS 9999, iOS 9999, watchOS 9999, tvOS 9999, *)
public protocol PreformattedCodingKey { }

Handle `PreformattedCodingKey` conforming keys in `JSONEncoder`

private func _converted(_ key: CodingKey) -> CodingKey {
        // Use the plain key if it is preformatted
        if key is PreformattedCodingKey {
            return key
        }

        switch encoder.options.keyEncodingStrategy {
        case .useDefaultKeys:
            return key
        case .convertToSnakeCase:
            let newKeyString = JSONEncoder.KeyEncodingStrategy._convertToSnakeCase(key.stringValue)
            return _JSONKey(stringValue: newKeyString, intValue: key.intValue)
        case .custom(let converter):
            return converter(codingPath + [key])
        }
    }

Handle `PreformattedCodingKey` conforming keys in `JSONDecoder`

private struct _JSONKeyedDecodingContainer<K : CodingKey> : KeyedDecodingContainerProtocol, TestKeyedDecodingContainerProtocol {
...
    /// Initializes `self` by referencing the given decoder and container.
    init(referencing decoder: __JSONDecoder, wrapping container: [String : Any]) {
        self.decoder = decoder
        self.codingPath = decoder.codingPath

        // Use the plain container if the keys are preformatted
        guard !(Key.self is PreformattedCodingKey.Type) else {
            self.container = container
            return
        }

        switch decoder.options.keyDecodingStrategy {
        case .useDefaultKeys:
            self.container = container
        case .convertFromSnakeCase:
            // Convert the snake case keys in the container to camel case.
            // If we hit a duplicate key after conversion, then we'll use the first one we saw. Effectively an undefined behavior with JSON dictionaries.
            self.container = Dictionary(container.map {
                key, value in (JSONDecoder.KeyDecodingStrategy._convertFromSnakeCase(key), value)
            }, uniquingKeysWith: { (first, _) in first })
        case .custom(let converter):
            self.container = Dictionary(container.map {
                key, value in (converter(decoder.codingPath + [_JSONKey(stringValue: key, intValue: nil)]).stringValue, value)
            }, uniquingKeysWith: { (first, _) in first })
        }
    }

Handling `allRawKeys`

TODO: Add actual suggested code here:

Introduce new API on KeyedDecodingContainerProtocol
Default implementation returning allKeys
New API on the KeyedDecodingContainer

Handling `useSnakeCase`

TODO: Add actual suggested code here:

Introduce keyCodingStrategy on JSONEncoder and JSONDecoder in both Darwin Foundation overlay and swift-corelibs-foundation.
Deprecate keyEncodingStrategy and keyDecodingStrategy
Implement strategies. If a default keyCodingStrategy is used, there should be a fallback to the deprecated encoding and decoding strategies.

Impact on Existing Code

The allRawKeys is additive, but with a default implementation, so all existing KeyedDecodingContainerProtocol conformers will continue to compile, although it would be advisable to implement a specialized version.

Also no direct impact for the PreformattedCodingKey protocol, since adoption of this protocol is additive.

Note that conforming an existing CodingKey to PreformattedCodingKey will change it's encoding and decoding behavior, so that must be done with thoughts about how this intersects with current and future use of key coding strategies.

There will be deprecation warnings for existing keyEncodingStrategy and keyDecodingStrategy, but opting in to a keyCodingStrategy can be done at the leasure of the user.

With regards to any current Decodable conforming type that uses allKeys from the KeyedDecodingContainer upon decoding, I have demonstrated above that it is not reliable when used together with convertFromSnakeCase. Deprecating this API will let this fact be known and allow the author to take steps to using allRawKeys or alternatively let the CodingKey in use conform to PreformattedCodingKey.

Alternatives Considered

Using `Dictionary` instead of `PreformattedCodingKey`

You can already today ensure that key coding will not be performed on your CodingKey by leveraging the fact that keys in Dictionary are treated as data and not as CodingKeys.

This knowledge can be used to circumvent key coding strategies today. As can be seen in the following gist, the ergonomics are quite horrible, so conforming your CodingKey to the PreformattedCodingKey seems like a great win.

Don't touch my keys:
https://gist.github.com/mortenbekditlevsen/7918fb98638f8a9e2b017f0fad12da0b

Graceful fallback for `allKeys`

Even though using useSnakeCase for a JSONEncoder, the allKeys method on KeyedDecodingContainer could still use the convertFromSnakeCase transformation to recover the same keys as it does today when using the convertFromSnakeCase key decoding strategy. There is, however, no obvious choice for the custom case here, and I guess that in the long run it could easily cause more confusion than benefit.

Full `allKeys` support for simple enum backed `CodingKey`s

I did a small hack based on the great work by @stephencelis and Brandon Williams in their swift-case-paths library.

This hack basically allows you to generate an array of all cases of an enum without associated values. In other words, it queries the runtime to return information that is comparable to what the CaseIterable conformance gives us at compile time.

Using that hack, the allKeys implementation could iterate over all cases of your CodingKey when that CodingKey is an enum like the synthesized ones.

Having access to these cases allow you to perform the key coding transformation correctly and return a list of keys present in the KeyedDecodingContainer.

A fallback, in case your CodingKey is not a plain enum-backed version could be to attempt initializing the CodingKey directly from the encoded key in the KeyedDecodingContainer - or the fallback could even be to attempt the graceful fallback described in the section above.

While fun to play around with, this solution seems a bit strange, and as it mainly only fully works with plain enums, I don't consider it fit for actual use.

Acknowledgements

A huge thanks to @norio_nomura for the original PR to include useSnakeCase as a key coding strategy.

Many thanks to everyone providing feedback on the pre pitch discussion.

Revision history

Initial version

EDIT:
Removed a leftover suggestion from previously about also deprecating allKeys. After suggestion from Itai Ferber above I am currently not suggesting to deprecate that.