Roundtripping key coding strategies

Morten_Bek_Ditlevsen · August 4, 2022, 1:53pm

Hi Itai,
I'm back at the keyboard after a long (and most excellent) summer break.

This could definitely be a fix for a large percentage of the issues that people are likely to run in to today - at least when using English abbreviations. But I do think that it feels a bit like a workaround, and when you do need to extend this list, it might be unclear why this is needed.

There's also an aspect that this perhaps does not help with - namely the case of decoding in a dynamic fashion using the allKeys api together with a CodingKey that returns a key for any String:

For a String based enum CodingKey like:

enum CodingKeys: String, CodingKey {
  case imageURL
}

and the encoded string version of the key: image_url, the enum initializer would return nil for the default 'from snake case' conversion: 'imageUrl', and in this case your suggested strategy could try replacing 'Url' with 'URL' and try initializing the coding key with 'imageURL', which would succeed.

But if we consider the dynamic case, where the coding key is something like:

struct MyCodingKey: CodingKey {
    let stringValue: String
    let intValue: Int?
    
    init(stringValue: String) {
        self.stringValue = stringValue
        self.intValue = Int(stringValue)
    }
    
    init(intValue: Int) {
        self.stringValue = "\(intValue)"
        self.intValue = intValue
    }
}

This coding key has a non-failing initializer, so the algorithm applying the list of abbreviations would never know if it were 'correct' to use a key based on the string 'imageUrl' or 'imageURL'...

These considerations reinforce the point of view that there is a huge difference between the kind of enum based coding keys that you get automatically synthesized - and the dynamic case where the coding keys can hold any value and allKeys is used to inspect the payload.

With this consideration in mind, I think that it would be a noble goal to fix the autosynthesized Codable conformance 100% with no workarounds or extra knowledge needed. This fix will of course also work for all manual Codable conformances that do not use the allKeys api.

My personal guess is that such a solution will solve a much larger percentage of issues than the extensible list of abbreviations will solve on it's own. And it needs less 'magic sauce' in my point of view.

Remaining is the 'dynamic' decoding with flexible coding keys and using the allKeys api.
Regarding that we can state:

The combination of dynamic keys and JSONDecoder key decoding strategies is already somewhat broken (please see [Pre-pitch] Roundtripping key coding strategies - #13 by Morten_Bek_Ditlevsen)
Adding a list of known abbreviations will not fix dynamic key decoding since we cannot know if these dynamic keys follow the 'rules' defined by this list.
A solution that always returns an empty array for the allKeys api would make the situation for dynamic key decoding worse.
The only way to completely fix the dynamic key coding is to not transform the keys - either by not using any key coding strategies OR by adding a protocol to the standard library that suggests to encoders and decoders that they should not encode or decode the keys even though they might have a key coding strategy set.

Combining these things I propose another solution that can be implemented entirely in JSONEncoder and JSONDecoder in Foundation, namely the solution previously suggested at: [Pre-pitch] Roundtripping key coding strategies - #13 by Morten_Bek_Ditlevsen

To recap: a .useSnakeCase key encoding strategy on JSONEncoder and a similarly named key decoding strategy on JSONEncoder. These strategies transform keys in the direction of the CodingKey to String representation - both when encoding and decoding - and thus there is no loss due to having two separate transforms.

In order to support the allKeys API, the solution would, however, fall back to the existing 'fromSnakeCase' transformation - thus doing the exact same thing as the .fromSnakeCase key decoding strategy does for the allKeys api today.

This solution would work for all synthesized CodingKey conformance as well as all conformances that do not use allKeys - and for Decodable conformances that use the allKeys api, the situation is status quo compared to the current .fromSnakeCase key decoding strategy.

It could be a future direction to try and fix the dynamic decoding situation using a 'leave my coding keys alone' marker protocol.

What do you think? Do you agree that the suggested list of abbreviations will not fix dynamic parsing, or might I be overlooking something?

I'd love to hear any thoughts that any readers of this topic may have.
Also from any members of the Foundation team. :-)
CC: @tomerd @Tony_Parker