Using RawRepresentable String and Int keys for Codable Dictionaries

Currently you can only use Strings and Ints as keys for a Dictionary if you wish to be able to encode and decode the dictionary from / to a keyed container.

Any other codable key types will encode and decode the dictionary into an unkeyed container.

This eliminates a few use cases for serializing Dictionaries into JSON dictionaries when the keys are RawRepresentable as either String or Int.

For instance you could wish to use a String backed enum as keys in your Dictionary:

enum MyKey: String, Codable {
    case one
    case two
}

let dict: [MyKey: String] = [.one: "first", .two: "second"]

Encoding the above to JSON using the JSONEncoder makes this into an array of the four values ["one", "first", "two", "second"] instead of a dictionary/object/map.

Another use case is using something like 'tagged' (GitHub - pointfreeco/swift-tagged: 🏷 A wrapper type for safer, expressive code.) in order to add type safety to String based identifiers (and more) in your code base.

Using Tagged you can get a wrapper around a String that is Codable and RawRepresentable and has the advantage that it is type safe. For instance you can model the following:

struct User {
    typealias Identifier = Tagged<User, String>
    var id: Identifier
    var name: String
}

let userMap: [User.Identifier: User]

But now userMap will be serialized as an array instead of a dictionary.

I have looked at stdlib/public/core/Codable.swift.gyb to try and see if I could extend the Dictionary conformance to Encodable and Decodable to be able to use RawRepresentable keys where the RawValue is either String or Int.

Unfortunatly I have failed to implement this since RawRepresenable can only be used as a generic constraint and you cannot add multiple conformances to Encodable (there cannot be more than one conformance, even with different conditional bounds).

Does anyone have any idea how to implement a change to the stdlib in order to enable the use of RawRepresentable keys?

And do anyone have any thoughts about whether this change would be ok. If someone already expects array serialization, then the dictionary behavior will be unexpected.

@itaiferber

1 Like

For reference, this is SR-7788 [and was brought up way back when in JSON Encoding / Decoding weird encoding of dictionary with enum values]

It's possible, though I'm not sure off the top of my head with which ABI effects, to extend Encodable/Decodable with marker requirements fullfilled by RawRepresentable types:

extension Encodable {
    func __stringRawValue() -> String? { nil }
}

extension Encodable where Self: RawRepresentable, Self.RawValue == String {
    func __stringRawValue() -> String? { return self.rawValue }
}

enum Foo: Int, Encodable {
    case a
}

enum Bar: String, Encodable {
    case b
}

print(Foo.a.__stringRawValue() ?? "nil") // nil
print(Bar.b.__stringRawValue() ?? "nil") // b

It might be possible to take advantage of protocol extensions in this way (or other runtime trickery to make it happen), but...

... the big problem with making this change right now is that it's not backwards-compatible. Even if we enhance Dictionary.init(from:) to be able to decode both forms (legacy unkeyed format, newer preferred keyed format), serialized payloads from newer versions of Swift wouldn't be decodable on older versions. For archived data, this is a serious concern.

There's unfortunately no good "switch" or option to enable somewhere to make it work. Specific encoders and decoders could override behavior for dictionaries to support this (so you could tell, for instance, JSONEncoder to prefer converting RawRepresentable keys to CodingKeys), but this would be on an encoder-by-encoder basis; there's no hook we could add to Dictionary specifically to easily turn this on or off.


There's also the practical matter to consider that could affect clients in a surprising way: if you take a type and make it RawRepresentable by an Int or String in a later version of an app/framework, you can suddenly break your data format for yourself or others, with little recourse. Without having a good way to control the behavior, I think it would be too risky and breaking to enable this now.

Thank you for the explanation and for the clever tricks!

I completely get how this can’t be fixed due to platforms already having the current behavior, but it does make me very sad that something like this is basically forever unfixable...

My current workarounds are to parse as String keyed dictionaries and then mapping keys and creating new Dictionaries from the mapped key/value pairs. During encoding I map back again. If I forget the mapping anywhere, the serialization works fine, but produces (for me) an unexpected result.
It feels fragile, and I am only getting this issue because I am trying to make my models more type safe... :-/

Can anyone suggest a more performant workaround than mapping and creating new Dictionaries?

I think I see how a fix could be added to JSONEncoder and Decoder directly. Do you think that this behavior could be added and enabled through a ‘strategy’ (rawRepresentableStringKeyedDictionaryDecodeStratgey?)

Do you think that a PR like that would be considered if I gave it a shot?

I think a nice addition would be the ability to use any type that can be represented as String or Int in a dictionary key. For example:

let response: [Date: String]

// string or int key depends on date formatting strategy
{ 
  1563298954: "foo",
  1563385354: "bar" 
}
// or
{ 
  "07-16-2019": "foo",
  "07-17-2019": "bar"
}
let response: [UUID: String]

{ 
  "ae31ad22-54c6-4674-80f4-d0beb17ceb34": "foo",
  "bfeeb141-c94c-426e-bb01-5f0100cc4141": "bar" 
}

I have a working implementation of a JSONEncoder/Decoder that does this. But it only does it for RawRepresentable types with Strings as the RawValue.

This means that it unfortunately does not work for your use case @jjanke, since neither Date nor UUID conforms to RawRepresentable... :-/

The implementation uses some hacks like encoding the keys temporarily- pushing and popping dummy keys to the path stack to make it work.

This could be made available through an option on JSONDecoder and JSONEncoder, which is only supported from a given Swift version.

Another option would be to add a StringKeyCodable protocol, and then you could confirm types to this protocol if you wished them to result in dictionaries instead of arrays. This protocol would then also only be available from a certain Swift version.

@itaiferber do you have any opinion about whether it’s feasible to get a PR as described above accepted?

No matter what we decide, we'll have to put any new API through API review — a PR would be welcome, but changing Foundation types would require internal API review too, which we would have to account for as part of the process. There are other changes I've long wanted to make to JSONEncoder/JSONDecoder (at the very least, adding .iso8601WithOptions(...) to Date{En,De}codingStrategy is long overdue), so this would quite possibly fit in well there.

If we decide that this is a larger change we'd like to make that would go into the stdlib (e.g. StringKeyCodable/StringKeyConvertible/whatever), that would of course need to go through swift-evolution.

Thank you for your reply.
I have made a PR of a very, very WIP implementation of a workaround specifically for JSONEncoder and JSONDecoder.
I would very much like to discuss the details of this. Would it be better to do that here or on the PR itself?

https://github.com/apple/swift/pull/26257

I like the idea of getting other changes in at this time as well.

It could be checked if SR-8276 is still an issue by removing the special case for 32-bit platforms for the definitions of the other marker protocols.

Would it also be worth reconsidering @norio_nomura s fix for [SR-6629] again?([SR-6629] JSONEncoder with convertToSnakeCase encodes myURLProperty into my_url_property, but JSONDecoder with convertFromSnakeCase does not decode myURLProperty from my_url_property. ¡ Issue #3752 ¡ apple/swift-corelibs-foundation ¡ GitHub) Add useSnakeCasedKeys to JSONDecoder.KeyDecodingStrategy by norio-nomura ¡ Pull Request #14039 ¡ apple/swift ¡ GitHub

I just hit that problem as well...

I see the problem of archived data. However, IMHO the current behaviour is a bug - and data archived with it would be relying on that bug.
IMHO, dictionary is clearly a keyed structure, so I don't know how you can expect it to be encoded/decoded in an unkeyed container. Similarly to what @jjanke mentioned, I'd expect Dictionary to always encode itself into a keyed container.
I see, that there are limitations proposed by the resulting format. In this case JSON only being able to have string keys. But that is again something that the encoders/decoders should have to deal with. Since CodingKey already makes sure that keys of an unkeyed container can be represented as String and possibly Int, the encoders could make use of that.
The only remaining question is, how to get from an arbitrary RawRepresentable type (whose RawValue could be any type) to something that is convertible to String and/or Int.
I see two options here. One would be to implement some kind of marker, similar to what @itaiferber already described. In this case I'd just use a "dummy" coding key internally, instead of limiting it to String. This would allow the encoder/decoder to decide what to use as key (String or Int).

Another option would be to introduce a new protocol CodingKeyRepresentable, which would allow developers to decide themselves, how a RawRepresentable type (or any type for that matter) translates into a CodingKey:

public protocol CodingKeyRepresentable {
    associatedtype CodingKey: Swift.CodingKey

    var codingKey: CodingKey { get }
}

extension RawRepresentable where Self: CodingKeyRepresentable, Self.CodingKey == Self {
    public var codingKey: CodingKey { return self }
}

extension RawRepresentable where Self: CodingKeyRepresentable, Self.CodingKey == RawValue {
    public var codingKey: CodingKey { return rawValue }
}

With Swift 5.1's new features, we can maybe even provide some default implementation for any RawRepresentable type whose RawValue is either String or Int. (Haven't tried if that would work, though).

private enum AnyCodingKey: CodingKey {
    case string(String)
    case int(Int)

    var stringValue: String {
        switch self {
        case .string(let str): return str
        case .int(let int): return String(int)
        }
    }

    var intValue: Int? {
        switch self {
        case .string(let str): return Int(str)
        case .int(let int): return int
        }
    }

    init?(stringValue: String) { self = .string(stringValue) }
    init?(intValue: Int) { self = .int(intValue) }
}

extension RawRepresentable where Self: CodingKeyRepresentable, Self.RawValue == String {
    public var codingKey: some Swift.CodingKey { return AnyCodingKey.string(rawValue) }
}

extension RawRepresentable where Self: CodingKeyRepresentable, Self.RawValue == Int {
    public var codingKey: some Swift.CodingKey { return AnyCodingKey.int(rawValue) }
}

Otherwise we could simply not restrain the CodingKeyRepresentable protocol with an associatedtype and have its codingKey property simply return an existential. In this case it would be easier to provide default implementations.

Personally, I'd prefer the second approach, which would have to go through swift-evolution, though.
Looking at @Morten_Bek_Ditlevsen's PR, I'm not sure we should go down that road. It needs to introduce new public API in Foundation, which means it has to go through the API review @itaiferber mentioned, but still only fixes part of the problem for only one encoder/decoder. I think if we anyways have to go through a more in-depth review, we should at least have a solution that fixes the problem for good. :slight_smile:

1 Like

Excellent analysis! Thanks!
I would also prefer that the functionality was general across encoders/decoders too - instead of just fixing a single use case.

Hi all,
I’m bumping this topic on request from @Tony_Parker on the discussion on the PR mentioned earlier: [WIP] Using RawRepresentable String keys for Dictionaries with JSONEncoder and JSONDecoder by mortenbekditlevsen · Pull Request #26257 · apple/swift · GitHub

To summarize the discussion so far:
Currently Dictionaries that are not either String or Int keyed will encode and decode as key-value pairs rather than as dictionaries.
This happens even though the keys may be RawRepresentable as Strings or Ints.

It could be desirable to have those RawRepresentable keys make Dictionaries encode and decode as dictionaries - and it could even be desirable to get this functionality if the keys could just be encoded/decided to/from single value containers containing a String or an Int.

The options as I remember them are:

  1. Do nothing
  2. Consider current behavior to be a bug and change the behavior. This is problems since the behavior is tied to the platform (iOS 13 gets current behavior while a future iOS release could get the new). I don’t think that this option is realistic even though I would personally like to consider the current behavior to be a bug.
  3. Create a shared ‘marker’ protocol for opting in to this behavior for each key type, but for all encoders and decoders.
  4. Fix this with new encoding/decoding options per encoder - as the mentioned PR does for JSONEncoder by adding a new ‘dictionary encoding strategy’.
  5. Perhaps this could be fixed using ‘new type’ if that will become part of the language. This would not fix the general issue, but at least it could perhaps be made to work for types that are ‘new types’ of String and Int. Just speculating here, since I am not very familiar with how newtype might actually work.

For me the most elegant option would be number 2, but as mentioned I am guessing that this can’t be chosen.

Can anyone help with suggestions for this issue?

This just bit me again by silently garbling my data, and I would like to suggest a that case 2 above is definitely correct.

The fact that depending on key type, a dictionary can silently encode as an array massively fails the principle of least astonishment. If an encoder can not handle a specific type of key, it should be throwing an error, not changing fundamental semantics of the data format to try and kludge it in anyway.

4 Likes

Dictionary conforms to Codable, but since JSON dictionaries only support keys of type String and Number, the JSONEncoder encodes a Swift Dictionary with String or Int keys as a non-keyed (array) container. The use of tagged types that encapsulate a String is a common use case, but this turns out to have the surprising effect of producing a JSON array instead of a JSON dictionary.

To work around this, the extensions below facilitate the conversion of a dictionary with keys that conform to RawRepresentable where the RawValue is String to a pure Dictionary<String, Value> and back upon decoding.

extension Dictionary where Key: RawRepresentable, Key.RawValue == String, Value: Encodable {
    var encodable: [String: Value] {
        var result: [String: Value] = [:]
        for key in keys {
            result[key.rawValue] = self[key]
        }
        return result
    }
}

extension Dictionary where Key == String, Value: Decodable {
    func decodable<KeyType>(_ type: KeyType.Type) -> [KeyType: Value]? where KeyType: RawRepresentable, KeyType.RawValue == String {
        var result: [KeyType: Value] = [:]
        for key in keys {
            guard let wrappedKey = KeyType(rawValue: key) else {
                return nil
            }
            result[wrappedKey] = self[key]
        }
        return result
    }
}

Example of use:

struct ShortID: RawRepresentable, Codable {
    let rawValue: String
    
    init?(rawValue: String) {
        // validate if necessary
        self.rawValue = rawValue
    }
}

struct VectorClock: Codable {
    var versions: [ShortID: Int]

   // ...

    func encode(to encoder: Encoder) throws {
        var container = encoder.singleValueContainer()
        try container.encode(versions.encodable)
    }
    
    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        guard let versions = try container.decode([String: Int].self).decodable(ShortID.self) else {
            throw DecodingError.typeMismatch(ShortID.self, DecodingError.Context(codingPath: decoder.codingPath, debugDescription: "Dictionary keys aren't all valid ShortID."))
        }
        self.versions = versions
    }
}

Encoding and decoding Swift dictionaries with custom key types