Hi all,
I would like to present a pitch to solving an issue with the current state of Dictionary
encoding and decoding.
Many thanks to @itaiferber for providing input and feedback, for revising the pitch and for helping me shape the overall direction.
Pitch: Allow coding of non- String
/ Int
keyed Dictionary
into a KeyedContainer
Introduction
The current conformance of Swift's Dictionary
to the Codable
protocols has a somewhat-surprising limitation in that dictionaries whose key type is not String
or Int
(values directly representable in CodingKey
types) encode not as KeyedContainer
s but as UnkeyedContainer
s. This behavior has caused much confusion for users and I would like to offer a way to improve the situation.
Motivation
The primary motivation for this pitch lays in the much-discussed confusion of this default behavior:
- Dictionarys encoding strategy
- JSON Encoding / Decoding weird encoding of dictionary with enum values
- Bug or PEBKAC
- Using RawRepresentable String and Int keys for Codable Dictionaries
The common situations where people have found the behavior confusing include:
- Using
enum
s as keys (especially whenRawRepresentable
, and backed byString
orInt
types) - Using
String
wrappers (like the generic Tagged library or custom wrappers) as keys - Using
Int8
or otherInt*
flavours as keys
In the various discussions, there are clear and concise explanations for this behavior, but it is also mentioned that supporting encoding of RawRepresentable
String
and Int
keys into keyed containers may indeed be considered to be a bug, and is an oversight in the implementation (JSON Encoding / Decoding weird encoding of dictionary with enum values, reply by Itai Ferber).
There's a bug at bugs.swift.org tracking the issue: SR-7788
Unfortunately, it is too late to change the behavior now:
- It is a breaking change with respect to existing behavior, with backwards-compatibility ramifications (new code couldn't decode old data and vice versa), and
- The behavior is tied to the Swift stdlib, so the behavior would differ between consumers of the code and what OS versions they are on
Instead, I propose the addition of a new protocol to the standard library. Opting in to this protocol for the key type of a Dictionary
will allow the Dictionary
to encode/decode to/from a KeyedContainer
.
Proposed Solution
I propose adding a new protocol to the standard library: CodingKeyRepresentable
Types conforming to CodingKeyRepresentable
indicate that they can be represented by a CodingKey
instance (which they can offer), allowing them to opt in to having dictionaries use their CodingKey
representations in order to encode into KeyedContainer
s.
The opt-in can only happen for a version of Swift where the protocol is available, so the user will be in full control of the situation. For instance I am currently using my own workaround, but once I only support iOS versions running a specific future Swift version with this feature, I could skip my own workaround and rely on this behavior instead.
I have a draft PR for the proposed solution: #34458
Examples
// Same as stdlib's _DictionaryCodingKey
struct _AnyCodingKey: CodingKey {
let stringValue: String
let intValue: Int?
init?(stringValue: String) {
self.stringValue = stringValue
self.intValue = Int(stringValue)
}
init?(intValue: Int) {
self.stringValue = "\(intValue)"
self.intValue = intValue
}
}
struct ID: Hashable, CodingKeyConvertible {
static let knownID1 = ID(stringValue: "<some-identifier-1>")
static let knownID2 = ID(stringValue: "<some-identifier-2>")
let stringValue: String
var codingKey: CodingKey {
return _AnyCodingKey(stringValue: stringValue)
}
init?(codingKey: CodingKey) {
stringValue = codingKey.stringValue
}
init(stringValue: String) {
self.stringValue = stringValue
}
}
let data: [ID: String] = [
.knownID1: "...",
.knownID2: "...",
]
let encoder = JSONEncoder()
try String(data: encoder.encode(data), encoding: .utf8)
/*
{
"<some-identifier-1>": "...",
"<some-identifier-2>": "...",
}
*/
Detailed Design
The proposed solution adds a new protocol, CodingKeyRepresentable
:
/// Indicates that the conforming type can provide a `CodingKey` to be used when
/// encoding into a keyed container.
public protocol CodingKeyRepresentable {
var codingKey: CodingKey { get }
init?(codingKey: CodingKey)
}
In the conditional Encodable
conformance on Dictionary
, the following extra case can handle such conforming types:
} else if let _ = Key.self as? CodingKeyRepresentable.Type {
// Since the keys are CodingKeyRepresentable, we can use the `codingKey`
// to create `_DictionaryCodingKey` instances.
var container = encoder.container(keyedBy: _DictionaryCodingKey.self)
for (key, value) in self.dict {
let codingKey = (key as! CodingKeyRepresentable).codingKey
let dictionaryCodingKey = _DictionaryCodingKey(codingKey: codingKey)
try container.encode(value, forKey: dictionaryCodingKey)
}
} else {
// Keys are Encodable but not Strings or Ints, so we cannot arbitrarily
In the conditional Decodable
conformance on Dictionary
, we can similarly handle conforming types:
} else if let codingKeyRepresentableType = Key.self as? CodingKeyRepresentable.Type {
// The keys are CodingKeyRepresentable, so we should be able to expect a keyed container.
let container = try decoder.container(keyedBy: _DictionaryCodingKey.self)
for key in container.allKeys {
let value = try container.decode(Value.self, forKey: key)
let k = codingKeyRepresentableType.init(codingKey: key)
self.dict[k as! Key] = value
}
} else {
// We should have encoded as an array of alternating key-value pairs.
Impact on Existing Code
No direct impact, since adoption of this protocol is additive.
However, special care must be taken in adopting the protocol, since adoption on any type T
which has previously been encoded as a dictionary key can introduce backwards incompatibility with archives. It is always safe to adopt CodingKeyConvertible
on new types, or types newly-conforming to Codable
.
Other Considerations
Conforming stdlib types to CodingKeyRepresentable
Along the above lines, we do not propose conforming any existing stdlib or Foundation type to CodingKeyRepresentable
due to backwards-compatibility concerns. Should end-user code require this conversion on existing types, we recommend writing wrapper types which conform on those types' behalf (for example, a MyUUIDWrapper
which contains a UUID
and conforms to CodingKeyRepresentable
to allow using UUID
s as dictionary keys directly).
Adding an AnyCodingKey
type to the standard library
Since types that conform to CodingKeyRepresentable
will need to supply a CodingKey
, most likely generated dynamically from type contents, this may be a good time to introduce a general key type which can take on any String
or Int
value it is initialized from.
Dictionary
already uses exactly such a key type internally (_DictionaryCodingKey
), as do JSONEncoder
/ JSONDecoder
with _JSONKey
(and PropertyListEncoder
/ PropertyListDecoder
with _PlistKey
), so generalization could be useful. The implementation of this type could match the implementation of _AnyCodingKey
provided above.
Alternatives Considered
Why not just make the type conform to CodingKey
directly?
For two reasons:
- In the rare case in which a type already conforms to
CodingKey
, this runs the risk of behavior-breaking changes -
CodingKey
requires exposure of astringValue
andintValue
property, which are only relevant when encoding and decoding; forcing types to expose these properties arbitrarily seems unreasonable
Why not refine RawRepresentable
, or use a RawRepresentable where RawValue == CodingKey
constraint?
RawRepresentable
conformance for types indicates a lossless conversion between the source type and its underlying RawValue
type; this conversion is often the "canonical" conversion between a source type and its underlying representation, most commonly between enum
s backed by raw values, and option sets similarly backed by raw values.
In contrast, we expect conversion to and from CodingKey
to be incidental , and representative only of the encoding and decoding process. We wouldn't suggest (or expect) a type's canonical underlying representation to be a CodingKey
, which is what a protocol CodingKeyRepresentable: RawRepresentable where RawValue == CodingKey
would require. Similarly, types which are already RawRepresentable
with non- CodingKey
raw values couldn't adopt conformance this way, and a big impetus for this feature is allowing Int
- and String
-backed enum
s to participate as dictionary coding keys.
Add workarounds to each Encoder
/ Decoder
Following a suggestion from @itaiferber, I have previously tried to provide a solution to this issue — not in general, but instead solving it by providing a DictionaryKeyEncodingStrategy
for JSONEncoder
: #26257
The idea there was to be able to express an opt-in to the new behavior directly in the JSONEncoder
and JSONDecoder
types by venting a new encoding/decoding 'strategy' configuration. I have since changed my personal opinion about this and I believe that the problem should not just be fixed for specific Encoder
/ Decoder
pairs, but rather for all.
The implementation of this was not very pretty, involving casts and iterations over the dictionaries to be encoded/decoded.
Await design of newtype
I have heard mentions of a newtype
design, that basically tries to solve the issue that the Tagged library solves: namely creating type safe wrappers around other primitive types.
I am in no way an expert in this, and I don't know how this would be implemented, but if it were possible to tell that SomeType
is a newtype
of String
, then this could be used to provide a new implementation in the Dictionary
Codable
conformance, and since this feature does not exist in older versions of Swift (providing that this is a feature that requires changes to the Swift run-time), then adding this to the Dictionary
Codable
conformance would not be behavior breaking.
But those are an awful lot of ifs and buts, and it only solves one of the issues that people appear to run in to (the wrapping issue) — and not for instance String
based enums or Int8
-based keys.
Do nothing
It is of course possible to handle this situation manually during encoding.
A rather unintrusive way of handling the situation is by using a property wrapper as suggested here: CodableKey.
This solution needs to be applied for each Dictionary
and is a quite elegant workaround. But it is still a workaround for something that could be fixed in the stdlib.