Let’s see… I need to page some of this back into memory.
Re _DictionaryCodingKey
: the current version has existed in its current form since it was first introduced.
Why does _DictionaryCodingKey.init(stringValue:)
attempt to parse the string value into an Int
?
This is primarily for the benefit of decoding. Some formats (like JSON) only support String
keys for dictionaries, which means that on encode, we have to convert Int
keys to String
s. On decode, however, we have to try to go the other way:
- We create a
KeyedDecodingContainer
keyed by a key type which attempts to parse Int
keys
- We access
allKeys
on that container, and if any key couldn’t be parsed as an Int
, we throw an error
This does mean that there is a performance hit on encode, yes — and it’s absolutely worth considering improving performance by possibly splitting out encoding behavior from decoding behavior. Note that it is possible (though I think very unlikely) that something could rely on this on encode:
- Ostensibly, you can imagine an
Encoder
/Decoder
which would prefer to use Int
keys where possible for efficiency (or as a requirement of the format — the opposite of something like JSON which requires String
keys), and given a dictionary created with keys that could be all Int
s, would prefer to convert them… But at this point, I think this is a pretty big stretch. I haven’t seen any Int
-preferring encoders/formats in the wild that would benefit from this
- Potentially more usefully, you can imagine a migration scenario in which existing code which encoded
Int
-keyed dictionaries has now been expanded to want to encode as String
-keyed instead. This allows the old Int
-keyed values to still be accessibly as if they were originally String
-keyed and vice versa: introducing new data into an old version of an app. But, this is a bit of a strange use-case
Should _DictionaryCodingKey.init(intValue:)
store the Int
key as a String
?
As alluded to above, I would say that many (if not most) formats do not support Int
values as keys (e.g. JSON). Conversion to a String
has to happen somewhere, and at the moment, the resulting value is easier to store. Could we benefit from recalculating the value every time (as opposed to converting the once and storing)? Potentially, but I think it might be difficult to measure.
The conversion has to happen at least once for every key on encode and decode, so doing it lazily doesn't inherently offer a benefit beyond storage. To see a real benefit to storing the resulting value as we do now, you’d need to access the key’s stringValue
at least twice, and it’s hard to tell how often that might happen. It at least depends on how the Encoder
/Decoder
touches the keys, and how often user code accesses them either via allKeys
or codingPath
.
So does that mean we should convert to an enum
? Also hard to say — in order to hit the case where you want to fit a _DictionaryCodingKey
into an existential box, you’d need to want to use such a key. But, an EitherCodingKey
type (i.e. either String
or an Int
) is less generally helpful as public API than, say, AnyCodingKey
which takes (String, Int?)
directly — in which case, you don’t benefit from the conversion to an enum
anyway. We could expose both types of keys and let users choose, but it feels like muddying the waters a bit…
In all, my gut feeling is that the performance impact one way or another is likely negligible in the vast majority of use-cases, and I think we might be hard-pressed to see real benefits one way or another.
I think regarding both questions: it would be interesting to see some real-world measurements of performance to see how big of an impact either one makes (potentially-wasteful String
→ Int
conversion on encode; unnecessary memory storage on encode/decode). With dynamic code like this, it can be really difficult to try to capture a wide swath of real-world scenarios, but it might be helpful if we can find something.