Bug or PEBKAC?

itaiferber · February 14, 2020, 2:24am

As the original author of the code, I can give some background as to why the implementation behaves this way (though as I no longer work at Apple, I can't speak to any decisions about changing this in the future, if at all).

Answering questions sequentially:

Why doesn't Dictionary assert at runtime that a given key encodes as either a String or an Int in the end? For several reasons:
1. Knowing what a given key encodes as requires encoding it and then checking — there is no way to know up-front how an object prefers to encode because it can choose one of many options based on Encoder.codingPath, Encoder.userInfo, global state, etc. This means that the only way to know for sure is to encode the entire key object and then check, which means that you have to potentially set aside a container for it, encode a potentially large object graph (the object has control of program flow once you call its encode(to:) method so you have to wait until it's done), and then inspect.
  
  For some encoders, this might be prohibitive: if you want to write an encoder that streams output as soon as objects hit your containers' encode(...) methods (e.g. you want to keep potential memory footprint down as much as possible), as soon as you hit a Dictionary, you can no longer stream keys
2. Even if you decide you want to encode a key and check what type it encoded after the fact, you don't have a guarantee that all keys in a dictionary will encode the same way. If I have a dictionary of 100 objects where the first 99 keys encode as strings but the last key encodes as an object, you have no way of knowing that would happen until after you encode the last key. This means that you have to potentially hold all encoded keys in memory until you know for sure what container you can use to hold them all, and if you've already made a decision about what container to use, it might be the case that you have to go back and change containers
3. Asserting that all encoded dictionaries must have String/Int keys, must be consistent between the two, or must have a structure representable natively by any given format is needlessly prohibitive. It's possible to allow any dictionary whose keys and values are encodable and decodable to encode and decode (assuming you know their static types), so there's no real reason to prohibit that
Why doesn't Dictionary check for conformance to CodingKey and expand the number of types it supports beyond String and Int? For two reasons:
1. Although it is unlikely, just because an object conforms to CodingKey doesn't necessarily mean that it would prefer to encode as its .stringValue or .intValue. Any object which conforms to Codable can choose its own representation, and although it is very unlikely that you'd be encoding a Codable & CodingKey object, it's not guaranteed
2. More importantly, we wanted to deter the potential usage of [CodingKey : <Codable type>] over the use of nested containers. Syntactically, it's really easy to reach for a Dictionary, but that forgoes the strong static typing that we built with containers; it's relatively rare that you want to encode a dictionary keyed by an arbitrary set of CodingKeys, but rather, a known set of CodingKeys. When you own the key type, it's much easier to reach for a nested container
Why doesn't Dictionary check for conformance to RawRepresentable and expand the number of types it supports beyond String and Int?

As far as I'm concerned, this is a bug, but one that will be very unlikely to change. There's been a lot of discussion of this in the past (see Using RawRepresentable String and Int keys for Codable Dictionaries and JSON Encoding / Decoding weird encoding of dictionary with enum values) and unfortunately, it's a behavior-breaking change. Dictionary can't change its implementation without producing data which is not compatible with older versions of the stdlib

NOTE: Why is this case different from the CodingKey case above? Because the semantics of RawRepresentable indicate that the type is truly equivalent to its raw type.

You can potentially make a similar claim about LosslessStringConvertible, but that one's a bit more iffy. Just because you can convert to a String and back doesn't mean that the type intends this.

Since it's unlikely that Dictionary can change these things without massive breaking consequences, the best chance forward is for additional API on existing Encoders and Decoders to twiddle with these and allow dictionaries to be intercepted, like with existing encoding/decoding strategies. /cc @Morten_Bek_Ditlevsen