Hi Morten, thanks for continuing the discussion on this! To answer some of your questions:
You technically can, but I think for a PR to be effective, it needs to have a primary purpose. Like for all API changes, the review process requires pitching the API and going through discussion. If you start up a pitch, it'd be great to have a PR up with the associated changes, but that's not strictly necessary; putting up a PR without having an associated pitch, though, might cause some confusion.
Given that this discussion has come up a few times in the past, I think there's been enough demand to pitch it. 
To answer your more technical questions:
In my mind, if one of the goals in doing this is refactoring some of the implementation we have, one of the easier ways to structure this would be:
- Pull out all of the base functionality from
_JSONEncoder/_PropertyListEncoder and _JSONDecoder/_PropertyListDecoder into new StructureEncoder and StructureDecoder classes in a new file (StructureEncoder.swift); these classes should be format-agnostic and do the minimum required to get things wrapped up
- Because
PropertyListEncoder and PropertyListDecoder don't have any encoding strategies, they can use the pure StructureEncoder/StructureDecoder to convert Codable values into containers before passing off to PropertyListSerialization and back. In other words, we can eliminate _PropertyListEncoder and _PropertyListDecoder
- Any of the encoding strategies on
JSONEncoder/JSONDecoder should remain specific to the JSON format; this means that we would reparent _JSONEncoder/_JSONDecoder to inherit from the new Structure types, and override only the behavior necessary to apply those strategies
Because the top-level {JSON,PropertyList}{En,De}coder types are thin shells around the actual underlying encoding mechanisms, all of the underlying implementation details can change without users knowing; the above is a simple way to do this, but we have flexibility.
There are, yes — for instance, the assumption that all dictionaries must be string-keyed (because that's what JSON supports). In designing a more generalized encoder/decoder pair, we'd need to decide how to handle this:
_JSONEncoder converts Int-keyed dictionaries to String-keyed dictionaries by stringifying the keys. We could maintain this behavior as generally useful: not all formats support Int-keyed dictionaries, so it would be a shame to need to subclass these types just to enforce that. Alternatively, we could offer an encoding strategy for it to let consumers decide
Dictionary itself already turns non-String- and non-Int-keyed dictionaries into keyed containers, so we wouldn't need to handle more complex cases than that
Of course, there are other potential implicit assumptions that we'd need to audit for.
Agreed — there's no need for the application of JSON semantics to affect this more generalized structure. (In general, too, there have been requests to relax this restriction, which I'm in favor of; that will require further changes, though.)
Thanks for putting work into this! We can continue discussing specifics, but I'd like to pull back to some higher-level topics here that in the past we haven't gotten to come up with good answers for. Specifically, two subjects:
- Naming/availability
- Whether or not
JSONEncoder/JSONDecoder will use this new pair as their basis
The first subject is the much bigger and more important one: at the moment, the namespace here is already pretty saturated with "Encoder"- and "Decoder"-type words that can make reaching for the right tool to use a little bit difficult. Besides Encoder and Decoder themselves, there are the encoding containers, and the actual format-specific encoder types. If we think this is something we'd like to do, we need to come up with a really good name for these types that
- Doesn't have the potential for leading toward erroneous conflation of "Structure" with
struct, i.e., we don't want to run the risk of someone reaching for StructureEncoder for the wrong reason
- In general, is more difficult to reach for to begin with. These types are not necessarily all that useful on their own; it's rare to need to need to convert arbitrary
Codable types to containers unless you're then going to pass those containers through a serialization pass of your own. This makes them good API for developers who are looking to write their own encoders and decoders, but not so good for developers looking to use encoders and decoders. StructureEncoder is currently a straw-man name to give a bit of meaning to the API, but something like ConcreteEncoderBase (or something similarly abstract) could make it more difficult to reach for
Something that has come up in the past is potentially namespacing these types to begin with to fall more in line with the principle of progressive disclosure. However, in the absence of language-supported namespaces at the moment, we'd have to resort to using something like enum EncodingUtilities {...}, which would still be imported by default when you import Foundation. (I think ideally, you'd need to import Foundation.EncodingUtilities or similar in order to see these types.) Not something insurmountable, but not quite in place yet for us to be able to use it.
As for JSONEncoder/JSONDecoder — while it would be really nice to factor some of this implementation out into something shareable between JSON and PropertyList, another topic that's come up is weaning JSONEncoder and JSONDecoder off of JSONSerialization and onto a different internal serializer that's closer to the whole encoding and decoding process. Not requiring a whole pass to decode the JSON via JSONSerialization first would give us better type-level access to the underlying data (e.g. see decoding a Double vs. decoding a Decimal; JSONSerialization has to choose which, but doesn't know what you'd prefer as a consumer — while JSONDecoder might and could do a better job), along with some potential performance benefits.
Does that obviate the need for this? I don't think so — clearly this would be useful on its own. The question is just how much infrastructure work would we like to do to factor this out while then repeating further work to further enhance JSONEncoder/JSONDecoder. I don't have an answer for this at the moment.
In any case, I appreciate the continuing discussion! I think before we skip ahead to implementation detail, we should carefully consider the design decisions that make a big difference here, then pitch the API. The implementation detail should be "boring" in comparison to the work done up-front. 