Proper way to structure containers in new coders?

I've been tinkering with the whole Codable process a bit as of late, and to fully get it I'm trying to see how it would work with different formats at the same time.
My thought exercise in this case is to deal with XML and/or Protocol Buffers, while still keeping support for JSON.

So, without having to deal with this at the implementation level, I'm having troubles imagining how the encoding/decoding process would structure the containers stack to properly represent the formats. My understanding is that the container stack — which has keyed, unkeyed and single containers — is a generalised view over the concrete underlying format (e.g. enum JSON, struct XMLElement, etc). So does this mean that it supports only formats that can be represented as a combination of Dictionary (with String or Int key), Array and Any?

Let's take this example:

  • The Swift model
struct Parent: Codable {
    value: Int
    child: Child
}
struct Child: Codable {
    childValue: String
}
  • A JSON we want to decode
{
    "value": 42,
    "child": {
        "childValue": "aValue"
    }
}
  • An XML we want to decode
<PARENT value="42">
    <CHILD childValue="aValue"/>
</PARENT>
  • A Protocol Buffers message definition of a value we want to decode (the actual binary value is not presented here as it would be less readable compared to the definition)
message Parent {
    required int64 value = 1
    required Child child = 2
}
message Child {
    required string childValue = 1
}

Given all these, and that they all more or less "match", I suppose the idea is that a Codable type should be able to support all of them with a single implementation of Decodable.init(from:).

JSON

With the JSON value there are no issues: the default synthesised implementation works with no issues.

XML

With the XML value I don't see issues on the decoding, but I think the encoding is ambiguous: how does the model give information to the Encoder whether a property will go into an XML attribute or a child XML element?
Given how Codable is structured I don't think this should come from info custom fed into the Encoder (e.g. through userInfo), but should come from the model itself. The suitable place for this seems to be CodingKeys, which specify where the properties get coded.
Though the protocol doesn't leave options to define additional details about the keys, so I believe the option here is only to define something like protocol XMLCodingKey: CodingKey which allows for that. But Encoder will still work with a keyed container which will be defined with a CodingKey... should the encoder fail (i.e. throw) at runtime if the key is not XMLCodingKey?
Also this means that you have to give up the synthesised implementation, but I don't think there are ways around it...

Protocol Buffers

In this case there are mainly two issues:

  • the first, that it's easily solvable, is to provide the corresponding tag for each property. CodingKey.intValue is a perfect match for it. Unluckily the default implementation doesn't offer anything with this (1-indexed property order would be a nice default here, but not sure it fits elsewhere), but still it's easy to fix this with a manual declaration of CodingKeys
  • the second, which is the real issue, is that the Encoder/Decoder needs information about the wire type used for each specific property (i.e. which of the multiple binary representations is used for the value). This might be okay for decoding, but when encoding this information is necessary. I believe this is more or less the same issue as in XML and could be solved the same way.

So, the actual question here is: am I thinking about this right? Are more specific CodingKey subprotocols the way to go when dealing with formats that needs more details about the properties, and throw at runtime if a model with a non-conforming CodingKeys is passed to the Encoder/Decoder?

3 Likes

Sorry for the delay in responding to this — I just got back from vacation, so I’m catching up on a lot of threads here.


Formats whose contents are represented in terms of Dictionary (with String or Int keys), Array, numbers, Strings, and null values offer representations in the vast majority of formats. It's also entirely possible for an Encoder to collect an object graph in this representation and transform it into a more natural representation, too; there isn't a strict requirement that the object graph passed in is represented in the same order when serialized.

But yes, if you have a format which is truly incompatible with this model, then it might not be well-suited to fit the Codable API. This is a tradeoff we made in the design: if you try to capture something representative of all serialization formats, you end up with very weak and vague API because there are ends of the spectrum which don't align with one another at all. But I would say that ~90% of serialization formats are extremely similar to one another and can be captured well here.

This is the type of decision that I would posit is really up to the Encoder, not what's being encoded. XML has many, many ways to represent similar concepts, and I think the specific format choices should be up to, and controlled by the Encoder. For instance, the XML you give above could be represented in at least few ways off the top of my head:

  • Your original (properties in XML attributes, except for complex values):

    <PARENT value="42">
        <CHILD childValue="aValue" />
    </PARENT>
    
  • Properties in child nodes with custom node types:

    <parent>
        <value>42</value>
        <child>
            <childValue>1</childValue>
        </child>
    </parent>
    
  • Properties in child nodes with generalized node types (e.g. property-value pairs):

    <property>parent</property>
    <value>
        <property>value</property>
        <value>42</value>
    
        <property>child</property>
        <value>
            <property>childValue</property>
            <value>1</value>
        </value>
    </value>
    

In all, this is up to the DTD/schema you're working with, and how the Encoder chooses to support these representations. This isn't a choice that the actual type should control (or even care about); a flag on the Encoder could potentially allow you to choose the representation you'd want.

The manual declaration of CodingKeys is indeed the answer here. We didn't want the compiler to unwittingly introduce fragility into your CodingKeys enum by deciding on integer values on your behalf. Since all enum cases have names, string values are easy; integer values fall prey to reordering and naming concerns. String backing is relatively safe (short of renaming your property and the associated key); Int backing you'd have to do yourself.

Again, I think that depending on the model you're going for, this can be solved either by the Encoder itself, or by the type directly without need for anything special:

  • If the desired format is such that the produced protobuf output is consistent in wire types across the board (e.g. use varints everywhere, or use fixed-width values everywhere, etc.), then there is an easy solution which is that the Encoder applies the same strategy across the board. The types themselves don't need to care about the wire formats of their individual properties, since they're all the same
  • If different types care about the wired formats, then the protobuf Encoder can offer specific types for those wire formats — e.g., Parent.value would be of type ProtobufEncoder.FixedInt64 (or something like that). FixedInt64 would just wrap an Int64 (and would encode as a regular Int64 through all other encoders), but would be special for ProtobufEncoder: the encoder could intercept this type specifically to get its value to write out in the fixed format as needed. ProtobufEncoder would also offer VarInt64, and the Protobuf compiler could generate the different types based on your schema

To get at the core of this: I don't think refining the CodingKey protocol is necessary given the tools we already have at hand. Besides the additional complexity (conceptually, of implementation, etc.), I think between splitting some of the responsibilities here among the Encoder and the types being encoded, it's possible to do everything here today.

With a definition of

struct Parent : Codable {
    let value: ProtobufEncoder.FixedInt64
    let child: Child

    private enum CodingKeys : Int, CodingKey {
        case value = 1
        case child = 2
    }
}

struct Child : Codable {
    let childValue: String

    private enum CodingKeys : Int, CodingKey {
        case childValue = 1
    }
}

it should be possible to write Parent to JSON, XML, and Protobuf as-is today.

3 Likes

Oh well, I wouldn't consider ~20h a "delay" :) Thanks for giving such a detailed response.

I guessed this was possible, and surely gives more flexibility.

I do agree that the most common cases should be made easier, and the design seems to nail this. Though I'm not sure — as I don't have enough examples in my mind — that other are downright impossible. This topic is indeed about exploring how to use the existing API with less common use-cases and see how far it can be pushed!

I would agree with this under the assumption that an API follows consistent rules about some kind of behaviour. e.g. for the already supported strategies it makes sense because these kind of things (property casing, date format, etc...) you usually have consistency. Though I'm not sure the same can be said for everything... most APIs are not exactly perfect :sob:

Taking this XML example I understand why you're suggesting to make the Encoder choose: all your examples basically represent the same data, which would in Swift all use the same model. The concept of Codable is to abstract from the specific encoding, and this includes details like these. Though there will be exceptions...

Given this situation, you can still delay the decision to the Encoder, but this means you'll effectively end up having a different Encoder per model. This also means you might have to switch encoders while going down the nested containers... so the responsibility of choosing the right one will be coupled with the encoders themselves. Not nice at all :disappointed_relieved:
An alternative is to have the Encoder know about all the exceptions in the codebase — e.g. by passing the schemas of all supported models — but that seems a bit extreme...

Makes sense :slight_smile:

This would work, but I think it defeats the purpose of the Codable separation: this way not only the encoding detail of a single encoding creeps into the model conformance, but actually goes all the model structure! Also this isn't scalable as if another encoding needs a similar detail for the same property we end up with nested wrappers :cold_sweat:

You did convince me that the CodingKey route isn't the best, not that much for implementation costs but more for architectural reasons: it would make the models be too responsible about coding-related details, which are indeed responsibility of the coder.

At this point it looks to me like the best solution is to have "strategies" for consistent rules, and model-specific exceptions (e.g. by passing the schemas of all relevant models) provided to the coder.

Shouldn't the CodingKeys here have both an intValue and a stringValue? In this snippet they would have only the former, and only the protobuf coder would work. Unless I'm missing something :thinking:

I think you're missing something. In Itai's example, the coding keys have both types of values:

var p = Parent(value: 100, child: .init(childValue: "some value"))
var e = JSONEncoder()

e.keyEncodingStrategy = .custom { keys in
    for key in keys {
        print(key.stringValue, key.intValue ?? "didn't have an intValue")
    }
    return keys.last!
}

try e.encode(p)

/* Prints:
value 1
child 2
child 2
childValue 1
*/

Note how stringValue is non-optional, and intValue is optional. The intValue here is filled out by the enumeration's integer raw value.

If there are indeed exceptions, the question is — who would best know about them and where they should be? Is it the Encoder itself? The individual objects? The entity doing the actual encoding (e.g. the code that calls XMLEncoder().encode(...))?

There are solutions to this and in the design of our APIs, we've tried to keep all options open, so any of these three actors can influence the final decision in concert with one another:

  1. Individual objects generally shouldn't need to know about the wire formats/what's going on elsewhere in encoding/decoding, but they can through the encoder's codingPath (AKA "where are we in the process right now?") and the userInfo dictionary (AKA "what details might I need to know from the top level down?"). So if you need to, individual objects can decide to inform the Encoder about what to do
  2. The entity doing the encoding generally shouldn't muck around with individual objects and types they don't know about, but they can if the Encoder provides strategies which allow overriding of individual types (like DateEncodingStrategy). A .custom strategy can offer callbacks to the top-level to influence how things are encoded
  3. An Encoder generally shouldn't special-case various things during encoding, but can do whatever it wants or needs to do to get the payload encoded

All in all, you can attack this problem from any of the three directions. The question is: who has the information (or can provide it) to best inform how things are encoded?

Why? You don't necessarily have to — you can have a single Encoder which changes serialization schemes as it's going through and encoding. I don't think there's necessarily anything keeping the Encoder from doing a combination of several different methods, as long as it knows what's needed.

There's some amount of balance to strike here between idealism and pragmatism. Your type could be encoded in a variety of different formats, but will it? If so, and your type cares so strongly about the details of all possible formats that you can't reflect this in the model structure, there are other ways of dealing with it.

One way is to switch on the output format (there are some ways of doing this, but it largely requires participation from the Encoder or the top-level entity at the moment) and to wrap at encode(to:)-time the properties you want to encode in the specific wire formats you want.

Another option is to incorporate Codable adaptors somehow (we haven't yet designed this, but it's come up a few times), letting you more easily modify the behavior of individual properties; no promises on this ATM.

Yeah, as mentioned above, this is one way to do it. There are many, but it depends on specific needs rather than hypotheticals.

As noted by @krilnon, Int-backed CodingKeys get both String and Int values (String value matching the name of the case, Int value based on its value).

1 Like

Indeed, I did miss that. I was used to seeing a custom implementation like

enum CodingKeys: String, CodingKey {
    case value
    case child = "customKey"
}

and internally I thought that having the enum being RawRepresentable with a String was necessary to retrieve CodingKey.stringValue, while indeed that is necessary only to be able to override the values. I never thought before that the synthesised implementation didn't actually need the RawRepresentable bit, and could be expressed as

enum CodingKeys: CodingKey {
    case value, child
}

Yep! This is a little-known feature. If you don't need to customize the cases, you don't strictly need to be String-backed. There's more info in the original proposal, right where protocol CodingKey is introduced (scroll a little bit; it's hard to link to directly).

1 Like