Multiple calls to `encode(to:)` in the same container (a dubious Codable / `@dynamicMemberLookup` combination)

tikitu · May 19, 2022, 9:30am

This thing I'm doing seems to work, but I'm not sure if I can rely on it from Codable semantics or if it's a happy accident of JSONEncoder/Decoder (and perhaps the iOS/Xcode versions I'm using).

I have a bunch of types that all share some standard boilerplate properties, call that the "standard metadata". I used @dynamicMemberLookup to compose the standard metadata with structs for the unique properties of various types, like this:

public struct StandardMetadata: Codable {
    public let created: Date
    public let lastChanged: Date
    public let id: UInt
}

@dynamicMemberLookup
public struct ComesFromAPI<Model> {
    public let metadata: StandardMetadata
    let body: Model

    public subscript<Value>(dynamicMember keyPath: KeyPath<Model, Value>) -> Value {
        body[keyPath: keyPath]
    }
}

public struct SomeExampleModel: Codable {
    let name: String
}

I'm receiving these types as JSON, and the metadata is in the same container as the individual values.

{ "created": "2022-01-01T00:00:00",
  "lastChanged": "2022-03-10T02:05:23",
  "id": 3527,
  "name": "Gollum" 
}

I can give ComesFromAPI a conditional conformance like so:

extension ComesFromAPI: Codable where Model: Codable {
    public init(from decoder: Decoder) throws {
        metadata = try StandardMetadata(from: decoder)
        body = try Model(from: decoder)
    }

    public func encode(to encoder: Encoder) throws {
        try metadata.encode(to: encoder)
        try body.encode(to: encoder)
    }
}

This lets me successfully round-trip encode/decode ComesFromAPI<SomeExampleModel>, with the encoded JSON looking as I want it to (all properties in one JSON container). (That is, it's successful compiling with Xcode Version 13.3.1 and targeting iOS 15.4.)

But I don't understand the semantic guarantees of Codable well enough to know if this is reliable; I would not have been surprised at all if the encoding instead generated two JSON objects side-by-side, and while I'm delighted that it doesn't I don't really understand why. Can anyone enlighten me?

tera · May 19, 2022, 10:48am

Tens? Thousands?

Is this essential or could you stomach this?

{
  "metadata": {
      "created": "2022-01-01T00:00:00",
      "lastChanged": "2022-03-10T02:05:23",
      "id": 3527
   },
  "name": "Gollum" 
}

From personal experience: I play with creative workarounds this, then sleep over it, take a fresh look, realise that the amount of creative code to reduce boilerplate is greater than the amount of the boilerplate itself, and/or the removed boilerplate code, whilst repetitive, was quite simple to understand and the creative workaround could contain bugs, is quite challenging to grasp either for another developer or for myself in a few months time, take a deep breath and revert from creative code back to a simple:

// option 1
struct StandardMetadata: Codable {
    let created: Date
    let lastChanged: Date
    let id: UInt
}

struct SomeExampleModel: Codable {
    let metadata: StandardMetadata
    let name: String
}

// or option 2
struct SomeExampleModel: Codable {
    let created: Date
    let lastChanged: Date
    let id: UInt
    let name: String
}

KISS!

tikitu · May 19, 2022, 11:29am

I can't change the JSON format. If it turns out I really can't rely on the Codable behaviour, indeed repeating the boilerplate is probably the fallback plan. (We're talking tens but not hundreds of types, it's manageable by hand.) It means introducing a protocol (for reasons that go outside the simplified version for this post, but I'm sure you can imagine), also probably not impossible.

I'm still interested in the general question though, because the idea of "codable mixins" turns up pretty often in my thinking about patterns.

itaiferber · May 19, 2022, 12:56pm

In the general case, Codable doesn't guarantee the semantics you're looking for, so the ComesFromAPI code you show isn't guaranteed to be safe; however, in a constrained scenario (using this wrapper to only encode and decode types that you control, and only using JSONEncoder/JSONDecoder), this should be okay.

The implementation of encode(to:) here has two main issues:

Encoding two objects at the same "level" and expecting them to interleave in the way that JSONEncoder allows isn't guaranteed. In general, Codable doesn't guarantee in general that you can call encode(to:) more than once at a given object hierarchy depth at all (e.g., an Encoder that would like to write output in a streaming fashion would be within its rights to set a precondition that this isn't allowed, since it wouldn't be able to interleave the contents of the objects)
- There's also the matter that Codable requires that both metadata and body request the same encoding container within their encode(to:) implementations; otherwise it is a hard error, and you're likely to crash
- A bit more detail in Aggregating two Encodable's
Calling encode(to:) directly on an object and giving it an Encoder prevents the encoder from handling that object in any sort of special way. This isn't likely to be an issue if the object is of a type you control that isn't otherwise special-cased by the encoder (e.g. Date, URL, Dictionary, and several other types are handled specifically by JSONEncoder/JSONDecoder, and calling encode(to:) directly on them will yield different results than encoding them "properly" through a container)
- More detail in Propery wrapper decoding difference between `singleValueContainer` and `init(from: Decoder)`

Based on the detail you've given, it doesn't sound like either of these are concerns: it doesn't sound like you're planning on writing to a format other than JSON, and it doesn't outwardly seem like you'll be encoding or decoding anything other than ComesFromAPI<some type you control already>. That being said, I don't know if I would personally feel comfortable relying on this — it's extremely unlikely that the behavior of JSONEncoder/JSONDecoder could change to suddenly make this illegal, but as @tera says, for some, there's something to be said for writing it so simply that anyone could understand the behavior (and that it can't possibly break).

If Swift had support for hygienic macros or mixins natively in the language, I'd say that decorating your types in a way that helped automate "option 2" above would likely be my go-to approach, but in the absence of that, one alternative to writing it all out by hand would be using a source generator to do the heavy lifting.

tikitu · May 19, 2022, 2:38pm

Thanks @itaiferber , that's exactly the kind of detail I was looking for! (Indeed with this I'm confident that I'm "safe enough" for the moment, and I also have a clear picture of what issues to watch for in future.)

Chasing @tera's suggestion I eventually ran down the detail of why this mixin approach is extra appealing in my case, which might also be interesting for y'all. Turns out one of the properties in (the real version of) StandardMetadata needs a custom encoding key (for the sake of the example, suppose id was keyed "@id" in the JSON I get). So the boilerplate version also runs to explicit coding keys for all the properties of each model, even if they would otherwise be covered by the compiler-generated conformance. Obviously that's still not a blocker, it's just lifting the annoyance level a little higher.

Thank you both, very useful discussion.

tera · May 19, 2022, 2:58pm

This particular case (or a similar case) is easy to handle.

struct Id: Codable {
    var id: Int = 123
}

extension Id {
    func encode(to encoder: Encoder) throws {
        var container = encoder.singleValueContainer()
        try container.encode("@" + String(id))
    }
    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        let id = try container.decode(String.self)
        self.id = Int(id[id.index(after: id.startIndex) ..< id.endIndex])!
    }
}

struct SomeExampleModel: Codable {
    var id = Id()
    var created = Date()
}

func test() {
    let encoder = JSONEncoder()
    encoder.outputFormatting = [.prettyPrinted, .withoutEscapingSlashes]
    let data = try! encoder.encode(SomeExampleModel())
    print(String(data: data, encoding: .utf8))
    /*
     {
       "id" : "@123",
       "created" : 674664631.40048802
     }
     */
    let v2 = try! JSONDecoder().decode(SomeExampleModel.self, from: data)
    print(v2)
    /*
     SomeExampleModel(
        id: JT.Id(id: 123),
        created: 2022-05-19 14:50:31 +0000
     )
     */
}

itaiferber · May 19, 2022, 6:04pm

As a general suggestion, more validation on decode is highly recommended:

struct Id: Codable {
    var id: Int

    func encode(to encoder: Encoder) throws {
        var container = encoder.singleValueContainer()
        try container.encode("@\(id)")
    }

    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        let idString = try container.decode(String.self)
        guard idString.first == "@",
              let id = Int(idString.dropFirst()) else {
            throw DecodingError.dataCorruptedError(in: container, debugDescription: "Unexpected ID format: \"\(idString)\"")
        }

        self.id = id
    }
}

tikitu · May 20, 2022, 5:08am

Thanks, that’s a nice pattern! Unfortunately in my case it’s the key, not the value, that is nonstandard:

{ "@id": 784, … }

It just adds to the boilerplate, requiring an explicit CodingKey type for each model where in most cases the data properties wouldn’t need it.

tera · May 20, 2022, 6:34am

I believe you can do with a single CodingKey type for "id" property.

    encoder.keyEncodingStrategy = .custom { keys in
        keys.last!.stringValue == "id" ? IdKey(stringValue: "@id")! : keys.last!
    }
    decoder.keyDecodingStrategy = .custom { keys in
        keys.last!.stringValue == "@id" ? IdKey(stringValue: "id")! : keys.last!
    }

struct IdKey: CodingKey {
    var stringValue: String
    var intValue: Int?

    init?(stringValue: String) {
        self.stringValue = stringValue
        self.intValue = nil
    }
    init?(intValue: Int) {
        self.stringValue = String(intValue)
        self.intValue = intValue
    }
}

(Although I won't recommend it either, for the reasons already mentioned: the size of a more complex ( -> thus potentially buggy) creative workaround is comparable or even exceeding the amount of a more simple and bug free boilerplate.)

PS. I'd have a serious talk with my DB engineer who wants me to use this JSON format...

PPS. There's always a more general JSONSerialization to consider...

tikitu · May 20, 2022, 6:59am

Interesting technique, although indeed it has quite some drawbacks compared to the straightforward approach. Thanks.