Automatic Codable conformance for enums with associated values that themselves conform to Codable

At least as of Swift 4.1, an enum with one or more cases with associated values cannot participate in synthesized Codable conformance. For example, the following will yield compiler errors:

enum MyOtherEnum: Codable { // errors: does not conform to Encodable/Decodable
    case foo
    case bar(label: Int)
}

This is true even if all the associated values in said enum themselves conform to Codable. In a similar fashion as the synthesized Hashable / Equatable implementations for enums with associated values that themselves conform to Hashable / Equatable, it would be useful for the Swift compiler to synthesize Codable conformance automatically for any enum with associated values so long as every unique occurrence of an associated value type itself already conforms to Codable.

In broad strokes, I'd expect the above code snippet to yield a synthesized implementation much like this:

enum MyEnum: Codable {
    case foo
    case bar(label: Int)
}

// Begin synthesized code...

extension MyEnum {
    enum Discriminator: String, Decodable {
        case foo, bar
    }
    
    enum CodingKeys: String, CodingKey {
        case discriminator
        case bar_label
    }
    
    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        let discriminator = try container.decode(Discriminator.self, forKey: .discriminator)
        switch discriminator {
        case .foo:
            self = .foo
        case .bar:
            let label = try container.decode(Int.self, forKey: .bar_label)
            self = .bar(label: label)
        }
    }
    
    func encode(to encoder: Encoder) throws {
        // Yadda, yadda...
    }
}

I have left out the encode(to:) implementation as an exercise for the reader, since it should be an obvious inversion of the suggested init(from:) implementation.

The algorithm for this synthesis of Decodable conformance, in broad strokes:

  1. Synthesize a Discriminator enum that is a 1x1 mapping of the target enum's discriminators, but without any associated values, and using a raw String type so it can participate in automatic Codable conformance.
  2. Synthesize the CodingKeys enum necessary to use a decoding container. CodingKeys should start with a .discriminator case which will be used to decode the target enum's discriminator. For each case of the target enum that has associated values, append a case to CodingKeys for each associated value. If an associated value has a label, use the label as the key. If not, use the index of the associated value's position in the list of values (e.g. 2 for the third position). Prefix the names of all the associated value keys with something predictable—perhaps the relevant discriminator—to prevent naming collisions between a CodingKey and any associated value with an otherwise identical label (see the .anotherCase example behind the GitHub Gist link below). Using the relevant discriminator from the target enum as a key prefix also prevents naming collisions between synthesized CodingKeys; the alternative would be to enforce uniquing of the synthesized keys, re-using keys across cases as needed.
  3. Synthesize init(from:), first by decoding the discriminator. Switch on the discriminator, implementing each case so that it attempts to decode only associated values for the relevant case of the target enum. There should already be the required CodingKey cases necessary to perform this decoding. Once all associated values are decoded, if any, decoding is complete.

More verbose example available here in this GitHub Gist.

Given the following array:

let array: [MyEnum] = [.foo, .bar(label: 37)]

The JSON encoding of array would be this:

[
  {
    "discriminator": "foo"
  },
  {
    "discriminator": "bar",
    "bar_label": 37
  }
]
12 Likes

However, a problem arises for JSON-web-service use cases: how do you map heterogenous objects to the right enum-with-associated-values (EWAV) enum cases? In thinking that through, it occurred to me that the only use case I can think of for such an enum is when it’s being used on the client side against a web service endpoint that returns heterogenous types (in an array or otherwise), and crucially such web service endpoints seem to always provide a type hint in each object. If this is a fair generalization, then a synthesized Codable conformance could work in tandem with a hypothetical .discriminatorDecodingStrategy where the developer provides a mapping to the correct enum case…(API for that to be bikeshedded to death)

By "type hints", I mean something like this, from a hypothetical web service that returns your favorited items:

// type hinting in the JSON
[{"object": "album", "id": "abc"},  {"object": "podcast", "id": "xyz"}]

So if: (1) the client developer can provide that mapping relatively painlessly and (2) all associated values have labels and (3) all associated values are Codable, then the compiler could (maybe?) synthesize everything else.

But perhaps it would be acceptable not to cover those web service edge cases, since there are still other use cases that would benefit from automatic conformance:

  • Storing arbitrary data in Keychain or UserDefaults
  • Storing data locally for later use
  • Sharing peer-to-peer data with other clients that use the same encoded format
1 Like

I've also seen sum types encoded in JSON as:

[ { "album": { "id": "abc" } }, { "podcast": { "id": "xyz" } } ]

That's the model used for oneof in proto3 and its JSON, at least.

As far as cases with multiple associated values go (case foo(Bar, baz: Qux)), I don't see an obvious way to map them, but that might be an acceptable cutoff point because synthesis likewise doesn't handle tuples.

I do find this a (very) worthwhile endeavor, but muddy.

3 Likes

As far as cases with multiple associated values go (case foo(Bar, baz: Qux)), I don’t see an obvious way to map them

I would suggest that automatic Codable conformance rules would be:

  1. All associated values must conform to Codable.
  2. Cases with associated values must either have only one associated value…
  3. …or have a label for every associated value.

With those rules, encoding would be analogous to encoding a struct or a class. Cases with only one associated value could use a sensible default coding key for the associated value, perhaps .associatedValue. Cases with more than one associated value would use the labels as the coding keys.

enum Compatible: Codable {
  case foo
  case bar(String)
  case baz(label: String, everything: [Int])
}

enum Incompatible: Codable {
  case foo
  case bar(String, String)
  case baz(String, everything: [Int])
}
1 Like

Is this something that would be considered at all? I also think it's a great idea and would like to see this built into Swift.

9 Likes

Might be cleaner to model it something like this:

[
  {
    "discriminator": "foo"
  },
  {
    "discriminator": "bar",
    "values": [
      {"label": 37}
    ]
  }
]

Then you could do:

// case baz(String, everything: [Int])
{
  "discriminator": "baz",
  "values": [
    "foo",
    {"everything": [37, 42]}
  ]
}

or even more explicitly:

// case baz(String, everything: [Int])
{
  "discriminator": "baz",
  "values": [
    {
      "value": "foo"
    },
    {
      "label": "everything",
      "value": [37, 42]
    }
  ]
}

Before designing support for this we should decide whether the design is intended to primarily support persistence controlled by Swift code or whether it is intended to work with server APIs. We will of course never be able to support all server formats, but we might be able support a substantial subset if we support a few different encoding strategies instead of just one.

Has anyone done any analysis of how APIs tend to encode values that can be modeled as an enum with associated values in Swift? Specifically, I am interested in whether there are common patterns used in the wild or not.

3 Likes

I think part of the problem you'll find taking that approach is that there are so many fundamentally different reasons to use enums with associated values in Swift... and those various orthogonal situations will by necessity be modeled in different ways by different services architectures that don't support enums the was Swift does.

4 Likes

I somehow missed this topic back when it started up, but this largely summarizes what I want to say on the topic. This is something I've myself wanted to address, but we haven't gotten the chance to model this space enough.

The proposal here explicitly models discriminators in the discussion, but in fact, more often than not, I find that a discriminator would not be used:

From my experience, I find the opposite to be true. In fact, the most common case I've seen requiring enums here is when a discriminator is not present — in the case of heterogeneous arrays of types which would best be described with anonymous union types (e.g. String | Double | Int). In the case of getting [3.14, "hello", 1], there's no discriminator to work with, and this is pretty common. The best solution is to attempt decoding from a single-value container as one of the above types and catching a type mismatch:

enum StringOrDoubleOrInt {
    case string(String)
    case double(Double)
    case int(Int)

    public func encode(to encoder: Encoder) throws {
        var container = encoder.singleValueContainer()
        switch self {
        case .string(let string): try container.encode(string)
        case .double(let double): try container.encode(double)
        case .int(let int): try container.encode(int)
        }
    }

    // init(from:) which does the same as the above ^ — get a singleValueContainer and attempt to decode
}

Where I've seen discriminators most often is in enums with complex objects as payloads, where it's not immediately clear at a type-level to distinguish between these types.

Given these two conflicting use cases, I'd be interested in trying to find an approach which can cleanly handle both needs without relying on something implicit like the naming or labeling of cases.

(There's also the issue of what order to attempt to decode these non-discriminator-based types, but I think we can explore that along the same vein.)

This may well be true and if so we would want to focus on persistence that is controlled by Swift code. My point is that we should make that decision consciously, and we may want to look at how complex it would be to support common API encodings before making that decision.

2 Likes

One use of enum with a discriminator is with coding XDR. See section 4.15 - Discriminated Unions. I model these using enums with associated values.

An example:

public enum LedgerEntryChange: XDRDecodable {
    case LEDGER_ENTRY_CREATED (LedgerEntry)
    case LEDGER_ENTRY_UPDATED (LedgerEntry)
    case LEDGER_ENTRY_REMOVED (LedgerEntry)
    case LEDGER_ENTRY_STATE (LedgerEntry)

    public init(from decoder: XDRDecoder) throws {
        let discriminant = try decoder.decode(Int32.self)

        switch discriminant {
        case LedgerEntryChangeType.LEDGER_ENTRY_CREATED:
            self = .LEDGER_ENTRY_CREATED(try decoder.decode(LedgerEntry.self))
        case LedgerEntryChangeType.LEDGER_ENTRY_UPDATED:
            self = .LEDGER_ENTRY_UPDATED(try decoder.decode(LedgerEntry.self))
        case LedgerEntryChangeType.LEDGER_ENTRY_REMOVED:
            self = .LEDGER_ENTRY_REMOVED(try decoder.decode(LedgerEntry.self))
        case LedgerEntryChangeType.LEDGER_ENTRY_STATE:
            self = .LEDGER_ENTRY_STATE(try decoder.decode(LedgerEntry.self))
        default:
            fatalError("Unrecognized change type: \(discriminant)")
        }
    }
}

Apologies — I wrote the above while in a bit of a hurry, so let me clarify: I don't think one use case is more important than another; I think we should find a solution which is able to synthesize code for both use cases which want a discriminator, and those which do not.

For instance, one straw-man approach to this is to generate both implementations and allow someone to use either approach via a property, e.g.

enum SynthesizedEnumDiscriminatorType {
    // No discriminator -- attempt cases in order single-value
    case none

    // Use the case name as the value-containing key, e.g.
    // {"album": {"id": "abc"}} or {"podcast": {"id": "xyz"}}
    case discriminateByCaseName

    // Use the case name as a discriminator keyed by the given key, e.g.
    // with disciminateByCaseNameForKey(CodingKeys.object), {"object": "album", "id": "abc"}
    case discriminateByCaseNameForKey(CodingKey)
}

enum MyEnum {
    case string(String)
    case double(Double)

    // Synthesized
    static var codableDiscriminatorType: SynthesizedEnumDiscriminatorType {
        return .none
    }

    init(from decoder: Decoder) throws {
        switch MyEnum.codableDiscriminatorType {
            // ... use the strategies
        }
    }
}

Essentially, add another customization point by using a property similar to how CodingKeys influences the representation of a structured type. If you want to override the discriminator strategy used, provide a discriminator type via the enum (you can even make it mutable if we decide to generate all implementations and choose dynamically).


Of course, there are design issues with the above, but what I'm trying to get at is that it's possible to come up with solutions that work across multiple use cases.

6 Likes

No worries. I just thought that a concrete use-case might help. It's also sometimes useful to remember than not every format is JSON or a plist :slight_smile:.

FWIW, I have seen a number of different configurations for the discriminator and value in practice.

For example, an "out of line" discriminator:

{
    // the key for the discriminator and value can vary depending on the design of the API
    "type": "discriminator",
    // any valid json value that decodes to the associated value for the case described by discriminator
    "value": 42
}

An inline discriminator (along side properties of the type that decodes to the associated value):

{
    // the key for the discriminator can vary based on the design of the API
    "type": "discriminator",
    "property1": "value",
    "property2", "value
}

A discriminator key:

{
    "discriminator": "value"
    "otherProperty": "other value"
}

In this last design, there is one optional property per enum case included in the API response. The name of the property that stores the enum value is elided in favor of one of the case names. For example, we might have:

enum StringOrInt {
    string(String)
    int(Int)
}
struct Foo {
    var name: String
    var value: StringOrInt
}

An example JSON response for this might be:

{
    "name": "George"
    "string": "a string value"
}

That would decode to a value equal to:

Foo(name: "George", value: .string("a string value")

Here's a possible solution that will be able to cover all cases, as opposed to a few hard-coded ones in SynthesizedEnumDiscriminatorType: (sorry for ugly/inconsistent parameter order and naming)

protocol SynthesisedEnumDiscriminatorEncoder {
    // special case, effectively "single value container"
    static func encodeCase<CaseKey: CodingKey>(with encoder: Encoder, _ name: CaseKey, _ value: AnyEncodable) throws
    // "keyed container"
    static func encodeCase<CaseKey: CodingKey, ValueKey: CodingKey>(with encoder: Encoder, _ name: CaseKey, _ values: [ValueKey : AnyEncodable]) throws
    // maybe an "unkeyed container" as well?
    // static func encodeCase<CaseKey: CodingKey>(with encoder: Encoder, _ name: CaseKey, _ values: [AnyEncodable]) throws
}
protocol SynthesisedEnumDiscriminatorDecoder {
    // first decode the discriminator to figure out what to do
    static func decodeCase<CaseKey: CodingKey>(with decoder: Decoder, ofType type: CaseKey.Type) throws -> CaseKey
    // decode from "single value container"
    static func decodeCaseValue<CaseKey: CodingKey, T: Decodable>(with decoder: Decoder, forCase case: CaseKey, ofType type: T.Type) throws -> T
    // decode an associated value by key
    static func decodeCaseValue<CaseKey: CodingKey, ValueKey: CodingKey, T: Decodable>(with decoder: Decoder, forCase case: CaseKey, forKey: ValueKey, ofType type: T.Type) throws -> T
    // "unkeyed container" would require a mutating function to store implicit state, which isn't possible
    // mutating static func decodeNextCaseValue<CaseKey: CodingKey, T: Decodable>(with decoder: Decoder, forCase case: CaseKey, ofType type: T.Type) throws -> T
}

(note: I'm pretending AnyEncodable exists here to simplify the encoding API, but it'd be possible to come up with a similar thing that doesn't need it — variadic generics might help here)

Each "strategy" would then implement these protocols. Here's an example for an "out of line" discriminator:

struct SynthesisedEnumOutOfLineDiscriminatorCoder: SynthesisedEnumDiscriminatorEncoder, SynthesisedEnumDiscriminatorDecoder {

    enum CodingKeys: CodingKey {
        case type
        case value
    }
    
    static func encodeCase<CaseKey: CodingKey>(with encoder: Encoder, _ name: CaseKey, _ value: AnyEncodable) throws {
        var container = encoder.container(keyedBy: CodingKeys.self)
        try container.encode(name.stringValue, forKey: .type)
        try container.encode(value, forKey: .value)
    }

    static func encodeCase<CaseKey: CodingKey, ValueKey: CodingKey>(with encoder: Encoder, _ name: CaseKey, _ values: [ValueKey : AnyEncodable]) throws {
        // similarly to above
    }

    static func decodeCase<CaseKey: CodingKey>(with decoder: Decoder, ofType type: CaseKey.Type) throws -> CaseKey {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        guard let name = type.init(stringValue: try container.decode(String.self, forKey: .type)) else { fatalError() }
        return name
    }
    
    static func decodeCaseValue<CaseKey: CodingKey, T: Decodable>(with decoder: Decoder, forCase case: CaseKey, ofType type: T.Type) throws -> T {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        return try container.decode(T.self, forKey: .value)
    }
    
    static func decodeCaseValue<CaseKey: CodingKey, ValueKey: CodingKey, T: Decodable>(with decoder: Decoder, forCase case: CaseKey, forKey: ValueKey, ofType type: T.Type) throws -> T {
        // similarly to above
    }

}

Then a type that wants to use a strategy has to do a whole lot less work:

enum IntOrString: Codable {
    case int(Int)
    case string(String)
    
    typealias SynthesisedDiscriminatorEncoder = SynthesisedEnumOutOfLineDiscriminatorCoder
    typealias SynthesisedDiscriminatorDecoder = SynthesisedEnumOutOfLineDiscriminatorCoder
    
    // SYNTHESISED STUFF BELOW
    
    enum Discriminator: String, CodingKey {
        case int
        case string
    }
    
    init(from decoder: Decoder) throws {
        let discriminator = try SynthesisedDiscriminatorDecoder.decodeCase(with: decoder, ofType: Discriminator.self)
        switch discriminator {
        case .int:
            // the associated value for this case is exactly one, unlabelled Int, the "single value" function is used
            self = .int(try SynthesisedDiscriminatorDecoder.decodeCaseValue(with: decoder, forCase: discriminator, ofType: Int.self))
        case .string:
            self = .string(try SynthesisedDiscriminatorDecoder.decodeCaseValue(with: decoder, forCase: discriminator, ofType: String.self))
        }
    }
    
    func encode(to encoder: Encoder) throws {
        switch self {
        case .int(let value):
            try SynthesisedDiscriminatorEncoder.encodeCase(with: encoder, Discriminator.int, AnyEncodable(value))
        case .string(let value):
            try SynthesisedDiscriminatorEncoder.encodeCase(with: encoder, Discriminator.string, AnyEncodable(value))
        }
    }
}

Two simple typealiases, and boom, you have synthesised Codable conformance.

Of course, if you want to use a custom strategy, the amount of code to implement a new strategy is probably a bit more than just implementing Codable directly, but the idea here is that the Swift standard library would provide implementations for several common ones :stuck_out_tongue:

cc @anandabits @itaiferber

Am I missing something or in all the possible solutions you consider only one associated value per enum case?

Did this just die out? If it did for lack of consensus, why is it necessary to boil the ocean of handling infinite variations of server API when just persistence would be so useful? If we have to relate to servers, let there be an escape hatch for manual conformance as now required!

1 Like

Is anybody interested in resurrecting this? Would be helpful for a few projects we have in the works.

2 Likes

sourcery supports auto-codable for enums

2 Likes