Automatic Codable conformance for enums with associated values that themselves conform to Codable

However, a problem arises for JSON-web-service use cases: how do you map heterogenous objects to the right enum-with-associated-values (EWAV) enum cases? In thinking that through, it occurred to me that the only use case I can think of for such an enum is when it’s being used on the client side against a web service endpoint that returns heterogenous types (in an array or otherwise), and crucially such web service endpoints seem to always provide a type hint in each object. If this is a fair generalization, then a synthesized Codable conformance could work in tandem with a hypothetical .discriminatorDecodingStrategy where the developer provides a mapping to the correct enum case…(API for that to be bikeshedded to death)

By "type hints", I mean something like this, from a hypothetical web service that returns your favorited items:

// type hinting in the JSON
[{"object": "album", "id": "abc"},  {"object": "podcast", "id": "xyz"}]

So if: (1) the client developer can provide that mapping relatively painlessly and (2) all associated values have labels and (3) all associated values are Codable, then the compiler could (maybe?) synthesize everything else.

But perhaps it would be acceptable not to cover those web service edge cases, since there are still other use cases that would benefit from automatic conformance:

  • Storing arbitrary data in Keychain or UserDefaults
  • Storing data locally for later use
  • Sharing peer-to-peer data with other clients that use the same encoded format
1 Like

I've also seen sum types encoded in JSON as:

[ { "album": { "id": "abc" } }, { "podcast": { "id": "xyz" } } ]

That's the model used for oneof in proto3 and its JSON, at least.

As far as cases with multiple associated values go (case foo(Bar, baz: Qux)), I don't see an obvious way to map them, but that might be an acceptable cutoff point because synthesis likewise doesn't handle tuples.

I do find this a (very) worthwhile endeavor, but muddy.

3 Likes

As far as cases with multiple associated values go (case foo(Bar, baz: Qux)), I don’t see an obvious way to map them

I would suggest that automatic Codable conformance rules would be:

  1. All associated values must conform to Codable.
  2. Cases with associated values must either have only one associated value…
  3. or have a label for every associated value.

With those rules, encoding would be analogous to encoding a struct or a class. Cases with only one associated value could use a sensible default coding key for the associated value, perhaps .associatedValue. Cases with more than one associated value would use the labels as the coding keys.

enum Compatible: Codable {
  case foo
  case bar(String)
  case baz(label: String, everything: [Int])
}

enum Incompatible: Codable {
  case foo
  case bar(String, String)
  case baz(String, everything: [Int])
}
1 Like

Is this something that would be considered at all? I also think it's a great idea and would like to see this built into Swift.

9 Likes

Might be cleaner to model it something like this:

[
  {
    "discriminator": "foo"
  },
  {
    "discriminator": "bar",
    "values": [
      {"label": 37}
    ]
  }
]

Then you could do:

// case baz(String, everything: [Int])
{
  "discriminator": "baz",
  "values": [
    "foo",
    {"everything": [37, 42]}
  ]
}

or even more explicitly:

// case baz(String, everything: [Int])
{
  "discriminator": "baz",
  "values": [
    {
      "value": "foo"
    },
    {
      "label": "everything",
      "value": [37, 42]
    }
  ]
}

Before designing support for this we should decide whether the design is intended to primarily support persistence controlled by Swift code or whether it is intended to work with server APIs. We will of course never be able to support all server formats, but we might be able support a substantial subset if we support a few different encoding strategies instead of just one.

Has anyone done any analysis of how APIs tend to encode values that can be modeled as an enum with associated values in Swift? Specifically, I am interested in whether there are common patterns used in the wild or not.

3 Likes

I think part of the problem you'll find taking that approach is that there are so many fundamentally different reasons to use enums with associated values in Swift... and those various orthogonal situations will by necessity be modeled in different ways by different services architectures that don't support enums the was Swift does.

4 Likes

I somehow missed this topic back when it started up, but this largely summarizes what I want to say on the topic. This is something I've myself wanted to address, but we haven't gotten the chance to model this space enough.

The proposal here explicitly models discriminators in the discussion, but in fact, more often than not, I find that a discriminator would not be used:

From my experience, I find the opposite to be true. In fact, the most common case I've seen requiring enums here is when a discriminator is not present — in the case of heterogeneous arrays of types which would best be described with anonymous union types (e.g. String | Double | Int). In the case of getting [3.14, "hello", 1], there's no discriminator to work with, and this is pretty common. The best solution is to attempt decoding from a single-value container as one of the above types and catching a type mismatch:

enum StringOrDoubleOrInt {
    case string(String)
    case double(Double)
    case int(Int)

    public func encode(to encoder: Encoder) throws {
        var container = encoder.singleValueContainer()
        switch self {
        case .string(let string): try container.encode(string)
        case .double(let double): try container.encode(double)
        case .int(let int): try container.encode(int)
        }
    }

    // init(from:) which does the same as the above ^ — get a singleValueContainer and attempt to decode
}

Where I've seen discriminators most often is in enums with complex objects as payloads, where it's not immediately clear at a type-level to distinguish between these types.

Given these two conflicting use cases, I'd be interested in trying to find an approach which can cleanly handle both needs without relying on something implicit like the naming or labeling of cases.

(There's also the issue of what order to attempt to decode these non-discriminator-based types, but I think we can explore that along the same vein.)

This may well be true and if so we would want to focus on persistence that is controlled by Swift code. My point is that we should make that decision consciously, and we may want to look at how complex it would be to support common API encodings before making that decision.

2 Likes

One use of enum with a discriminator is with coding XDR. See section 4.15 - Discriminated Unions. I model these using enums with associated values.

An example:

public enum LedgerEntryChange: XDRDecodable {
    case LEDGER_ENTRY_CREATED (LedgerEntry)
    case LEDGER_ENTRY_UPDATED (LedgerEntry)
    case LEDGER_ENTRY_REMOVED (LedgerEntry)
    case LEDGER_ENTRY_STATE (LedgerEntry)

    public init(from decoder: XDRDecoder) throws {
        let discriminant = try decoder.decode(Int32.self)

        switch discriminant {
        case LedgerEntryChangeType.LEDGER_ENTRY_CREATED:
            self = .LEDGER_ENTRY_CREATED(try decoder.decode(LedgerEntry.self))
        case LedgerEntryChangeType.LEDGER_ENTRY_UPDATED:
            self = .LEDGER_ENTRY_UPDATED(try decoder.decode(LedgerEntry.self))
        case LedgerEntryChangeType.LEDGER_ENTRY_REMOVED:
            self = .LEDGER_ENTRY_REMOVED(try decoder.decode(LedgerEntry.self))
        case LedgerEntryChangeType.LEDGER_ENTRY_STATE:
            self = .LEDGER_ENTRY_STATE(try decoder.decode(LedgerEntry.self))
        default:
            fatalError("Unrecognized change type: \(discriminant)")
        }
    }
}

Apologies — I wrote the above while in a bit of a hurry, so let me clarify: I don't think one use case is more important than another; I think we should find a solution which is able to synthesize code for both use cases which want a discriminator, and those which do not.

For instance, one straw-man approach to this is to generate both implementations and allow someone to use either approach via a property, e.g.

enum SynthesizedEnumDiscriminatorType {
    // No discriminator -- attempt cases in order single-value
    case none

    // Use the case name as the value-containing key, e.g.
    // {"album": {"id": "abc"}} or {"podcast": {"id": "xyz"}}
    case discriminateByCaseName

    // Use the case name as a discriminator keyed by the given key, e.g.
    // with disciminateByCaseNameForKey(CodingKeys.object), {"object": "album", "id": "abc"}
    case discriminateByCaseNameForKey(CodingKey)
}

enum MyEnum {
    case string(String)
    case double(Double)

    // Synthesized
    static var codableDiscriminatorType: SynthesizedEnumDiscriminatorType {
        return .none
    }

    init(from decoder: Decoder) throws {
        switch MyEnum.codableDiscriminatorType {
            // ... use the strategies
        }
    }
}

Essentially, add another customization point by using a property similar to how CodingKeys influences the representation of a structured type. If you want to override the discriminator strategy used, provide a discriminator type via the enum (you can even make it mutable if we decide to generate all implementations and choose dynamically).


Of course, there are design issues with the above, but what I'm trying to get at is that it's possible to come up with solutions that work across multiple use cases.

6 Likes

No worries. I just thought that a concrete use-case might help. It's also sometimes useful to remember than not every format is JSON or a plist :slight_smile:.

FWIW, I have seen a number of different configurations for the discriminator and value in practice.

For example, an "out of line" discriminator:

{
    // the key for the discriminator and value can vary depending on the design of the API
    "type": "discriminator",
    // any valid json value that decodes to the associated value for the case described by discriminator
    "value": 42
}

An inline discriminator (along side properties of the type that decodes to the associated value):

{
    // the key for the discriminator can vary based on the design of the API
    "type": "discriminator",
    "property1": "value",
    "property2", "value
}

A discriminator key:

{
    "discriminator": "value"
    "otherProperty": "other value"
}

In this last design, there is one optional property per enum case included in the API response. The name of the property that stores the enum value is elided in favor of one of the case names. For example, we might have:

enum StringOrInt {
    string(String)
    int(Int)
}
struct Foo {
    var name: String
    var value: StringOrInt
}

An example JSON response for this might be:

{
    "name": "George"
    "string": "a string value"
}

That would decode to a value equal to:

Foo(name: "George", value: .string("a string value")

Here's a possible solution that will be able to cover all cases, as opposed to a few hard-coded ones in SynthesizedEnumDiscriminatorType: (sorry for ugly/inconsistent parameter order and naming)

protocol SynthesisedEnumDiscriminatorEncoder {
    // special case, effectively "single value container"
    static func encodeCase<CaseKey: CodingKey>(with encoder: Encoder, _ name: CaseKey, _ value: AnyEncodable) throws
    // "keyed container"
    static func encodeCase<CaseKey: CodingKey, ValueKey: CodingKey>(with encoder: Encoder, _ name: CaseKey, _ values: [ValueKey : AnyEncodable]) throws
    // maybe an "unkeyed container" as well?
    // static func encodeCase<CaseKey: CodingKey>(with encoder: Encoder, _ name: CaseKey, _ values: [AnyEncodable]) throws
}
protocol SynthesisedEnumDiscriminatorDecoder {
    // first decode the discriminator to figure out what to do
    static func decodeCase<CaseKey: CodingKey>(with decoder: Decoder, ofType type: CaseKey.Type) throws -> CaseKey
    // decode from "single value container"
    static func decodeCaseValue<CaseKey: CodingKey, T: Decodable>(with decoder: Decoder, forCase case: CaseKey, ofType type: T.Type) throws -> T
    // decode an associated value by key
    static func decodeCaseValue<CaseKey: CodingKey, ValueKey: CodingKey, T: Decodable>(with decoder: Decoder, forCase case: CaseKey, forKey: ValueKey, ofType type: T.Type) throws -> T
    // "unkeyed container" would require a mutating function to store implicit state, which isn't possible
    // mutating static func decodeNextCaseValue<CaseKey: CodingKey, T: Decodable>(with decoder: Decoder, forCase case: CaseKey, ofType type: T.Type) throws -> T
}

(note: I'm pretending AnyEncodable exists here to simplify the encoding API, but it'd be possible to come up with a similar thing that doesn't need it — variadic generics might help here)

Each "strategy" would then implement these protocols. Here's an example for an "out of line" discriminator:

struct SynthesisedEnumOutOfLineDiscriminatorCoder: SynthesisedEnumDiscriminatorEncoder, SynthesisedEnumDiscriminatorDecoder {

    enum CodingKeys: CodingKey {
        case type
        case value
    }
    
    static func encodeCase<CaseKey: CodingKey>(with encoder: Encoder, _ name: CaseKey, _ value: AnyEncodable) throws {
        var container = encoder.container(keyedBy: CodingKeys.self)
        try container.encode(name.stringValue, forKey: .type)
        try container.encode(value, forKey: .value)
    }

    static func encodeCase<CaseKey: CodingKey, ValueKey: CodingKey>(with encoder: Encoder, _ name: CaseKey, _ values: [ValueKey : AnyEncodable]) throws {
        // similarly to above
    }

    static func decodeCase<CaseKey: CodingKey>(with decoder: Decoder, ofType type: CaseKey.Type) throws -> CaseKey {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        guard let name = type.init(stringValue: try container.decode(String.self, forKey: .type)) else { fatalError() }
        return name
    }
    
    static func decodeCaseValue<CaseKey: CodingKey, T: Decodable>(with decoder: Decoder, forCase case: CaseKey, ofType type: T.Type) throws -> T {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        return try container.decode(T.self, forKey: .value)
    }
    
    static func decodeCaseValue<CaseKey: CodingKey, ValueKey: CodingKey, T: Decodable>(with decoder: Decoder, forCase case: CaseKey, forKey: ValueKey, ofType type: T.Type) throws -> T {
        // similarly to above
    }

}

Then a type that wants to use a strategy has to do a whole lot less work:

enum IntOrString: Codable {
    case int(Int)
    case string(String)
    
    typealias SynthesisedDiscriminatorEncoder = SynthesisedEnumOutOfLineDiscriminatorCoder
    typealias SynthesisedDiscriminatorDecoder = SynthesisedEnumOutOfLineDiscriminatorCoder
    
    // SYNTHESISED STUFF BELOW
    
    enum Discriminator: String, CodingKey {
        case int
        case string
    }
    
    init(from decoder: Decoder) throws {
        let discriminator = try SynthesisedDiscriminatorDecoder.decodeCase(with: decoder, ofType: Discriminator.self)
        switch discriminator {
        case .int:
            // the associated value for this case is exactly one, unlabelled Int, the "single value" function is used
            self = .int(try SynthesisedDiscriminatorDecoder.decodeCaseValue(with: decoder, forCase: discriminator, ofType: Int.self))
        case .string:
            self = .string(try SynthesisedDiscriminatorDecoder.decodeCaseValue(with: decoder, forCase: discriminator, ofType: String.self))
        }
    }
    
    func encode(to encoder: Encoder) throws {
        switch self {
        case .int(let value):
            try SynthesisedDiscriminatorEncoder.encodeCase(with: encoder, Discriminator.int, AnyEncodable(value))
        case .string(let value):
            try SynthesisedDiscriminatorEncoder.encodeCase(with: encoder, Discriminator.string, AnyEncodable(value))
        }
    }
}

Two simple typealiases, and boom, you have synthesised Codable conformance.

Of course, if you want to use a custom strategy, the amount of code to implement a new strategy is probably a bit more than just implementing Codable directly, but the idea here is that the Swift standard library would provide implementations for several common ones :stuck_out_tongue:

cc @anandabits @itaiferber

Am I missing something or in all the possible solutions you consider only one associated value per enum case?

Did this just die out? If it did for lack of consensus, why is it necessary to boil the ocean of handling infinite variations of server API when just persistence would be so useful? If we have to relate to servers, let there be an escape hatch for manual conformance as now required!

1 Like

Is anybody interested in resurrecting this? Would be helpful for a few projects we have in the works.

2 Likes

sourcery supports auto-codable for enums

2 Likes

THIS HAS BEEN IMPLEMENTED IN SWIFT 5.5: swift-evolution/0295-codable-synthesis-for-enums-with-associated-values.md at main · apple/swift-evolution · GitHub :smile_cat:

3 Likes