Codable synthesis for enums with associated values

Introduction

Codable was introduced in SE-0166
with support for synthesizing Encodable and Decodable conformance for
class and struct types, that only contain values that also conform
to the respective protocols.

Motivation

Currently auto-synthesis does not work for enums with associated values.
There have been discussions about it in the past, but the concrete structure
of the encoded values was never agreed upon. We believe that having a solution
for this is an important quality of life improvement.

Proposed solution

The following enum with associated values

enum Command: Codable {
  case load(key: String)
  case store(key: String, value: Int)
}

would be encoded to

{
  "load": {
    "key": "MyKey"
  }
}

and

{
  "store": {
    "key": "MyKey",
    "value": 42
  }
}

The top-level container contains a single key that matches the name of the enum
case that points to another container that contains the values as they would be
encoded for structs and classes.

Associated values can also be unnamed, in which case they will be encoded into an
array instead (that need to happen even if only one of the value is not named):

enum Command: Codable {
  case load(String)
  case store(String, value: Int)
}

would encoded to

{
  "load": [
    "MyKey"
  ]
}

and

{
  "store": [
    "MyKey",
    42
  ]
}

This solution is closely following the default behavior of the Rust library serde.

12 Likes

Happy to see a pitch in this area! I think this is a good start in the direction of having something in place for enums which has been missing for a long time. I'm wondering about a few cases here which are not explicitly discussed in the pitch:

  1. What about enum cases without associated values? Are those still encoded as keyed containers, or single values? For instance, .load in

    enum Command: Codable {
        case load
        case store(key: String, value: Int)
    }
    
  2. What about enum cases that share a name? For instance, both cases of

    enum Foo {
        case bar(String, id: Int)
        case bar(String, value: Double)
    }
    

    could end up with the same encoded representation ({"bar": ["abc", 123]} could match either). Would the synthesized initializer try cases in order, or how would that work?

  3. It would be nice to see some Alternatives Considered/Future Directions with some discussion about other approaches. For instance, from a lot of discussion in Automatic Codable conformance for enums with associated values that themselves conform to Codable and similar threads, we know that different folks have differing opinions on what they would expect to be a default implementation based on their needs.

6 Likes

Those are very good points, thanks for bringing them up.

  1. I think right now we have the following options:
  • Use the same structure as for the other cases, i.e. { "load": {} } or { "load": [] }
  • Make it a string, i.e. "load"

Using the default structure would certainly make it easier for the code generation and also there is currently no way to check if a key within a container points to a value or a nested container. So in this case we would need to use container.nestedContainer and catch the error if it doesn't exist. It would certainly be nicer to have a version of this function that returns an optional, so maybe that is something to consider adding.

I think one thing that speaks against using raw values as the default behavior is consistency.

  1. I'm not sure there is a good way to represent this and I'm also not sure this is a very common case for serialization. I'm open for suggestions on this one, but I am leaning towards disallowing this for auto-synthesis for now.

  2. I'll add some thoughts on the alternatives to the pitch later. Thanks for pointing this out.

Thanks for the consideration! I think there's a lot of potential here. :smile: Some additional thoughts:

I'm not entirely sure what you mean by this; do you mind elaborating? As in, determining the difference between "load" and {"load": ...} directly from the Decoder instance itself, or something else?

Although I agree that consistency is definitely nice, one thing to consider is the evolution of enums over time. Code which uses an enum like

enum Foo: Codable {
    case bar
    case baz
}

If it later adds a

case quux(String)

it would no longer be able to decode previously-encoded .bar and .baz values, which could have encoded as single values. It might be surprising that the addition of one enum case with associated values would change the encoded representation of all of the other values too (and potentially silently at that).

Although I've seen enums like this in the wild, I mostly brought this up as something worth explicitly calling out in the pitch and the implementation. I don't necessarily think there needs to be a different representation for these two cases, just clearly spelled-out rules about what would happen (e.g., enum cases like this are attempted in order, always). I think disallowing these altogether might introduce a bit more pain than is needed, but just my opinion.

2 Likes

Yes, sorry. I mean if I have a KeyedDecodingContainer, I can't determine whether under a given key, there is a nested container in a non-throwing manner. For values there are the decodeIfPresent functions, for nested containers there is only the throwing function.

Yes, that is another point in favor of always using containers instead of raw values.

I think disallowing this would be better than silently running into the wrong case. If we can find a good way to represent these cases in the future, the support can still be added, but it's hard to change it, once it's in.

Got it, agreed. The way to do this would be to attempt to fetch a nested container and catch an error if that failed. Synthesis could at least be smart about this and attempt only the containers expected based on the types of enum cases (e.g. if all enum cases are labeled, there's no need to attempt an unkeyed container).

I was going to say that I'm not sure I agree (I was trying to make the case for the opposite side) but realized that my concern is unfounded. To be a bit clearer, I was concerned that

  1. Attempting to encode enum Foo: Codable today would encode .bar and .baz as simple strings, but with this proposal, the encoding format would change
  2. Along those lines, adding another case with associated values (not possible today) could risk changing the encoding format

I realized after posting that enum Foo: Codable as expressed above doesn't work today because Foo is not RawRepresentable, and if Foo were made RawRepresentable to compile today, you couldn't add a case with associated values without dropping the raw value anyway.

So to sum up:

  1. Having cases with no associated values encode as keyed containers won't change any behavior from today
  2. Having the, encode as keyed containers wouldn't change the encoding format if you add additional cases with associated values

I'm on board with this direction.

6 Likes

Considering the similarities to tuples, it may make sense to reduce the scope to only provide synthesis for cases with single associated values. I would still find this very useful.

enum Action: Codable {
  case close
  case web(Auth)
  case web(link: Link)

  // do not support until tuples also have a synthesis solution 
  case both(auth: Auth, link: Link)
}

struct Auth: Codable { }
struct Link: Codable { }

Developers would naturally expect to be able to override the synthesis of CodingKeys like they can today with structs;

enum Action: Codable {
  case close
  case web(Auth)
  case web(link: Link)

  enum CodingKeys: String, CodingKey {
    case web = "auth"
    case webLink = "link"
  }
}