[Proposal] Foundation Swift Archival & Serialization

zwaldowski · March 16, 2017, 3:53pm

IMHO those are only “sweet” until you need to decode a value out to something other than a typed value, then it’s ambiguity city. There are many ways to solve that, but none of them are conducive to beginners. Using the metatype to seed the generic resolution is the only thing I’d get behind, personally.

Zach Waldowski
zach@waldowski.me

···

On Mar 16, 2017, at 3:09 AM, David Hart via swift-evolution <swift-evolution@swift.org> wrote:

2) Libraries like Marshal (https://github.com/utahiosmac/Marshal\) and Unbox (https://github.com/JohnSundell/Unbox\) don’t require the decoding functions to provide the type: those functions are generic on the return turn and it’s automatically inferred:

func decode<T>(key: Key) -> T

self.stringProperty = decode(key: .stringProperty) // correct specialisation of the generic function chosen by the compiler

Is there a reason the proposal did not choose this solution? Its quite sweet.

itaiferber · March 16, 2017, 7:33pm

Thanks for the thorough and detailed review, Brent! Responses inline.

Hi everyone,

The following introduces a new Swift-focused archival and serialization API as part of the Foundation framework. We’re interested in improving the experience and safety of performing archival and serialization, and are happy to receive community feedback on this work.

Thanks to all of the people who've worked on this. It's a great proposal.

Specifically:

• It aims to provide a solution for the archival of Swift struct and enum types

I see a lot of discussion here of structs and classes, and an example of an enum without associated values, but I don't see any discussion of enums with associated values. Can you sketch how you see people encoding such types?

For example, I assume that `Optional` is going to get some special treatment, but if it doesn't, how would you write its `encode(to:)` method?

`Optional` values are accepted and vended directly through the API. The `encode(_:forKey:)` methods take optional values directly, and `decodeIfPresent(_:forKey:)` vend optional values.

`Optional` is special in this way — it’s a primitive part of the system. It’s actually not possible to write an `encode(to:)` method for `Optional`, since the representation of null values is up to the encoder and the format it’s working in; `JSONEncoder`, for instance, decides on the representation of `nil` (JSON `null`). It wouldn’t be possible to ask `nil` to encode itself in a reasonable way.

What about a more complex enum, like the standard library's `UnicodeDecodingResult`:

  enum UnicodeDecodingResult {
    case emptyInput
    case error
    case scalarValue(UnicodeScalar)
  }

Or, say, an `Error`-conforming type from one of my projects:

  public enum SQLError: Error {
      case connectionFailed(underlying: Error)
      case executionFailed(underlying: Error, statement: SQLStatement)
      case noRecordsFound(statement: SQLStatement)
      case extraRecordsFound(statement: SQLStatement)
      case columnInvalid(underlying: Error, key: ColumnSpecifier, statement: SQLStatement)
      case valueInvalid(underlying: Error, key: AnySQLColumnKey, statement: SQLStatement)
  }

(You can assume that all the types in the associated values are `Codable`.)

Sure — these cases specifically do not derive `Codable` conformance because the specific representation to choose is up to you. Two possible ways to write this, though there are many others (I’m simplifying these cases here a bit, but you can extrapolate this):

// Approach 1
// This produces either {"type": 0} for `.noValue`, or {"type": 1, "value": …} for `.associated`.
public enum EnumWithAssociatedValue : Codable {
     case noValue
     case associated(Int)

     private enum CodingKeys : CodingKey {
         case type
         case value
     }

     public init(from decoder: Decoder) throws {
         let container = try decoder.container(keyedBy: CodingKeys.self)
         let type = try container.decode(Int.self, forKey: .type)
         switch type {
         case 0:
             self = .noValue
         case 1:
             let value = try container.decode(Int.self, forKey: .value)
             self = .associated(value)
         default:
             throw …
         }
     }

     public func encode(to encoder: Encoder) throws {
         let container = encoder.container(keyedBy: codingKeys.self)
         switch self {
         case .noValue:
             try container.encode(0, forKey: .type)
         case .associated(let value):
             try container.encode(1, forKey: .type)
             try container.encode(value, forKey: .value)
         }
     }
}

// Approach 2
// Produces `0`, `1`, or `2` for `.noValue1`, `.noValue2`, and `.noValue3` respectively.
// Produces {"type": 3, "value": …} and {"type": 4, "value": …} for `.associated1` and `.associated2`.
public enum EnumWithAssociatedValue : Codable {
     case noValue1
     case noValue2
     case noValue3
     case associated1(Int)
     case associated2(String)

     private enum CodingKeys : CodingKey {
         case type
         case value
     }

     public init(from decoder: Decoder) throws {
         if let container = try? decoder.singleValueContainer() {}
             let type = container.decode(Int.self)
             switch type {
             case 0: self = .noValue1
             case 1: self = .noValue2
             case 2: self = .noValue3
             default: throw …
             }
         } else {
             let container = try decoder.container(keyedBy: CodingKeys.self)
             let type = container.decode(Int.self, forKey: .type)
             switch type {
             case 3:
                 let value = container.decode(Int.self, forKey: .value)
                 self = .associated1(value)
             case 4:
                 let value = container.decode(String.self, forKey: .value)
                 self = .associated2(value)
             default: throw ...
             }
         }
     }
}

There are, of course, many more approaches that you could take, but these are just two examples. The first is likely simpler to read and comprehend, but may not be appropriate if you’re trying to optimize for space.

I don't necessarily assume that the compiler should write conformances to these sorts of complicated enums for me (though that would be nice!); I'm just wondering what the designers of this feature envision people doing in cases like these.

• protocol Codable: Adopted by types to opt into archival. Conformance may be automatically derived in cases where all properties are also Codable.

Have you given any consideration to supporting types which only need to decode? That seems likely to be common when interacting with web services.

We have. Ultimately, we decided that the introduction of several protocols to cover encodability, decodability, and both was too much of a cognitive overhead, considering the number of other types we’re also introducing. You can always implement `encode(to:)` as `fatalError()`.

  • protocol CodingKey: Adopted by types used as keys for keyed containers, replacing String keys with semantic types. Conformance may be automatically derived in most cases.
  • protocol Encoder: Adopted by types which can take Codable values and encode them into a native format.
    • class KeyedEncodingContainer<Key : CodingKey>: Subclasses of this type provide a concrete way to store encoded values by CodingKey. Types adopting Encoder should provide subclasses of KeyedEncodingContainer to vend.
    • protocol SingleValueEncodingContainer: Adopted by types which provide a concrete way to store a single encoded value. Types adopting Encoder should provide types conforming to SingleValueEncodingContainer to vend (but in many cases will be able to conform to it themselves).
  • protocol Decoder: Adopted by types which can take payloads in a native format and decode Codable values out of them.
    • class KeyedDecodingContainer<Key : CodingKey>: Subclasses of this type provide a concrete way to retrieve encoded values from storage by CodingKey. Types adopting Decoder should provide subclasses of KeyedDecodingContainer to vend.
    • protocol SingleValueDecodingContainer: Adopted by types which provide a concrete way to retrieve a single encoded value from storage. Types adopting Decoder should provide types conforming to SingleValueDecodingContainer to vend (but in many cases will be able to conform to it themselves).

I do want to note that, at this point in the proposal, I was sort of thinking you'd gone off the deep end modeling this. Having read the whole thing, I now understand what all of these things do, but this really is a very large subsystem. I think it's worth asking if some of these types can be eliminated or combined.

In the past, the concepts of `SingleValueContainer` and `Encoder` were not distinct — all of the methods on `SingleValueContainer` were just part of `Encoder`. Sure, this is a simpler system, but unfortunately promotes the wrong thing altogether. I’ll address this below.

Structured types (i.e. types which encode as a collection of properties) encode and decode their properties in a keyed manner. Keys may be String-convertible or Int-convertible (or both),

What does "may" mean here? That, at runtime, the encoder will test for the preferred key type and fall back to the other one? That seems a little bit problematic.

Yes, this is the case. A lot is left up to the `Encoder` because it can choose to do something for its format that your implementation of `encode(to:)` may not have considered.
If you try to encode something with an `Int` key in a string-keyed dictionary, the encoder may choose to stringify the integer if appropriate for the format. If not, it can reject your key, ignore the call altogether, `preconditionFailure()`, etc. It is also perfectly legitimate to write an `Encoder` which supports a flat encoding format — in that case, keys are likely ignored altogether, in which case there is no error to be had. We’d like to not arbitrarily constrain an implementation unless necessary.

FWIW, 99.9% of the time, the appropriate thing to do is to either simply throw an error, or `preconditionFailure()`. Nasal demons should not be the common case. But for some encoding formats, this is appropriate.

I'm also quite worried about how `Int`-convertible keys will interact with code synthesis. The obvious way to assign integers—declaration order—would mean that reordering declarations would invisibly break archiving, potentially (if the types were compatible) without breaking anything in an error-causing way even at runtime. You could sort the names, but then adding a new property would shift the integers of the properties "below" it. You could hash the names, but then there's no obvious relationship between the integers and key cases.

At the same time, I also think that using arbitrary integers is a poor match for ordering. If you're making an ordered container, you don't want arbitrary integers wrapped up in an abstract type. You want adjacent integers forming indices of an eventual array. (Actually, you may not want indices at all—you may just want to feed elements in one at a time!)

For these exact reasons, integer keys are not produced by code synthesis, only string keys. If you want integer keys, you’ll have to write them yourself. :)

Integer keys are fragile, as you point out yourself, and while we’d like to encourage their use as appropriate, they require explicit thought and care as to their use.

So I would suggest the following changes:

* The coding key always converts to a string. That means we can eliminate the `CodingKey` protocol and instead use `RawRepresentable where RawValue == String`, leveraging existing infrastructure. That also means we can call the `CodingKeys` associated type `CodingKey` instead, which is the correct name for it—we're not talking about an `OptionSet` here.

* If, to save space on disk, you want to also people to use integers as the serialized representation of a key, we might introduce a parallel `IntegerCodingKey` protocol for that, but every `CodingKey` type should map to `String` first and foremost. Using a protocol here ensures that it can be statically determined at compile time whether a type can be encoded with integer keys, so the compiler can select an overload of `container(keyedBy:)`.

* Intrinsically ordered data is encoded as a single value containers of type `Array<Codable>`. (I considered having an `orderedContainer()` method and type, but as I thought about it, I couldn't think of an advantage it would have over `Array`.)

This is possible, but I don’t see this as necessarily advantageous over what we currently have. In 99.9% of cases, `CodingKey` types will have string values anyway — in many cases you won’t have to write the `enum` yourself to begin with, but even when you do, derived `CodingKey` conformance will generate string values on your behalf.
The only time a key will not have a string value is if the `CodingKey` protocol is implemented manually and a value is either deliberately left out, or there was a mistake in the implementation; in either case, there wouldn’t have been a valid string value anyway.

    /// Returns an encoding container appropriate for holding a single primitive value.
    ///
    /// - returns: A new empty single value container.
    /// - precondition: May not be called after a prior `self.container(keyedBy:)` call.
    /// - precondition: May not be called after a value has been encoded through a previous `self.singleValueContainer()` call.
    func singleValueContainer() -> SingleValueEncodingContainer

Speaking of which, I'm not sure about single value containers. My first instinct is to say that methods should be moved from them to the `Encoder` directly, but that would probably cause code duplication. But...isn't there already duplication between the `SingleValue*Container` and the `Keyed*Container`? Why, yes, yes there is. So let's talk about that.

In the Alternatives Considered section of the proposal, we detail having done just this. Originally, the requirements now on `SingleValueContainer` sat on `Encoder` and `Decoder`.
Unfortunately, this made it too easy to do the wrong thing, and required extra work (in comparison) to do the right thing.

When `Encoder` has `encode(_ value: Bool?)`, `encode(_ value: Int?)`, etc. on it, it’s very intuitive to try to encode values that way:

func encode(to encoder: Encoder) throws {
     // The very first thing I try to type is encoder.enc… and guess what pops up in autocomplete:
     try encoder.encode(myName)
     try encoder.encode(myEmail)
     try encoder.encode(myAddress)
}

This might look right to someone expecting to be able to encode in an ordered fashion, which is _not_ what these methods do.
In addition, for someone expecting keyed encoding methods, this is very confusing. Where are those methods? Where don’t these "default" methods have keys?

The very first time that code block ran, it would `preconditionFailure()` or throw an error, since those methods intend to encode only one single value.

    open func encode<Value : Codable>(_ value: Value?, forKey key: Key) throws
    open func encode(_ value: Bool?, forKey key: Key) throws
    open func encode(_ value: Int?, forKey key: Key) throws
    open func encode(_ value: Int8?, forKey key: Key) throws
    open func encode(_ value: Int16?, forKey key: Key) throws
    open func encode(_ value: Int32?, forKey key: Key) throws
    open func encode(_ value: Int64?, forKey key: Key) throws
    open func encode(_ value: UInt?, forKey key: Key) throws
    open func encode(_ value: UInt8?, forKey key: Key) throws
    open func encode(_ value: UInt16?, forKey key: Key) throws
    open func encode(_ value: UInt32?, forKey key: Key) throws
    open func encode(_ value: UInt64?, forKey key: Key) throws
    open func encode(_ value: Float?, forKey key: Key) throws
    open func encode(_ value: Double?, forKey key: Key) throws
    open func encode(_ value: String?, forKey key: Key) throws
    open func encode(_ value: Data?, forKey key: Key) throws

Wait, first, a digression for another issue: I'm concerned that, if you look at the `decode` calls, there are plain `decode(…)` calls which throw if a `nil` was originally encoded and `decodeIfPresent` calls which return optional. The result is, essentially, that the encoding system eats a level of optionality for its own purposes—seemingly good, straightforward-looking code like this:

  struct MyRecord: Codable {
    var id: Int?
    …

    func encode(to encoder: Encoder) throws {
      let container = encoder.container(keyedBy: CodingKey.self)
      try container.encode(id, forKey: .id)
      …
    }

    init(from decoder: Decoder) throws {
      let container = decoder.container(keyedBy: CodingKey.self)
      id = try container.decode(Int.self, forKey: .id)
      …
    }
  }

Will crash. (At least, I assume that's what will happen.)

The return type of `decode(Int.self, forKey: .id)` is `Int`. I’m not convinced that it’s possible to misconstrue that as the correct thing to do here. How would that return a `nil` value if the value was `nil` to begin with?
The only other method that would be appropriate is `decodeIfPresent(Int.self, forKey: .id)`, which is exactly what you want.

I think we'd be better off having `encode(_:forKey:)` not take an optional; instead, we should have `Optional` conform to `Codable` and behave in some appropriate way. Exactly how to implement it might be a little tricky because of nested optionals; I suppose a `none` would have to measure how many levels of optionality there are between it and a concrete value, and then encode that information into the data. I think our `NSNull` bridging is doing something broadly similar right now.

`Optional` cannot encode to `Codable` for the reasons given above. It is a primitive type much like `Int` and `String`, and it’s up to the encoder and the format to represent it.
How would `Optional` encode `nil`?

I know that this is not the design you would use in Objective-C, but Swift uses `Optional` differently from how Objective-C uses `nil`. Swift APIs consider `nil` and absent to be different things; where they can both occur, good Swift APIs use doubled-up Optionals to be precise about the situation. I think the design needs to be a little different to accommodate that.

Now, back to the `SingleValue*Container`/`Keyed*Container` issue. The list above is, frankly, gigantic. You specify a *lot* of primitives in `Keyed*Container`; there's a lot to implement here. And then you have to implement it all *again* in `SingleValue*Container`:

    func encode(_ value: Bool) throws
    func encode(_ value: Int) throws
    func encode(_ value: Int8) throws
    func encode(_ value: Int16) throws
    func encode(_ value: Int32) throws
    func encode(_ value: Int64) throws
    func encode(_ value: UInt) throws
    func encode(_ value: UInt8) throws
    func encode(_ value: UInt16) throws
    func encode(_ value: UInt32) throws
    func encode(_ value: UInt64) throws
    func encode(_ value: Float) throws
    func encode(_ value: Double) throws
    func encode(_ value: String) throws
    func encode(_ value: Data) throws

This is madness.

Look, here's what we do. You have two types: `Keyed*Container` and `Value*Container`. `Keyed*Container` looks something like this:

  final public class KeyedEncodingContainer<EncoderType: Encoder, Key: > where Key.RawValue == String {
      public let encoder: EncoderType

      public let codingKeyContext: [RawRepresentable where RawValue == String]
      // Hmm, we might need a CodingKey protocol after all.
      // Still, it could just be `protocol CodingKey: RawRepresentable where RawValue == String {}`

      subscript (key: Key) -> ValueEncodingContainer {
          return encoder.makeValueEncodingContainer(forKey: key)
      }
  }

It's so simple, it doesn't even need to be specialized. You might even be able to get away with combining the encoding and decoding variants if the subscript comes from a conditional extension. `Value*Container` *does* need to be specialized; it looks like this (modulo the `Optional` issue I mentioned above):

Sure, let’s go with this for a moment. Presumably, then, `Encoder` would be able to vend out both `KeyedEncodingContainer`s and `ValueEncodingContainer`s, correct?

public protocol ValueEncodingContainer {
func encode<Value : Codable>(_ value: Value?, forKey key: Key) throws

I’m assuming that the key here is a typo, correct?
Keep in mind that combining these concepts changes the semantics of how single-value encoding works. Right now `SingleValueEncodingContainer` only allows values of primitive types; this would allow you to encode a value in terms of a different arbitrarily-codable value.

      func encode(_ value: Bool?) throws
      func encode(_ value: Int?) throws
      func encode(_ value: Int8?) throws
      func encode(_ value: Int16?) throws
      func encode(_ value: Int32?) throws
      func encode(_ value: Int64?) throws
      func encode(_ value: UInt?) throws
      func encode(_ value: UInt8?) throws
      func encode(_ value: UInt16?) throws
      func encode(_ value: UInt32?) throws
      func encode(_ value: UInt64?) throws
      func encode(_ value: Float?) throws
      func encode(_ value: Double?) throws
      func encode(_ value: String?) throws
      func encode(_ value: Data?) throws

      func encodeWeak<Object : AnyObject & Codable>(_ object: Object?) throws

Same comment here.

      var codingKeyContext: [CodingKey]
  }

And use sites would look like:

  func encode(to encoder: Encoder) throws {
    let container = encoder.container(keyedBy: CodingKey.self)
    try container[.id].encode(id)
    try container[.name].encode(name)
    try container[.birthDate].encode(birthDate)
  }

For consumers, this doesn’t seem to make much of a difference. We’ve turned `try container.encode(id, forKey:. id)` into `try container[.id].encode(id)`.

Decoding is slightly tricker. You could either make the subscript `Optional`, which would be more like `Dictionary` but would be inconsistent with `Encoder` and would give the "never force-unwrap anything" crowd conniptions, or you could add a `contains()` method to `ValueDecodingContainer` and make `decode(_:)` throw. Either one works.

Also, another issue with the many primitives: swiftc doesn't really like large overload sets very much. Could this set be reduced? I'm not sure what the logic was in choosing these particular types, but many of them share protocols in Swift—you might get away with just this:

  public protocol ValueEncodingContainer {
      func encode<Value : Codable>(_ value: Value?, forKey key: Key) throws
      func encode(_ value: Bool?, forKey key: Key) throws
      func encode<Integer: SignedInteger>(_ value: Integer?, forKey key: Key) throws
      func encode<UInteger: UnsignedInteger>(_ value: UInteger?, forKey key: Key) throws
      func encode<Floating: FloatingPoint>(_ value: Floating?, forKey key: Key) throws
      func encode(_ value: String?, forKey key: Key) throws
      func encode(_ value: Data?, forKey key: Key) throws

      func encodeWeak<Object : AnyObject & Codable>(_ object: Object?, forKey key: Key) throws

      var codingKeyContext: [CodingKey]
  }

These types were chosen because we want the API to make static guarantees about concrete types which all `Encoder`s and `Decoder`s should support. This is somewhat less relevant for JSON, but more relevant for binary formats where the difference between `Int16` and `Int64` is critical.
This turns the concrete type check into a runtime check that `Encoder` authors need to keep in mind. More so, however, any type can conform to `SignedInteger` or `UnsignedInteger` as long as it fulfills the protocol requirements. I can write an `Int37` type, but no encoder could make sense of that type, and that failure is a runtime failure. If you want a concrete example, `Float80` conforms to `FloatingPoint`; no popular binary format I’ve seen supports 80-bit floats, though — we cannot prevent that call statically…

Instead, we want to offer a static, concrete list of types that `Encoder`s and `Decoder`s must be aware of, and that consumers have guarantees about support for.

To accommodate my previous suggestion of using arrays to represent ordered encoded data, I would add one more primitive:

func encode(_ values: [Codable]) throws

Collection types are purposefully not primitives here:

* If `Array` is a primitive, but does not conform to `Codable`, then you cannot encode `Array<Array<Codable>>`.
* If `Array` is a primitive, and conforms to `Codable`, then there may be ambiguity between `encode(_ values: [Codable])` and `encode(_ value: Codable)`.
* Even in cases where there are not, inside of `encode(_ values: [Codable])`, if I call `encode([[1,2],[3,4]])`, you’ve lost type information about what’s contained in the array — all you see is `Codable`
* If you change it to `encode<Value : Codable>(_ values: [Value])` to compensate for that, you still cannot infinitely recurse on what type `Value` is. Try it with `encode([[[[1]]]])` and you’ll see what I mean; at some point the inner types are no longer preserved.

(Also, is there any sense in adding `Date` to this set, since it needs special treatment in many of our formats?)

We’ve considered adding `Date` to this list. However, this means that any format that is a part of this system needs to be able to make a decision about how to format dates. Many binary formats have no native representations of dates, so this is not necessarily a guarantee that all formats can make.

Looking for additional opinions on this one.

Encoding Container Types

For some types, the container into which they encode has meaning. Especially when coding for a specific output format (e.g. when communicating with a JSON API), a type may wish to explicitly encode as an array or a dictionary:

// Continuing from before
public protocol Encoder {
    func container<Key : CodingKey>(keyedBy keyType: Key.Type, type containerType: EncodingContainerType) -> KeyedEncodingContainer<Key>
}

/// An `EncodingContainerType` specifies the type of container an `Encoder` should use to store values.
public enum EncodingContainerType {
    /// The `Encoder`'s preferred container type; equivalent to either `.array` or `.dictionary` as appropriate for the encoder.
    case `default`

    /// Explicitly requests the use of an array to store encoded values.
    case array

    /// Explicitly requests the use of a dictionary to store encoded values.
    case dictionary
}

I see what you're getting at here, but I don't think this is fit for purpose, because arrays are not simply dictionaries with integer keys—their elements are adjacent and ordered. See my discussion earlier about treating inherently ordered containers as simply single-value `Array`s.

You’re right in that arrays are not simply dictionaries with integer keys, but I don’t see where we make that assertion here.
If an `Encoder` is asked for an array and is provided with integer keys, it can use those keys as indices. If the keys are non-contiguous, the intervening spaces can be filled with null values (if appropriate for the format; if not, the operation can error out).

The way these containers are handled is completely up to the `Encoder`. An `Encoder` producing an array may choose to ignore keys altogether and simply produce an array from the values given to it sequentially. (This is not recommended, but possible.)

Nesting

In practice, some types may also need to control how data is nested within their container, or potentially nest other containers within their container. Keyed containers allow this by returning nested containers of differing key types:

[snip]

This can be common when coding against specific external data representations:

// User type for interfacing with a specific JSON API. JSON API expects encoding as {"id": ..., "properties": {"name": ..., "timestamp": ...}}. Swift type differs from encoded type, and encoding needs to match a spec:

This comes very close to—but doesn't quite—address something else I'm concerned about. What's the preferred way to handle differences in serialization to different formats?

Here's what I mean: Suppose I have a BlogPost model, and I can both fetch and post BlogPosts to a cross-platform web service, and store them locally. But when I fetch and post remotely, I ned to conform to the web service's formats; when I store an instance locally, I have a freer hand in designing my storage, and perhaps need to store some extra metadata. How do you imagine handling that sort of situation? Is the answer simply that I should use two different types?

This is a valid concern, and one that should likely be addressed.

Perhaps the solution is to offer a `userInfo : [UserInfoKey : Any]` (`UserInfoKey` being a `String`-`RawRepresentable` struct or similar) on `Encoder` and `Decoder` set at the top-level to allow passing this type of contextual information from the top level down.

To remedy both of these points, we adopt a new convention for inheritance-based coding — encoding super as a sub-object of self:

[snip]

        try super.encode(to: container.superEncoder())

This seems like a good idea to me. However, it brings up another point: What happens if you specify a superclass of the originally encoded class? In other words:

  let joe = Employee(…)
  let payload = try SomeEncoder().encode(joe)
  …
  let someone = try SomeDecoder().decode(Person.self, from: payload)
  print(type(of: someone)) // Person, Employee, or does `decode(_:from:)` fail?

We don’t support this type of polymorphic decoding. Because no type information is written into the payload (there’s no safe way to do this that is not currently brittle), there’s no way to tell what’s in there prior to decoding it (and there wouldn’t be a reasonable way to trust what’s in the payload to begin with).
We’ve thought through this a lot, but in the end we’re willing to make this tradeoff for security primarily, and simplicity secondarily.

The encoding container types offer overloads for working with and processing the API's primitive types (String, Int, Double, etc.). However, for ease of implementation (both in this API and others), it can be helpful for these types to conform to Codable themselves. Thus, along with these overloads, we will offer Codable conformance on these types:

[snip]

Since Swift's function overload rules prefer more specific functions over generic functions, the specific overloads are chosen where possible (e.g. encode("Hello, world!", forKey: .greeting) will choose encode(_: String, forKey: Key) over encode<T : Codable>(_: T, forKey: Key)). This maintains performance over dispatching through the Codable existential, while allowing for the flexibility of fewer overloads where applicable.

How important is this performance? If the answer is "eh, not really that much", I could imagine a setup where every "primitive" type eventually represents itself as `String` or `Data`, and each `Encoder`/`Decoder` can use dynamic type checks in `encode(_:)`/`decode(_:)` to define whatever "primitives" it wants for its own format.

Does this imply that `Int32` should decide how it’s represented as `Data`? What if an encoder forgets to implement that?
Again, we want to provide a static list of types that `Encoder`s know they _must_ handle, and thus, consumers have _guarantees_ that those types are supported.

* * *

One more thing. In Alternatives Considered, you present two designs—#2 and #3—where you generate a separate instance which represents the type in a fairly standardized way for the encoder to examine.

This design struck me as remarkably similar to the reflection system and its `Mirror` type, which is also a separate type describing an original instance. My question was: Did you look at the reflection system when you were building this design? Do you think there might be anything that can be usefully shared between them?

We did, quite a bit, and spent a lot of time considering reflection and its place in our design. Ultimately, the reflection system does not currently have the features we would need, and although the Swift team has expressed desire to improve the system considerably, it’s not currently a top priority, AFAIK.

Thank you for your attention. I hope this was helpful!

Thanks for all of these comments! Looking to respond to your other email soon.

···

On 15 Mar 2017, at 21:19, Brent Royal-Gordon wrote:

On Mar 15, 2017, at 3:40 PM, Itai Ferber via swift-evolution >> <swift-evolution@swift.org> wrote:

--
Brent Royal-Gordon
Architechies

itaiferber · March 16, 2017, 8:03pm

Thanks for the comments, David.
I responded to #2 in a separate email, but wanted to get back to responding to #1.

In implementing this, I have had the same thoughts. Ideally, one day, we would be able to migrate the implementation of this away from the compiler to public API (through reflection, property behaviors, or similar). If the compiler offers external features that would allow us to do everything that we want, I would be more than happy to move the implementation from inside the compiler to outside of it.

···

On 16 Mar 2017, at 0:09, David Hart wrote:

First of all, great proposal :D

Brent, earlier in the thread makes a lot of good points. But I’d still like to discuss two subjects:

1) What makes the proposal really stand on its feet compared to third-party libraries is the compiler generation magic. I feel divided about it. On one hand, this is the only solution today to have this level of type and key safety. But on another hand, I have the impression that future versions of Swift (with more reflection, property behaviours, lenses, etc…) would dramatically affect how this subject is treated and implemented. Are you worried that we are asking the compiler to do work which might be un-necessary in the future? That this topic would be better expressed with more powerful language features? Any plans to migrate for this API to smoothly migrate to those features in the future?

2) Libraries like Marshal (https://github.com/utahiosmac/Marshal\) and Unbox (https://github.com/JohnSundell/Unbox\) don’t require the decoding functions to provide the type: those functions are generic on the return turn and it’s automatically inferred:

func decode<T>(key: Key) -> T

self.stringProperty = decode(key: .stringProperty) // correct specialisation of the generic function chosen by the compiler

Is there a reason the proposal did not choose this solution? Its quite sweet.

Swift Archival & Serialization
Proposal: SE-NNNN <https://github.com/apple/swift-evolution/pull/639>
Author(s): Itai Ferber <https://github.com/itaiferber>, Michael LeHew <https://github.com/mlehew>, Tony Parker <https://github.com/parkera>
Review Manager: TBD
Status: Awaiting review
Associated PRs:
#8124 <https://github.com/apple/swift/pull/8124>
#8125 <https://github.com/apple/swift/pull/8125>

Tony_Parker · March 16, 2017, 8:59pm

Hi Slava,

Hi Itai,

I’m wondering what the motivation is for keeping this as part of Foundation and not the standard library. It seems like you’re landing an implementation of this in the Foundation overlay on master, and another copy of all the code will have to go into swift-corelibs-foundation. This seems suboptimal. Or are there future plans to unify the Foundation overlay with corelibs-foundation somehow?

I would like some unification in the future, but they are currently two separate implementations for a bunch of reasons (lack of bridging on Linux being a huge one, along with the inability of the standard library and runtime to distinguish between presence of Objective-C and the presence of Foundation).

Also the implementation uses some Foundation-isms (NSMutableArray, NSNumber) and it would be nice to stick with idiomatic Swift as much as possible instead.

The implementation you’re looking at is for JSONArchiver, which is based on NSJSONSerialization, which is part of Foundation and not the standard library. That’s a primary reason to use Foundation. NSJSONSerialization also deals with types like ‘Date’ which are in Foundation. Finally, the primitive Coding API uses Data, which is a Foundation type.

So in summary, I’m fine with this API being part of Foundation. Foundation is available the same places the standard library is, so it is more than acceptable to use Foundation API and types here.

- Tony

···

On Mar 16, 2017, at 1:50 PM, Slava Pestov via swift-evolution <swift-evolution@swift.org> wrote:

Finally you should take a look at the integer protocol work (https://github.com/apple/swift-evolution/blob/master/proposals/0104-improved-integers.md\) to replace the repetitive code surrounding primitive types, however I don’t know if this has landed in master yet.

Slava
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

itaiferber · March 16, 2017, 9:04pm

Hi Slava,

Thanks for your comments!

Hi Itai,

I’m wondering what the motivation is for keeping this as part of Foundation and not the standard library. It seems like you’re landing an implementation of this in the Foundation overlay on master, and another copy of all the code will have to go into swift-corelibs-foundation. This seems suboptimal. Or are there future plans to unify the Foundation overlay with corelibs-foundation somehow?

This has to be part of Foundation because `Data`, a Foundation type, is one of the primitive types of serialization. This will be doubly true if we decide to add `Date` as another primitive type.

I agree that this is suboptimal at the moment, but we will work to find a way to keep the work in sync in a reasonable manner.

Also the implementation uses some Foundation-isms (NSMutableArray, NSNumber) and it would be nice to stick with idiomatic Swift as much as possible instead.

Using the Foundation framework is just as idiomatic in Swift… ;)
In this specific case, we need collections with reference semantics (`NSMutableArray` and `NSMutableDictionary`) and a concrete type-erased number box (`NSNumber`); theres’s no reason to reinvent the wheel if we already have exactly the tools we need.

The reference implementation at the moment goes through `JSONSerialization`, which affects the specifics of its implementation. This may change in the future.

Finally you should take a look at the integer protocol work (https://github.com/apple/swift-evolution/blob/master/proposals/0104-improved-integers.md\) to replace the repetitive code surrounding primitive types, however I don’t know if this has landed in master yet.

As mentioned in other emails, the list of primitive types was carefully chosen because we need to have a concrete list of types which consumers can rely on being supported, and that `Encoder`s and `Decoder`s know they _must_ support.

Specifically:

1. For binary formats, the difference between an `Int16` and an `Int64` is significant. The `Encoder` would need to know that it’s received one type or another, not just a `FixedWidthInteger`; this would involve a runtime check of the concrete type of the argument
2. Any type can conform to these protocols — nothing is preventing me from writing an `Int37` type conforming to `FixedWidthInteger` and passing it in. Most encoders would really not know what to do with this type (especially ones with binary formats), but the failure would be a runtime one instead of a static one
* A concrete example of this is the `FloatingPoint` protocol. `Float80` conforms to the protocol, but no common binary format I’ve seen supports 80-bit floating-point values. We’d prefer to prevent that statically by accepting only `Float` and `Double`
3. Consumers of the API then have no guarantees that a specific `Encoder` supports the type that they need. Did the encoder remember to support `UInt64` values? Similarly, `Encoder`s and `Decoder`s don’t know what types they need to be considering. Am I supposed to handle `UInt8` differently from `Int16`? With a list of concrete types, this becomes immediately clear — both consumers and writers of `Encoder`s have a clear contract.

···

On 16 Mar 2017, at 13:50, Slava Pestov wrote:

Slava

Karl · March 17, 2017, 7:20am

I agree that the protocols should be part of the standard library rather than Foundation. As far as I can make out, the only part of this proposal that actually requires Foundation is the use of the “Data” type (which itself is a strange and often frustrating omission from the standard library). The actual concrete encoders can live in Foundation.

Generally my opinion is that the proposed feature is nice. Everybody hates NSCoder and having to write those required initialisers on your UIViews and whatnot. At its core, it’s not really very different from any other Swift archiving library which exists today, except that it’s backed with layer upon layer of compiler-generated magic to make it less verbose. The things I don’t like:

1) While making things less verbose is commendable, automatically generating the encoding functions could be an anti-feature. “Codable” is for properties with persistable values only, which is a level of semantics which goes above the type-system. We don’t generate Equatable conformance for structs whose elements are all Equatable; it’s a big leap to go from “this data type is persistable” to “the value of this variable should be persisted” - for one thing, the value may not have meaning to others (e.g. a file-handle as an Int32) or it may contain sensitive user-data in a String. The encoding function isn’t just boilerplate - you *should* think about it; otherwise who knows what kind of data you’re leaking?

=> One possible solution would be to have “Codable" as a regular protocol, and refine it with “AutoCodable" which contains the magic. Just so there is a little extra step where you think “do I want this to be generated?”.

2) More generally, though, I don’t support the idea of the Foundation module introducing extensions to the Swift language that no other developer of any other module has the chance to do, with an aside that some reflection API which doesn’t exist yet might possibly make it less magic in the future. My jaw hit the floor when I saw this was part of the proposal, and that it wasn’t for the standard library. Could you imagine, if somebody proposed their own magic protocols for AlamoFire or Kitura or any other Swift library? It would be a non-starter. It *should* be a non-starter.

=> Do we have a policy about module-specific compiler magic such as this?
=> Can we move any of the magic (e.g. CodableKeys generation) to the standard library?

I develop a lot for platforms, or for scenarios, where Foundation is not supported nor desirable. Considering that people are taught to prefer library code to rolling their own, and that humans are generally quite receptive to shortcuts at the expense of correctness, if this machinery exists at all we can expect it to spread. It would totally kill libraries such as SwiftJSON or whatever else you currently use. The fact that such a fundamental and widespread feature would now live in a monolithic module I can’t access would significantly spoil the language for me.

- Karl

···

On 16 Mar 2017, at 21:48, Slava Pestov via swift-evolution <swift-evolution@swift.org> wrote:

Hi Itai,

I’m wondering what the motivation is for keeping this as part of Foundation and not the standard library. It seems like you’re landing an implementation of this in the Foundation overlay on master, and another copy of all the code will have to go into swift-corelibs-foundation. This seems suboptimal. Or are there future plans to unify the Foundation overlay with corelibs-foundation somehow?

Also the implementation uses some Foundation-isms (NSMutableArray, NSNumber) and it would be nice to stick with idiomatic Swift as much as possible instead.

Finally you should take a look at the integer protocol work (https://github.com/apple/swift-evolution/blob/master/proposals/0104-improved-integers.md\) to replace the repetitive code surrounding primitive types, however I don’t know if this has landed in master yet.

Slava

itaiferber · March 17, 2017, 6:15pm

Another issue of scale - I had to switch to a native mail client as replying inline severely broke my webmail client. ;-)

Again, lots of love here. Responses inline.

Proposed solution
We will be introducing the following new types:

protocol Codable: Adopted by types to opt into archival. Conformance may be automatically derived in cases where all properties are also Codable.

FWIW I think this is acceptable compromise. If the happy path is derived conformances, only-decodable or only-encodable types feel like a lazy way out on the part of a user of the API, and builds a barrier to proper testing.

[snip]

Structured types (i.e. types which encode as a collection of properties) encode and decode their properties in a keyed manner. Keys may be String-convertible or Int-convertible (or both), and user types which have properties should declare semantic key enums which map keys to their properties. Keys must conform to the CodingKey protocol:
public protocol CodingKey { <##snip##> }

A few things here:

The protocol leaves open the possibility of having both a String or Int representation, or neither. What should a coder do in either case? Are the representations intended to be mutually exclusive, or not? The protocol design doesn’t seem particularly matching with the flavor of Swift; I’d expect something along the lines of a CodingKey enum and the protocol CodingKeyRepresentable. It’s also possible that the concerns of the two are orthogonal enough that they deserve separate container(keyedBy:) requirements.

The general answer to "what should a coder do" is "what is appropriate for its format". For a format that uses exclusively string keys (like JSON), the string representation (if present on a key) will always be used. If the key has no string representation but does have an integer representation, the encoder may choose to stringify the integer. If the key has neither, it is appropriate for the `Encoder` to fail in some way.

On the flip side, for totally flat formats, an `Encoder` may choose to ignore keys altogether, in which case it doesn’t really matter. The choice is up to the `Encoder` and its format.

The string and integer representations are not meant to be mutually exclusive at all, and in fact, where relevant, we encourage providing both types of representations for flexibility.

As for the possibility of having neither representation, this question comes up often. I’d like to summarize the thought process here by quoting some earlier review (apologies for the poor formatting from my mail client):

If there are two options, each of which is itself optional, we have 4 possible combinations. But! At the same time we prohibit one combination by what? Runtime error? Why not use a 3-case enum for it? Even further down the rabbit whole there might be a CodingKey<> specialized for a concrete combination, like CodingKey<StringAndIntKey> or just CodingKey<StringKey>, but I’m not sure whether our type system will make it useful or possible…

public enum CodingKeyValue {
  case integer(value: Int)
  case string(value: String)
  case both(intValue: Int, stringValue: String)
}
public protocol CodingKey {
  init?(value: CodingKeyValue)
  var value: CodingKeyValue { get }
}

I agree that this certainly feels suboptimal. We’ve certainly explored other possibilities before sticking to this one, so let me try to summarize here:

* Having a concrete 3-case CodingKey enum would preclude the possibility of having neither a stringValue nor an intValue. However, there is a lot of value in having the key types belong to the type being encoded (more safety, impossible to accidentally mix key types, private keys, etc.); if the CodingKey type itself is an enum (which cannot be inherited from), then this prevents differing key types.
* Your solution as presented is better: CodingKey itself is still a protocol, and the value itself is the 3-case enum. However, since CodingKeyValue is not literal-representable, user keys cannot be enums RawRepresentable by CodingKeyValue. That means that the values must either be dynamically returned, or (for attaining the benefits that we want to give users — easy representation, autocompletion, etc.) the type has to be a struct with static lets on it giving the CodingKeyValues. This certainly works, but is likely not what a developer would have in mind when working with the API; the power of enums in Swift makes them very easy to reach for, and I’m thinking most users would expect their keys to be enums. We’d like to leverage that where we can, especially since RawRepresentable enums are appropriate in the vast majority of use cases.
* Three separate CodingKey protocols (one for Strings, one for Ints, and one for both). You could argue that this is the most correct version, since it most clearly represents what we’re looking for. However, this means that every method now accepting a CodingKey must be converted into 3 overloads each accepting different types. This explodes the API surface, is confusing for users, and also makes it impossible to use CodingKey as an existential (unless it’s an empty 4th protocol which makes no static guarantees and the others inherit from).
* [The current] approach. On the one hand, this allows for the accidental representation of a key with neither a stringValue nor an intValue. On the other, we want to make it really easy to use autogenerated keys, or autogenerated key implementations if you provide the cases and values yourself. The nil value possibility is only a concern when writing stringValue and intValue yourself, which the vast majority of users should not have to do.
  * Additionally, a key word in that sentence bolded above is “generally”. As part of making this API more generalized, we push a lot of decisions to Encoders and Decoders. For many formats, it’s true that having a key with no value is an error, but this is not necessarily true for all formats; for a linear, non-keyed format, it is entirely reasonable to ignore the keys in the first place, or replaced them with fixed-format values. The decision of how to handle this case is left up to Encoders and Decoders; for most formats (and for our implementations), this is certainly an error, and we would likely document this and either throw or preconditionFailure. But this is not the case always.
* In terms of syntax, there’s another approach that would be really nice (but is currently not feasible) — if enums were RawRepresentable in terms of tuples, it would be possible to give implementations for String, Int, (Int, String), (String, Int), etc., making this condition harder to represent by default unless you really mean to.

Hope that gives some helpful background on this decision. FWIW, the only way to end up with a key having no `intValue` or `stringValue` is manually implementing the `CodingKey` protocol (which should be _exceedingly_ rare) and implementing the methods by not switching on `self`, or some other method that would allow you to forget to give a key neither value.

Speaking of the mutually exclusive representations - what above serializations that doesn’t code as one of those two things? YAML can have anything be a “key”, and despite that being not particularly sane, it is a use case.

We’ve explored this, but at the end of the day, it’s not possible to generalize this to the point where we could represent all possible options on all possible formats because you cannot make any promises as to what’s possible and what’s not statically.

We’d like to strike a balance here between strong static guarantees on one end (the extreme end of which introduces a new API for every single format, since you can almost perfectly statically express what’s possible and what isn’) and generalization on the other (the extreme end of which is an empty protocol because there really are encoding formats which are mutually exclusive). So in this case, this API would support producing and consuming YAML with string or integer keys, but not arbitrary YAML.

For most types, String-convertible keys are a reasonable default; for performance, however, Int-convertible keys are preferred, and Encoders may choose to make use of Ints over Strings. Framework types should provide keys which have both for flexibility and performance across different types of Encoders. It is generally an error to provide a key which has neither a stringValue nor an intValue.

Could you speak a little more to using Int-convertible keys for performance? I get the feeling int-based keys parallel the legacy of NSCoder’s older design, and I don’t really see anyone these days supporting non-keyed archivers. They strike me as fragile. What other use cases are envisioned for ordered archiving than that?

We agree that integer keys are fragile, and from years (decades) of experience with `NSArchiver`, we are aware of the limitations that such encoding offers. For this reason, we will never synthesize integer keys on your behalf. This is something you must put thought into, if using an integer key for archival.

However, there are use-cases (both in archival and in serialization, but especially so in serialization) where integer keys are useful. Ordered encoding is one such possibility (when the format supports it, integer keys are sequential, etc.), and is helpful for, say, marshaling objects in an XPC context (where both sides are aware of the format, are running the same version of the same code, on the same device) — keys waste time and bandwidth unnecessarily in some cases.

Integer keys don’t necessarily imply ordered encoding, however. There are binary encoding formats which support integer-keyed dictionaries (read: serialized hash maps) which are more efficient to encode and decode than similar string-keyed ones. In that case, as long as integer keys are chosen with care, the end result is more performant.

But again, this depends on the application and use case. Defining integer keys requires manual effort because we want thought put into defining them; they are indeed fragile when used carelessly.

[snip]

Keyed Encoding Containers

Keyed encoding containers are the primary interface that most Codable types interact with for encoding and decoding. Through these, Codable types have strongly-keyed access to encoded data by using keys that are semantically correct for the operations they want to express.

Since semantically incompatible keys will rarely (if ever) share the same key type, it is impossible to mix up key types within the same container (as is possible with Stringkeys), and since the type is known statically, keys get autocompletion by the compiler.

open class KeyedEncodingContainer<Key : CodingKey> {

Like others, I’m a little bummed about this part of the design. Your reasoning up-thread is sound, but I chafe a bit on having to reabstract and a little more on having to be a reference type. Particularly knowing that it’s got a bit more overhead involved… I /like/ that NSKeyedArchiver can simply push some state and pass itself as the next encoding container down the stack.

There’s not much more to be said about why this is a `class` that I haven’t covered; if it were possible to do otherwise at the moment, then we would.

As for _why_ we do this — this is the crux of the whole API. We not only want to make it easy to use a custom key type that is semantically correct for your type, we want to make it difficult to do the easy but incorrect thing. From experience with `NSKeyedArchiver`, we’d like to move away from unadorned string (and integer) keys, where typos and accidentally reused keys are common, and impossible to catch statically.
`encode<T : Codable>(_: T?, forKey: String)` unfortunately not only encourages code like `encode(foo, forKey: "foi") // whoops, typo`, it is _more difficult_ to use a semantic key type: `encode(foo, forKey: CodingKeys.foo.stringValue)`. The additional typing and lack of autocompletion makes it an active disincentive. `encode<T : Codable>(_: T?, forKey: Key)` reverses both of these — it makes it impossible to use unadorned strings or accidentally use keys from another type, and nets shorter code with autocompletion: `encode(foo, forKey: .foo)`

The side effect of this being the fact that keyed containers are classes is suboptimal, I agree, but necessary.

open func encode<Value : Codable>(_ value: Value?, forKey key: Key) throws

Does this win anything over taking a Codable?

Taking the concrete type over an existential allows for static dispatch on the type within the implementation, and is a performance win in some cases.

    open func encode(_ value: Bool?, forKey key: Key) throws
    open func encode(_ value: Int?, forKey key: Key) throws
    open func encode(_ value: Int8?, forKey key: Key) throws
    open func encode(_ value: Int16?, forKey key: Key) throws
    open func encode(_ value: Int32?, forKey key: Key) throws
    open func encode(_ value: Int64?, forKey key: Key) throws
    open func encode(_ value: UInt?, forKey key: Key) throws
    open func encode(_ value: UInt8?, forKey key: Key) throws
    open func encode(_ value: UInt16?, forKey key: Key) throws
    open func encode(_ value: UInt32?, forKey key: Key) throws
    open func encode(_ value: UInt64?, forKey key: Key) throws
    open func encode(_ value: Float?, forKey key: Key) throws
    open func encode(_ value: Double?, forKey key: Key) throws
    open func encode(_ value: String?, forKey key: Key) throws
    open func encode(_ value: Data?, forKey key: Key) throws

What is the motivation behind abandoning the idea of “primitives” from the Alternatives Considered? Performance? Being unable to close the protocol?

Being unable to close the protocol is the primary reason. Not being able to tell at a glance what the concrete types belonging to this set are is related, and also a top reason.

What ways is encoding a value envisioned to fail? I understand wanting to allow maximum flexibility, and being symmetric to `decode` throwing, but there are plenty of “conversion” patterns the are asymmetric in the ways they can fail (Date formatters, RawRepresentable, LosslessStringConvertible, etc.).

Different formats support different concrete values, even of primitive types. For instance, you cannot natively encode `Double.nan` in JSON, but you can in plist. Without additional options on `JSONEncoder`, `encode(Double.nan, forKey: …)` will throw.

/// For `Encoder`s that implement this functionality, this will only encode the given object and associate it with the given key if it encoded unconditionally elsewhere in the archive (either previously or in the future).
open func encodeWeak<Object : AnyObject & Codable>(_ object: Object?, forKey key: Key) throws

Is this correct that if I send a Cocoa-style object graph (with weak backrefs), an encoder could infinitely recurse? Or is a coder supposed to detect that?

`encodeWeak` has a default implementation that calls the regular `encode<T : Codable>(_: T, forKey: Key)`; only formats which actually support weak backreferencing should override this implementation, so it should always be safe to call (it will simply unconditionally encode the object by default).

open var codingKeyContext: [CodingKey]
}
[snippity snip]

Alright, those are just my first thoughts. I want to spend a little time marinating in the code from PR #8124 before I comment further. Cheers! I owe you, Michael, and Tony a few drinks for sure.

Hehe, thanks :)

···

On 15 Mar 2017, at 22:58, Zach Waldowski wrote:

On Mar 15, 2017, at 6:40 PM, Itai Ferber via swift-evolution >> <swift-evolution@swift.org> wrote:

Zach Waldowski
zach@waldowski.me

itaiferber · March 17, 2017, 7:42pm

This is a fantastic proposal! I am very much looking forward to robust Swift-native encoding and decoding in Foundation. The compiler synthesized conformances is especially great! I want to thank everyone who worked on it. It is clear that a lot of work went into the proposal.

The proposal covers a lot of ground so I’m breaking my comments up by topic in the order the occur in the proposal.

Thanks for the feedback, Matthew! Responses inline.

Encode / Decode only types:

Brent raised the question of decode only types. Encode only types are also not uncommon when an API accepts an argument payload that gets serialized into the body of a request. The compiler synthesis feature in the proposal makes providing both encoding and decoding easy in common cases but this won’t always work as needed.

The obvious alternative is to have Decodable and Encodable protocols which Codable refines. This would allow us to omit a conformance we don’t need when it can’t be synthesized.

If conformances are still synthesized individually (i.e. for just `Decodable` or just `Encodable`), it would be way too easy to accidentally conform to one or the other and not realize that you’re not conforming to `Codable`, since the synthesis is invisible. You’d just be missing half of the protocol.

If the way out of this is to only synthesize conformance to `Codable`, then it’s much harder to justify the inclusion of `Encodable` or `Decodable` since those would require a manual implementation and would much more rarely be used.

Your reply to Brent mentions using `fatalError` to avoid implementing the direction that isn't needed. I think it would be better if the conformance can reflect what is actually supported by the type. Requiring us to write `fatalError` as a stub for functionality we don’t need is a design problem IMO. I don’t think the extra protocols are really that big a burden. They don’t add any new functionality and are very easy to understand, especially considering the symmetry they would have with the other types you are introducing.

Coding Keys:

As others have mentioned, the design of this protocol does not require a value of a conforming type to actually be a valid key (it can return nil for both `intValue` and `stringValue`). This seems problematic to me.

In the reply to Brent again you mention throwing and `preconditionFailure` as a way to handle incompatible keys. This also seems problematic to me and feels like a design problem. If we really need to support more than one underlying key type and some encoders will reject some key types this information should be captured in the type system. An encoder would only vend a keyed container for keys it actually supports. Ideally the conformance of a type’s CodingKeys could be leveraged to produce a compiler error if an attempt was made to encode this type into an encoder that can’t support its keys. In general, the idea is to produce static errors as close to the origin of the programming mistake as possible.

I would very much prefer that we don’t defer to runtime assertions or thrown errors, etc for conditions that could be caught statically at compile time given the right design. Other comments have indicated that static guarantees are important to the design (encoders *must* guarantee support of primitives specified by the protocols, etc). Why is a static guarantee of compatible coding keys considered less important?

I agree that it would be nice to support this in a static way, but while not impossible to represent in the type system, it absolutely explodes the API into a ton of different types and protocols which are not dissimilar. We’ve considered this in the past (see approach #4 in the [Alternatives Considered](https://github.com/itaiferber/swift-evolution/blob/637532e2abcbdb9861e424359bb6dac99dc6b638/proposals/XXXX-swift-archival-serialization.md#alternatives-considered\) section) and moved away from it for a reason.

To summarize:

* To statically represent the difference between an encoder which supports string keys and one which supports integer keys, we would have to introduce two different protocol types (say, `StringKeyEncoder` and `IntKeyEncoder`)
* Now that there are two different encoder types, the `Codable` protocol also needs to be split up into two — one version which encodes using a `StringKeyEncoder` and one version which encodes using an `IntKeyEncoder`. If you want to support encoding to an encoder which supports both types of keys, we’d need a _third_ Codable protocol which takes something that’s `StringKeyEncoder & IntKeyEncoder` (because you cannot just conform to both `StringCodable` and `IntCodable` — it’s ambiguous when given something that’s `StringKeyEncoder & IntKeyEncoder`)
* On encoders which support both string and integer keys, you need overloads for `encode<T : StringCodable>(…)`, `encode<T :

(…)`, and `encode<T : StringCodable & IntCodable>(…)` or

else the call is ambiguous
* Repeat for both `encode<T : …>(_ t: T?, forKey: String)` and `encode<T : …>(_ t: T?, forKey: Int)`
* Repeat for decoders, with all of their overloads as well

This is not to mention the issue of things which are single-value encodable, which adds additional complexity. Overall, the complexity of this makes it unapproachable and confusing as API, and is a hassle for both consumers and for `Encoder`/`Decoder` writers.

We are willing to make the runtime failure tradeoff for keeping the rest of the API consumable with the understanding that we expect that the vast majority of `CodingKey` conformances will be automatically generated, and that type authors will generally provide key types which are appropriate for the formats they expect to encode their own types in.

Keyed Containers:

Joe posted raised the topic of the alternative of using manual type erasure for the keyed containers rather than abstract classes. Did you explore this direction at all? It feels like a more natural approach for Swift and as Joe noted, it can be designed in such a way that eases migration to existentials in the future if that is the “ideal” design (which you mentioned in your response).

Joe mentions the type-erased types as an example of where we’ve had to use those because we’re lacking other features — I don’t see how type erasure would be the solution. We’re doing the opposite of type-erasure: we’re trying to offer an abstract type that is generic and _specified_ on a type you give it. The base `KeyedEncodingContainer` is effectively a type-erased base type, but it’s the generics that we really need.

Decoding Containers:

returns: A value of the requested type, if present for the given key and convertible to the requested type.

Can you elaborate on the details of “convertible to the requested type” means? It think this is an important detail for the proposal.

For example, I would expect numeric values to attempt conversion using the SE-0080 failable numeric conversion initializers (decoding JSON was a primary motivating use case for that proposal). If the requested type conforms to RawRepresentable and the encoded value can be converted to RawValue (perhaps using a failable numeric initializer) I would expect the raw value initializer to be used to attempt conversion. If Swift ever gained a standard mechanism for generalized value conversion I would also expect that to be used if a conversion is available from the encoded type to the requested type.

If either of those conversions fail I would expect something like an “invalid value” error or a “conversion failure” error rather than a “type mismatch” error. The types don’t exactly mismatch, we just have a failable conversion process that did not succeed.

Keep in mind that no type information is written into the payload, so the interpretation of this is up to the `Encoder` and its format.
JSON, for instance, has no delineation between number types. For `{"value": 1}`, you should be able to `decode(…, forKey: .value)` the value through any one of the numeric types, since 1 is representable by any of them. However, requesting it as a `String` should throw a `.coderTypeMismatch`.

If you try to ask for `3.14` as an `Int`, I think it’s valid to get a `.coderTypeMismatch` — you asked for something of the wrong type altogether. I don’t see much value in providing a different error type to represent the same thing.

Context:

I’m glad Brent raised the topic of supporting multiple formats. In his example the difference was remote and local formats. I’ve also seen cases where the same API requires the same model to be encoded differently in different endpoints. This is awful, but it also happens sometimes in the wild. Supporting an application specified encoding context would be very useful for handling these situations (the `codingKeyContex` would not always be sufficient and would usually not be a good way to determine the required encoding or decoding format).

A `[UserInfoKey: Any]` context was mentioned as a possibility. This would be better than nothing, but why not use the type system to preserve information about the type of the context? I have a slightly different approach in mind. Why not just have a protocol that refines Codable with context awareness?

public protocol ContextAwareCodable: Codable {
    associatedtype Context
    init(from decoder: Decoder, with context: Context?) throws
    func encode(to encoder: Encoder, with context: Context?) throws
}
extension ContextAwareCodable {
    init(from decoder: Decoder) throws {
        try self.init(from: decoder, with: nil
    }
    func encode(to encoder: Encoder) throws {
        try self.encode(to: encoder, with: nil)
    }
}

There are a few issues with this:

1. For an `Encoder` to be able to call `encode(to:with:)` with the `Context` type, it would have to know the `Context` type statically. That means it would have to be generic on the `Context` type (we’ve looked at this in the past with regards to the encoder declaring the type of key it’s willing to accept)
2. It makes more sense for the `Encoder` to define what context type it vends, rather than have `Codable` things discriminate on what they can accept. If you have two types in a payload which require mutually-exclusive context types, you cannot encode the payload at all
3. `associatedtype` requirements cannot be overridden by subclasses. That means if you subclass a `ContextAwareCodable` type, you cannot require a different context type than your parent, which may be a no-go. This, by the way, is why we don’t have an official `associatedtype CodingKeys : CodingKey` requirement on `Codable`

Encoders and Decoders would be encouraged to support a top level encode / decode method which is generic and takes an application supplied context. When the context is provided it would be given to all `ContextAwareCodable` types that are aware of this context type during encoding or decoding. The coding container protocols would include an overload for `ContextAwareCodable` allowing the container to know whether the Value understands the context given to the top level encoder / decoder:

open func encode<Value : ContextAwareCodable>(_ value: Value?, forKey key: Key) throws

A common top level signature for coders and decoders would look something like this:

open func encode<Value : Codable, Context>(_ value: Value, with context: Context) throws -> Data

This approach would preserve type information about the encoding / decoding context. It falls back to the basic Codable implementation when a Value doesn’t know about the current context. The default implementation simply translates this to a nil context allowing ContextAwareCodable types to have a single implementation of the initializer and the encoding method that is used whether they are able to understand the current context or not.

I amend the above — this means that if you have two types which require different contexts in the same payload, only one of them will get the context and the other silently will not. I’m not sure this is better.

A slightly more type-erased context type would allow all members to look at the context if desired without having to split the protocol, require multiple different types and implementations, etc.

···

On 16 Mar 2017, at 14:29, Matthew Johnson wrote:

- Matthew

Tony_Parker · March 20, 2017, 5:17pm

Hi Kenny,

Hi All.

Forgive me if I missed it - I haven’t read the proposal in full detail - but it seems to make no mention of archiving graphs with circular references. Is this implicitly supported, or explicitly unsupported?

We expect this to be left up to the encoders and decoders. In Foundation itself, we actually have subclasses of NSCoder that support this and subclasses that do not.

While we’re at it, my only real exposure to archiving is through Foundation, so I’d like to know how everybody else understands these terms:

serialization - the process of “flattening” out an object graph into a serial stream of objects

encoding - the process of converting internal object data into an external format

archiving - the whole enchilada of serialization + encoding

Thanks!

-Kenny

Good question about terminology. Here is how we’ve tried to define these:

Serialization: conversion of a small fixed set of types to a data format and back
ex. NSJSONSerialization, NSPropertyListSerialization

Encoding: conversion of an arbitrary type to a smaller set of serialized types and back
ex: Encoder, Decoder, JSONEncoder, PropertyListEncoder

Archiver: In the ObjC Foundation, the objects which do the encoding. In this proposal we have chosen not to re-use this term to avoid confusion. Instead we tried to simplify the terminology by calling the top level concrete object an encoder.

- Tony

···

On Mar 19, 2017, at 9:53 PM, Kenny Leung via swift-evolution <swift-evolution@swift.org> wrote:

itaiferber · March 21, 2017, 5:03pm

Hi Colin,

Thanks for your comments! Are you talking about `Codable` synthesis, or encoding in general?

Hi Itai,

Glad to see these proposal! I'm curious, have you or the other Swift folks
thought about how *users* of these new Codable protocols will interact with
resilience domains?

What I mean is that what appear to be private or internal identifiers, and
thus changeable at will, may actually be fragile in that changing them will
break the ability to decode archives encoded by previous versions.

Making this safer could mean:
- Encoding only public properties

Unfortunately, property accessibility in code does not always map 1-to-1 with accessibility for archival (nor do I think they should be tied to one another).
There are certainly cases where you’d want to include private information in an archive, but that is not useful to expose to external clients, e.g., a struct/class version:

public struct MyFoo {
     // Should be encoded.
     public var title: String
     public var identifier: Int

     // This should be encoded too — in case the struct changes in the
     // future, want to be able to refer to the payload version.
     private let version = 1.0
}

Of course, there can also be public properties that you don’t find useful to encode. At the moment, I’m not sure there’s a much better answer than "the author of the code will have to think about the representation of their data"; even if there were an easier way to annotate "I definitely want this to be archived"/"I definitely don’t want this to be archived", the annotation would still need to be manual.

(The above applies primarily in the case of `Codable` synthesis; when implementing `Codable` manually I don’t think the compiler should ever prevent you from doing what you need.)

- Adding some form of indirection (a la ObjC non-fragile ivars?)

What do you mean by this?

- Compiler warning (or disallowing) changes to properties in certain
situations.

We’ve thought about this with regards to identifying classes uniquely across renaming, moving modules, etc.; this is a resilience problem in general.
In order for the compiler to know about changes to your code it’d need to keep state across compilations. While possible, this feels pretty fragile (and potentially not very portable).

* Compiler warns about changing a property? Blow away the cache directory!
* Cloning the code to a new machine for the first time? Hmm, all the warnings went away…

This would be nice to have, but yes:

I imagine the specifics would need to follow the rest of the plans for
resilience.

specifics on this would likely be in line with the rest of resilience plans for Swift in general.

···

On 21 Mar 2017, at 8:44, Colin Barrett wrote:

It's likely that this could be addressed by a future proposal, as for the
time being developers can simply "not hold it wrong" ;)

Thanks,
-Colin

On Wed, Mar 15, 2017 at 6:52 PM Itai Ferber via swift-evolution < > swift-evolution@swift.org> wrote:

Hi everyone,

The following introduces a new Swift-focused archival and serialization
API as part of the Foundation framework. We’re interested in improving the
experience and safety of performing archival and serialization, and are
happy to receive community feedback on this work.
Because of the length of this proposal, the *Appendix* and *Alternatives
Considered* sections have been omitted here, but are available in the full
proposal <https://github.com/apple/swift-evolution/pull/639> on the
swift-evolution repo. The full proposal also includes an *Unabridged API* for
further consideration.

Without further ado, inlined below.

— Itai

Swift Archival & Serialization

   - Proposal: SE-NNNN <https://github.com/apple/swift-evolution/pull/639>
   - Author(s): Itai Ferber <https://github.com/itaiferber>, Michael LeHew
   <https://github.com/mlehew>, Tony Parker <https://github.com/parkera>
   - Review Manager: TBD
   - Status: *Awaiting review*
   - Associated PRs:
      - #8124 <https://github.com/apple/swift/pull/8124>
      - #8125 <https://github.com/apple/swift/pull/8125>

Introduction

Foundation's current archival and serialization APIs (NSCoding,
NSJSONSerialization, NSPropertyListSerialization, etc.), while fitting
for the dynamism of Objective-C, do not always map optimally into Swift.
This document lays out the design of an updated API that improves the
developer experience of performing archival and serialization in Swift.

Specifically:

   - It aims to provide a solution for the archival of Swift struct and
   enum types
   - It aims to provide a more type-safe solution for serializing to
   external formats, such as JSON and plist

Motivation

The primary motivation for this proposal is the inclusion of native Swift
enum and struct types in archival and serialization. Currently,
developers targeting Swift cannot participate in NSCoding without being
willing to abandon enum and structtypes — NSCoding is an @objc protocol,
conformance to which excludes non-class types. This is can be limiting in
Swift because small enums and structs can be an idiomatic approach to
model representation; developers who wish to perform archival have to
either forgo the Swift niceties that constructs like enumsprovide, or
provide an additional compatibility layer between their "real" types and
their archivable types.

Secondarily, we would like to refine Foundation's existing serialization
APIs (NSJSONSerialization and NSPropertyListSerialization) to better
match Swift's strong type safety. From experience, we find that the
conversion from the unstructured, untyped data of these formats into
strongly-typed data structures is a good fit for archival mechanisms,
rather than taking the less safe approach that 3rd-party JSON conversion
approaches have taken (described further in an appendix below).

We would like to offer a solution to these problems without sacrificing
ease of use or type safety.
Agenda

This proposal is the first stage of three that introduce different facets
of a whole Swift archival and serialization API:

   1. This proposal describes the basis for this API, focusing on the
   protocols that users adopt and interface with
   2. The next stage will propose specific API for new encoders
   3. The final stage will discuss how this new API will interop with
   NSCoding as it is today

SE-NNNN provides stages 2 and 3.
Proposed solution

We will be introducing the following new types:

   - protocol Codable: Adopted by types to opt into archival. Conformance
   may be automatically derived in cases where all properties are also
   Codable.
   - protocol CodingKey: Adopted by types used as keys for keyed
   containers, replacing String keys with semantic types. Conformance may
   be automatically derived in most cases.
   - protocol Encoder: Adopted by types which can take Codable values and
   encode them into a native format.
      - class KeyedEncodingContainer<Key : CodingKey>: Subclasses of this
      type provide a concrete way to store encoded values by CodingKey.
      Types adopting Encoder should provide subclasses of
      KeyedEncodingContainer to vend.
      - protocol SingleValueEncodingContainer: Adopted by types which
      provide a concrete way to store a single encoded value. Types adopting
      Encoder should provide types conforming to
      SingleValueEncodingContainer to vend (but in many cases will be
      able to conform to it themselves).
   - protocol Decoder: Adopted by types which can take payloads in a
   native format and decode Codable values out of them.
      - class KeyedDecodingContainer<Key : CodingKey>: Subclasses of this
      type provide a concrete way to retrieve encoded values from storage by
      CodingKey. Types adopting Decoder should provide subclasses of
      KeyedDecodingContainer to vend.
      - protocol SingleValueDecodingContainer: Adopted by types which
      provide a concrete way to retrieve a single encoded value from storage.
      Types adopting Decoder should provide types conforming to
      SingleValueDecodingContainer to vend (but in many cases will be
      able to conform to it themselves).

For end users of this API, adoption will primarily involve the Codable
and CodingKey protocols. In order to participate in this new archival
system, developers must add Codable conformance to their types:

// If all properties are Codable, implementation is automatically derived:public struct Location : Codable {
    public let latitude: Double
    public let longitude: Double}
public enum Animal : Int, Codable {
    case chicken = 1
    case dog
    case turkey
    case cow}
public struct Farm : Codable {
    public let name: String
    public let location: Location
    public let animals: [Animal]}

With developer participation, we will offer encoders and decoders
(described in SE-NNNN, not here) that take advantage of this conformance to
offer type-safe serialization of user models:

let farm = Farm(name: "Old MacDonald's Farm",
                location: Location(latitude: 51.621648, longitude: 0.269273),
                animals: [.chicken, .dog, .cow, .turkey, .dog, .chicken, .cow, .turkey, .dog])let payload: Data = try JSONEncoder().encode(farm)
do {
    let farm = try JSONDecoder().decode(Farm.self, from: payload)

    // Extracted as user types:
    let coordinates = "\(farm.location.latitude, farm.location.longitude)"} catch {
    // Encountered error during deserialization}

This gives developers access to their data in a type-safe manner and a
recognizable interface.
Detailed design

To support user types, we expose the Codable protocol:

/// Conformance to `Codable` indicates that a type can marshal itself into and out of an external representation.public protocol Codable {
    /// Initializes `self` by decoding from `decoder`.
    ///
    /// - parameter decoder: The decoder to read data from.
    /// - throws: An error if reading from the decoder fails, or if read data is corrupted or otherwise invalid.
    init(from decoder: Decoder) throws

    /// Encodes `self` into the given encoder.
    ///
    /// If `self` fails to encode anything, `encoder` will encode an empty `.default` container in its place.
    ///
    /// - parameter encoder: The encoder to write data to.
    /// - throws: An error if any values are invalid for `encoder`'s format.
    func encode(to encoder: Encoder) throws}

By adopting Codable, user types opt in to this archival system.

Structured types (i.e. types which encode as a collection of properties)
encode and decode their properties in a keyed manner. Keys may be String-convertible
or Int-convertible (or both), and user types which have properties should
declare semantic key enums which map keys to their properties. Keys must
conform to the CodingKey protocol:

/// Conformance to `CodingKey` indicates that a type can be used as a key for encoding and decoding.public protocol CodingKey {
    /// The string to use in a named collection (e.g. a string-keyed dictionary).
    var stringValue: String? { get }

    /// Initializes `self` from a string.
    ///
    /// - parameter stringValue: The string value of the desired key.
    /// - returns: An instance of `Self` from the given string, or `nil` if the given string does not correspond to any instance of `Self`.
    init?(stringValue: String)

    /// The int to use in an indexed collection (e.g. an int-keyed dictionary).
    var intValue: Int? { get }

    /// Initializes `self` from an integer.
    ///
    /// - parameter intValue: The integer value of the desired key.
    /// - returns: An instance of `Self` from the given integer, or `nil` if the given integer does not correspond to any instance of `Self`.
    init?(intValue: Int)}

For most types, String-convertible keys are a reasonable default; for
performance, however, Int-convertible keys are preferred, and Encoders may
choose to make use of Ints over Strings. Framework types should provide
keys which have both for flexibility and performance across different types
of Encoders. It is generally an error to provide a key which has neither
a stringValue nor an intValue.

By default, CodingKey conformance can be derived for enums which have
either String or Int backing:

enum Keys1 : CodingKey {
    case a // (stringValue: "a", intValue: nil)
    case b // (stringValue: "b", intValue: nil)}
enum Keys2 : String, CodingKey {
    case c = "foo" // (stringValue: "foo", intValue: nil)
    case d // (stringValue: "d", intValue: nil)}
enum Keys3 : Int, CodingKey {
    case e = 4 // (stringValue: "e", intValue: 4)
    case f // (stringValue: "f", intValue: 5)
    case g = 9 // (stringValue: "g", intValue: 9)}

Coding keys which are not enums, have associated values, or have other
raw representations must implement these methods manually.

In addition to automatic CodingKey conformance derivation for enums,
Codableconformance can be automatically derived for certain types as well:

   1. Types whose properties are all either Codable or primitive get an
   automatically derived String-backed CodingKeys enum mapping properties
   to case names
   2. Types falling into (1) and types which provide a CodingKeys enum (directly
   or via a typealias) whose case names map to properties which are all
   Codableget automatic derivation of init(from:) and encode(to:) using
   those properties and keys. Types may choose to provide a custom
   init(from:) or encode(to:) (or both); whichever they do not provide
   will be automatically derived
   3. Types which fall into neither (1) nor (2) will have to provide a
   custom key type and provide their own init(from:) and encode(to:)

Many types will either allow for automatic derivation of all codability
(1), or provide a custom key subset and take advantage of automatic method
derivation (2).
Encoding and Decoding

Types which are encodable encode their data into a container provided by
their Encoder:

/// An `Encoder` is a type which can encode values into a native format for external representation.public protocol Encoder {
    /// Populates `self` with an encoding container (of `.default` type) and returns it, keyed by the given key type.
    ///
    /// - parameter type: The key type to use for the container.
    /// - returns: A new keyed encoding container.
    /// - precondition: May not be called after a previous `self.container(keyedBy:)` call of a different `EncodingContainerType`.
    /// - precondition: May not be called after a value has been encoded through a prior `self.singleValueContainer()` call.
    func container<Key : CodingKey>(keyedBy type: Key.Type) -> KeyedEncodingContainer<Key>

    /// Returns an encoding container appropriate for holding a single primitive value.
    ///
    /// - returns: A new empty single value container.
    /// - precondition: May not be called after a prior `self.container(keyedBy:)` call.
    /// - precondition: May not be called after a value has been encoded through a previous `self.singleValueContainer()` call.
    func singleValueContainer() -> SingleValueEncodingContainer

    /// The path of coding keys taken to get to this point in encoding.
    var codingKeyContext: [CodingKey] { get }}
// Continuing examples from before; below is automatically generated by the compiler if no customization is needed.public struct Location : Codable {
    private enum CodingKeys : CodingKey {
        case latitutude
        case longitude
    }

    public func encode(to encoder: Encoder) throws {
        // Generic keyed encoder gives type-safe key access: cannot encode with keys of the wrong type.
        let container = encoder.container(keyedBy: CodingKeys.self)

        // The encoder is generic on the key -- free key autocompletion here.
        try container.encode(latitude, forKey: .latitude)
        try container.encode(longitude, forKey: .longitude)
    }}
public struct Farm : Codable {
    private enum CodingKeys : CodingKey {
        case name
        case location
        case animals
    }

    public func encode(to encoder: Encoder) throws {
        let container = encoder.container(keyedBy: CodingKeys.self)
        try container.encode(name

Colin_Barrett · March 21, 2017, 6:59pm

Hi Itai,

···

On Tue, Mar 21, 2017 at 1:03 PM Itai Ferber <iferber@apple.com> wrote:

Hi Colin,

Thanks for your comments! Are you talking about Codable synthesis, or
encoding in general?

Yeah, I meant specifically in the case where things are synthesized
automatically. As you point out below, if someone implements a custom
Codeable instance, all bets are off.

On 21 Mar 2017, at 8:44, Colin Barrett wrote:

Hi Itai,

Glad to see these proposal! I'm curious, have you or the other Swift folks
thought about how *users* of these new Codable protocols will interact with
resilience domains?

What I mean is that what appear to be private or internal identifiers, and
thus changeable at will, may actually be fragile in that changing them will
break the ability to decode archives encoded by previous versions.

Making this safer could mean:
- Encoding only public properties

Unfortunately, property accessibility in code does not always map 1-to-1
with accessibility for archival (nor do I think they should be tied to one
another).
There are certainly cases where you’d want to include private information
in an archive, but that is not useful to expose to external clients, e.g.,
a struct/class version:

public struct MyFoo {
    // Should be encoded.
    public var title: String
    public var identifier: Int

    // This should be encoded too — in case the struct changes in the
    // future, want to be able to refer to the payload version.
    private let version = 1.0
}

Of course, there can also be public properties that you don’t find useful
to encode. At the moment, I’m not sure there’s a much better answer than
"the author of the code will have to think about the representation of
their data"; even if there were an easier way to annotate "I definitely
want this to be archived"/"I definitely don’t want this to be archived",
the annotation would still need to be manual.

(The above applies primarily in the case of Codable synthesis; when
implementing Codable manually I don’t think the compiler should ever
prevent you from doing what you need.)

- Adding some form of indirection (a la ObjC non-fragile ivars?)

What do you mean by this?

I'm not sure exactly how or if it would work in-detail, unfortunately, but
I know that the ObjC runtime emits symbols which are used to lookup the
offset in the object struct for non-fragile ivars. Maybe some similar form
of indirection would be useful for encoding non-public ivars. Like I said,
don't know exactly how/if that would work, just sharing :)

- Compiler warning (or disallowing) changes to properties in certain
situations.

We’ve thought about this with regards to identifying classes uniquely
across renaming, moving modules, etc.; this is a resilience problem in
general.
In order for the compiler to know about changes to your code it’d need to
keep state across compilations. While possible, this feels pretty fragile
(and potentially not very portable).

   - Compiler warns about changing a property? Blow away the cache
   directory!
   - Cloning the code to a new machine for the first time? Hmm, all the
   warnings went away…

This would be nice to have, but yes:

I imagine the specifics would need to follow the rest of the plans for
resilience.

specifics on this would likely be in line with the rest of resilience plans
for Swift in general.

Right. Thus my concern about allowing non-public fields to be automatically
serialized. The most conservative option would be to only automatically
synthesize a Codeable instance for the public members of public types.
Seems overly restrictive, so maybe anything goes for internal types, or
there's some sort of warning (overridable via an attribute?)

I want to emphasize btw that I'm enthusiastic about this proposal in
general. The support for integer keys is welcome and, as it's one of my pet
projects, eases support for a Cap'n Proto-style serialization format.[1]

-Colin

[1]: https://capnproto.org

It's likely that this could be addressed by a future proposal, as for the
time being developers can simply "not hold it wrong" ;)

Thanks,
-Colin

On Wed, Mar 15, 2017 at 6:52 PM Itai Ferber via swift-evolution < swift-evolution@swift.org> wrote:

Hi everyone,

The following introduces a new Swift-focused archival and serialization
API as part of the Foundation framework. We’re interested in improving the
experience and safety of performing archival and serialization, and are
happy to receive community feedback on this work.

Because of the length of this proposal, the *Appendix* and *Alternatives
Considered* sections have been omitted here, but are available in the full
proposal <https://github.com/apple/swift-evolution/pull/639> on the
swift-evolution repo. The full proposal also includes an *Unabridged API*
for

further consideration.

Without further ado, inlined below.

— Itai

Swift Archival & Serialization

- Proposal: SE-NNNN <https://github.com/apple/swift-evolution/pull/639>
- Author(s): Itai Ferber <https://github.com/itaiferber>, Michael LeHew
<https://github.com/mlehew>, Tony Parker <https://github.com/parkera>
- Review Manager: TBD
- Status: *Awaiting review*
- Associated PRs:
- #8124 <https://github.com/apple/swift/pull/8124>
- #8125 <https://github.com/apple/swift/pull/8125>

Introduction

Foundation's current archival and serialization APIs (NSCoding,
NSJSONSerialization, NSPropertyListSerialization, etc.), while fitting
for the dynamism of Objective-C, do not always map optimally into Swift.
This document lays out the design of an updated API that improves the
developer experience of performing archival and serialization in Swift.

Specifically:

- It aims to provide a solution for the archival of Swift struct and
enum types
- It aims to provide a more type-safe solution for serializing to

external formats, such as JSON and plist

Motivation

The primary motivation for this proposal is the inclusion of native Swift
enum and struct types in archival and serialization. Currently,
developers targeting Swift cannot participate in NSCoding without being
willing to abandon enum and structtypes — NSCoding is an @objc protocol,
conformance to which excludes non-class types. This is can be limiting in
Swift because small enums and structs can be an idiomatic approach to
model representation; developers who wish to perform archival have to
either forgo the Swift niceties that constructs like enumsprovide, or
provide an additional compatibility layer between their "real" types and
their archivable types.

Secondarily, we would like to refine Foundation's existing serialization
APIs (NSJSONSerialization and NSPropertyListSerialization) to better
match Swift's strong type safety. From experience, we find that the
conversion from the unstructured, untyped data of these formats into
strongly-typed data structures is a good fit for archival mechanisms,
rather than taking the less safe approach that 3rd-party JSON conversion
approaches have taken (described further in an appendix below).

We would like to offer a solution to these problems without sacrificing
ease of use or type safety.
Agenda

This proposal is the first stage of three that introduce different facets
of a whole Swift archival and serialization API:

1. This proposal describes the basis for this API, focusing on the

protocols that users adopt and interface with

2. The next stage will propose specific API for new encoders
3. The final stage will discuss how this new API will interop with

NSCoding as it is today

SE-NNNN provides stages 2 and 3.
Proposed solution

We will be introducing the following new types:

- protocol Codable: Adopted by types to opt into archival. Conformance

may be automatically derived in cases where all properties are also
Codable.

- protocol CodingKey: Adopted by types used as keys for keyed

containers, replacing String keys with semantic types. Conformance may
be automatically derived in most cases.

- protocol Encoder: Adopted by types which can take Codable values and

encode them into a native format.

- class KeyedEncodingContainer<Key : CodingKey>: Subclasses of this

type provide a concrete way to store encoded values by CodingKey.
Types adopting Encoder should provide subclasses of
KeyedEncodingContainer to vend.

- protocol SingleValueEncodingContainer: Adopted by types which

provide a concrete way to store a single encoded value. Types adopting
Encoder should provide types conforming to
SingleValueEncodingContainer to vend (but in many cases will be
able to conform to it themselves).

- protocol Decoder: Adopted by types which can take payloads in a

native format and decode Codable values out of them.

- class KeyedDecodingContainer<Key : CodingKey>: Subclasses of this

type provide a concrete way to retrieve encoded values from storage by
CodingKey. Types adopting Decoder should provide subclasses of
KeyedDecodingContainer to vend.

- protocol SingleValueDecodingContainer: Adopted by types which

provide a concrete way to retrieve a single encoded value from storage.
Types adopting Decoder should provide types conforming to
SingleValueDecodingContainer to vend (but in many cases will be
able to conform to it themselves).

For end users of this API, adoption will primarily involve the Codable
and CodingKey protocols. In order to participate in this new archival
system, developers must add Codable conformance to their types:

// If all properties are Codable, implementation is automatically
derived:public struct Location : Codable {
public let latitude: Double
public let longitude: Double}
public enum Animal : Int, Codable {
case chicken = 1
case dog
case turkey
case cow}
public struct Farm : Codable {
public let name: String
public let location: Location
public let animals: [Animal]}

With developer participation, we will offer encoders and decoders
(described in SE-NNNN, not here) that take advantage of this conformance to
offer type-safe serialization of user models:

let farm = Farm(name: "Old MacDonald's Farm",
location: Location(latitude: 51.621648, longitude: 0.269273),

animals: [.chicken, .dog, .cow, .turkey, .dog, .chicken, .cow, .turkey,
.dog])let payload: Data = try JSONEncoder().encode(farm)

do {
let farm = try JSONDecoder().decode(Farm.self, from: payload)

// Extracted as user types:
let coordinates = "\(farm.location.latitude, farm.location.longitude)"}
catch {
// Encountered error during deserialization}

This gives developers access to their data in a type-safe manner and a
recognizable interface.
Detailed design

To support user types, we expose the Codable protocol:

/// Conformance to `Codable` indicates that a type can marshal itself into
and out of an external representation.public protocol Codable {

/// Initializes `self` by decoding from `decoder`.
///
/// - parameter decoder: The decoder to read data from.
/// - throws: An error if reading from the decoder fails, or if read data
is corrupted or otherwise invalid.
init(from decoder: Decoder) throws

/// Encodes `self` into the given encoder.
///
/// If `self` fails to encode anything, `encoder` will encode an empty
`.default` container in its place.
///
/// - parameter encoder: The encoder to write data to.
/// - throws: An error if any values are invalid for `encoder`'s format.
func encode(to encoder: Encoder) throws}

By adopting Codable, user types opt in to this archival system.

Structured types (i.e. types which encode as a collection of properties)
encode and decode their properties in a keyed manner. Keys may be
String-convertible
or Int-convertible (or both), and user types which have properties should
declare semantic key enums which map keys to their properties. Keys must
conform to the CodingKey protocol:

/// Conformance to `CodingKey` indicates that a type can be used as a key
for encoding and decoding.public protocol CodingKey {

/// The string to use in a named collection (e.g. a string-keyed
dictionary).
var stringValue: String? { get }

/// Initializes `self` from a string.
///
/// - parameter stringValue: The string value of the desired key.
/// - returns: An instance of `Self` from the given string, or `nil` if the
given string does not correspond to any instance of `Self`.
init?(stringValue: String)

/// The int to use in an indexed collection (e.g. an int-keyed dictionary).
var intValue: Int? { get }

/// Initializes `self` from an integer.
///
/// - parameter intValue: The integer value of the desired key.
/// - returns: An instance of `Self` from the given integer, or `nil` if
the given integer does not correspond to any instance of `Self`.
init?(intValue: Int)}

For most types, String-convertible keys are a reasonable default; for
performance, however, Int-convertible keys are preferred, and Encoders may
choose to make use of Ints over Strings. Framework types should provide
keys which have both for flexibility and performance across different types
of Encoders. It is generally an error to provide a key which has neither
a stringValue nor an intValue.

By default, CodingKey conformance can be derived for enums which have
either String or Int backing:

enum Keys1 : CodingKey {
case a // (stringValue: "a", intValue: nil)
case b // (stringValue: "b", intValue: nil)}
enum Keys2 : String, CodingKey {
case c = "foo" // (stringValue: "foo", intValue: nil)
case d // (stringValue: "d", intValue: nil)}
enum Keys3 : Int, CodingKey {
case e = 4 // (stringValue: "e", intValue: 4)
case f // (stringValue: "f", intValue: 5)
case g = 9 // (stringValue: "g", intValue: 9)}

Coding keys which are not enums, have associated values, or have other
raw representations must implement these methods manually.

In addition to automatic CodingKey conformance derivation for enums,
Codableconformance can be automatically derived for certain types as well:

1. Types whose properties are all either Codable or primitive get an

automatically derived String-backed CodingKeys enum mapping properties
to case names

2. Types falling into (1) and types which provide a CodingKeys enum
(directly

or via a typealias) whose case names map to properties which are all
Codableget automatic derivation of init(from:) and encode(to:) using
those properties and keys. Types may choose to provide a custom
init(from:) or encode(to:) (or both); whichever they do not provide
will be automatically derived

3. Types which fall into neither (1) nor (2) will have to provide a

custom key type and provide their own init(from:) and encode(to:)

Many types will either allow for automatic derivation of all codability
(1), or provide a custom key subset and take advantage of automatic method
derivation (2).
Encoding and Decoding

Types which are encodable encode their data into a container provided by
their Encoder:

/// An `Encoder` is a type which can encode values into a native format for
external representation.public protocol Encoder {

/// Populates `self` with an encoding container (of `.default` type) and
returns it, keyed by the given key type.
///
/// - parameter type: The key type to use for the container.
/// - returns: A new keyed encoding container.
/// - precondition: May not be called after a previous
`self.container(keyedBy:)` call of a different `EncodingContainerType`.
/// - precondition: May not be called after a value has been encoded
through a prior `self.singleValueContainer()` call.
func container<Key : CodingKey>(keyedBy type: Key.Type) ->
KeyedEncodingContainer<Key>

/// Returns an encoding container appropriate for holding a single
primitive value.
///
/// - returns: A new empty single value container.
/// - precondition: May not be called after a prior
`self.container(keyedBy:)` call.
/// - precondition: May not be called after a value has been encoded
through a previous `self.singleValueContainer()` call.
func singleValueContainer() -> SingleValueEncodingContainer

/// The path of coding keys taken to get to this point in encoding.
var codingKeyContext: [CodingKey] { get }}

// Continuing examples from before; below is automatically generated by the
compiler if no customization is needed.public struct Location : Codable {

private enum CodingKeys : CodingKey {
case latitutude
case longitude
}

public func encode(to encoder: Encoder) throws {
// Generic keyed encoder gives type-safe key access: cannot encode with
keys of the wrong type.
let container = encoder.container(keyedBy: CodingKeys.self)

// The encoder is generic on the key -- free key autocompletion here.
try container.encode(latitude, forKey: .latitude)
try container.encode(longitude, forKey: .longitude)
}}
public struct Farm : Codable {
private enum CodingKeys : CodingKey {
case name
case location
case animals
}

public func encode(to encoder: Encoder) throws {
let container = encoder.container(keyedBy: CodingKeys.self)
try container.encode(name

itaiferber · March 22, 2017, 5:41pm

Hi Ben,

What’s the use case that you were thinking of? `KeyPath`s could be useful in the case where you don’t need to customize your key names, but cannot represent a custom case like

public struct Post {
     var authorID: Int
     var bodyText: String

     private enum CodingKeys : String, CodingKey {
         case authorID = "author_id"
         case bodyText = "body_text"
     }
}

Or am I misunderstanding?

— Itai

···

On 22 Mar 2017, at 5:39, Ben Rimmington wrote:

On 15 Mar 2017, at 22:40, Itai Ferber wrote:

The following introduces a new Swift-focused archival and serialization API as part of the Foundation framework. We’re interested in improving the experience and safety of performing archival and serialization, and are happy to receive community feedback on this work.

Instead of a CodingKeys enum, could the KeyPath proposal <https://github.com/apple/swift-evolution/pull/644> be utilized somehow?

-- Ben

Tony_Parker · March 23, 2017, 6:44pm

Hi Oliver,

···

On Mar 23, 2017, at 7:55 AM, Oliver Jones via swift-evolution <swift-evolution@swift.org> wrote:

Like everyone I’m excited by this new proposal. But…

> protocol Codable: Adopted by types to opt into archival. Conformance may be automatically derived in cases where all properties are also Codable.

… can I make one suggestion. Please do not repeat the mistakes of NSCoding in combining the encoding and decoding into a single protocol. Just as there are Encoder and Decoder classes their should be Encodable and Decodable protocols (maybe have an aggregate Codable protocol for convenience but do not force it).

My reasoning:

Sometimes you only want to decode or encode and object and not vice versa. This is often the case with Web APIs and JSON serialisation.

Eg:

Often an app only consumes (decodes) JSON encoded objects and never writes them out (a read only app for example). So the encode(to:) methods are completely redundant and someone adopting Codable should not be forced to write them.

If only I had a dollar for all the times I’ve seen this sort of code in projects:

class MyClass : NSCoding {
init?(coder: NSCoder) {
// ... some decoding code
}

func encode(with aCoder: NSCoder) {
preconditionFailure(“Not implemented”)
}
}

Another example:

Web APIs often take data in a different structure as input (i.e. “Request” objects) than they output. These request objects are only ever encoded and never decoded by an application so implementing init(from:) is completely redundant.

Personally I think the approach taken by libraries like Wrap (https://github.com/johnsundell/wrap\) and Unbox (https://github.com/JohnSundell/Unbox\) is a much better design. Encoding and decoding should not be the same protocol.

Yes I understand that Codable could provide no-op (or preconditionFailure) protocol extension based default implementations of init(from:) and encode(to:) (or try to magic up implementations based on the Codable nature of public properties as suggested in the proposal) but to me that seems like a hack that is papering over bad design. I think this joint Codable design probably fails the Liskov substitution principle too.

So I again implore you to consider splitting Codable into two protocols, one for encoding and another for decoding.

Sorry if I’m repeating what other people have already said. I’ve not read every response to this proposal on the list.

Regards

Thanks for your feedback. We are indeed considering splitting this up into 3 protocols instead of 1 (“Encodable", “Decodable", "Codable : Encodable, Decodable”).

The main counterpoint is the additional complexity inherent in this approach. We are considering if the tradeoff is worth it.

- Tony

Jon_Hull · March 23, 2017, 9:09pm

Let me vote +1 for splitting it. The added conceptual complexity should be minimal, since it is progressively disclosed. You only need to know about codeable, unless you run into the issue where you only want one, at which point stack overflow, etc… will point you to Encodable/Decodable and you will be glad they exist.

···

On Mar 23, 2017, at 11:44 AM, Tony Parker via swift-evolution <swift-evolution@swift.org> wrote:

Hi Oliver,

On Mar 23, 2017, at 7:55 AM, Oliver Jones via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Like everyone I’m excited by this new proposal. But…

> protocol Codable: Adopted by types to opt into archival. Conformance may be automatically derived in cases where all properties are also Codable.

… can I make one suggestion. Please do not repeat the mistakes of NSCoding in combining the encoding and decoding into a single protocol. Just as there are Encoder and Decoder classes their should be Encodable and Decodable protocols (maybe have an aggregate Codable protocol for convenience but do not force it).

My reasoning:

Sometimes you only want to decode or encode and object and not vice versa. This is often the case with Web APIs and JSON serialisation.

Eg:

Often an app only consumes (decodes) JSON encoded objects and never writes them out (a read only app for example). So the encode(to:) methods are completely redundant and someone adopting Codable should not be forced to write them.

If only I had a dollar for all the times I’ve seen this sort of code in projects:

class MyClass : NSCoding {
init?(coder: NSCoder) {
// ... some decoding code
}

func encode(with aCoder: NSCoder) {
preconditionFailure(“Not implemented”)
}
}

Another example:

Web APIs often take data in a different structure as input (i.e. “Request” objects) than they output. These request objects are only ever encoded and never decoded by an application so implementing init(from:) is completely redundant.

Personally I think the approach taken by libraries like Wrap (https://github.com/johnsundell/wrap\) and Unbox (https://github.com/JohnSundell/Unbox\) is a much better design. Encoding and decoding should not be the same protocol.

Yes I understand that Codable could provide no-op (or preconditionFailure) protocol extension based default implementations of init(from:) and encode(to:) (or try to magic up implementations based on the Codable nature of public properties as suggested in the proposal) but to me that seems like a hack that is papering over bad design. I think this joint Codable design probably fails the Liskov substitution principle too.

So I again implore you to consider splitting Codable into two protocols, one for encoding and another for decoding.

Sorry if I’m repeating what other people have already said. I’ve not read every response to this proposal on the list.

Regards

Thanks for your feedback. We are indeed considering splitting this up into 3 protocols instead of 1 (“Encodable", “Decodable", "Codable : Encodable, Decodable”).

The main counterpoint is the additional complexity inherent in this approach. We are considering if the tradeoff is worth it.

- Tony

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

benrimmington · March 23, 2017, 7:37pm

For custom names, the `CodingKeys` enum does seem like the best design, unless an attribute can be used.

  public struct Post : Codable {
      @codable(name: "author_id") var authorID: Int
      @codable(name: "body_text") var bodyText: String
  }

If each `KeyPath` encapsulates the type information, the `decode` methods won't need a `type` parameter.

  /// Primitive decoding methods (for single-value and keyed containers).
  open class DecodingContainer<Root : Codable> {
      open func decode(for keyPath: KeyPath<Root, Bool>) throws -> Bool
      open func decode(for keyPath: KeyPath<Root, Int>) throws -> Int
      open func decode(for keyPath: KeyPath<Root, UInt>) throws -> UInt
      open func decode(for keyPath: KeyPath<Root, Float>) throws -> Float
      open func decode(for keyPath: KeyPath<Root, Double>) throws -> Double
      open func decode(for keyPath: KeyPath<Root, String>) throws -> String
      open func decode(for keyPath: KeyPath<Root, Data>) throws -> Data
  }

  /// Keyed containers inherit the primitive decoding methods.
  open class KeyedDecodingContainer : DecodingContainer {
      open func decode<Value : Codable>(for keyPath: KeyPath<Root, Value>) throws -> Value
  }

-- Ben

···

On 22 Mar 2017, at 17:41, Itai Ferber wrote:

What’s the use case that you were thinking of? KeyPaths could be useful in the case where you don’t need to customize your key names, but cannot represent a custom case like

public struct Post {
    var authorID: Int
    var bodyText: String

    private enum CodingKeys : String, CodingKey {
        case authorID = "author_id"
        case bodyText = "body_text"
    }
}
Or am I misunderstanding?

Oliver_Jones · March 23, 2017, 9:28pm

Fantastic. Great to hear. I look forward to reading the revised proposal!

Regards

···

On 24 Mar 2017, at 3:34 am, Itai Ferber <iferber@apple.com> wrote:

Hi Oliver,

Thanks for your comments! We thought about this and we agree overall — we will incorporate this suggestion along with others in the next batch update as long as nothing prohibitive comes up.

— Itai

On 23 Mar 2017, at 7:49, Oliver Jones wrote:

Like everyone I’m excited by this new proposal. But…

> protocol Codable: Adopted by types to opt into archival. Conformance may be automatically derived in cases where all properties are also Codable.

… can I make one suggestion. Please do not repeat the mistakes of NSCoding in combining the encoding and decoding into a single protocol. Just as there are Encoder and Decoder classes their should be Encodable and Decodable protocols (maybe have an aggregate Codable protocol for convenience but do not force it).

zwaldowski · April 3, 2017, 11:01pm

Itai and co:

This is a solid improvement.

I think it's appropriate to diminish the importance of non-keyed
containers. "Nonkeyed" as the name is pretty iffy to me, though, even
though I admit it makes the use case pretty clear. "Ordered" or
"Sequential" both sound fine, even for an encoder that's slot-based
instead of NSArchiver-like model. An array is ordered but you don't
have to traverse it in order.

Best,

Zachary Waldowski

zach@waldowski.me

···

On Mon, Apr 3, 2017, at 04:31 PM, Itai Ferber via swift-evolution wrote:

Hi everyone,

With feedback from swift-evolution and additional internal review,
we've pushed updates to this proposal, and to the Swift Encoders[1]
proposal. In the interest of not blowing up mail clients with the full
HTML again, I'll simply be linking to the swift-evolution PR here[2],
as well as the specific diff[3] of what's changed.
At a high level:

* The Codable protocol has been split up into Encodable and Decodable
* String keys on CodingKey are no longer optional
* KeyedEncodingContainer has become
   KeyedEncodingContainerProtocol, with a concrete type-erased
   KeyedEncodingContainer struct to hold it
* Array responsibilities have been removed from
   KeyedEncodingContainer, and have been added to a new
   UnkeyedEncodingContainer type
* codingKeyContext has been renamed codingPath
There are some specific changes inline — I know it might be a bit of a
pain, but let's keep discussion here on the mailing list instead of on
GitHub. We'll be looking to start the official review process very
soon, so we're interested in any additional feedback.
Thanks!

— Itai

_________________________________________________

swift-evolution mailing list

swift-evolution@swift.org

https://lists.swift.org/mailman/listinfo/swift-evolution

Links:

  1. Proposal for Foundation Swift Encoders by itaiferber · Pull Request #640 · apple/swift-evolution · GitHub
  2. Proposal for Foundation Swift Archival & Serialization API by itaiferber · Pull Request #639 · apple/swift-evolution · GitHub
  3. Proposal for Foundation Swift Archival & Serialization API by itaiferber · Pull Request #639 · apple/swift-evolution · GitHub

Joe_Groff · March 16, 2017, 2:12am

I see. Protocols with associated types serve the same purpose as generic interfaces in other languages, but we don't have the first-class support for protocol types with associated type constraints (a value of type `Container where Key == K`). That's something we'd like to eventually support. In other places in the standard library, we wrtie the type-erased container by hand, which is why we have `AnySequence`, `AnyCollection`, and `AnyHashable`. You could probably do something similar here; that would be a bit awkward for implementers, but might be easier to migrate forward to where we eventually want to be with the language.

-Joe

···

On Mar 15, 2017, at 6:46 PM, Itai Ferber <iferber@apple.com> wrote:

Thanks Joe, and thanks for passing this along!

To those who are curious, we use abstract base classes for a cascading list of reasons:

  • We need to be able to represent keyed encoding and decoding containers as abstract types which are generic on a key type
  • There are two ways to support abstraction in this way: protocol & type constraints, and generic types
    • Since Swift protocols are not generic, we unfortunately cannot write protocol KeyedEncodingContainer<Key : CodingKey> { ... }, which is the "ideal" version of what we're trying to represent
  • Let's try this with a protocol first (simplified here):

protocol Container {
    associatedtype Key : CodingKey
}

func container<Key : CodingKey, Cont : Container>(_ type: Key.Type) -> Cont where Cont.Key == Key {
    // return something
}

This looks promising so far — let's try to make it concrete:

struct ConcreteContainer<K : CodingKey> : Container {
    typealias Key = K
}

func container<Key : CodingKey, Cont : Container>(_ type: Key.Type) -> Cont where Cont.Key == Key {
    return ConcreteContainer<Key>() // error: Cannot convert return expression of type 'ConcreteContainer<Key>' to return type 'Cont'
}

Joe or anyone from the Swift team can describe this better, but this is my poor-man's explanation of why this happens. Swift's type constraints are "directional" in a sense. You can constrain a type going into a function, but not out of a function. There is no type I could return from inside of container() which would satisfy this constraint, because the constraint can only be satisfied by turning Cont into a concrete type from the outside.

Okay, well let's try this:

func container... {
    return ConcreteContainer<Key>() as! Cont
}

This compiles fine! Hmm, let's try to use it:

container(Int.self) // error: Generic parameter 'Cont' could not be inferred

The type constraint can only be fulfilled from the outside, not the inside. The function call itself has no context for the concrete type that this would return, so this is a no-go.

  • If we can't do it with type constraints in this way, is it possible with generic types? Yep! Generic types satisfy this without a problem. However, since we don't have generic protocols, we have to use a generic abstract base class to represent the same concept — an abstract container generic on the type of key which dynamically dispatches to the "real" subclassed type

Hopes that gives some simplified insight into the nature of this decision.

hartbit · March 16, 2017, 6:06pm

2) Libraries like Marshal (https://github.com/utahiosmac/Marshal\) and Unbox (https://github.com/JohnSundell/Unbox\) don’t require the decoding functions to provide the type: those functions are generic on the return turn and it’s automatically inferred:

func decode<T>(key: Key) -> T

self.stringProperty = decode(key: .stringProperty) // correct specialisation of the generic function chosen by the compiler

Is there a reason the proposal did not choose this solution? Its quite sweet.

IMHO those are only “sweet” until you need to decode a value out to something other than a typed value, then it’s ambiguity city.

Other than a typed value? Can you give an example?

···

On 16 Mar 2017, at 16:53, Zach Waldowski <zach@waldowski.me> wrote:

On Mar 16, 2017, at 3:09 AM, David Hart via swift-evolution <swift-evolution@swift.org> wrote:

There are many ways to solve that, but none of them are conducive to beginners. Using the metatype to seed the generic resolution is the only thing I’d get behind, personally.

Zach Waldowski
zach@waldowski.me

Tony_Parker · March 17, 2017, 4:07pm

Hi Karl,

Hi Itai,

I’m wondering what the motivation is for keeping this as part of Foundation and not the standard library. It seems like you’re landing an implementation of this in the Foundation overlay on master, and another copy of all the code will have to go into swift-corelibs-foundation. This seems suboptimal. Or are there future plans to unify the Foundation overlay with corelibs-foundation somehow?

Also the implementation uses some Foundation-isms (NSMutableArray, NSNumber) and it would be nice to stick with idiomatic Swift as much as possible instead.

Finally you should take a look at the integer protocol work (https://github.com/apple/swift-evolution/blob/master/proposals/0104-improved-integers.md\) to replace the repetitive code surrounding primitive types, however I don’t know if this has landed in master yet.

Slava

I agree that the protocols should be part of the standard library rather than Foundation. As far as I can make out, the only part of this proposal that actually requires Foundation is the use of the “Data” type (which itself is a strange and often frustrating omission from the standard library). The actual concrete encoders can live in Foundation.

Generally my opinion is that the proposed feature is nice. Everybody hates NSCoder and having to write those required initialisers on your UIViews and whatnot. At its core, it’s not really very different from any other Swift archiving library which exists today, except that it’s backed with layer upon layer of compiler-generated magic to make it less verbose. The things I don’t like:

1) While making things less verbose is commendable, automatically generating the encoding functions could be an anti-feature. “Codable” is for properties with persistable values only, which is a level of semantics which goes above the type-system. We don’t generate Equatable conformance for structs whose elements are all Equatable; it’s a big leap to go from “this data type is persistable” to “the value of this variable should be persisted” - for one thing, the value may not have meaning to others (e.g. a file-handle as an Int32) or it may contain sensitive user-data in a String. The encoding function isn’t just boilerplate - you *should* think about it; otherwise who knows what kind of data you’re leaking?

=> One possible solution would be to have “Codable" as a regular protocol, and refine it with “AutoCodable" which contains the magic. Just so there is a little extra step where you think “do I want this to be generated?”.

The number one complaint we have about NSCoding (and this complaint predates Swift by a long shot) is that too much boilerplate is required for really simple data structures. Resolving this issue is one of our primary goals for this API.

There are a lot of benefits to keeping one protocol in place instead of making another one for “auto” codable: API which accepts Codable things does not need to have two entry points; there is one concept of Codable instead of two; you can “buy in” to more complex behavior by implementing parts of Codable (e.g., just the keys if you simply want to change the names of the JSON keys).

2) More generally, though, I don’t support the idea of the Foundation module introducing extensions to the Swift language that no other developer of any other module has the chance to do, with an aside that some reflection API which doesn’t exist yet might possibly make it less magic in the future. My jaw hit the floor when I saw this was part of the proposal, and that it wasn’t for the standard library. Could you imagine, if somebody proposed their own magic protocols for AlamoFire or Kitura or any other Swift library? It would be a non-starter. It *should* be a non-starter.

=> Do we have a policy about module-specific compiler magic such as this?
=> Can we move any of the magic (e.g. CodableKeys generation) to the standard library?

I develop a lot for platforms, or for scenarios, where Foundation is not supported nor desirable. Considering that people are taught to prefer library code to rolling their own, and that humans are generally quite receptive to shortcuts at the expense of correctness, if this machinery exists at all we can expect it to spread. It would totally kill libraries such as SwiftJSON or whatever else you currently use. The fact that such a fundamental and widespread feature would now live in a monolithic module I can’t access would significantly spoil the language for me.

We’ve been very accepting of patches to port swift-corelibs-foundation to other architectures and platforms, including s390x (https://github.com/apple/swift-corelibs-foundation/pull/386\), Android (https://github.com/apple/swift-corelibs-foundation/pull/622\), and Cygwin (https://github.com/apple/swift-corelibs-foundation/pull/381\). Foundation is extremely portable, so it should be available anywhere that Swift itself is possible to compile.

Foundation and the Swift standard library also have a more intimate relationship than most other libraries. This is especially relevant on Darwin, where we have special cases for bridging, implementations of various API on String, and more. It’s not a stretch to take advantage of that for this API as well.

Now, if we had all the language features we needed to avoid any compiler magic then we absolutely would have implemented this with those instead. However, those things aren’t here yet and for some of them (e.g. property behaviors or generalized existentials), there is no plan in place to add them in the near future. Working with JSON is such a common task for so many of our developers that we felt it was important to tackle the problem now with the tools we have at our disposal.

Of course, we will not be afraid to improve the API in the future as new features are added to the language.

- Tony

···

On Mar 17, 2017, at 12:20 AM, Karl Wagner via swift-evolution <swift-evolution@swift.org> wrote:

On 16 Mar 2017, at 21:48, Slava Pestov via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

- Karl
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution