[Proposal] Foundation Swift Archival & Serialization

Joe_Groff · March 17, 2017, 7:07pm

Although it isn't sufficient in its current form, I think the key paths proposal that recently went out will eventually be able to help a bit here:

https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170313/033998.html

If keypaths eventually had the ability to be looked up by name or index, like your coding keys are, we could potentially use them as the default coding key type instead of synthesizing a new type. That would cut down the number of compiler-known things. (I think it's reasonable for coding to do its own thing in the meantime, to be clear.)

-Joe

···

On Mar 16, 2017, at 12:33 PM, Itai Ferber via swift-evolution <swift-evolution@swift.org> wrote:

This design struck me as remarkably similar to the reflection system and its `Mirror` type, which is also a separate type describing an original instance. My question was: Did you look at the reflection system when you were building this design? Do you think there might be anything that can be usefully shared between them?

We did, quite a bit, and spent a lot of time considering reflection and its place in our design. Ultimately, the reflection system does not currently have the features we would need, and although the Swift team has expressed desire to improve the system considerably, it’s not currently a top priority, AFAIK.

Brent_Royal-Gordon · March 17, 2017, 8:23pm

Optional values are accepted and vended directly through the API. The encode(_:forKey:) methods take optional values directly, and decodeIfPresent(_:forKey:) vend optional values.

Optional is special in this way — it’s a primitive part of the system. It’s actually not possible to write an encode(to:) method for Optional, since the representation of null values is up to the encoder and the format it’s working in; JSONEncoder, for instance, decides on the representation of nil (JSON null).

Yes—I noticed that later but then forgot to revise the beginning. Sorry about that.

It wouldn’t be possible to ask nil to encode itself in a reasonable way.

I really think it could be done, at least for most coders. I talked about this in another email, but in summary:

NSNull would become a primitive type; depending on the format, it would be encoded either as a null value or the absence of a value.
Optional.some(x) would be encoded the same as x.
Optional.none would be encoded in the following fashion:
If the Wrapped associated type was itself an optional type, it would be encoded as a keyed container containing a single entry. That entry's key would be some likely-unique value like "_swiftOptionalDepth"; its value would be the number of levels of optionality before reaching a non-optional type.
If the Wrapped associated type was non-optional, it would be encoded as an NSNull.

That sounds complicated, but the runtime already has machinery to coerce Optionals to Objective-C id: Optional.some gets bridged as the Wrapped value, while Optional.none gets bridged as either NSNull or _SwiftNull, which contains a depth. We would simply need to make _SwiftNull conform to Codable, and give it a decoding implementation which was clever enough to realize when it was being asked to decode a different type.

What about a more complex enum, like the standard library's `UnicodeDecodingResult`:

enum UnicodeDecodingResult {
case emptyInput
case error
case scalarValue(UnicodeScalar)
}

Or, say, an `Error`-conforming type from one of my projects:

public enum SQLError: Error {
case connectionFailed(underlying: Error)
case executionFailed(underlying: Error, statement: SQLStatement)
case noRecordsFound(statement: SQLStatement)
case extraRecordsFound(statement: SQLStatement)
case columnInvalid(underlying: Error, key: ColumnSpecifier, statement: SQLStatement)
case valueInvalid(underlying: Error, key: AnySQLColumnKey, statement: SQLStatement)
}

(You can assume that all the types in the associated values are `Codable`.)

Sure — these cases specifically do not derive Codable conformance because the specific representation to choose is up to you. Two possible ways to write this, though there are many others (I’m simplifying these cases here a bit, but you can extrapolate this):

Okay, so tl;dr is "There's nothing special to help with this; just encode some indication of the case in one key, and the associated values in separate keys". I suppose that works.

Have you given any consideration to supporting types which only need to decode? That seems likely to be common when interacting with web services.

We have. Ultimately, we decided that the introduction of several protocols to cover encodability, decodability, and both was too much of a cognitive overhead, considering the number of other types we’re also introducing. You can always implement encode(to:) as fatalError().

I understand that impulse.

Structured types (i.e. types which encode as a collection of properties) encode and decode their properties in a keyed manner. Keys may be String-convertible or Int-convertible (or both),

What does "may" mean here? That, at runtime, the encoder will test for the preferred key type and fall back to the other one? That seems a little bit problematic.

Yes, this is the case. A lot is left up to the Encoder because it can choose to do something for its format that your implementation of encode(to:) may not have considered.
If you try to encode something with an Int key in a string-keyed dictionary, the encoder may choose to stringify the integer if appropriate for the format. If not, it can reject your key, ignore the call altogether, preconditionFailure(), etc. It is also perfectly legitimate to write an Encoder which supports a flat encoding format — in that case, keys are likely ignored altogether, in which case there is no error to be had. We’d like to not arbitrarily constrain an implementation unless necessary.

Wait, what? If it's ignoring the keys altogether, how does it know what to decode with each call? Do you have to decode in the same order you encoded?

(Or are you saying that the encoder might use the keys to match fields to, say, predefined fields in a schema provided to the encoder, but not actually write anything about the keys to disk? That would make sense. But ignoring the keys altogether doesn't.)

In general, my biggest concern with this design is that, in a hundred different places, it is very loosely specified. We have keyed containers, but the keys can convert to either, or both, or neither of two different types. We have encode and decode halves, but you only have to support one or the other. Nils are supported, but they're interpreted as equivalent to the absence of a value. If something encounters a problem or incompatibility, it should throw an error or trip a precondition.

I worry that this is so loosely specified that you can't really trust an arbitrary type to work with an arbitrary encoder; you'll just have to hope that your testing touches every variation on every part of the object graph.

This kind of design is commonplace in Objective-C, but Swift developers often go to great lengths to expose these kinds of requirements to the type system so the compiler can verify them. For instance, I would expect a similar Swift framework to explicitly model the raw values of keys as part of the type system; if you tried to use a type providing string keys with an encoder that required integer keys, the compiler would reject your code. Even when something can't be explicitly modeled by the type system, Swift developers usually try to document guarantees about how to use APIs safely; for instance, Swift.RangeReplaceableCollection explicitly states that its calls may make indices retrieved before the call invalid, and individual conforming types document specific rules about which indices will keep working and which won't.

But your Encoder and Decoder designs seem to document semantics very loosely; they don't formally model very important properties, like "Does this coder preserve object identities*?" and "What sorts of keys does this coder use?", even when it's easy to do so, and now it seems like they also don't specify important semantics, like whether or not the encoder is required to inspect the key to determine the value you're looking for, either. I'm very concerned by that.

The design you propose takes advantage of several Swift niceties—Optional value types, enums for keys, etc.—and I really appreciate those things. But in its relatively casual attitude towards typing, it still feels like an Objective-C design being ported to Swift. I want to encourage you to go beyond that.

* That is, if you encode a reference to the same object twice and then decode the result, do you get one instance with two references, or two instances with one reference each? JSONEncoder can't provide that behavior, but NSKeyedArchiver can. There's no way for a type which won't encode properly without this property to reject encoders which cannot guarantee it.

For these exact reasons, integer keys are not produced by code synthesis, only string keys. If you want integer keys, you’ll have to write them yourself. :)

That's another thing I realized on a later reading and forgot to correct. Sorry about that.

(On the other hand, that reminds me of another minor concern: Your statement that superContainer() instances use a key with the integer value 0. I'd suggest you document that fact in boldface in the documentation for integer keys, because I expect that every developer who uses integer keys will want to start at key 0.)

So I would suggest the following changes:

* The coding key always converts to a string. That means we can eliminate the `CodingKey` protocol and instead use `RawRepresentable where RawValue == String`, leveraging existing infrastructure. That also means we can call the `CodingKeys` associated type `CodingKey` instead, which is the correct name for it—we're not talking about an `OptionSet` here.

* If, to save space on disk, you want to also people to use integers as the serialized representation of a key, we might introduce a parallel `IntegerCodingKey` protocol for that, but every `CodingKey` type should map to `String` first and foremost. Using a protocol here ensures that it can be statically determined at compile time whether a type can be encoded with integer keys, so the compiler can select an overload of `container(keyedBy:)`.

* Intrinsically ordered data is encoded as a single value containers of type `Array<Codable>`. (I considered having an `orderedContainer()` method and type, but as I thought about it, I couldn't think of an advantage it would have over `Array`.)

This is possible, but I don’t see this as necessarily advantageous over what we currently have. In 99.9% of cases, CodingKey types will have string values anyway — in many cases you won’t have to write the enum yourself to begin with, but even when you do, derived CodingKey conformance will generate string values on your behalf.
The only time a key will not have a string value is if the CodingKey protocol is implemented manually and a value is either deliberately left out, or there was a mistake in the implementation; in either case, there wouldn’t have been a valid string value anyway.

Again, I think this might come down to an Objective-C vs. Swift mindset difference. The Objective-C mindset is often "very few people will do X, so we might as well allow it". The Swift mindset is more "very few people will do X, so we might as well forbid it". :^)

In this case: Very few people will be inconvenienced by a requirement that they provide strings in their CodingKeys, so why not require it? Doing so ensures that encoders can always rely on a string key being available, and with all the magic we're providing to ensure the compiler fills in the actual strings for you, users will not find the requirement burdensome.

/// Returns an encoding container appropriate for holding a single primitive value.
///
/// - returns: A new empty single value container.
/// - precondition: May not be called after a prior `self.container(keyedBy:)` call.
/// - precondition: May not be called after a value has been encoded through a previous `self.singleValueContainer()` call.
func singleValueContainer() -> SingleValueEncodingContainer

Speaking of which, I'm not sure about single value containers. My first instinct is to say that methods should be moved from them to the `Encoder` directly, but that would probably cause code duplication. But...isn't there already duplication between the `SingleValue*Container` and the `Keyed*Container`? Why, yes, yes there is. So let's talk about that.

In the Alternatives Considered section of the proposal, we detail having done just this. Originally, the requirements now on SingleValueContainer sat on Encoder and Decoder.
Unfortunately, this made it too easy to do the wrong thing, and required extra work (in comparison) to do the right thing.

When Encoder has encode(_ value: Bool?), encode(_ value: Int?), etc. on it, it’s very intuitive to try to encode values that way:

func encode(to encoder: Encoder) throws {
    // The very first thing I try to type is encoder.enc… and guess what pops up in autocomplete:
    try encoder.encode(myName)
    try encoder.encode(myEmail)
    try encoder.encode(myAddress)
}
This might look right to someone expecting to be able to encode in an ordered fashion, which is not what these methods do.
In addition, for someone expecting keyed encoding methods, this is very confusing. Where are those methods? Where don’t these "default" methods have keys?

The very first time that code block ran, it would preconditionFailure() or throw an error, since those methods intend to encode only one single value.

That's true. But this is mitigated by the fact that the mistake is self-correcting—it will definitely cause a precondition to fail the first time you make it.

However, I do agree that it's not really a good idea. I'm more interested in the second suggestion I had, having the Keyed*Container return a SingleValue*Container.

The return type of decode(Int.self, forKey: .id) is Int. I’m not convinced that it’s possible to misconstrue that as the correct thing to do here. How would that return a nil value if the value was nil to begin with?

I think people will generally assume that they're going to get out the value they put in, and will be surprised that something encode(_:) accepts will cause decode(_:) to error out. I do agree that the type passed to `decode(_:forKey_:)` will make it relatively obvious what happened, but I think it'd be even better to just preserve the user's types.

I think we'd be better off having `encode(_:forKey:)` not take an optional; instead, we should have `Optional` conform to `Codable` and behave in some appropriate way. Exactly how to implement it might be a little tricky because of nested optionals; I suppose a `none` would have to measure how many levels of optionality there are between it and a concrete value, and then encode that information into the data. I think our `NSNull` bridging is doing something broadly similar right now.

Optional cannot encode to Codable for the reasons given above. It is a primitive type much like Int and String, and it’s up to the encoder and the format to represent it.
How would Optional encode nil?

I discussed this above: Treat null-ness as a primitive value with its own encode() call and do something clever for nested Optionals.

It's so simple, it doesn't even need to be specialized. You might even be able to get away with combining the encoding and decoding variants if the subscript comes from a conditional extension. `Value*Container` *does* need to be specialized; it looks like this (modulo the `Optional` issue I mentioned above):

Sure, let’s go with this for a moment. Presumably, then, Encoder would be able to vend out both KeyedEncodingContainers and ValueEncodingContainers, correct?

Yes.

public protocol ValueEncodingContainer {
func encode<Value : Codable>(_ value: Value?, forKey key: Key) throws

I’m assuming that the key here is a typo, correct?

Yes, sorry. I removed the forKey: label from the other calls, but not this one. (I almost left it on all of the calls, which would have been really confusing!)

Keep in mind that combining these concepts changes the semantics of how single-value encoding works. Right now SingleValueEncodingContainer only allows values of primitive types; this would allow you to encode a value in terms of a different arbitrarily-codable value.

Yes. I don't really see that as a problem; if you ask `Foo` to encode itself, and it only wants to encode a `Bar`, is anything really gained by insisting that it add a level of nesting first? More concretely: If you're encoding an enum with a `rawValue`, why not just encode the `rawValue`?

var codingKeyContext: [CodingKey]
}

And use sites would look like:

func encode(to encoder: Encoder) throws {
let container = encoder.container(keyedBy: CodingKey.self)
try container[.id].encode(id)
try container[.name].encode(name)
try container[.birthDate].encode(birthDate)
}

For consumers, this doesn’t seem to make much of a difference. We’ve turned try container.encode(id, forKey:. id) into try container[.id].encode(id).

It isn't terribly different for consumers, although the subscript is slightly less wordy. But it means that encoders/decoders only provide one—not two—sets of encoding/decoding calls, and it allows some small bits of cleverness, like passing a SingleValue*Container off to a piece of code that's supposed to handle it.

These types were chosen because we want the API to make static guarantees about concrete types which all Encoders and Decoders should support. This is somewhat less relevant for JSON, but more relevant for binary formats where the difference between Int16 and Int64 is critical.

This turns the concrete type check into a runtime check that Encoder authors need to keep in mind. More so, however, any type can conform to SignedInteger or UnsignedInteger as long as it fulfills the protocol requirements. I can write an Int37 type, but no encoder could make sense of that type, and that failure is a runtime failure. If you want a concrete example, Float80 conforms to FloatingPoint; no popular binary format I’ve seen supports 80-bit floats, though — we cannot prevent that call statically…

Instead, we want to offer a static, concrete list of types that Encoders and Decoders must be aware of, and that consumers have guarantees about support for.

But this way instead forces encoders to treat a whole bunch of types as "primitive" which, to those encoders, aren't primitive at all.

Maybe it's just that we have different priorities here, but in general, I want an archiving system that (within reason) handles whatever types I throw at it, if necessary by augmenting the underlying encoder format with default Foundation-provided behavior. If a format only supports 64-bit ints and I throw a 128-bit int at it, I don't want it to truncate it or throw up its hands; I want it to read the two's-compliment contents of the `BinaryInteger.words` property, convert it to a `Data` in some standard endianness, and write that out. Or convert to a human-readable `String` and use that. It doesn't matter a whole lot, as long as it does something it can undo later.

I also like that a system with very few primitives essentially makes no assumptions about what a format will need to customize. A low-level binary format cares a lot about different integer sizes, but a higher-level one probably cares more about dates, URLs, and dictionaries. For instance, I suspect (hope?) that the JSONEncoder is going to hook Array and Dictionary to make them form JSON arrays and objects, not the sort of key-based representation NSKeyedArchiver uses (if I recall correctly). If we just provide, for instance, these:

  func encode(_ value: String) throws
  func encode(_ value: NSNull) throws
  func encode(_ value: Codable) throws

Then there's exactly one path to customization—test for types in `encode(_: Codable)`—and everyone will use it. If you have some gigantic set of primitives, many coders will end up being filled with boilerplate to funnel ten integer types into one or two implementations, and nobody will be really happy with the available set.

In reality, you'll probably need a few more than just these three, particularly since BinaryInteger and FloatingPoint both have associated values, so several very important features (like their `bitWidth` and `isSigned` properties) can only be accessed through a separate primitive. But the need for a few doesn't imply that we need a big mess of them, particularly when the difference is only relevant to one particular class of encoders.

To accommodate my previous suggestion of using arrays to represent ordered encoded data, I would add one more primitive:

func encode(_ values: [Codable]) throws

Collection types are purposefully not primitives here:

If Array is a primitive, but does not conform to Codable, then you cannot encode Array<Array<Codable>>.
If Array is a primitive, and conforms to Codable, then there may be ambiguity between encode(_ values: [Codable]) and encode(_ value: Codable).
Even in cases where there are not, inside of encode(_ values: [Codable]), if I call encode([[1,2],[3,4]]), you’ve lost type information about what’s contained in the array — all you see is Codable
If you change it to encode<Value : Codable>(_ values: [Value]) to compensate for that, you still cannot infinitely recurse on what type Value is. Try it with encode([[[[1]]]]) and you’ll see what I mean; at some point the inner types are no longer preserved.

Hmm, I suppose you're right.

Alternative design: In addition to KeyedContainers, you also have OrderedContainers. Like my proposed behavior for KeyedContainers, these merely vend SingleValue*Containers—in this case as an Array-like Collection.

  extension MyList: Codable {
    func encode(to encoder: Encoder) throws {
      let container = encoder.orderedContainer(self.count)

      for (valueContainer, elem) in zip(container, self) {
        try valueContainer.encode(elem)
      }
    }

    init(from decoder: Decoder) throws {
      let container = decoder.orderedContainer()

      self.init(try container.map { try $0.decode(Element.self) })
    }
  }

This helps us draw an important distinction between keyed and ordered containers. KeyedContainers locate a value based on the key. Perhaps the way in which it's based on the key is that it extracts an integer from the key and then finds the matching location in a list of values, but then that's just how keys are matched to values in that format. OrderedContainers, on the other hand, are contiguous, variable-length, and have an intrinsic order to them. If you're handed an OrderedContainer, you are meant to be able to enumerate its contents; a KeyedContainer is more opaque than that.

(Also, is there any sense in adding `Date` to this set, since it needs special treatment in many of our formats?)

We’ve considered adding Date to this list. However, this means that any format that is a part of this system needs to be able to make a decision about how to format dates. Many binary formats have no native representations of dates, so this is not necessarily a guarantee that all formats can make.

Looking for additional opinions on this one.

I think that, if you're taking the view that you want to provide a set of pre-specified primitive methods as a list of things you want encoders to make a policy decision about, Date is a good candidate. But as I said earlier, I'd prefer to radically reduce the set of primitives, not add to it.

IIUC, two of your three proposed, Foundation-provided coders need to do something special with dates; perhaps one of the three needs to do something special with different integer sizes and types. Think of that as a message about your problem domain.

I see what you're getting at here, but I don't think this is fit for purpose, because arrays are not simply dictionaries with integer keys—their elements are adjacent and ordered. See my discussion earlier about treating inherently ordered containers as simply single-value `Array`s.

You’re right in that arrays are not simply dictionaries with integer keys, but I don’t see where we make that assertion here.

Well, because you're doing all this with a keyed container. That sort of implies that the elements are stored and looked up by key.

Suppose you want to write n elements into a KeyedEncodingContainer. You need a different key for each element, but you don't know ahead of time how many elements there are. So I guess you'll need to introduce a custom key type for no particular reason:

  struct /* wat */ IndexCodingKeys: CodingKey {
    var index: Int

    init(stringValue: String?, intValue: Int) throws {
      guard let i = intValue ?? Int(stringValue) else {
        throw …
      }
      index = i
    }

    var stringValue: String? {
      return String(index)
    }
    var intValue: Int? {
      return index
    }
  }

And then you write them all into keyed slots? And on the way back in, you inspect `allKeys` (assuming it's filled in at all, since you keep saying that coders don't necessarily have to use the keys), and use that to figure out the available elements, and decode them?

I'm just not sure I understand how this is supposed to work reliably when you combine arbitrary coders and arbitrary types.

The way these containers are handled is completely up to the Encoder. An Encoder producing an array may choose to ignore keys altogether and simply produce an array from the values given to it sequentially. (This is not recommended, but possible.)

Again, as I said earlier, this idea that a keyed encoder could just ignore the keys entirely is very strange and worrying to me. It sounds like a keyed container has no dependable semantics at all.

There's preserving implementation flexibility, and then there's being so vague about behavior that nothing has any meaning and you can't reliably use anything. I'm very worried that, in some places, this design leans towards the latter. A keyed container might not write the keys anywhere in the file, but it certainly ought to use them to determine which field you're looking for. If it doesn't—if the key is just a suggestion—then all this API provides is a naming convention for methods that do vaguely similar things, potentially in totally incompatible ways.

This comes very close to—but doesn't quite—address something else I'm concerned about. What's the preferred way to handle differences in serialization to different formats?

Here's what I mean: Suppose I have a BlogPost model, and I can both fetch and post BlogPosts to a cross-platform web service, and store them locally. But when I fetch and post remotely, I ned to conform to the web service's formats; when I store an instance locally, I have a freer hand in designing my storage, and perhaps need to store some extra metadata. How do you imagine handling that sort of situation? Is the answer simply that I should use two different types?

This is a valid concern, and one that should likely be addressed.

Perhaps the solution is to offer a userInfo : [UserInfoKey : Any] (UserInfoKey being a String-RawRepresentable struct or similar) on Encoder and Decoder set at the top-level to allow passing this type of contextual information from the top level down.

At a broad level, that's a good idea. But why not provide something more precise than a bag of `Any`s here? You're in pure Swift; you have that flexibility.

  protocol Codable {
    associatedtype CodingContext = ()

    init<Coder: Decoder>(from decoder: Coder, with context: CodingContext) throws
    func encoder<Coder: Encoder>(from encoder: Coder, with context: CodingContext) throws
  }
  protocol Encoder {
    associatedtype CodingContext = ()

    func container<Key : CodingKey>(keyedBy type: Key.Type) -> KeyedEncodingContainer<Key, CodingContext>
    …
  }
  class KeyedEncodingContainer<Key: CodingKey, CodingContext> {
    func encode<Value: Codable>(_ value: Value,? forKey key: Key, with context: Value.CodingContext) throws { … }

    // Shorthand when contexts are the same:
    func encode<Value: Codable>(_ value: Value,? forKey key: Key) throws
      where Value.CodingContext == CodingContext
    { … }

    …
  }

We don’t support this type of polymorphic decoding. Because no type information is written into the payload (there’s no safe way to do this that is not currently brittle), there’s no way to tell what’s in there prior to decoding it (and there wouldn’t be a reasonable way to trust what’s in the payload to begin with).
We’ve thought through this a lot, but in the end we’re willing to make this tradeoff for security primarily, and simplicity secondarily.

Well, `String(reflecting: typeInstance)` will give you the fully-qualified type name, so you can certainly write it. (If you're worried about `debugDescription` on types changing, I'm sure we can provide something, either public or as SPI, that won't.) You can't read it and convert it back to a type instance, but you can read it and match it against the type provided, including by walking into superContainer()s and finding the one corresponding to the type instance the user passed. Or you could call a type method on the provided type and ask it for a subtype instance to use for initialization, forming a sort of class cluster. Or, as a safety measure, you can throw if there's a class name mismatch.

(Maybe it'd be better to write out and check the key type, rather than the instance type. Hmm.)

Obviously not every encoder will want to write out types—I wouldn't expect JSONEncoder to do it, except perhaps with some sort of off-by-default option—but I think it could be very useful if added.

How important is this performance? If the answer is "eh, not really that much", I could imagine a setup where every "primitive" type eventually represents itself as `String` or `Data`, and each `Encoder`/`Decoder` can use dynamic type checks in `encode(_:)`/`decode(_:)` to define whatever "primitives" it wants for its own format.

Does this imply that Int32 should decide how it’s represented as Data? What if an encoder forgets to implement that?

Yes, Int32 decides how—if the encoder doesn't do anything special to represent integers—it should be represented in terms of a more immediately serializable type like Data. If an encoder forgets to provide a special representation for Int32, then it falls back to a sensible, Foundation-provided default. If the encoder author later realizes their mistake and wants to correct the encoder, they'd probably better build backwards compatibility into the decoder.

Again, we want to provide a static list of types that Encoders know they must handle, and thus, consumers have guarantees that those types are supported.

I do think that consumers are guaranteed these types are supported: Even if the encoder doesn't do anything special, Foundation will write them out as simpler and simpler types until, sooner or later, you get to something that is supported, like Data or String. This is arguably a stronger level of guarantee than we have when there are a bunch of primitive types, because if an encoder author feels like nobody's going to actually use UInt8 when it's a primitive, the natural thing to do is to throw or trap. If the author feels the same way about UInt8 when it's not a primitive, then the natural thing to do is to let Foundation do what it does, which is write out UInt8 in terms of some other type.

···

On Mar 16, 2017, at 12:33 PM, Itai Ferber <iferber@apple.com> wrote:

--
Brent Royal-Gordon
Architechies

anandabits · March 17, 2017, 6:27pm

Another issue of scale - I had to switch to a native mail client as replying inline severely broke my webmail client. ;-)

Again, lots of love here. Responses inline.
Proposed solution
We will be introducing the following new types:

protocol Codable: Adopted by types to opt into archival. Conformance may be automatically derived in cases where all properties are also Codable.

FWIW I think this is acceptable compromise. If the happy path is derived conformances, only-decodable or only-encodable types feel like a lazy way out on the part of a user of the API, and builds a barrier to proper testing.

[snip]

Structured types (i.e. types which encode as a collection of properties) encode and decode their properties in a keyed manner. Keys may be String-convertible or Int-convertible (or both), and user types which have properties should declare semantic key enums which map keys to their properties. Keys must conform to the CodingKey protocol:
public protocol CodingKey { <##snip##> }

A few things here:

The protocol leaves open the possibility of having both a String or Int representation, or neither. What should a coder do in either case? Are the representations intended to be mutually exclusive, or not? The protocol design doesn’t seem particularly matching with the flavor of Swift; I’d expect something along the lines of a CodingKey enum and the protocol CodingKeyRepresentable. It’s also possible that the concerns of the two are orthogonal enough that they deserve separate container(keyedBy:) requirements.

The general answer to "what should a coder do" is "what is appropriate for its format". For a format that uses exclusively string keys (like JSON), the string representation (if present on a key) will always be used. If the key has no string representation but does have an integer representation, the encoder may choose to stringify the integer. If the key has neither, it is appropriate for the Encoder to fail in some way.

On the flip side, for totally flat formats, an Encoder may choose to ignore keys altogether, in which case it doesn’t really matter. The choice is up to the Encoder and its format.

The string and integer representations are not meant to be mutually exclusive at all, and in fact, where relevant, we encourage providing both types of representations for flexibility.

As for the possibility of having neither representation, this question comes up often. I’d like to summarize the thought process here by quoting some earlier review (apologies for the poor formatting from my mail client):

If there are two options, each of which is itself optional, we have 4 possible combinations. But! At the same time we prohibit one combination by what? Runtime error? Why not use a 3-case enum for it? Even further down the rabbit whole there might be a CodingKey<> specialized for a concrete combination, like CodingKey<StringAndIntKey> or just CodingKey<StringKey>, but I’m not sure whether our type system will make it useful or possible…

public enum CodingKeyValue {
case integer(value: Int)
case string(value: String)
case both(intValue: Int, stringValue: String)
}
public protocol CodingKey {
init?(value: CodingKeyValue)
var value: CodingKeyValue { get }
}

I agree that this certainly feels suboptimal. We’ve certainly explored other possibilities before sticking to this one, so let me try to summarize here:

* Having a concrete 3-case CodingKey enum would preclude the possibility of having neither a stringValue nor an intValue. However, there is a lot of value in having the key types belong to the type being encoded (more safety, impossible to accidentally mix key types, private keys, etc.); if the CodingKey type itself is an enum (which cannot be inherited from), then this prevents differing key types.
* Your solution as presented is better: CodingKey itself is still a protocol, and the value itself is the 3-case enum. However, since CodingKeyValue is not literal-representable, user keys cannot be enums RawRepresentable by CodingKeyValue. That means that the values must either be dynamically returned, or (for attaining the benefits that we want to give users — easy representation, autocompletion, etc.) the type has to be a struct with static lets on it giving the CodingKeyValues. This certainly works, but is likely not what a developer would have in mind when working with the API; the power of enums in Swift makes them very easy to reach for, and I’m thinking most users would expect their keys to be enums. We’d like to leverage that where we can, especially since RawRepresentable enums are appropriate in the vast majority of use cases.
* Three separate CodingKey protocols (one for Strings, one for Ints, and one for both). You could argue that this is the most correct version, since it most clearly represents what we’re looking for. However, this means that every method now accepting a CodingKey must be converted into 3 overloads each accepting different types. This explodes the API surface, is confusing for users, and also makes it impossible to use CodingKey as an existential (unless it’s an empty 4th protocol which makes no static guarantees and the others inherit from).
* [The current] approach. On the one hand, this allows for the accidental representation of a key with neither a stringValue nor an intValue. On the other, we want to make it really easy to use autogenerated keys, or autogenerated key implementations if you provide the cases and values yourself. The nil value possibility is only a concern when writing stringValue and intValue yourself, which the vast majority of users should not have to do.
* Additionally, a key word in that sentence bolded above is “generally”. As part of making this API more generalized, we push a lot of decisions to Encoders and Decoders. For many formats, it’s true that having a key with no value is an error, but this is not necessarily true for all formats; for a linear, non-keyed format, it is entirely reasonable to ignore the keys in the first place, or replaced them with fixed-format values. The decision of how to handle this case is left up to Encoders and Decoders; for most formats (and for our implementations), this is certainly an error, and we would likely document this and either throw or preconditionFailure. But this is not the case always.
* In terms of syntax, there’s another approach that would be really nice (but is currently not feasible) — if enums were RawRepresentable in terms of tuples, it would be possible to give implementations for String, Int, (Int, String), (String, Int), etc., making this condition harder to represent by default unless you really mean to.

Hope that gives some helpful background on this decision. FWIW, the only way to end up with a key having no intValue or stringValue is manually implementing the CodingKey protocol (which should be exceedingly rare) and implementing the methods by not switching on self, or some other method that would allow you to forget to give a key neither value.

Speaking of the mutually exclusive representations - what above serializations that doesn’t code as one of those two things? YAML can have anything be a “key”, and despite that being not particularly sane, it is a use case.

We’ve explored this, but at the end of the day, it’s not possible to generalize this to the point where we could represent all possible options on all possible formats because you cannot make any promises as to what’s possible and what’s not statically.

We’d like to strike a balance here between strong static guarantees on one end (the extreme end of which introduces a new API for every single format, since you can almost perfectly statically express what’s possible and what isn’) and generalization on the other (the extreme end of which is an empty protocol because there really are encoding formats which are mutually exclusive). So in this case, this API would support producing and consuming YAML with string or integer keys, but not arbitrary YAML.

For most types, String-convertible keys are a reasonable default; for performance, however, Int-convertible keys are preferred, and Encoders may choose to make use of Ints over Strings. Framework types should provide keys which have both for flexibility and performance across different types of Encoders. It is generally an error to provide a key which has neither a stringValue nor an intValue.
Could you speak a little more to using Int-convertible keys for performance? I get the feeling int-based keys parallel the legacy of NSCoder’s older design, and I don’t really see anyone these days supporting non-keyed archivers. They strike me as fragile. What other use cases are envisioned for ordered archiving than that?

We agree that integer keys are fragile, and from years (decades) of experience with NSArchiver, we are aware of the limitations that such encoding offers. For this reason, we will never synthesize integer keys on your behalf. This is something you must put thought into, if using an integer key for archival.

However, there are use-cases (both in archival and in serialization, but especially so in serialization) where integer keys are useful. Ordered encoding is one such possibility (when the format supports it, integer keys are sequential, etc.), and is helpful for, say, marshaling objects in an XPC context (where both sides are aware of the format, are running the same version of the same code, on the same device) — keys waste time and bandwidth unnecessarily in some cases.

Integer keys don’t necessarily imply ordered encoding, however. There are binary encoding formats which support integer-keyed dictionaries (read: serialized hash maps) which are more efficient to encode and decode than similar string-keyed ones. In that case, as long as integer keys are chosen with care, the end result is more performant.

But again, this depends on the application and use case. Defining integer keys requires manual effort because we want thought put into defining them; they are indeed fragile when used carelessly.

[snip]

Keyed Encoding Containers

Keyed encoding containers are the primary interface that most Codable types interact with for encoding and decoding. Through these, Codable types have strongly-keyed access to encoded data by using keys that are semantically correct for the operations they want to express.

Since semantically incompatible keys will rarely (if ever) share the same key type, it is impossible to mix up key types within the same container (as is possible with Stringkeys), and since the type is known statically, keys get autocompletion by the compiler.

open class KeyedEncodingContainer<Key : CodingKey> {

Like others, I’m a little bummed about this part of the design. Your reasoning up-thread is sound, but I chafe a bit on having to reabstract and a little more on having to be a reference type. Particularly knowing that it’s got a bit more overhead involved… I /like/ that NSKeyedArchiver can simply push some state and pass itself as the next encoding container down the stack.

There’s not much more to be said about why this is a class that I haven’t covered; if it were possible to do otherwise at the moment, then we would.

It is possible using a manually written type-erased wrapper along the lines of AnySequence and AnyCollection. I don’t recall seeing a rationale for why you don’t want to go this route. I would still like to hear more on this topic.

As for why we do this — this is the crux of the whole API. We not only want to make it easy to use a custom key type that is semantically correct for your type, we want to make it difficult to do the easy but incorrect thing. From experience with NSKeyedArchiver, we’d like to move away from unadorned string (and integer) keys, where typos and accidentally reused keys are common, and impossible to catch statically.
encode<T : Codable>(_: T?, forKey: String) unfortunately not only encourages code like encode(foo, forKey: "foi") // whoops, typo, it is more difficult to use a semantic key type: encode(foo, forKey: CodingKeys.foo.stringValue). The additional typing and lack of autocompletion makes it an active disincentive. encode<T : Codable>(_: T?, forKey: Key) reverses both of these — it makes it impossible to use unadorned strings or accidentally use keys from another type, and nets shorter code with autocompletion: encode(foo, forKey: .foo)

The side effect of this being the fact that keyed containers are classes is suboptimal, I agree, but necessary.

open func encode<Value : Codable>(_ value: Value?, forKey key: Key) throws

Does this win anything over taking a Codable?

Taking the concrete type over an existential allows for static dispatch on the type within the implementation, and is a performance win in some cases.

open func encode(_ value: Bool?, forKey key: Key) throws
open func encode(_ value: Int?, forKey key: Key) throws
open func encode(_ value: Int8?, forKey key: Key) throws
open func encode(_ value: Int16?, forKey key: Key) throws
open func encode(_ value: Int32?, forKey key: Key) throws
open func encode(_ value: Int64?, forKey key: Key) throws
open func encode(_ value: UInt?, forKey key: Key) throws
open func encode(_ value: UInt8?, forKey key: Key) throws
open func encode(_ value: UInt16?, forKey key: Key) throws
open func encode(_ value: UInt32?, forKey key: Key) throws
open func encode(_ value: UInt64?, forKey key: Key) throws
open func encode(_ value: Float?, forKey key: Key) throws
open func encode(_ value: Double?, forKey key: Key) throws
open func encode(_ value: String?, forKey key: Key) throws
open func encode(_ value: Data?, forKey key: Key) throws

What is the motivation behind abandoning the idea of “primitives” from the Alternatives Considered? Performance? Being unable to close the protocol?

Being unable to close the protocol is the primary reason. Not being able to tell at a glance what the concrete types belonging to this set are is related, and also a top reason.

Looks like we have another strong motivating use case for closed protocols. I hope that will be in scope for Swift 5.

It would be great for the auto-generated documentation and “headers" to provide a list of all public or open types inheriting from a closed class or conforming to a closed protocol (when we get them). This would go a long way towards addressing your second reason.

···

On Mar 17, 2017, at 1:15 PM, Itai Ferber via swift-evolution <swift-evolution@swift.org> wrote:
On 15 Mar 2017, at 22:58, Zach Waldowski wrote:
On Mar 15, 2017, at 6:40 PM, Itai Ferber via swift-evolution <swift-evolution@swift.org> wrote:

What ways is encoding a value envisioned to fail? I understand wanting to allow maximum flexibility, and being symmetric to `decode` throwing, but there are plenty of “conversion” patterns the are asymmetric in the ways they can fail (Date formatters, RawRepresentable, LosslessStringConvertible, etc.).

Different formats support different concrete values, even of primitive types. For instance, you cannot natively encode Double.nan in JSON, but you can in plist. Without additional options on JSONEncoder, encode(Double.nan, forKey: …) will throw.

/// For `Encoder`s that implement this functionality, this will only encode the given object and associate it with the given key if it encoded unconditionally elsewhere in the archive (either previously or in the future).
open func encodeWeak<Object : AnyObject & Codable>(_ object: Object?, forKey key: Key) throws

Is this correct that if I send a Cocoa-style object graph (with weak backrefs), an encoder could infinitely recurse? Or is a coder supposed to detect that?

encodeWeak has a default implementation that calls the regular encode<T : Codable>(_: T, forKey: Key); only formats which actually support weak backreferencing should override this implementation, so it should always be safe to call (it will simply unconditionally encode the object by default).

open var codingKeyContext: [CodingKey]
}
[snippity snip]
Alright, those are just my first thoughts. I want to spend a little time marinating in the code from PR #8124 before I comment further. Cheers! I owe you, Michael, and Tony a few drinks for sure.

Hehe, thanks :)

Zach Waldowski
zach@waldowski.me

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

anandabits · March 17, 2017, 9:25pm

This is a fantastic proposal! I am very much looking forward to robust Swift-native encoding and decoding in Foundation. The compiler synthesized conformances is especially great! I want to thank everyone who worked on it. It is clear that a lot of work went into the proposal.

The proposal covers a lot of ground so I’m breaking my comments up by topic in the order the occur in the proposal.

Thanks for the feedback, Matthew! Responses inline.

And thank you for the responses!

Encode / Decode only types:

Brent raised the question of decode only types. Encode only types are also not uncommon when an API accepts an argument payload that gets serialized into the body of a request. The compiler synthesis feature in the proposal makes providing both encoding and decoding easy in common cases but this won’t always work as needed.

The obvious alternative is to have Decodable and Encodable protocols which Codable refines. This would allow us to omit a conformance we don’t need when it can’t be synthesized.

If conformances are still synthesized individually (i.e. for just Decodable or just Encodable), it would be way too easy to accidentally conform to one or the other and not realize that you’re not conforming to Codable, since the synthesis is invisible. You’d just be missing half of the protocol.

This is the kind of mistake people don’t tend to make often and Swift’s typing will alert someone pretty quickly if they make this mistake. A fixit could even be offered if the type is in the same module as it is used incorrectly. I really don’t think it’s that big a deal to expect people to understand the differences. They already need to understand encoders and decoders to make use of these protocols and this is just the other side of that distinction.

If the way out of this is to only synthesize conformance to Codable, then it’s much harder to justify the inclusion of Encodable or Decodable since those would require a manual implementation and would much more rarely be used.

I wouldn’t limit synthesis in that way.

This isn’t that big a deal given that synthesis will do the work for us most of the time but I think it’s unfortunate to see these coupled. There will be times when we have to choose between fatalError and maintaining code we don’t need. That’s a bad choice to have to make. I don’t like designs that impose it on me.

Your reply to Brent mentions using `fatalError` to avoid implementing the direction that isn't needed. I think it would be better if the conformance can reflect what is actually supported by the type. Requiring us to write `fatalError` as a stub for functionality we don’t need is a design problem IMO. I don’t think the extra protocols are really that big a burden. They don’t add any new functionality and are very easy to understand, especially considering the symmetry they would have with the other types you are introducing.

Coding Keys:

As others have mentioned, the design of this protocol does not require a value of a conforming type to actually be a valid key (it can return nil for both `intValue` and `stringValue`). This seems problematic to me.

In the reply to Brent again you mention throwing and `preconditionFailure` as a way to handle incompatible keys. This also seems problematic to me and feels like a design problem. If we really need to support more than one underlying key type and some encoders will reject some key types this information should be captured in the type system. An encoder would only vend a keyed container for keys it actually supports. Ideally the conformance of a type’s CodingKeys could be leveraged to produce a compiler error if an attempt was made to encode this type into an encoder that can’t support its keys. In general, the idea is to produce static errors as close to the origin of the programming mistake as possible.

I would very much prefer that we don’t defer to runtime assertions or thrown errors, etc for conditions that could be caught statically at compile time given the right design. Other comments have indicated that static guarantees are important to the design (encoders *must* guarantee support of primitives specified by the protocols, etc). Why is a static guarantee of compatible coding keys considered less important?

I agree that it would be nice to support this in a static way, but while not impossible to represent in the type system, it absolutely explodes the API into a ton of different types and protocols which are not dissimilar. We’ve considered this in the past (see approach #4 in the Alternatives Considered <https://github.com/itaiferber/swift-evolution/blob/637532e2abcbdb9861e424359bb6dac99dc6b638/proposals/XXXX-swift-archival-serialization.md#alternatives-considered> section) and moved away from it for a reason.

To summarize:

To statically represent the difference between an encoder which supports string keys and one which supports integer keys, we would have to introduce two different protocol types (say, StringKeyEncoder and IntKeyEncoder)
Now that there are two different encoder types, the Codable protocol also needs to be split up into two — one version which encodes using a StringKeyEncoderand one version which encodes using an IntKeyEncoder. If you want to support encoding to an encoder which supports both types of keys, we’d need a thirdCodable protocol which takes something that’s StringKeyEncoder & IntKeyEncoder (because you cannot just conform to both StringCodable and IntCodable — it’s ambiguous when given something that’s StringKeyEncoder & IntKeyEncoder)
On encoders which support both string and integer keys, you need overloads for encode<T : StringCodable>(…), encode<T : IntCodable>(…), and encode<T : StringCodable & IntCodable>(…) or else the call is ambiguous
Repeat for both encode<T : …>(_ t: T?, forKey: String) and encode<T : …>(_ t: T?, forKey: Int)
Repeat for decoders, with all of their overloads as well
This is not to mention the issue of things which are single-value encodable, which adds additional complexity. Overall, the complexity of this makes it unapproachable and confusing as API, and is a hassle for both consumers and for Encoder/Decoder writers.

We are willing to make the runtime failure tradeoff for keeping the rest of the API consumable with the understanding that we expect that the vast majority of CodingKeyconformances will be automatically generated, and that type authors will generally provide key types which are appropriate for the formats they expect to encode their own types in.

What if we go in a different direction here? Instead of distinguishing both, why not at least require all keys to support strings? This is the common use case. As you have already noted elsewhere, people using Int keys are already kind of in “expert” territory. I would feel a lot better if at least the common use case was statically safe. Is it really that important to support Int-only keys? A type that is primarily intended to be used with Int keys could just stringing it’s Int values to provide strings. You could even a new protocol providing default implementations for coding key types which want to be implemented in terms of Int:

public protocol CodingKey {
    var stringValue: String { get }
    init?(stringValue: String)
    var intValue: Int? { get }
    init?(intValue: Int)
}
protocol IntCodingKey: CodingKey {
    var guaranteedIntValue: Int { get }
}
extension IntCodingKey {
    var intValue: Int? { return guaranteedIntValue }
    var stringValue: String { return "\(guaranteedIntValue)" }
    init?(stringValue: String) {
        if let int = Int(stringValue) {
            self.init(intValue: int)
        } else {
            return nil
        }
    }
}

IntCodingKey would not need to appear anywhere in the APIs, it could exist solely to provide the default implementations of the string members. Libraries that wish to discover that *all* instances of this CodingKey type could cast to IntCodingKey to determine that.

Keyed Containers:

Joe posted raised the topic of the alternative of using manual type erasure for the keyed containers rather than abstract classes. Did you explore this direction at all? It feels like a more natural approach for Swift and as Joe noted, it can be designed in such a way that eases migration to existentials in the future if that is the “ideal” design (which you mentioned in your response).

Joe mentions the type-erased types as an example of where we’ve had to use those because we’re lacking other features — I don’t see how type erasure would be the solution. We’re doing the opposite of type-erasure: we’re trying to offer an abstract type that is generic and specified on a type you give it. The base KeyedEncodingContainer is effectively a type-erased base type, but it’s the generics that we really need.

You’re trying to erase some type information while preserving the key type. There are several ways to accomplish this. I don’t understand the specific problem you’re running into in writing a non-class type that does this. If there is a technical limitation around this I am interested in learning what it is. But maybe it’s possible to use a different approach to these types. I would appreciate it if you can elaborate specific technical details behind the belief that a struct like AnyCollection / AnySequence is not viable.

Decoding Containers:

returns: A value of the requested type, if present for the given key and convertible to the requested type.

Can you elaborate on the details of “convertible to the requested type” means? It think this is an important detail for the proposal.

For example, I would expect numeric values to attempt conversion using the SE-0080 failable numeric conversion initializers (decoding JSON was a primary motivating use case for that proposal). If the requested type conforms to RawRepresentable and the encoded value can be converted to RawValue (perhaps using a failable numeric initializer) I would expect the raw value initializer to be used to attempt conversion. If Swift ever gained a standard mechanism for generalized value conversion I would also expect that to be used if a conversion is available from the encoded type to the requested type.

If either of those conversions fail I would expect something like an “invalid value” error or a “conversion failure” error rather than a “type mismatch” error. The types don’t exactly mismatch, we just have a failable conversion process that did not succeed.

Keep in mind that no type information is written into the payload, so the interpretation of this is up to the Encoder and its format.
JSON, for instance, has no delineation between number types. For {"value": 1}, you should be able to decode(…, forKey: .value) the value through any one of the numeric types, since 1 is representable by any of them. However, requesting it as a String should throw a .coderTypeMismatch.

It sounds like the behavior we’ll get for JSON numbers is what I would expect. But what about encoders that do keep track of the type of a numeric value during serialization? What behavior is expected by them?

Protocols are about semantics, not just syntax. "A value of the requested type, if present for the given key and convertible to the requested type” is a very vague semantic statement. Should we define the semantics of what kind of conversions are valid and which are not more precisely? Or do you think it is important to leave this up to individual decoders to decide? Mandating failable numeric conversions would make it easier to change a numeric property types without necessarily breaking archived data (especially when promoting to larger numeric types).

You didn’t answer my question about decoding RawRepresentable types where the payload is RawValue. It looks like that won’t happen and instead the RawRepresentable type would need to conform to Codable, which would usually be synthesized to store a single value. That seems reasonable.

If you try to ask for 3.14 as an Int, I think it’s valid to get a .coderTypeMismatch — you asked for something of the wrong type altogether. I don’t see much value in providing a different error type to represent the same thing.

Context:

I’m glad Brent raised the topic of supporting multiple formats. In his example the difference was remote and local formats. I’ve also seen cases where the same API requires the same model to be encoded differently in different endpoints. This is awful, but it also happens sometimes in the wild. Supporting an application specified encoding context would be very useful for handling these situations (the `codingKeyContex` would not always be sufficient and would usually not be a good way to determine the required encoding or decoding format).

A `[UserInfoKey: Any]` context was mentioned as a possibility. This would be better than nothing, but why not use the type system to preserve information about the type of the context? I have a slightly different approach in mind. Why not just have a protocol that refines Codable with context awareness?

public protocol ContextAwareCodable: Codable {
associatedtype Context
init(from decoder: Decoder, with context: Context?) throws
func encode(to encoder: Encoder, with context: Context?) throws
}
extension ContextAwareCodable {
init(from decoder: Decoder) throws {
try self.init(from: decoder, with: nil
}
func encode(to encoder: Encoder) throws {
try self.encode(to: encoder, with: nil)
}
}

There are a few issues with this:

For an Encoder to be able to call encode(to:with:) with the Context type, it would have to know the Context type statically. That means it would have to be generic on the Context type (we’ve looked at this in the past with regards to the encoder declaring the type of key it’s willing to accept)

This is not true. It could expose a generic top-level method and capture the type of Context there in the internal type used to perform the actual encoding.

It makes more sense for the Encoder to define what context type it vends, rather than have Codable things discriminate on what they can accept. If you have two types in a payload which require mutually-exclusive context types, you cannot encode the payload at all
associatedtype requirements cannot be overridden by subclasses. That means if you subclass a ContextAwareCodable type, you cannot require a different context type than your parent, which may be a no-go. This, by the way, is why we don’t have an official associatedtype CodingKeys : CodingKey requirement on Codable

I based my suggestion on real world use cases I have encountered. In these use cases the context would usually be an enum that all types involved in any given (de)serialization share.

There is a tradeoff here - you get increased type safety but as you point you you can’t set up a context where some types involved in the (de)serialization know about one part of the context and others know about a different part of the context. In my experience the type safety would provide a lot more benefit than the additional flexibility. IMO we need to stop using Any payload dictionaries when we can avoid them. Does anyone actually have specific real world use cases where the statically typed approach wouldn’t work?

Encoders and Decoders would be encouraged to support a top level encode / decode method which is generic and takes an application supplied context. When the context is provided it would be given to all `ContextAwareCodable` types that are aware of this context type during encoding or decoding. The coding container protocols would include an overload for `ContextAwareCodable` allowing the container to know whether the Value understands the context given to the top level encoder / decoder:

open func encode<Value : ContextAwareCodable>(_ value: Value?, forKey key: Key) throws

A common top level signature for coders and decoders would look something like this:

open func encode<Value : Codable, Context>(_ value: Value, with context: Context) throws -> Data

This approach would preserve type information about the encoding / decoding context. It falls back to the basic Codable implementation when a Value doesn’t know about the current context. The default implementation simply translates this to a nil context allowing ContextAwareCodable types to have a single implementation of the initializer and the encoding method that is used whether they are able to understand the current context or not.

I amend the above — this means that if you have two types which require different contexts in the same payload, only one of them will get the context and the other silently will not. I’m not sure this is better.

Like I said, there is a tradeoff involved here. A type can only choose one kind of context to care about, but then we get static type safety. If that context isn’t available they type does not need need to know about it. Careful design using existentials for context types would allow users to gradually reduce type safety if necessary. If a type really wants to know about all contexts it would simply use Any as its context type. This gives us more type safety when we want it without losing the ability to access arbitrary context data when we want to do that.

A slightly more type-erased context type would allow all members to look at the context if desired without having to split the protocol, require multiple different types and implementations, etc.

I don’t quite follow you here. Can you elaborate?

···

On Mar 17, 2017, at 2:42 PM, Itai Ferber <iferber@apple.com> wrote:
On 16 Mar 2017, at 14:29, Matthew Johnson wrote:

- Matthew

Colin_Barrett · March 21, 2017, 4:00pm

I assumed that in those situations, one would create a wrapper struct,

struct WebBlogModel {
let wrapped: BlogModel
}

probably for the encoding impl that requires more custom work. The
implementation of Codable for this struct would then serialize
(deserialize) from (to) its wrapped value's properties directly.

Types already provide a means for performing context sensitive
implementation selection, I don't think it's necessary to provide another
way to do that in Swift. Of course I could very well be wrong :)

-Colin

···

On Thu, Mar 16, 2017 at 3:33 PM Itai Ferber via swift-evolution < swift-evolution@swift.org> wrote:

Here's what I mean: Suppose I have a BlogPost model, and I can both fetch
and post BlogPosts to a cross-platform web service, and store them locally.
But when I fetch and post remotely, I ned to conform to the web service's
formats; when I store an instance locally, I have a freer hand in designing
my storage, and perhaps need to store some extra metadata. How do you
imagine handling that sort of situation? Is the answer simply that I should
use two different types?

This is a valid concern, and one that should likely be addressed.

Perhaps the solution is to offer a userInfo : [UserInfoKey : Any] (
UserInfoKey being a String-RawRepresentable struct or similar) on Encoder
and Decoder set at the top-level to allow passing this type of contextual
information from the top level down.

benrimmington · March 24, 2017, 1:39pm

On second thought, "property behaviors" <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170313/034042.html> could eventually allow the custom name to be stored in a `KeyPath` subclass.

Key paths would also need `LosslessStringConvertible` conformance. The current design only has `CustomDebugStringConvertible` conformance.

-- Ben

···

On 23 Mar 2017, at 19:37, Ben Rimmington wrote:

On 22 Mar 2017, at 17:41, Itai Ferber wrote:

What’s the use case that you were thinking of? KeyPaths could be useful in the case where you don’t need to customize your key names, but cannot represent a custom case like

public struct Post {
    var authorID: Int
    var bodyText: String

    private enum CodingKeys : String, CodingKey {
        case authorID = "author_id"
        case bodyText = "body_text"
    }
}
Or am I misunderstanding?

For custom names, the `CodingKeys` enum does seem like the best design, unless an attribute can be used.

  public struct Post : Codable {
      @codable(name: "author_id") var authorID: Int
      @codable(name: "body_text") var bodyText: String
  }

If each `KeyPath` encapsulates the type information, the `decode` methods won't need a `type` parameter.

  /// Primitive decoding methods (for single-value and keyed containers).
  open class DecodingContainer<Root : Codable> {
      open func decode(for keyPath: KeyPath<Root, Bool>) throws -> Bool
      open func decode(for keyPath: KeyPath<Root, Int>) throws -> Int
      open func decode(for keyPath: KeyPath<Root, UInt>) throws -> UInt
      open func decode(for keyPath: KeyPath<Root, Float>) throws -> Float
      open func decode(for keyPath: KeyPath<Root, Double>) throws -> Double
      open func decode(for keyPath: KeyPath<Root, String>) throws -> String
      open func decode(for keyPath: KeyPath<Root, Data>) throws -> Data
  }

  /// Keyed containers inherit the primitive decoding methods.
  open class KeyedDecodingContainer : DecodingContainer {
      open func decode<Value : Codable>(for keyPath: KeyPath<Root, Value>) throws -> Value
  }

itaiferber · April 4, 2017, 9:53pm

Hi Zach,

Thanks for your comments!
The type is called "unkeyed", but I assume "nonkeyed" was a typo and that's what you meant. As far as the phrasing of "ordered" and "sequential", both sound good, but:

1. The symmetry between "keyed" and "unkeyed" is helpful in creating opposition between types of encoding (and especially so in comparison to "single value", which is the odd man out — and you'd extremely rarely need to interact with it)
2. Given something that's "x" or "not x", you'd generally gravitate toward the thing with the more positive phrasing. As you mention, we really want to encourage keyed containers and diminish the use of unkeyed containers unless truly necessary, because they're fragile. The problem is, it's much easier to use the unkeyed containers — especially accidentally as a novice, since they're much simpler API — and I think "ordered" and "sequential" don't go far enough to detract from their usage.

They sound good, and in fact, too good, and we find that more negative phrasing is helpful.

— Itai

···

On 3 Apr 2017, at 16:01, Zach Waldowski via swift-evolution wrote:

Itai and co:

This is a solid improvement.

I think it's appropriate to diminish the importance of non-keyed
containers. "Nonkeyed" as the name is pretty iffy to me, though, even
though I admit it makes the use case pretty clear. "Ordered" or
"Sequential" both sound fine, even for an encoder that's slot-based
instead of NSArchiver-like model. An array is ordered but you don't
have to traverse it in order.

Best,

  Zachary Waldowski

  zach@waldowski.me

On Mon, Apr 3, 2017, at 04:31 PM, Itai Ferber via swift-evolution > wrote:

Hi everyone,

With feedback from swift-evolution and additional internal review,
we've pushed updates to this proposal, and to the Swift Encoders[1]
proposal. In the interest of not blowing up mail clients with the full
HTML again, I'll simply be linking to the swift-evolution PR here[2],
as well as the specific diff[3] of what's changed.
At a high level:

* The Codable protocol has been split up into Encodable and Decodable
* String keys on CodingKey are no longer optional
* KeyedEncodingContainer has become
   KeyedEncodingContainerProtocol, with a concrete type-erased
   KeyedEncodingContainer struct to hold it
* Array responsibilities have been removed from
   KeyedEncodingContainer, and have been added to a new
   UnkeyedEncodingContainer type
* codingKeyContext has been renamed codingPath
There are some specific changes inline — I know it might be a bit of a
pain, but let's keep discussion here on the mailing list instead of on
GitHub. We'll be looking to start the official review process very
soon, so we're interested in any additional feedback.
Thanks!

— Itai

_________________________________________________

swift-evolution mailing list

swift-evolution@swift.org

https://lists.swift.org/mailman/listinfo/swift-evolution

Links:

  1. Proposal for Foundation Swift Encoders by itaiferber · Pull Request #640 · apple/swift-evolution · GitHub
  2. Proposal for Foundation Swift Archival & Serialization API by itaiferber · Pull Request #639 · apple/swift-evolution · GitHub
  3. Proposal for Foundation Swift Archival & Serialization API by itaiferber · Pull Request #639 · apple/swift-evolution · GitHub

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

itaiferber · March 16, 2017, 5:21pm

Yep, that’s a good way to describe it.
We could potentially do that as well, but adding another type like `AnyHashable` or `AnyCollection` felt like a much more sweeping change, considering that those require some special compiler magic themselves (and we’d like to do as little of that as we can).

···

On 15 Mar 2017, at 19:12, Joe Groff wrote:

On Mar 15, 2017, at 6:46 PM, Itai Ferber <iferber@apple.com> wrote:

Thanks Joe, and thanks for passing this along!

To those who are curious, we use abstract base classes for a cascading list of reasons:

  • We need to be able to represent keyed encoding and decoding containers as abstract types which are generic on a key type
  • There are two ways to support abstraction in this way: protocol & type constraints, and generic types
    • Since Swift protocols are not generic, we unfortunately cannot write protocol KeyedEncodingContainer<Key : CodingKey> { ... }, which is the "ideal" version of what we're trying to represent
  • Let's try this with a protocol first (simplified here):

protocol Container {
    associatedtype Key : CodingKey
}

func container<Key : CodingKey, Cont : Container>(_ type: Key.Type) -> Cont where Cont.Key == Key {
    // return something
}

This looks promising so far — let's try to make it concrete:

struct ConcreteContainer<K : CodingKey> : Container {
    typealias Key = K
}

func container<Key : CodingKey, Cont : Container>(_ type: Key.Type) -> Cont where Cont.Key == Key {
    return ConcreteContainer<Key>() // error: Cannot convert return expression of type 'ConcreteContainer<Key>' to return type 'Cont'
}

Joe or anyone from the Swift team can describe this better, but this is my poor-man's explanation of why this happens. Swift's type constraints are "directional" in a sense. You can constrain a type going into a function, but not out of a function. There is no type I could return from inside of container() which would satisfy this constraint, because the constraint can only be satisfied by turning Cont into a concrete type from the outside.

Okay, well let's try this:

func container... {
    return ConcreteContainer<Key>() as! Cont
}

This compiles fine! Hmm, let's try to use it:

container(Int.self) // error: Generic parameter 'Cont' could not be inferred

The type constraint can only be fulfilled from the outside, not the inside. The function call itself has no context for the concrete type that this would return, so this is a no-go.

  • If we can't do it with type constraints in this way, is it possible with generic types? Yep! Generic types satisfy this without a problem. However, since we don't have generic protocols, we have to use a generic abstract base class to represent the same concept — an abstract container generic on the type of key which dynamically dispatches to the "real" subclassed type

Hopes that gives some simplified insight into the nature of this decision.

I see. Protocols with associated types serve the same purpose as generic interfaces in other languages, but we don't have the first-class support for protocol types with associated type constraints (a value of type `Container where Key == K`). That's something we'd like to eventually support. In other places in the standard library, we wrtie the type-erased container by hand, which is why we have `AnySequence`, `AnyCollection`, and `AnyHashable`. You could probably do something similar here; that would be a bit awkward for implementers, but might be easier to migrate forward to where we eventually want to be with the language.

-Joe

itaiferber · March 16, 2017, 7:55pm

I’m going to reply to this thread as a whole — apologies if there’s someone’s comment that I’ve missed.

This is something that has come up in internal review, and we’ve certainly given it thought. As Zach has already mentioned, the primary concern with overloading based on return type is ambiguity.
There are many cases in which Swift’s type system currently does not handle ambiguity in the way that you would expect, and it can be very surprising. For instance,

func foo() -> Int { return 42 }
func foo() -> Double { return .pi }
func consumesInt(_ x : Int) { print(x) }

let x = foo() // Ambiguous use of foo()
consumesInt(x) // Even though x is going to be used as an Int
let y: Int = x // Same here

`let x = foo() as Int` works now, but it actually didn’t always — until a somewhat recent version of Swift AFAICT, the only way to resolve the ambiguity was through `let x: Int = foo()`. This has since been fixed, but it was very confusing to try to figure out the unambiguous way to call it.

Keep in mind that this isn’t an unreasonable thing to want to do:

struct Foo {
     var x: Int
     init(from decoder: Decoder) throws {
         let container = try decoder.container(keyedBy: CodingKeys.self)

         // Want to process an element before it’s assigned.
         let x = container.decode(forKey: .x) // Ambiguous call

         // Or whatever.
         if x < 0 {
             self.x = x + 100
         else {
             self.x = x * 200
         }
	}
}

You can write `let x: Int = container.decode(…)` or `let x = container.decode(…) as Int`, but this isn’t always intuitive.
Consider also that the metatype would also be necessary for `decode<Value : Codable>(_ type: Value.Type, forKey: Key) -> Value` because the return value of that certainly could be ambiguous in many cases.

Finally, the metatype arg allows you to express the following succinctly: `let v: SuperClass = container.decode(SubClass.self, forKey: .v)`.

In the general case (`decode<Value : Codable>`) we would need the metatype to avoid ambiguity. It’s not strictly necessary for primitive types, but helps in the case of ambiguity, and solves the conceptual overhead of "Why do I specify the type sometimes but not others? Why are some of these types special? Should I always provide the type? Why wouldn’t I?"

Matthew offered `func decode<T>(_ key: Key, as type: T.Type = T.self) throws -> T` which looks appealing, but:

1. Doesn’t help resolve the ambiguity either
2. Allows for 3 ways of expressing the same thing (`let x: Int = decode(key)`, `let x = decode(key) as Int`, and `let x = decode(key, as: Int.self)`)

The cognitive overhead of figuring out all of the ambiguity goes away when we’re consistent everywhere.
FWIW, too, I am not convinced that Foundation should add API just because 3rd parties will add it. The ambiguity in the general case cannot be solved by wrappers, and I would prefer to provide one simple, consistent solution; if 3rd parties would like to add wrappers for their own sake, then I certainly encourage that.

···

On 16 Mar 2017, at 11:46, Matthew Johnson via swift-evolution wrote:

> On Mar 16, 2017, at 1:34 PM, Zach Waldowski via swift-evolution > <swift-evolution@swift.org> wrote:

On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via >> swift-evolution wrote:

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a beginner know to use "as Int" or ": Int"? Why would they? The "prettiness" of the simple case doesn't make up for how difficult it is to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to, make use of "tricky" syntax.

I don’t think this is especially tricky. Nevertheless, we can avoid requiring this syntax by moving the type argument to the end and providing a default. But I think return type inference is worth supporting. It has become widely adopted by the community already in this use case.

If we don’t support this in Foundation we will continue to see 3rd party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive to already be taking our ball and go home over such a minor thing?

I don’t think that’s what I’m doing at all. This is a fantastic proposal. I’m still working through it and writing up my more detailed thoughts.

That said, as with many (most?) first drafts, there is room for improvement. I think it’s worth pointing out the syntax that many of us would like to use for decoding and at least considering including it in the proposal. If the answer is that it’s trivial for those who want to use subscripts to write the wrappers for return type inference and / or subscripts themselves that’s ok. But it’s a fair topic for discussion and should at least be addressed as an alternative that was rejected for a specific reason.

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

itaiferber · March 16, 2017, 8:01pm

Subscripts, by the way, would not help here, since they cannot throw. `decode` must be able to throw.
[SR-238]([SR-238] Support throwing subscripts · Issue #42860 · apple/swift · GitHub); for Apple folks, 28775436.

···

On 16 Mar 2017, at 11:46, Matthew Johnson via swift-evolution wrote:

> On Mar 16, 2017, at 1:34 PM, Zach Waldowski via swift-evolution > <swift-evolution@swift.org> wrote:

On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via >> swift-evolution wrote:

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a beginner know to use "as Int" or ": Int"? Why would they? The "prettiness" of the simple case doesn't make up for how difficult it is to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to, make use of "tricky" syntax.

I don’t think this is especially tricky. Nevertheless, we can avoid requiring this syntax by moving the type argument to the end and providing a default. But I think return type inference is worth supporting. It has become widely adopted by the community already in this use case.

If we don’t support this in Foundation we will continue to see 3rd party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive to already be taking our ball and go home over such a minor thing?

I don’t think that’s what I’m doing at all. This is a fantastic proposal. I’m still working through it and writing up my more detailed thoughts.

That said, as with many (most?) first drafts, there is room for improvement. I think it’s worth pointing out the syntax that many of us would like to use for decoding and at least considering including it in the proposal. If the answer is that it’s trivial for those who want to use subscripts to write the wrappers for return type inference and / or subscripts themselves that’s ok. But it’s a fair topic for discussion and should at least be addressed as an alternative that was rejected for a specific reason.

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

anandabits · March 16, 2017, 6:23pm

2) Libraries like Marshal (https://github.com/utahiosmac/Marshal\) and Unbox (https://github.com/JohnSundell/Unbox\) don’t require the decoding functions to provide the type: those functions are generic on the return turn and it’s automatically inferred:

func decode<T>(key: Key) -> T

self.stringProperty = decode(key: .stringProperty) // correct specialisation of the generic function chosen by the compiler

Is there a reason the proposal did not choose this solution? Its quite sweet.

IMHO those are only “sweet” until you need to decode a value out to something other than a typed value, then it’s ambiguity city.

Other than a typed value? Can you give an example?

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

We can also use the type argument but provide a default value:

func decode<T>(_ key: Key, as type: T.Type = T.self) throws -> T

let i = decode(key: .myKey, as: Int.self)

I think the Foundation team should strongly consider one of these signatures and allow us to rely on return type inference when desired. If this isn’t provided by Foundation we’ll see a bunch of wrappers that do this. Why not include it in the standard interface?

The same argument can be made for providing a subscript instead of or in addition to `decode`. Of course exposing a proper subscript interface isn’t possible until we have throwing subscripts. Decoding is one of the major motivating use cases for throwing subscripts. With Foundation tackling this topic in Swift 4 maybe it would be good to consider bringing throwing subscripts in scope and using them in the interface to KeyedDecodingContainer.

Subscript with return type inference is the natural interface for a keyed decoder (obviously IMO). If we don’t support this in Foundation we will continue to see 3rd party libraries that do this. I think it would be better to provide the same interface and implementation to everyone directly in Foundation itself.

···

On Mar 16, 2017, at 1:06 PM, David Hart via swift-evolution <swift-evolution@swift.org> wrote:
On 16 Mar 2017, at 16:53, Zach Waldowski <zach@waldowski.me <mailto:zach@waldowski.me>> wrote:

On Mar 16, 2017, at 3:09 AM, David Hart via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

There are many ways to solve that, but none of them are conducive to beginners. Using the metatype to seed the generic resolution is the only thing I’d get behind, personally.

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

anandabits · March 16, 2017, 6:46pm

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a beginner know to use "as Int" or ": Int"? Why would they? The "prettiness" of the simple case doesn't make up for how difficult it is to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to, make use of "tricky" syntax.

I don’t think this is especially tricky. Nevertheless, we can avoid requiring this syntax by moving the type argument to the end and providing a default. But I think return type inference is worth supporting. It has become widely adopted by the community already in this use case.

If we don’t support this in Foundation we will continue to see 3rd party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive to already be taking our ball and go home over such a minor thing?

I don’t think that’s what I’m doing at all. This is a fantastic proposal. I’m still working through it and writing up my more detailed thoughts.

That said, as with many (most?) first drafts, there is room for improvement. I think it’s worth pointing out the syntax that many of us would like to use for decoding and at least considering including it in the proposal. If the answer is that it’s trivial for those who want to use subscripts to write the wrappers for return type inference and / or subscripts themselves that’s ok. But it’s a fair topic for discussion and should at least be addressed as an alternative that was rejected for a specific reason.

···

On Mar 16, 2017, at 1:34 PM, Zach Waldowski via swift-evolution <swift-evolution@swift.org> wrote:
On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via swift-evolution wrote:

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

itaiferber · March 16, 2017, 9:12pm

If throwing subscripts made it in the Swift 4 timeframe, then we would certainly consider it.

···

On 16 Mar 2017, at 13:19, Matthew Johnson wrote:

> On Mar 16, 2017, at 3:01 PM, Itai Ferber <iferber@apple.com> wrote:

Subscripts, by the way, would not help here, since they cannot throw. decode must be able to throw.
SR-238 <[SR-238] Support throwing subscripts · Issue #42860 · apple/swift · GitHub; for Apple folks, 28775436.

They don’t “help” but they do provide a more natural interface. If the Foundation team feels a more wordy interface is necessary that is ok.

I specifically mentioned that they can’t throw yet. Throwing subscripts would make a good companion proposal if they could fit into the Swift 4 timeframe. If not, then yes we need a method rather than a subscript. But if we can get throwing subscripts into Swift 4, why not use Swift’s first class syntactic support for keyed access to keyed containers?

On 16 Mar 2017, at 11:46, Matthew Johnson via swift-evolution wrote:

On Mar 16, 2017, at 1:34 PM, Zach Waldowski via swift-evolution >>> <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> >>> wrote:

On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via >>> swift-evolution wrote:

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a beginner know to use "as Int" or ": Int"? Why would they? The "prettiness" of the simple case doesn't make up for how difficult it is to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to, make use of "tricky" syntax.

I don’t think this is especially tricky. Nevertheless, we can avoid requiring this syntax by moving the type argument to the end and providing a default. But I think return type inference is worth supporting. It has become widely adopted by the community already in this use case.

If we don’t support this in Foundation we will continue to see 3rd party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive to already be taking our ball and go home over such a minor thing?

I don’t think that’s what I’m doing at all. This is a fantastic proposal. I’m still working through it and writing up my more detailed thoughts.

That said, as with many (most?) first drafts, there is room for improvement. I think it’s worth pointing out the syntax that many of us would like to use for decoding and at least considering including it in the proposal. If the answer is that it’s trivial for those who want to use subscripts to write the wrappers for return type inference and / or subscripts themselves that’s ok. But it’s a fair topic for discussion and should at least be addressed as an alternative that was rejected for a specific reason.

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Philippe_Hausler · March 16, 2017, 9:18pm

One point of concern with making the implementations rely on that: it would require any adopter of Codable to be built in swift 4 mode no? it might be valuable to keep the protocol not requiring Swift 4 to aide in incremental migration.

···

On Mar 16, 2017, at 2:14 PM, Matthew Johnson via swift-evolution <swift-evolution@swift.org> wrote:

On Mar 16, 2017, at 4:12 PM, Itai Ferber <iferber@apple.com <mailto:iferber@apple.com>> wrote:

If throwing subscripts made it in the Swift 4 timeframe, then we would certainly consider it.

Cool. Any comment from the core team on whether this is a possibility? If it is and nobody else wants to write a proposal I would be willing to do it.

On 16 Mar 2017, at 13:19, Matthew Johnson wrote:

On Mar 16, 2017, at 3:01 PM, Itai Ferber <iferber@apple.com <mailto:iferber@apple.com>> wrote:

Subscripts, by the way, would not help here, since they cannot throw. decode must be able to throw.
SR-238 <[SR-238] Support throwing subscripts · Issue #42860 · apple/swift · GitHub; for Apple folks, 28775436.

They don’t “help” but they do provide a more natural interface. If the Foundation team feels a more wordy interface is necessary that is ok.

I specifically mentioned that they can’t throw yet. Throwing subscripts would make a good companion proposal if they could fit into the Swift 4 timeframe. If not, then yes we need a method rather than a subscript. But if we can get throwing subscripts into Swift 4, why not use Swift’s first class syntactic support for keyed access to keyed containers?

On 16 Mar 2017, at 11:46, Matthew Johnson via swift-evolution wrote:

On Mar 16, 2017, at 1:34 PM, Zach Waldowski via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via swift-evolution wrote:

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a beginner know to use "as Int" or ": Int"? Why would they? The "prettiness" of the simple case doesn't make up for how difficult it is to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to, make use of "tricky" syntax.

I don’t think this is especially tricky. Nevertheless, we can avoid requiring this syntax by moving the type argument to the end and providing a default. But I think return type inference is worth supporting. It has become widely adopted by the community already in this use case.

If we don’t support this in Foundation we will continue to see 3rd party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive to already be taking our ball and go home over such a minor thing?

I don’t think that’s what I’m doing at all. This is a fantastic proposal. I’m still working through it and writing up my more detailed thoughts.

That said, as with many (most?) first drafts, there is room for improvement. I think it’s worth pointing out the syntax that many of us would like to use for decoding and at least considering including it in the proposal. If the answer is that it’s trivial for those who want to use subscripts to write the wrappers for return type inference and / or subscripts themselves that’s ok. But it’s a fair topic for discussion and should at least be addressed as an alternative that was rejected for a specific reason.

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

zwaldowski · March 16, 2017, 6:34pm

I don’t have an example but I don’t see a problem either. There are
two options for specifying the return type manually. We can use the
signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a
beginner know to use "as Int" or ": Int"? Why would they? The
"prettiness" of the simple case doesn't make up for how difficult it is
to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to,
make use of "tricky" syntax.

If we don’t support this in Foundation we will continue to see 3rd
party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive
to already be taking our ball and go home over such a minor thing?

Zach Waldowski

zach@waldowski.me

···

On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via swift-evolution wrote:

anandabits · March 16, 2017, 8:19pm

Subscripts, by the way, would not help here, since they cannot throw. decode must be able to throw.
SR-238 <[SR-238] Support throwing subscripts · Issue #42860 · apple/swift · GitHub; for Apple folks, 28775436.

They don’t “help” but they do provide a more natural interface. If the Foundation team feels a more wordy interface is necessary that is ok.

I specifically mentioned that they can’t throw yet. Throwing subscripts would make a good companion proposal if they could fit into the Swift 4 timeframe. If not, then yes we need a method rather than a subscript. But if we can get throwing subscripts into Swift 4, why not use Swift’s first class syntactic support for keyed access to keyed containers?

···

On Mar 16, 2017, at 3:01 PM, Itai Ferber <iferber@apple.com> wrote:

On 16 Mar 2017, at 11:46, Matthew Johnson via swift-evolution wrote:

On Mar 16, 2017, at 1:34 PM, Zach Waldowski via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via swift-evolution wrote:

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a beginner know to use "as Int" or ": Int"? Why would they? The "prettiness" of the simple case doesn't make up for how difficult it is to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to, make use of "tricky" syntax.

I don’t think this is especially tricky. Nevertheless, we can avoid requiring this syntax by moving the type argument to the end and providing a default. But I think return type inference is worth supporting. It has become widely adopted by the community already in this use case.

If we don’t support this in Foundation we will continue to see 3rd party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive to already be taking our ball and go home over such a minor thing?

I don’t think that’s what I’m doing at all. This is a fantastic proposal. I’m still working through it and writing up my more detailed thoughts.

That said, as with many (most?) first drafts, there is room for improvement. I think it’s worth pointing out the syntax that many of us would like to use for decoding and at least considering including it in the proposal. If the answer is that it’s trivial for those who want to use subscripts to write the wrappers for return type inference and / or subscripts themselves that’s ok. But it’s a fair topic for discussion and should at least be addressed as an alternative that was rejected for a specific reason.

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

anandabits · March 16, 2017, 9:14pm

If throwing subscripts made it in the Swift 4 timeframe, then we would certainly consider it.

Cool. Any comment from the core team on whether this is a possibility? If it is and nobody else wants to write a proposal I would be willing to do it.

···

On Mar 16, 2017, at 4:12 PM, Itai Ferber <iferber@apple.com> wrote:
On 16 Mar 2017, at 13:19, Matthew Johnson wrote:

On Mar 16, 2017, at 3:01 PM, Itai Ferber <iferber@apple.com <mailto:iferber@apple.com>> wrote:

Subscripts, by the way, would not help here, since they cannot throw. decode must be able to throw.
SR-238 <[SR-238] Support throwing subscripts · Issue #42860 · apple/swift · GitHub; for Apple folks, 28775436.

They don’t “help” but they do provide a more natural interface. If the Foundation team feels a more wordy interface is necessary that is ok.

I specifically mentioned that they can’t throw yet. Throwing subscripts would make a good companion proposal if they could fit into the Swift 4 timeframe. If not, then yes we need a method rather than a subscript. But if we can get throwing subscripts into Swift 4, why not use Swift’s first class syntactic support for keyed access to keyed containers?

On 16 Mar 2017, at 11:46, Matthew Johnson via swift-evolution wrote:

On Mar 16, 2017, at 1:34 PM, Zach Waldowski via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via swift-evolution wrote:

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a beginner know to use "as Int" or ": Int"? Why would they? The "prettiness" of the simple case doesn't make up for how difficult it is to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to, make use of "tricky" syntax.

I don’t think this is especially tricky. Nevertheless, we can avoid requiring this syntax by moving the type argument to the end and providing a default. But I think return type inference is worth supporting. It has become widely adopted by the community already in this use case.

If we don’t support this in Foundation we will continue to see 3rd party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive to already be taking our ball and go home over such a minor thing?

I don’t think that’s what I’m doing at all. This is a fantastic proposal. I’m still working through it and writing up my more detailed thoughts.

That said, as with many (most?) first drafts, there is room for improvement. I think it’s worth pointing out the syntax that many of us would like to use for decoding and at least considering including it in the proposal. If the answer is that it’s trivial for those who want to use subscripts to write the wrappers for return type inference and / or subscripts themselves that’s ok. But it’s a fair topic for discussion and should at least be addressed as an alternative that was rejected for a specific reason.

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

anandabits · March 16, 2017, 9:28pm

One point of concern with making the implementations rely on that: it would require any adopter of Codable to be built in swift 4 mode no? it might be valuable to keep the protocol not requiring Swift 4 to aide in incremental migration.

Yes, probably so. I would be disappointed if we allowed the design of Swift 4 features to be influenced by Swift 3.1 compatibility.

···

On Mar 16, 2017, at 4:18 PM, Philippe Hausler <phausler@apple.com> wrote:

On Mar 16, 2017, at 2:14 PM, Matthew Johnson via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Mar 16, 2017, at 4:12 PM, Itai Ferber <iferber@apple.com <mailto:iferber@apple.com>> wrote:

If throwing subscripts made it in the Swift 4 timeframe, then we would certainly consider it.

Cool. Any comment from the core team on whether this is a possibility? If it is and nobody else wants to write a proposal I would be willing to do it.

On 16 Mar 2017, at 13:19, Matthew Johnson wrote:

On Mar 16, 2017, at 3:01 PM, Itai Ferber <iferber@apple.com <mailto:iferber@apple.com>> wrote:

Subscripts, by the way, would not help here, since they cannot throw. decode must be able to throw.
SR-238 <[SR-238] Support throwing subscripts · Issue #42860 · apple/swift · GitHub; for Apple folks, 28775436.

They don’t “help” but they do provide a more natural interface. If the Foundation team feels a more wordy interface is necessary that is ok.

I specifically mentioned that they can’t throw yet. Throwing subscripts would make a good companion proposal if they could fit into the Swift 4 timeframe. If not, then yes we need a method rather than a subscript. But if we can get throwing subscripts into Swift 4, why not use Swift’s first class syntactic support for keyed access to keyed containers?

On 16 Mar 2017, at 11:46, Matthew Johnson via swift-evolution wrote:

On Mar 16, 2017, at 1:34 PM, Zach Waldowski via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via swift-evolution wrote:

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a beginner know to use "as Int" or ": Int"? Why would they? The "prettiness" of the simple case doesn't make up for how difficult it is to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to, make use of "tricky" syntax.

I don’t think this is especially tricky. Nevertheless, we can avoid requiring this syntax by moving the type argument to the end and providing a default. But I think return type inference is worth supporting. It has become widely adopted by the community already in this use case.

If we don’t support this in Foundation we will continue to see 3rd party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive to already be taking our ball and go home over such a minor thing?

I don’t think that’s what I’m doing at all. This is a fantastic proposal. I’m still working through it and writing up my more detailed thoughts.

That said, as with many (most?) first drafts, there is room for improvement. I think it’s worth pointing out the syntax that many of us would like to use for decoding and at least considering including it in the proposal. If the answer is that it’s trivial for those who want to use subscripts to write the wrappers for return type inference and / or subscripts themselves that’s ok. But it’s a fair topic for discussion and should at least be addressed as an alternative that was rejected for a specific reason.

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

hartbit · March 16, 2017, 7:58pm

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a beginner know to use "as Int" or ": Int"? Why would they? The "prettiness" of the simple case doesn't make up for how difficult it is to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to, make use of "tricky" syntax.

Two arguments:

1) Most of the time, you will be setting the return value of decode into a typed property and will not need ‘as’.
2) Even when you do need it, its not tricky syntax: it’s the official way to direct the type inference engine in Swift.

David.

···

On 16 Mar 2017, at 19:34, Zach Waldowski via swift-evolution <swift-evolution@swift.org> wrote:
On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via swift-evolution wrote:

If we don’t support this in Foundation we will continue to see 3rd party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive to already be taking our ball and go home over such a minor thing?

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

hartbit · March 16, 2017, 8:27pm

I’m going to reply to this thread as a whole — apologies if there’s someone’s comment that I’ve missed.

This is something that has come up in internal review, and we’ve certainly given it thought. As Zach has already mentioned, the primary concern with overloading based on return type is ambiguity.
There are many cases in which Swift’s type system currently does not handle ambiguity in the way that you would expect, and it can be very surprising. For instance,

func foo() -> Int { return 42 }
func foo() -> Double { return .pi }
func consumesInt(_ x : Int) { print(x) }

let x = foo() // Ambiguous use of foo()
consumesInt(x) // Even though x is going to be used as an Int
let y: Int = x // Same here
let x = foo() as Int works now, but it actually didn’t always — until a somewhat recent version of Swift AFAICT, the only way to resolve the ambiguity was through let x: Int = foo(). This has since been fixed, but it was very confusing to try to figure out the unambiguous way to call it.

Keep in mind that this isn’t an unreasonable thing to want to do:

struct Foo {
    var x: Int
    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)

        // Want to process an element before it’s assigned.
        let x = container.decode(forKey: .x) // Ambiguous call

        // Or whatever.
        if x < 0 {
            self.x = x + 100
        else {
            self.x = x * 200
        }
    }
}
You can write let x: Int = container.decode(…) or let x = container.decode(…) as Int, but this isn’t always intuitive.

That’s where I disagree. Let me try to prove my point:

You bring up the example of having to store the decoded value in a variable before setting it to a typed property. But its also not unreasonable to want to do the same thing when encoding the value, possibly storing it into a different type. If we follow that argument, its also not very intuitive to have to do

container.encode(x as Double, forKey: .x).

Wouldn’t that be an argument to have an API like this:

func encode<T>(_ value: Data?, forKey key: Key, as type: T.Type) throws

I would argue that type inference is a core feature in Swift and that we should embrace it. I believe that in most cases the return value of encode will be stored into a typed property and type inference will do the right thing. In the few cases where the type has to be enforced, the patterns you mention above are not weird syntax; they are used and useful all over Swift:

let cgFloat: CGFloat = 42
let pi = 3.14159265359 as Float
let person = factory.get<Person>() // potential feature in Generics Manifesto

The way I think about it is that the type argument is already there as a generic parameter. Adding an extra argument that needs to be explicitly given on every single call feels like unneeded verbosity to me.

Consider also that the metatype would also be necessary for decode<Value : Codable>(_ type: Value.Type, forKey: Key) -> Value because the return value of that certainly could be ambiguous in many cases.

Finally, the metatype arg allows you to express the following succinctly: let v: SuperClass = container.decode(SubClass.self, forKey: .v).

In the general case (decode<Value : Codable>) we would need the metatype to avoid ambiguity. It’s not strictly necessary for primitive types, but helps in the case of ambiguity, and solves the conceptual overhead of "Why do I specify the type sometimes but not others? Why are some of these types special? Should I always provide the type? Why wouldn’t I?"

Matthew offered func decode<T>(_ key: Key, as type: T.Type = T.self) throws -> T which looks appealing, but:

Doesn’t help resolve the ambiguity either
Allows for 3 ways of expressing the same thing (let x: Int = decode(key), let x = decode(key) as Int, and let x = decode(key, as: Int.self))
The cognitive overhead of figuring out all of the ambiguity goes away when we’re consistent everywhere.
FWIW, too, I am not convinced that Foundation should add API just because 3rd parties will add it.

Agreed. Foundation should not add API just because 3rd parties do it. But 3rd parties should not be dismissed entirely nonetheless. They are a good breeding ground for ideas to spawn and shape Swift in interesting ways.

···

On 16 Mar 2017, at 20:55, Itai Ferber via swift-evolution <swift-evolution@swift.org> wrote:
The ambiguity in the general case cannot be solved by wrappers, and I would prefer to provide one simple, consistent solution; if 3rd parties would like to add wrappers for their own sake, then I certainly encourage that.

On 16 Mar 2017, at 11:46, Matthew Johnson via swift-evolution wrote:

On Mar 16, 2017, at 1:34 PM, Zach Waldowski via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Thu, Mar 16, 2017, at 02:23 PM, Matthew Johnson via swift-evolution wrote:

I don’t have an example but I don’t see a problem either. There are two options for specifying the return type manually. We can use the signature you used above and use `as` to specify the expected type:

let i = decode(.myKey) as Int

The awkwardness of this syntax is exactly what I'm referring to. Would a beginner know to use "as Int" or ": Int"? Why would they? The "prettiness" of the simple case doesn't make up for how difficult it is to understand and fix its failure cases.

Any official Swift or Foundation API shouldn't, or shouldn't need to, make use of "tricky" syntax.

I don’t think this is especially tricky. Nevertheless, we can avoid requiring this syntax by moving the type argument to the end and providing a default. But I think return type inference is worth supporting. It has become widely adopted by the community already in this use case.

If we don’t support this in Foundation we will continue to see 3rd party libraries that do this.

The proposal's been out for less than 24 hours, is it really productive to already be taking our ball and go home over such a minor thing?

I don’t think that’s what I’m doing at all. This is a fantastic proposal. I’m still working through it and writing up my more detailed thoughts.

That said, as with many (most?) first drafts, there is room for improvement. I think it’s worth pointing out the syntax that many of us would like to use for decoding and at least considering including it in the proposal. If the answer is that it’s trivial for those who want to use subscripts to write the wrappers for return type inference and / or subscripts themselves that’s ok. But it’s a fair topic for discussion and should at least be addressed as an alternative that was rejected for a specific reason.

Zach Waldowski
zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution