Why you can't make someone else's class Decodable: a long-winded explanation of 'required' initializers

David Hart recently asked on Twitter <https://twitter.com/dhartbit/status/891766239340748800&gt; if there was a good way to add Decodable support to somebody else's class. The short answer is "no, because you don't control all the subclasses", but David already understood that and wanted to know if there was anything working to mitigate the problem. So I decided to write up a long email about it instead. (Well, actually I decided to write a short email and then failed at doing so.)

The Problem

You can add Decodable to someone else's struct today with no problems:

extension Point: Decodable {
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.init(x: x, y: y)
  }
}

But if Point is a (non-final) class, then this gives you a pile of errors:

- init(from:) needs to be 'required' to satisfy a protocol requirement. 'required' means the initializer can be invoked dynamically on subclasses. Why is this important? Because someone might write code like this:

func decodeMe<Result: Decodable>() -> Result {
  let decoder = getDecoderFromSomewhere()
  return Result(from: decoder)
}
let specialPoint: VerySpecialSubclassOfPoint = decodeMe()

…and the compiler can't stop them, because VerySpecialSubclassOfPoint is a Point, and Point is Decodable, and therefore VerySpecialSubclassOfPoint is Decodable. A bit more on this later, but for now let's say that's a sensible requirement.

- init(from:) also has to be a 'convenience' initializer. That one makes sense too—if you're outside the module, you can't necessarily see private properties, and so of course you'll have to call another initializer that can.

But once it's marked 'convenience' and 'required' we get "'required' initializer must be declared directly in class 'Point' (not in an extension)", and that defeats the whole purpose. Why this restriction?

The Semantic Reason

The initializer is 'required', right? So all subclasses need to have access to it. But the implementation we provided here might not make sense for all subclasses—what if VerySpecialSubclassOfPoint doesn't have an 'init(x:y:)' initializer? Normally, the compiler checks for this situation and makes the subclass reimplement the 'required' initializer…but that only works if the 'required' initializers are all known up front. So it can't allow this new 'required' initializer to go by, because someone might try to call it dynamically on a subclass. Here's a dynamic version of the code from above:

func decodeDynamic(_ pointType: Point.Type) -> Point {
  let decoder = getDecoderFromSomewhere()
  return pointType.init(from: decoder)
}
let specialPoint = decodeDynamic(VerySpecialSubclassOfPoint.self)

The Implementation Reason

'required' initializers are like methods: they may require dynamic dispatch. That means that they get an entry in the class's dynamic dispatch table, commonly known as its vtable. Unlike Objective-C method tables, vtables aren't set up to have entries arbitrarily added at run time.

(Aside: This is one of the reasons why non-@objc methods in Swift extensions can't be overridden; if we ever lift that restriction, it'll be by using a separate table and a form of dispatch similar to objc_msgSend. I sent a proposal to swift-evolution about this last year but there wasn't much interest.)

The Workaround

Today's answer isn't wonderful, but it does work: write a wrapper struct that conforms to Decodable instead:

struct DecodedPoint: Decodable {
  var value: Point
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.value = Point(x: x, y: y)
  }
}

This doesn't have any of the problems with inheritance, because it only handles the base class, Point. But it makes everywhere else a little less convenient—instead of directly encoding or decoding Point, you have to use the wrapper, and that means no implicitly-generated Codable implementations either.

I'm not going to spend more time talking about this, but it is the officially recommended answer at the moment. You can also just have all your own types that contain points manually decode the 'x' and 'y' values and then construct a Point from that.

Future Direction: 'required' + 'final'

One language feature we could add to make this work is a 'required' initializer that is also 'final'. Because it's 'final', it wouldn't have to go into the dynamic dispatch table. But because it's 'final', we have to make sure its implementation works on all subclasses. For that to work, it would only be allowed to call other 'required' initializers…which means you're still stuck if the original author didn't mark anything 'required'. Still, it's a safe, reasonable, and contained extension to our initializer model.

Future Direction: runtime-checked convenience initializers

In most cases you don't care about hypothetical subclasses or invoking init(from:) on some dynamic Point type. If there was a way to mark init(from:) as something that was always available on subclasses, but dynamically checked to see if it was okay, we'd be good. That could take one of two forms:

- If 'self' is not Point itself, trap.
- If 'self' did not inherit or override all of Point's designated initializers, trap.

The former is pretty easy to implement but not very extensible. The latter seems more expensive: it's information we already check in the compiler, but we don't put it into the runtime metadata for a class, and checking it at run time requires walking up the class hierarchy until we get to the class we want. This is all predicated on the idea that this is rare, though.

This is a much more intrusive change to the initializer model, and it's turning a compile-time check into a run-time check, so I think we're less likely to want to take this any time soon.

Future Direction: Non-inherited conformances

All of this is only a problem because people might try to call init(from:) on a subclass of Point. If we said that subclasses of Point weren't automatically Decodable themselves, we'd avoid this problem. This sounds like a terrible idea but it actually doesn't change very much in practice. Unfortunately, it's also a very complicated and intrusive change to the Swift protocol system, and so I don't want to spend more time on it here.

The Dangers of Retroactive Modeling

Even if we magically make this all work, however, there's still one last problem: what if two frameworks do this? Point can't conform to Decodable in two different ways, but neither can it just pick one. (Maybe one of the encoded formats uses "dx" and "dy" for the key names, or maybe it's encoded with polar coordinates.) There aren't great answers to this, and it calls into question whether the struct "solution" at the start of this message is even sensible.

I'm going to bring this up on swift-evolution soon as part of the Library Evolution discussions (there's a very similar problem if the library that owns Point decides to make it Decodable too), but it's worth noting that the wrapper struct solution doesn't have this problem.

Whew! So, that's why you can't do it. It's not a very satisfying answer, but it's one that falls out of our compile-time safety rules for initializers. For more information on this I suggest checking out my write-up of some of our initialization model problems <https://github.com/apple/swift/blob/master/docs/InitializerProblems.rst&gt;\. And I plan to write another email like this to discuss some solutions that are actually doable.

Jordan

P.S. There's a reason why Decodable uses an initializer instead of a factory-like method on the type but I can't remember what it is right now. I think it's something to do with having the right result type, which would have to be either 'Any' or an associated type if it wasn't just 'Self'. (And if it is 'Self' then it has all the same problems as an initializer and would require extra syntax.) Itai would know for sure.

9 Likes

'required' initializers are like methods: they may require dynamic dispatch. That means that they get an entry in the class's dynamic dispatch table, commonly known as its vtable. Unlike Objective-C method tables, vtables aren't set up to have entries arbitrarily added at run time.

(Aside: This is one of the reasons why non-@objc methods in Swift extensions can't be overridden; if we ever lift that restriction, it'll be by using a separate table and a form of dispatch similar to objc_msgSend. I sent a proposal to swift-evolution about this last year but there wasn't much interest.)

If I missed replying to that originally I also missed the chance to say that it would be a lovely idea and dynamic dispatch in some cases is just what the doctor ordered (runtime editable method tables).
This is especially especially important with extensions for classes and default methods (and the current rules for overriding methods in the implementing class), please resubmit the proposal :).

···

Sent from my iPhone

On 3 Aug 2017, at 01:09, Jordan Rose via swift-evolution <swift-evolution@swift.org> wrote:

Thanks for the detailed explanation Jordan! Comment inline:

David Hart recently asked on Twitter <https://twitter.com/dhartbit/status/891766239340748800&gt; if there was a good way to add Decodable support to somebody else's class. The short answer is "no, because you don't control all the subclasses", but David already understood that and wanted to know if there was anything working to mitigate the problem. So I decided to write up a long email about it instead. (Well, actually I decided to write a short email and then failed at doing so.)

The Problem

You can add Decodable to someone else's struct today with no problems:

extension Point: Decodable {
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.init(x: x, y: y)
  }
}

But if Point is a (non-final) class, then this gives you a pile of errors:

- init(from:) needs to be 'required' to satisfy a protocol requirement. 'required' means the initializer can be invoked dynamically on subclasses. Why is this important? Because someone might write code like this:

func decodeMe<Result: Decodable>() -> Result {
  let decoder = getDecoderFromSomewhere()
  return Result(from: decoder)
}
let specialPoint: VerySpecialSubclassOfPoint = decodeMe()

…and the compiler can't stop them, because VerySpecialSubclassOfPoint is a Point, and Point is Decodable, and therefore VerySpecialSubclassOfPoint is Decodable. A bit more on this later, but for now let's say that's a sensible requirement.

- init(from:) also has to be a 'convenience' initializer. That one makes sense too—if you're outside the module, you can't necessarily see private properties, and so of course you'll have to call another initializer that can.

But once it's marked 'convenience' and 'required' we get "'required' initializer must be declared directly in class 'Point' (not in an extension)", and that defeats the whole purpose. Why this restriction?

The Semantic Reason

The initializer is 'required', right? So all subclasses need to have access to it. But the implementation we provided here might not make sense for all subclasses—what if VerySpecialSubclassOfPoint doesn't have an 'init(x:y:)' initializer? Normally, the compiler checks for this situation and makes the subclass reimplement the 'required' initializer…but that only works if the 'required' initializers are all known up front. So it can't allow this new 'required' initializer to go by, because someone might try to call it dynamically on a subclass. Here's a dynamic version of the code from above:

func decodeDynamic(_ pointType: Point.Type) -> Point {
  let decoder = getDecoderFromSomewhere()
  return pointType.init(from: decoder)
}
let specialPoint = decodeDynamic(VerySpecialSubclassOfPoint.self)

The Implementation Reason

'required' initializers are like methods: they may require dynamic dispatch. That means that they get an entry in the class's dynamic dispatch table, commonly known as its vtable. Unlike Objective-C method tables, vtables aren't set up to have entries arbitrarily added at run time.

(Aside: This is one of the reasons why non-@objc methods in Swift extensions can't be overridden; if we ever lift that restriction, it'll be by using a separate table and a form of dispatch similar to objc_msgSend. I sent a proposal to swift-evolution about this last year but there wasn't much interest.)

I may have missed the proposal you sent, because I’d be quite interested about this. I hit this restriction once in a while and I really wish we could override methods in Swift extensions.

The Workaround

Today's answer isn't wonderful, but it does work: write a wrapper struct that conforms to Decodable instead:

struct DecodedPoint: Decodable {
  var value: Point
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.value = Point(x: x, y: y)
  }
}

This doesn't have any of the problems with inheritance, because it only handles the base class, Point. But it makes everywhere else a little less convenient—instead of directly encoding or decoding Point, you have to use the wrapper, and that means no implicitly-generated Codable implementations either.

I'm not going to spend more time talking about this, but it is the officially recommended answer at the moment. You can also just have all your own types that contain points manually decode the 'x' and 'y' values and then construct a Point from that.

Future Direction: 'required' + 'final'

One language feature we could add to make this work is a 'required' initializer that is also 'final'. Because it's 'final', it wouldn't have to go into the dynamic dispatch table. But because it's 'final', we have to make sure its implementation works on all subclasses. For that to work, it would only be allowed to call other 'required' initializers…which means you're still stuck if the original author didn't mark anything 'required'. Still, it's a safe, reasonable, and contained extension to our initializer model.

I like this solution. One drawback: it does force users to add an extra modifier to those initialisers, increasing the complexity of an initializer model which is already quite challenging for newcomers.

Future Direction: runtime-checked convenience initializers

In most cases you don't care about hypothetical subclasses or invoking init(from:) on some dynamic Point type. If there was a way to mark init(from:) as something that was always available on subclasses, but dynamically checked to see if it was okay, we'd be good. That could take one of two forms:

- If 'self' is not Point itself, trap.
- If 'self' did not inherit or override all of Point's designated initializers, trap.

The former is pretty easy to implement but not very extensible. The latter seems more expensive: it's information we already check in the compiler, but we don't put it into the runtime metadata for a class, and checking it at run time requires walking up the class hierarchy until we get to the class we want. This is all predicated on the idea that this is rare, though.

This is a much more intrusive change to the initializer model, and it's turning a compile-time check into a run-time check, so I think we're less likely to want to take this any time soon.

Indeed, turning a compile-time check into a run-time check doesn’t sound very Swifty.

Future Direction: Non-inherited conformances

All of this is only a problem because people might try to call init(from:) on a subclass of Point. If we said that subclasses of Point weren't automatically Decodable themselves, we'd avoid this problem. This sounds like a terrible idea but it actually doesn't change very much in practice. Unfortunately, it's also a very complicated and intrusive change to the Swift protocol system, and so I don't want to spend more time on it here.

The Dangers of Retroactive Modeling

Even if we magically make this all work, however, there's still one last problem: what if two frameworks do this? Point can't conform to Decodable in two different ways, but neither can it just pick one. (Maybe one of the encoded formats uses "dx" and "dy" for the key names, or maybe it's encoded with polar coordinates.) There aren't great answers to this, and it calls into question whether the struct "solution" at the start of this message is even sensible.

Somewhat related: I have a similar problem in a project where I need two different Codable conformances for a type: one for coding/decoding from/to JSON, and another one for coding/decoding from/to a database row. The keys and formatting are not identical. The only solution around that for now is separate types, which can be sub-optimal from a performance point of view.

···

On 3 Aug 2017, at 02:08, Jordan Rose <jordan_rose@apple.com> wrote:

I'm going to bring this up on swift-evolution soon as part of the Library Evolution discussions (there's a very similar problem if the library that owns Point decides to make it Decodable too), but it's worth noting that the wrapper struct solution doesn't have this problem.

Whew! So, that's why you can't do it. It's not a very satisfying answer, but it's one that falls out of our compile-time safety rules for initializers. For more information on this I suggest checking out my write-up of some of our initialization model problems <https://github.com/apple/swift/blob/master/docs/InitializerProblems.rst&gt;\. And I plan to write another email like this to discuss some solutions that are actually doable.

Jordan

P.S. There's a reason why Decodable uses an initializer instead of a factory-like method on the type but I can't remember what it is right now. I think it's something to do with having the right result type, which would have to be either 'Any' or an associated type if it wasn't just 'Self'. (And if it is 'Self' then it has all the same problems as an initializer and would require extra syntax.) Itai would know for sure.

Thanks for putting these thoughts together, Jordan! Some additional comments inline.

David Hart recently asked on Twitter <https://twitter.com/dhartbit/status/891766239340748800&gt; if there was a good way to add Decodable support to somebody else's class. The short answer is "no, because you don't control all the subclasses", but David already understood that and wanted to know if there was anything working to mitigate the problem. So I decided to write up a long email about it instead. (Well, actually I decided to write a short email and then failed at doing so.)

The Problem

You can add Decodable to someone else's struct today with no problems:

extension Point: Decodable {
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.init(x: x, y: y)
  }
}

But if Point is a (non-final) class, then this gives you a pile of errors:

- init(from:) needs to be 'required' to satisfy a protocol requirement. 'required' means the initializer can be invoked dynamically on subclasses. Why is this important? Because someone might write code like this:

func decodeMe<Result: Decodable>() -> Result {
  let decoder = getDecoderFromSomewhere()
  return Result(from: decoder)
}
let specialPoint: VerySpecialSubclassOfPoint = decodeMe()

…and the compiler can't stop them, because VerySpecialSubclassOfPoint is a Point, and Point is Decodable, and therefore VerySpecialSubclassOfPoint is Decodable. A bit more on this later, but for now let's say that's a sensible requirement.

- init(from:) also has to be a 'convenience' initializer. That one makes sense too—if you're outside the module, you can't necessarily see private properties, and so of course you'll have to call another initializer that can.

But once it's marked 'convenience' and 'required' we get "'required' initializer must be declared directly in class 'Point' (not in an extension)", and that defeats the whole purpose. Why this restriction?

The Semantic Reason

The initializer is 'required', right? So all subclasses need to have access to it. But the implementation we provided here might not make sense for all subclasses—what if VerySpecialSubclassOfPoint doesn't have an 'init(x:y:)' initializer? Normally, the compiler checks for this situation and makes the subclass reimplement the 'required' initializer…but that only works if the 'required' initializers are all known up front. So it can't allow this new 'required' initializer to go by, because someone might try to call it dynamically on a subclass. Here's a dynamic version of the code from above:

func decodeDynamic(_ pointType: Point.Type) -> Point {
  let decoder = getDecoderFromSomewhere()
  return pointType.init(from: decoder)
}
let specialPoint = decodeDynamic(VerySpecialSubclassOfPoint.self)

The Implementation Reason

'required' initializers are like methods: they may require dynamic dispatch. That means that they get an entry in the class's dynamic dispatch table, commonly known as its vtable. Unlike Objective-C method tables, vtables aren't set up to have entries arbitrarily added at run time.

(Aside: This is one of the reasons why non-@objc methods in Swift extensions can't be overridden; if we ever lift that restriction, it'll be by using a separate table and a form of dispatch similar to objc_msgSend. I sent a proposal to swift-evolution about this last year but there wasn't much interest.)

The Workaround

Today's answer isn't wonderful, but it does work: write a wrapper struct that conforms to Decodable instead:

struct DecodedPoint: Decodable {
  var value: Point
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.value = Point(x: x, y: y)
  }
}

This doesn't have any of the problems with inheritance, because it only handles the base class, Point. But it makes everywhere else a little less convenient—instead of directly encoding or decoding Point, you have to use the wrapper, and that means no implicitly-generated Codable implementations either.

I'm not going to spend more time talking about this, but it is the officially recommended answer at the moment. You can also just have all your own types that contain points manually decode the 'x' and 'y' values and then construct a Point from that.

I would actually take this a step further and recommend that any time you intend to extend someone else’s type with Encodable or Decodable, you should almost certainly write a wrapper struct for it instead, unless you have reasonable guarantees that the type will never attempt to conform to these protocols on its own.

This might sound extreme (and inconvenient), but Jordan mentions the issue here below in The Dangers of Retroactive Modeling. Any time you conform a type which does not belong to you to a protocol, you make a decision about its behavior where you might not necessarily have the "right" to — if the type later adds conformance to the protocol itself (e.g. in a library update), your code will no longer compile, and you’ll have to remove your own conformance. In most cases, that’s fine, e.g., there’s not much harm done in dropping your custom Equatable conformance on some type if it starts adopting it on its own. The real risk with Encodable and Decodable is that unless you don’t care about backwards/forwards compatibility, the implementations of these conformances are forever.

Using Point here as an example, it’s not unreasonable for Point to eventually get updated to conform to Codable. It’s also not unreasonable for the implementation of Point to adopt the default conformance, i.e., get encoded as {"x": …, "y": …}. This form might not be the most compact, but it leaves room for expansion (e.g. if Point adds a z field, which might also be reasonable, considering the type doesn’t belong to you). If you update your library dependency with the new Point class and have to drop the conformance you added to it directly, you’ve introduced a backwards and forwards compatibility concern: all new versions of your app now encode and decode a new archive format, which now requires migration. Unless you don’t care about other versions of your app, you’ll have to deal with this:
Old versions of your app which users may have on their devices cannot read archives with this new format
New versions of your app cannot read archives with the old format

Unless you don’t care for some reason, you will now have to write the wrapper struct, to either
Have new versions of your app attempt to read old archive versions and migrate them forward (leaving old app versions in the dust), or
Write all new archives with the old format so old app versions can still read archives written with newer app versions, and vice versa

Either way, you’ll need to write some wrapper to handle this; it’s significantly safer to do that work up front on a type which you do control (and safely allow Point to change out underneath you transparently), rather than potentially end up between a rock and a hard place later on because a type you don’t own changes out from under you.

Future Direction: 'required' + 'final'

One language feature we could add to make this work is a 'required' initializer that is also 'final'. Because it's 'final', it wouldn't have to go into the dynamic dispatch table. But because it's 'final', we have to make sure its implementation works on all subclasses. For that to work, it would only be allowed to call other 'required' initializers…which means you're still stuck if the original author didn't mark anything 'required'. Still, it's a safe, reasonable, and contained extension to our initializer model.

Future Direction: runtime-checked convenience initializers

In most cases you don't care about hypothetical subclasses or invoking init(from:) on some dynamic Point type. If there was a way to mark init(from:) as something that was always available on subclasses, but dynamically checked to see if it was okay, we'd be good. That could take one of two forms:

- If 'self' is not Point itself, trap.
- If 'self' did not inherit or override all of Point's designated initializers, trap.

The former is pretty easy to implement but not very extensible. The latter seems more expensive: it's information we already check in the compiler, but we don't put it into the runtime metadata for a class, and checking it at run time requires walking up the class hierarchy until we get to the class we want. This is all predicated on the idea that this is rare, though.

This is a much more intrusive change to the initializer model, and it's turning a compile-time check into a run-time check, so I think we're less likely to want to take this any time soon.

Future Direction: Non-inherited conformances

All of this is only a problem because people might try to call init(from:) on a subclass of Point. If we said that subclasses of Point weren't automatically Decodable themselves, we'd avoid this problem. This sounds like a terrible idea but it actually doesn't change very much in practice. Unfortunately, it's also a very complicated and intrusive change to the Swift protocol system, and so I don't want to spend more time on it here.

The Dangers of Retroactive Modeling

Even if we magically make this all work, however, there's still one last problem: what if two frameworks do this? Point can't conform to Decodable in two different ways, but neither can it just pick one. (Maybe one of the encoded formats uses "dx" and "dy" for the key names, or maybe it's encoded with polar coordinates.) There aren't great answers to this, and it calls into question whether the struct "solution" at the start of this message is even sensible.

I'm going to bring this up on swift-evolution soon as part of the Library Evolution discussions (there's a very similar problem if the library that owns Point decides to make it Decodable too), but it's worth noting that the wrapper struct solution doesn't have this problem.

Whew! So, that's why you can't do it. It's not a very satisfying answer, but it's one that falls out of our compile-time safety rules for initializers. For more information on this I suggest checking out my write-up of some of our initialization model problems <https://github.com/apple/swift/blob/master/docs/InitializerProblems.rst&gt;\. And I plan to write another email like this to discuss some solutions that are actually doable.

Jordan

P.S. There's a reason why Decodable uses an initializer instead of a factory-like method on the type but I can't remember what it is right now. I think it's something to do with having the right result type, which would have to be either 'Any' or an associated type if it wasn't just 'Self'. (And if it is 'Self' then it has all the same problems as an initializer and would require extra syntax.) Itai would know for sure.

To give background on this — the protocols originally had factory initializers in mind for this (to allow for object replacement and avoid some of these issues), but without a "real" factory initializer pattern like we’re discussing here, the problems with this approach were intractable (all due to subclassing issues).

An initializer pattern like static func decode(from: Decoder) throws -> ??? has a few problems
The return type is one consideration. If we allow for an associated type representing to the return type, subclasses cannot override the associated type to return something different. This makes object replacement impossible in situations which use subclassing. The only reasonable thing is to return Self (which would allow for returning instances of self, or of subclasses). (We could return Any, but that defeats the entire purpose of having a type-safe API to begin with; we want to avoid the dynamic casting altogether.)
Even if we return Self, this method cannot be overridden by subclasses:
If implemented as static func decode(from: Decoder) throws -> Self, the method clearly cannot be overridden in a subclass, as it is a static method
The method cannot be implemented as class func decode(from: Decoder) throws -> Self on a non-final class:
protocol Foo {
    static func create() -> Self
}

class Bar : Foo {
    class func create() -> Bar { // method 'create()' in non-final class 'Bar' must return 'Self' to conform to protocol 'Foo'
        return Bar()
    }
}

protocol Foo {
    static func create() -> Self
}

class Bar : Foo {
    class func create() -> Self {
        return Bar() // cannot convert return expression of type 'Bar' to return type 'Self'
    }
}

protocol Foo {
    static func create() -> Self
}

class Bar : Foo {
    class func create() -> Self {
        return Bar() as! Self // error: 'Self' is only available in a protocol or as the result of a method in a class; did you mean 'Bar'?; warning: forced cast of 'Bar' to same type has no effect; error: cannot convert return of expression type 'Bar' to return type 'Self'
    }
}

final class Bar : Foo {
    class func create() -> Bar { // no problems
        return Bar()
    }
}
This means that we either allow adoption of these protocols on final classes only (which, again, defeats the whole purpose!), or, that every class which implements these protocols has to have knowledge about all of its potential subclasses and their implementations of these protocols. This is prohibitive as well.
Even if it were possible to subclass these types of methods, they don’t follow the regular initializer pattern. In order to construct an instance of a subclass, you need to be able to call a superclass initializer. But these methods are not initializers; even if you call super’s factory initializer, there’s noting you can do with the returned instance of the superclass; unlike in ObjC, there’s no super- or self-reassignment (in general), so classes would have to follow a completely different (and awkward) pattern of creating an instance of the superclass, initializing from that instance in a separate initializer (e.g. self.init(superInstance)), and also setting decoded properties

Overall, the lack of a true factory initializer pattern prevented us from doing something like this, and we took the regular initializer approach.

···

On Aug 2, 2017, at 5:08 PM, Jordan Rose <jordan_rose@apple.com> wrote:

1 Like

For anyone interested, factory methods *can* retroactivaly be added to existing classes. This is how the SQLite library GRDB.swift is able to decode classes hierarchies like NSString, NSNumber, NSDecimalNumber, etc. from SQLite values:

The protocol for types that can instantiate from SQLite values has a factory method:

    public protocol DatabaseValueConvertible {
        /// Returns a value initialized from *dbValue*, if possible.
        static func fromDatabaseValue(_ dbValue: DatabaseValue) -> Self?
    }

Having Foundation classes implement it uses various techniques:

1. "Casting" (Data to NSData, or NSDate to Date, depending on which type provides the root conformance)

    // Workaround Swift inconvenience around factory methods of non-final classes
    func cast<T, U>(_ value: T) -> U? {
        return value as? U
    }
    
    extension NSData : DatabaseValueConvertible {
        public static func fromDatabaseValue(_ dbValue: DatabaseValue) -> Self? {
            // Use Data conformance
            guard let data = Data.fromDatabaseValue(dbValue) else {
                return nil
            }
            return cast(data)
        }
    }

    // Derives Date conformance from NSDate, for example
    extension ReferenceConvertible where Self: DatabaseValueConvertible, Self.ReferenceType: DatabaseValueConvertible {
        public static func fromDatabaseValue(_ dbValue: DatabaseValue) -> Self? {
            return ReferenceType.fromDatabaseValue(dbValue).flatMap { cast($0) }
        }
    }

2. Using magic Foundation initializers (magic because the code below compiles even if those are not *required* initializers). Works for NSNumber, NSDecimalNumber, NSString:

    extension NSNumber : DatabaseValueConvertible {
        public static func fromDatabaseValue(_ dbValue: DatabaseValue) -> Self? {
            switch dbValue.storage {
            case .int64(let int64):
                return self.init(value: int64)
            case .double(let double):
                return self.init(value: double)
            default:
                return nil
            }
        }
    }

    extension NSString : DatabaseValueConvertible {
        public static func fromDatabaseValue(_ dbValue: DatabaseValue) -> Self? {
            // Use String conformance
            guard let string = String.fromDatabaseValue(dbValue) else {
                return nil
            }
            return self.init(string: string)
        }
    }

The magic about Foundation initializers above makes me doubt that this technique is general enough for Decodable to profit from it, though. Yes it runs on Linux, so I'm not even sure if objc runtime is required or not. I'm clueless ???????

Gwendal Roué

···

Le 3 août 2017 à 02:09, Jordan Rose via swift-evolution <swift-evolution@swift.org> a écrit :

P.S. There's a reason why Decodable uses an initializer instead of a factory-like method on the type but I can't remember what it is right now. I think it's something to do with having the right result type, which would have to be either 'Any' or an associated type if it wasn't just 'Self'. (And if it is 'Self' then it has all the same problems as an initializer and would require extra syntax.) Itai would know for sure.

David Hart recently asked on Twitter<https://twitter.com/dhartbit/status/891766239340748800&gt; if there was a good way to add Decodable support to somebody else's class. The short answer is "no, because you don't control all the subclasses", but David already understood that and wanted to know if there was anything working to mitigate the problem. So I decided to write up a long email about it instead. (Well, actually I decided to write a short email and then failed at doing so.)

The Problem

You can add Decodable to someone else's struct today with no problems:

extension Point: Decodable {
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.init(x: x, y: y)
  }
}

But if Point is a (non-final) class, then this gives you a pile of errors:

- init(from:) needs to be 'required' to satisfy a protocol requirement. 'required' means the initializer can be invoked dynamically on subclasses. Why is this important? Because someone might write code like this:

func decodeMe<Result: Decodable>() -> Result {
  let decoder = getDecoderFromSomewhere()
  return Result(from: decoder)
}
let specialPoint: VerySpecialSubclassOfPoint = decodeMe()

…and the compiler can't stop them, because VerySpecialSubclassOfPoint is a Point, and Point is Decodable, and therefore VerySpecialSubclassOfPoint is Decodable. A bit more on this later, but for now let's say that's a sensible requirement.

- init(from:) also has to be a 'convenience' initializer. That one makes sense too—if you're outside the module, you can't necessarily see private properties, and so of course you'll have to call another initializer that can.

But once it's marked 'convenience' and 'required' we get "'required' initializer must be declared directly in class 'Point' (not in an extension)", and that defeats the whole purpose. Why this restriction?

The Semantic Reason

The initializer is 'required', right? So all subclasses need to have access to it. But the implementation we provided here might not make sense for all subclasses—what if VerySpecialSubclassOfPoint doesn't have an 'init(x:y:)' initializer? Normally, the compiler checks for this situation and makes the subclass reimplement the 'required' initializer…but that only works if the 'required' initializers are all known up front. So it can't allow this new 'required' initializer to go by, because someone might try to call it dynamically on a subclass. Here's a dynamic version of the code from above:

func decodeDynamic(_ pointType: Point.Type) -> Point {
  let decoder = getDecoderFromSomewhere()
  return pointType.init(from: decoder)
}
let specialPoint = decodeDynamic(VerySpecialSubclassOfPoint.self)

The Implementation Reason

'required' initializers are like methods: they may require dynamic dispatch. That means that they get an entry in the class's dynamic dispatch table, commonly known as its vtable. Unlike Objective-C method tables, vtables aren't set up to have entries arbitrarily added at run time.

(Aside: This is one of the reasons why non-@objc methods in Swift extensions can't be overridden; if we ever lift that restriction, it'll be by using a separate table and a form of dispatch similar to objc_msgSend. I sent a proposal to swift-evolution about this last year but there wasn't much interest.)

The Workaround

Today's answer isn't wonderful, but it does work: write a wrapper struct that conforms to Decodable instead:

struct DecodedPoint: Decodable {
  var value: Point
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.value = Point(x: x, y: y)
  }
}

This doesn't have any of the problems with inheritance, because it only handles the base class, Point. But it makes everywhere else a little less convenient—instead of directly encoding or decoding Point, you have to use the wrapper, and that means no implicitly-generated Codable implementations either.

I'm not going to spend more time talking about this, but it is the officially recommended answer at the moment. You can also just have all your own types that contain points manually decode the 'x' and 'y' values and then construct a Point from that.

Future Direction: 'required' + 'final'

One language feature we could add to make this work is a 'required' initializer that is also 'final'. Because it's 'final', it wouldn't have to go into the dynamic dispatch table. But because it's 'final', we have to make sure its implementation works on all subclasses. For that to work, it would only be allowed to call other 'required' initializers…which means you're still stuck if the original author didn't mark anything 'required'. Still, it's a safe, reasonable, and contained extension to our initializer model.

Future Direction: runtime-checked convenience initializers

In most cases you don't care about hypothetical subclasses or invoking init(from:) on some dynamic Point type. If there was a way to mark init(from:) as something that was always available on subclasses, but dynamically checked to see if it was okay, we'd be good. That could take one of two forms:

- If 'self' is not Point itself, trap.
- If 'self' did not inherit or override all of Point's designated initializers, trap.

The former is pretty easy to implement but not very extensible. The latter seems more expensive: it's information we already check in the compiler, but we don't put it into the runtime metadata for a class, and checking it at run time requires walking up the class hierarchy until we get to the class we want. This is all predicated on the idea that this is rare, though.

This is a much more intrusive change to the initializer model, and it's turning a compile-time check into a run-time check, so I think we're less likely to want to take this any time soon.

Future Direction: Non-inherited conformances

All of this is only a problem because people might try to call init(from:) on a subclass of Point. If we said that subclasses of Point weren't automatically Decodable themselves, we'd avoid this problem. This sounds like a terrible idea but it actually doesn't change very much in practice. Unfortunately, it's also a very complicated and intrusive change to the Swift protocol system, and so I don't want to spend more time on it here.

The Dangers of Retroactive Modeling

Even if we magically make this all work, however, there's still one last problem: what if two frameworks do this? Point can't conform to Decodable in two different ways, but neither can it just pick one. (Maybe one of the encoded formats uses "dx" and "dy" for the key names, or maybe it's encoded with polar coordinates.) There aren't great answers to this, and it calls into question whether the struct "solution" at the start of this message is even sensible.

I'm going to bring this up on swift-evolution soon as part of the Library Evolution discussions (there's a very similar problem if the library that owns Point decides to make it Decodable too), but it's worth noting that the wrapper struct solution doesn't have this problem.

Whew! So, that's why you can't do it. It's not a very satisfying answer, but it's one that falls out of our compile-time safety rules for initializers. For more information on this I suggest checking out my write-up of some of our initialization model problems<https://github.com/apple/swift/blob/master/docs/InitializerProblems.rst&gt;\. And I plan to write another email like this to discuss some solutions that are actually doable.

Jordan

P.S. There's a reason why Decodable uses an initializer instead of a factory-like method on the type but I can't remember what it is right now. I think it's something to do with having the right result type, which would have to be either 'Any' or an associated type if it wasn't just 'Self'. (And if it is 'Self' then it has all the same problems as an initializer and would require extra syntax.) Itai would know for sure.

···

Am 03.08.2017 um 02:09 schrieb Jordan Rose via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>>:

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

I just bumped in to the required initializer problem when I tried to make my ObjC classes Codable. So there really is no way to make an ObjC class Codable without subclassing it in Swift or writing wrappers? Unfortunately neither is economic for large amount of classes, isn't there even a hacky workaround?

Initially I had hopes that I could write an extension to the common base class of my models, to conform to Codable. I currently use Mantle and I thought in my Decodable init I could wrap Mantle's decoding methods.

Best,
Fabian

Actually if the wrapper types are structs with a single field, their use should not introduce any additional overhead at runtime.

Slava

···

On Aug 2, 2017, at 10:48 PM, David Hart <david@hartbit.com> wrote:

Somewhat related: I have a similar problem in a project where I need two different Codable conformances for a type: one for coding/decoding from/to JSON, and another one for coding/decoding from/to a database row. The keys and formatting are not identical. The only solution around that for now is separate types, which can be sub-optimal from a performance point of view.

Somewhat related: I have a similar problem in a project where I need two different Codable conformances for a type: one for coding/decoding from/to JSON, and another one for coding/decoding from/to a database row. The keys and formatting are not identical. The only solution around that for now is separate types, which can be sub-optimal from a performance point of view.

Actually if the wrapper types are structs with a single field, their use should not introduce any additional overhead at runtime.

I :heart: Swift

···

On 3 Aug 2017, at 08:00, Slava Pestov <spestov@apple.com> wrote:

On Aug 2, 2017, at 10:48 PM, David Hart <david@hartbit.com <mailto:david@hartbit.com>> wrote:

Slava

I just mentioned this in my other email, but to point out here: the reason this works in your case is because you adopt these methods as static funcs and can reasonably rely on subclasses of NSData, NSNumber, NSString, etc. to do the right thing because of work done behind the scenes in the ObjC implementations of these classes (and because we’ve got established subclassing requirements on these methods — all subclasses of these classes are going to look approximately the same without doing anything crazy).

This would not work for Codable in the general case, however, where subclasses likely need to add additional storage, properties, encoded representations, etc., without equivalent requirements, either via additional protocols or conventions.

···

On Aug 3, 2017, at 1:50 AM, Gwendal Roué via swift-evolution <swift-evolution@swift.org> wrote:

Le 3 août 2017 à 02:09, Jordan Rose via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

P.S. There's a reason why Decodable uses an initializer instead of a factory-like method on the type but I can't remember what it is right now. I think it's something to do with having the right result type, which would have to be either 'Any' or an associated type if it wasn't just 'Self'. (And if it is 'Self' then it has all the same problems as an initializer and would require extra syntax.) Itai would know for sure.

For anyone interested, factory methods *can* retroactivaly be added to existing classes. This is how the SQLite library GRDB.swift is able to decode classes hierarchies like NSString, NSNumber, NSDecimalNumber, etc. from SQLite values:

The protocol for types that can instantiate from SQLite values has a factory method:

    public protocol DatabaseValueConvertible {
        /// Returns a value initialized from *dbValue*, if possible.
        static func fromDatabaseValue(_ dbValue: DatabaseValue) -> Self?
    }

Having Foundation classes implement it uses various techniques:

1. "Casting" (Data to NSData, or NSDate to Date, depending on which type provides the root conformance)

    // Workaround Swift inconvenience around factory methods of non-final classes
    func cast<T, U>(_ value: T) -> U? {
        return value as? U
    }
    
    extension NSData : DatabaseValueConvertible {
        public static func fromDatabaseValue(_ dbValue: DatabaseValue) -> Self? {
            // Use Data conformance
            guard let data = Data.fromDatabaseValue(dbValue) else {
                return nil
            }
            return cast(data)
        }
    }

    // Derives Date conformance from NSDate, for example
    extension ReferenceConvertible where Self: DatabaseValueConvertible, Self.ReferenceType: DatabaseValueConvertible {
        public static func fromDatabaseValue(_ dbValue: DatabaseValue) -> Self? {
            return ReferenceType.fromDatabaseValue(dbValue).flatMap { cast($0) }
        }
    }

2. Using magic Foundation initializers (magic because the code below compiles even if those are not *required* initializers). Works for NSNumber, NSDecimalNumber, NSString:

    extension NSNumber : DatabaseValueConvertible {
        public static func fromDatabaseValue(_ dbValue: DatabaseValue) -> Self? {
            switch dbValue.storage {
            case .int64(let int64):
                return self.init(value: int64)
            case .double(let double):
                return self.init(value: double)
            default:
                return nil
            }
        }
    }

    extension NSString : DatabaseValueConvertible {
        public static func fromDatabaseValue(_ dbValue: DatabaseValue) -> Self? {
            // Use String conformance
            guard let string = String.fromDatabaseValue(dbValue) else {
                return nil
            }
            return self.init(string: string)
        }
    }

The magic about Foundation initializers above makes me doubt that this technique is general enough for Decodable to profit from it, though. Yes it runs on Linux, so I'm not even sure if objc runtime is required or not. I'm clueless ???????

Gwendal Roué

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Thaks for your explanation why a static method in a protocol is able to instantiate non final classes like NSData, NSDate, NSNumber, NSDecimalNumber, NSString, etc.

Is this "privilege" stable? Can I rely on it to be maintained over time? Or would it be a better idea to drop support for those low-level Foundation classes, because they'll eventually become regular classes without any specific support? This would not harm that much: Data, Date, String are there for a reason. NSDecimalNumber is the only one of its kind, though.

Gwendal

···

Le 3 août 2017 à 19:10, Itai Ferber <iferber@apple.com> a écrit :

I just mentioned this in my other email, but to point out here: the reason this works in your case is because you adopt these methods as static funcs and can reasonably rely on subclasses of NSData, NSNumber, NSString, etc. to do the right thing because of work done behind the scenes in the ObjC implementations of these classes (and because we’ve got established subclassing requirements on these methods — all subclasses of these classes are going to look approximately the same without doing anything crazy).

This would not work for Codable in the general case, however, where subclasses likely need to add additional storage, properties, encoded representations, etc., without equivalent requirements, either via additional protocols or conventions.

I should point out that there’s a third case: the case where you want to add conformance to a type from another framework, but you own both frameworks.

Plenty of examples of this can be found in the Cocoa frameworks, actually. For example, as NSString is, of course, declared in Foundation, its original declaration cannot conform to the NSPasteboardReading protocol, which is declared in AppKit. As a result, Apple declares NSString’s NSPasteboardReading support in a category in AppKit. There are reasons one might want to do the same thing in their own code—make one library and/or framework for use with Foundation-only programs, and extend a type from that library/framework with NSPasteboardReading support in a separate framework. It can’t currently be done with Swift, though.

Charles

···

On Aug 3, 2017, at 12:05 PM, Itai Ferber via swift-evolution <swift-evolution@swift.org> wrote:

Thanks for putting these thoughts together, Jordan! Some additional comments inline.

On Aug 2, 2017, at 5:08 PM, Jordan Rose <jordan_rose@apple.com <mailto:jordan_rose@apple.com>> wrote:

David Hart recently asked on Twitter <https://twitter.com/dhartbit/status/891766239340748800&gt; if there was a good way to add Decodable support to somebody else's class. The short answer is "no, because you don't control all the subclasses", but David already understood that and wanted to know if there was anything working to mitigate the problem. So I decided to write up a long email about it instead. (Well, actually I decided to write a short email and then failed at doing so.)

The Problem

You can add Decodable to someone else's struct today with no problems:

extension Point: Decodable {
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.init(x: x, y: y)
  }
}

But if Point is a (non-final) class, then this gives you a pile of errors:

- init(from:) needs to be 'required' to satisfy a protocol requirement. 'required' means the initializer can be invoked dynamically on subclasses. Why is this important? Because someone might write code like this:

func decodeMe<Result: Decodable>() -> Result {
  let decoder = getDecoderFromSomewhere()
  return Result(from: decoder)
}
let specialPoint: VerySpecialSubclassOfPoint = decodeMe()

…and the compiler can't stop them, because VerySpecialSubclassOfPoint is a Point, and Point is Decodable, and therefore VerySpecialSubclassOfPoint is Decodable. A bit more on this later, but for now let's say that's a sensible requirement.

- init(from:) also has to be a 'convenience' initializer. That one makes sense too—if you're outside the module, you can't necessarily see private properties, and so of course you'll have to call another initializer that can.

But once it's marked 'convenience' and 'required' we get "'required' initializer must be declared directly in class 'Point' (not in an extension)", and that defeats the whole purpose. Why this restriction?

The Semantic Reason

The initializer is 'required', right? So all subclasses need to have access to it. But the implementation we provided here might not make sense for all subclasses—what if VerySpecialSubclassOfPoint doesn't have an 'init(x:y:)' initializer? Normally, the compiler checks for this situation and makes the subclass reimplement the 'required' initializer…but that only works if the 'required' initializers are all known up front. So it can't allow this new 'required' initializer to go by, because someone might try to call it dynamically on a subclass. Here's a dynamic version of the code from above:

func decodeDynamic(_ pointType: Point.Type) -> Point {
  let decoder = getDecoderFromSomewhere()
  return pointType.init(from: decoder)
}
let specialPoint = decodeDynamic(VerySpecialSubclassOfPoint.self)

The Implementation Reason

'required' initializers are like methods: they may require dynamic dispatch. That means that they get an entry in the class's dynamic dispatch table, commonly known as its vtable. Unlike Objective-C method tables, vtables aren't set up to have entries arbitrarily added at run time.

(Aside: This is one of the reasons why non-@objc methods in Swift extensions can't be overridden; if we ever lift that restriction, it'll be by using a separate table and a form of dispatch similar to objc_msgSend. I sent a proposal to swift-evolution about this last year but there wasn't much interest.)

The Workaround

Today's answer isn't wonderful, but it does work: write a wrapper struct that conforms to Decodable instead:

struct DecodedPoint: Decodable {
  var value: Point
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.value = Point(x: x, y: y)
  }
}

This doesn't have any of the problems with inheritance, because it only handles the base class, Point. But it makes everywhere else a little less convenient—instead of directly encoding or decoding Point, you have to use the wrapper, and that means no implicitly-generated Codable implementations either.

I'm not going to spend more time talking about this, but it is the officially recommended answer at the moment. You can also just have all your own types that contain points manually decode the 'x' and 'y' values and then construct a Point from that.

I would actually take this a step further and recommend that any time you intend to extend someone else’s type with Encodable or Decodable, you should almost certainly write a wrapper struct for it instead, unless you have reasonable guarantees that the type will never attempt to conform to these protocols on its own.

This might sound extreme (and inconvenient), but Jordan mentions the issue here below in The Dangers of Retroactive Modeling. Any time you conform a type which does not belong to you to a protocol, you make a decision about its behavior where you might not necessarily have the "right" to — if the type later adds conformance to the protocol itself (e.g. in a library update), your code will no longer compile, and you’ll have to remove your own conformance. In most cases, that’s fine, e.g., there’s not much harm done in dropping your custom Equatable conformance on some type if it starts adopting it on its own. The real risk with Encodable and Decodable is that unless you don’t care about backwards/forwards compatibility, the implementations of these conformances are forever.

Using Point here as an example, it’s not unreasonable for Point to eventually get updated to conform to Codable. It’s also not unreasonable for the implementation of Point to adopt the default conformance, i.e., get encoded as {"x": …, "y": …}. This form might not be the most compact, but it leaves room for expansion (e.g. if Point adds a z field, which might also be reasonable, considering the type doesn’t belong to you). If you update your library dependency with the new Point class and have to drop the conformance you added to it directly, you’ve introduced a backwards and forwards compatibility concern: all new versions of your app now encode and decode a new archive format, which now requires migration. Unless you don’t care about other versions of your app, you’ll have to deal with this:
Old versions of your app which users may have on their devices cannot read archives with this new format
New versions of your app cannot read archives with the old format

Unless you don’t care for some reason, you will now have to write the wrapper struct, to either
Have new versions of your app attempt to read old archive versions and migrate them forward (leaving old app versions in the dust), or
Write all new archives with the old format so old app versions can still read archives written with newer app versions, and vice versa

Either way, you’ll need to write some wrapper to handle this; it’s significantly safer to do that work up front on a type which you do control (and safely allow Point to change out underneath you transparently), rather than potentially end up between a rock and a hard place later on because a type you don’t own changes out from under you.

Thanks for the vote of confidence. :-) Here’s the old proposal for now, likely to be revised soon. https://github.com/jrose-apple/swift-evolution/blob/overridable-members-in-extensions/proposals/nnnn-overridable-members-in-extensions.md

Jordan

···

On Aug 2, 2017, at 23:21, Goffredo Marocchi <panajev@gmail.com> wrote:

Sent from my iPhone

On 3 Aug 2017, at 01:09, Jordan Rose via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

'required' initializers are like methods: they may require dynamic dispatch. That means that they get an entry in the class's dynamic dispatch table, commonly known as its vtable. Unlike Objective-C method tables, vtables aren't set up to have entries arbitrarily added at run time.

(Aside: This is one of the reasons why non-@objc methods in Swift extensions can't be overridden; if we ever lift that restriction, it'll be by using a separate table and a form of dispatch similar to objc_msgSend. I sent a proposal to swift-evolution about this last year but there wasn't much interest.)

If I missed replying to that originally I also missed the chance to say that it would be a lovely idea and dynamic dispatch in some cases is just what the doctor ordered (runtime editable method tables).
This is especially especially important with extensions for classes and default methods (and the current rules for overriding methods in the implementing class), please resubmit the proposal :).

Aha, there is actually a hacky way to handle the required initializer problem: rely on protocols. Swift protocol extension methods are a lot like the 'required' + 'final' solution I described above, so you can do something like this:

protocol SwiftFooConstructible {
  init(swift: Foo)
}
extension SwiftFooConstructible where Self: ObjCFooConstructible {
  init(swift foo: Foo) {
    self.init(objc: foo)
  }
}

Whether or not you'll be able to use that to match Decodable up with Mantle, I'm not sure. But it is an option for now – and I'll stress that "for now", because we want to treat members that satisfy protocol requirements in classes as overridable by default, and that would break this.

Jordan

···

On Sep 2, 2017, at 00:30, Fabian Ehrentraud <Fabian.Ehrentraud@willhaben.at> wrote:

Am 03.08.2017 um 02:09 schrieb Jordan Rose via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>:

David Hart recently asked on Twitter <https://twitter.com/dhartbit/status/891766239340748800&gt; if there was a good way to add Decodable support to somebody else's class. The short answer is "no, because you don't control all the subclasses", but David already understood that and wanted to know if there was anything working to mitigate the problem. So I decided to write up a long email about it instead. (Well, actually I decided to write a short email and then failed at doing so.)

The Problem

You can add Decodable to someone else's struct today with no problems:

extension Point: Decodable {
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.init(x: x, y: y)
  }
}

But if Point is a (non-final) class, then this gives you a pile of errors:

- init(from:) needs to be 'required' to satisfy a protocol requirement. 'required' means the initializer can be invoked dynamically on subclasses. Why is this important? Because someone might write code like this:

func decodeMe<Result: Decodable>() -> Result {
  let decoder = getDecoderFromSomewhere()
  return Result(from: decoder)
}
let specialPoint: VerySpecialSubclassOfPoint = decodeMe()

…and the compiler can't stop them, because VerySpecialSubclassOfPoint is a Point, and Point is Decodable, and therefore VerySpecialSubclassOfPoint is Decodable. A bit more on this later, but for now let's say that's a sensible requirement.

- init(from:) also has to be a 'convenience' initializer. That one makes sense too—if you're outside the module, you can't necessarily see private properties, and so of course you'll have to call another initializer that can.

But once it's marked 'convenience' and 'required' we get "'required' initializer must be declared directly in class 'Point' (not in an extension)", and that defeats the whole purpose. Why this restriction?

The Semantic Reason

The initializer is 'required', right? So all subclasses need to have access to it. But the implementation we provided here might not make sense for all subclasses—what if VerySpecialSubclassOfPoint doesn't have an 'init(x:y:)' initializer? Normally, the compiler checks for this situation and makes the subclass reimplement the 'required' initializer…but that only works if the 'required' initializers are all known up front. So it can't allow this new 'required' initializer to go by, because someone might try to call it dynamically on a subclass. Here's a dynamic version of the code from above:

func decodeDynamic(_ pointType: Point.Type) -> Point {
  let decoder = getDecoderFromSomewhere()
  return pointType.init(from: decoder)
}
let specialPoint = decodeDynamic(VerySpecialSubclassOfPoint.self)

The Implementation Reason

'required' initializers are like methods: they may require dynamic dispatch. That means that they get an entry in the class's dynamic dispatch table, commonly known as its vtable. Unlike Objective-C method tables, vtables aren't set up to have entries arbitrarily added at run time.

(Aside: This is one of the reasons why non-@objc methods in Swift extensions can't be overridden; if we ever lift that restriction, it'll be by using a separate table and a form of dispatch similar to objc_msgSend. I sent a proposal to swift-evolution about this last year but there wasn't much interest.)

The Workaround

Today's answer isn't wonderful, but it does work: write a wrapper struct that conforms to Decodable instead:

struct DecodedPoint: Decodable {
  var value: Point
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.value = Point(x: x, y: y)
  }
}

This doesn't have any of the problems with inheritance, because it only handles the base class, Point. But it makes everywhere else a little less convenient—instead of directly encoding or decoding Point, you have to use the wrapper, and that means no implicitly-generated Codable implementations either.

I'm not going to spend more time talking about this, but it is the officially recommended answer at the moment. You can also just have all your own types that contain points manually decode the 'x' and 'y' values and then construct a Point from that.

Future Direction: 'required' + 'final'

One language feature we could add to make this work is a 'required' initializer that is also 'final'. Because it's 'final', it wouldn't have to go into the dynamic dispatch table. But because it's 'final', we have to make sure its implementation works on all subclasses. For that to work, it would only be allowed to call other 'required' initializers…which means you're still stuck if the original author didn't mark anything 'required'. Still, it's a safe, reasonable, and contained extension to our initializer model.

Future Direction: runtime-checked convenience initializers

In most cases you don't care about hypothetical subclasses or invoking init(from:) on some dynamic Point type. If there was a way to mark init(from:) as something that was always available on subclasses, but dynamically checked to see if it was okay, we'd be good. That could take one of two forms:

- If 'self' is not Point itself, trap.
- If 'self' did not inherit or override all of Point's designated initializers, trap.

The former is pretty easy to implement but not very extensible. The latter seems more expensive: it's information we already check in the compiler, but we don't put it into the runtime metadata for a class, and checking it at run time requires walking up the class hierarchy until we get to the class we want. This is all predicated on the idea that this is rare, though.

This is a much more intrusive change to the initializer model, and it's turning a compile-time check into a run-time check, so I think we're less likely to want to take this any time soon.

Future Direction: Non-inherited conformances

All of this is only a problem because people might try to call init(from:) on a subclass of Point. If we said that subclasses of Point weren't automatically Decodable themselves, we'd avoid this problem. This sounds like a terrible idea but it actually doesn't change very much in practice. Unfortunately, it's also a very complicated and intrusive change to the Swift protocol system, and so I don't want to spend more time on it here.

The Dangers of Retroactive Modeling

Even if we magically make this all work, however, there's still one last problem: what if two frameworks do this? Point can't conform to Decodable in two different ways, but neither can it just pick one. (Maybe one of the encoded formats uses "dx" and "dy" for the key names, or maybe it's encoded with polar coordinates.) There aren't great answers to this, and it calls into question whether the struct "solution" at the start of this message is even sensible.

I'm going to bring this up on swift-evolution soon as part of the Library Evolution discussions (there's a very similar problem if the library that owns Point decides to make it Decodable too), but it's worth noting that the wrapper struct solution doesn't have this problem.

Whew! So, that's why you can't do it. It's not a very satisfying answer, but it's one that falls out of our compile-time safety rules for initializers. For more information on this I suggest checking out my write-up of some of our initialization model problems <https://github.com/apple/swift/blob/master/docs/InitializerProblems.rst&gt;\. And I plan to write another email like this to discuss some solutions that are actually doable.

Jordan

P.S. There's a reason why Decodable uses an initializer instead of a factory-like method on the type but I can't remember what it is right now. I think it's something to do with having the right result type, which would have to be either 'Any' or an associated type if it wasn't just 'Self'. (And if it is 'Self' then it has all the same problems as an initializer and would require extra syntax.) Itai would know for sure.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

I just bumped in to the required initializer problem when I tried to make my ObjC classes Codable. So there really is no way to make an ObjC class Codable without subclassing it in Swift or writing wrappers? Unfortunately neither is economic for large amount of classes, isn't there even a hacky workaround?

Initially I had hopes that I could write an extension to the common base class of my models, to conform to Codable. I currently use Mantle and I thought in my Decodable init I could wrap Mantle's decoding methods.

Wrapper structs FTW. I think it's a lovely pattern that's super Swifty and really should be advertised more for solving these kinds of problems. Language-level features could also be useful for making that more usable, and the discussion about "strong" typealiases seems oriented in that direction.

Elviro

···

Il giorno 03 ago 2017, alle ore 08:35, David Hart via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> ha scritto:

On 3 Aug 2017, at 08:00, Slava Pestov <spestov@apple.com <mailto:spestov@apple.com>> wrote:

On Aug 2, 2017, at 10:48 PM, David Hart <david@hartbit.com <mailto:david@hartbit.com>> wrote:

Somewhat related: I have a similar problem in a project where I need two different Codable conformances for a type: one for coding/decoding from/to JSON, and another one for coding/decoding from/to a database row. The keys and formatting are not identical. The only solution around that for now is separate types, which can be sub-optimal from a performance point of view.

Actually if the wrapper types are structs with a single field, their use should not introduce any additional overhead at runtime.

I :heart: Swift

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

Slava

Why not Decimal?

···

On Thu, Aug 3, 2017 at 11:03 PM, Gwendal Roué via swift-evolution < swift-evolution@swift.org> wrote:

> Le 3 août 2017 à 19:10, Itai Ferber <iferber@apple.com> a écrit :
>
> I just mentioned this in my other email, but to point out here: the
reason this works in your case is because you adopt these methods as static
funcs and can reasonably rely on subclasses of NSData, NSNumber, NSString,
etc. to do the right thing because of work done behind the scenes in the
ObjC implementations of these classes (and because we’ve got established
subclassing requirements on these methods — all subclasses of these classes
are going to look approximately the same without doing anything crazy).
>
> This would not work for Codable in the general case, however, where
subclasses likely need to add additional storage, properties, encoded
representations, etc., without equivalent requirements, either via
additional protocols or conventions.

Thaks for your explanation why a static method in a protocol is able to
instantiate non final classes like NSData, NSDate, NSNumber,
NSDecimalNumber, NSString, etc.

Is this "privilege" stable? Can I rely on it to be maintained over time?
Or would it be a better idea to drop support for those low-level Foundation
classes, because they'll eventually become regular classes without any
specific support? This would not harm that much: Data, Date, String are
there for a reason. NSDecimalNumber is the only one of its kind, though.

Because I missed it entirely when I brought my ObjC Foundation luggage with me! Thanks for the hint!

Gwendal

···

Le 4 août 2017 à 06:33, Xiaodi Wu <xiaodi.wu@gmail.com> a écrit :

On Thu, Aug 3, 2017 at 11:03 PM, Gwendal Roué via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

> Le 3 août 2017 à 19:10, Itai Ferber <iferber@apple.com <mailto:iferber@apple.com>> a écrit :
>
> I just mentioned this in my other email, but to point out here: the reason this works in your case is because you adopt these methods as static funcs and can reasonably rely on subclasses of NSData, NSNumber, NSString, etc. to do the right thing because of work done behind the scenes in the ObjC implementations of these classes (and because we’ve got established subclassing requirements on these methods — all subclasses of these classes are going to look approximately the same without doing anything crazy).
>
> This would not work for Codable in the general case, however, where subclasses likely need to add additional storage, properties, encoded representations, etc., without equivalent requirements, either via additional protocols or conventions.

Thaks for your explanation why a static method in a protocol is able to instantiate non final classes like NSData, NSDate, NSNumber, NSDecimalNumber, NSString, etc.

Is this "privilege" stable? Can I rely on it to be maintained over time? Or would it be a better idea to drop support for those low-level Foundation classes, because they'll eventually become regular classes without any specific support? This would not harm that much: Data, Date, String are there for a reason. NSDecimalNumber is the only one of its kind, though.

Why not Decimal?

To clarify a bit here — this isn’t a "privilege" so much so as a property of the design of these classes.
`NSData`, `NSString`, `NSArray`, and some others, are all known as _class clusters_; the classes you know and use are essentially abstract base classes whose implementation is given in private concrete subclasses that specialize based on usage. These classes are essentially an abstract interface for subclasses to follow. You can take a look at the [subclassing notes for `NSArray`](NSArray | Apple Developer Documentation), for instance, to see the guidelines offered for subclassing such a base class.

The reason you can relatively safely offer `static` extensions on these types is that it’s reasonably rare to need to subclass them, and at that, even rarer to offer any interface _besides_ what’s given by the base class. You can rely on the, say, `NSString` interface to access all functionality needed to represent a string. If I were to subclass `NSString` with totally different properties, though, your `static` extension might not take that into account.

Not all types you list here are class clusters, BTW, but they largely fall into the same category of "never really subclassed". There’s no real need for anyone to subclass `NSDate` or `NSDecimalNumber` (since they’re pretty low-level structural types), so this should apply to those as well.

In general, this property applies to all types like this which are rarely subclassed. In Swift, types like this might fall under a `final class` designation, though in Objective-C it’s more by convention/lack of need than by strict enforcement. There’s a reason we offer some of these as `struct`s in Swift (e.g. `Date`, `Decimal`, `Data`, etc.).

···

On 3 Aug 2017, at 21:03, Gwendal Roué wrote:

Le 3 août 2017 à 19:10, Itai Ferber <iferber@apple.com> a écrit :

I just mentioned this in my other email, but to point out here: the reason this works in your case is because you adopt these methods as static funcs and can reasonably rely on subclasses of NSData, NSNumber, NSString, etc. to do the right thing because of work done behind the scenes in the ObjC implementations of these classes (and because we’ve got established subclassing requirements on these methods — all subclasses of these classes are going to look approximately the same without doing anything crazy).

This would not work for Codable in the general case, however, where subclasses likely need to add additional storage, properties, encoded representations, etc., without equivalent requirements, either via additional protocols or conventions.

Thaks for your explanation why a static method in a protocol is able to instantiate non final classes like NSData, NSDate, NSNumber, NSDecimalNumber, NSString, etc.

Is this "privilege" stable? Can I rely on it to be maintained over time? Or would it be a better idea to drop support for those low-level Foundation classes, because they'll eventually become regular classes without any specific support? This would not harm that much: Data, Date, String are there for a reason. NSDecimalNumber is the only one of its kind, though.

Gwendal

Yes, you’re right; this is something we need to do in some cases. For Codable specifically, I don’t think this design pattern would affect much, since:
Encoded representations of values almost overwhelmingly only need to encode stored properties; computed properties are very rarely part of encoded state (what do you do with a computed property on decode?)
Extensions in Swift cannot add storage, so you can’t extend your own types elsewhere with new properties that would need to encoded

I think I’d be hard-pressed to find a case where it’s not possible to adopt Codable on a type inside the framework it’s defined within due to a needed extension provided in a different framework (especially so considering Codable comes from the stdlib, so it’s not like there are compatibility/platform concerns there).

Regardless, if you own the type completely, then of course it’s safe to extend and control your own Codable implementation, and a struct wrapper is unnecessary.

···

On Aug 6, 2017, at 12:58 PM, Charles Srstka <cocoadev@charlessoft.com> wrote:

On Aug 3, 2017, at 12:05 PM, Itai Ferber via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Thanks for putting these thoughts together, Jordan! Some additional comments inline.

On Aug 2, 2017, at 5:08 PM, Jordan Rose <jordan_rose@apple.com <mailto:jordan_rose@apple.com>> wrote:

David Hart recently asked on Twitter <https://twitter.com/dhartbit/status/891766239340748800&gt; if there was a good way to add Decodable support to somebody else's class. The short answer is "no, because you don't control all the subclasses", but David already understood that and wanted to know if there was anything working to mitigate the problem. So I decided to write up a long email about it instead. (Well, actually I decided to write a short email and then failed at doing so.)

The Problem

You can add Decodable to someone else's struct today with no problems:

extension Point: Decodable {
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.init(x: x, y: y)
  }
}

But if Point is a (non-final) class, then this gives you a pile of errors:

- init(from:) needs to be 'required' to satisfy a protocol requirement. 'required' means the initializer can be invoked dynamically on subclasses. Why is this important? Because someone might write code like this:

func decodeMe<Result: Decodable>() -> Result {
  let decoder = getDecoderFromSomewhere()
  return Result(from: decoder)
}
let specialPoint: VerySpecialSubclassOfPoint = decodeMe()

…and the compiler can't stop them, because VerySpecialSubclassOfPoint is a Point, and Point is Decodable, and therefore VerySpecialSubclassOfPoint is Decodable. A bit more on this later, but for now let's say that's a sensible requirement.

- init(from:) also has to be a 'convenience' initializer. That one makes sense too—if you're outside the module, you can't necessarily see private properties, and so of course you'll have to call another initializer that can.

But once it's marked 'convenience' and 'required' we get "'required' initializer must be declared directly in class 'Point' (not in an extension)", and that defeats the whole purpose. Why this restriction?

The Semantic Reason

The initializer is 'required', right? So all subclasses need to have access to it. But the implementation we provided here might not make sense for all subclasses—what if VerySpecialSubclassOfPoint doesn't have an 'init(x:y:)' initializer? Normally, the compiler checks for this situation and makes the subclass reimplement the 'required' initializer…but that only works if the 'required' initializers are all known up front. So it can't allow this new 'required' initializer to go by, because someone might try to call it dynamically on a subclass. Here's a dynamic version of the code from above:

func decodeDynamic(_ pointType: Point.Type) -> Point {
  let decoder = getDecoderFromSomewhere()
  return pointType.init(from: decoder)
}
let specialPoint = decodeDynamic(VerySpecialSubclassOfPoint.self)

The Implementation Reason

'required' initializers are like methods: they may require dynamic dispatch. That means that they get an entry in the class's dynamic dispatch table, commonly known as its vtable. Unlike Objective-C method tables, vtables aren't set up to have entries arbitrarily added at run time.

(Aside: This is one of the reasons why non-@objc methods in Swift extensions can't be overridden; if we ever lift that restriction, it'll be by using a separate table and a form of dispatch similar to objc_msgSend. I sent a proposal to swift-evolution about this last year but there wasn't much interest.)

The Workaround

Today's answer isn't wonderful, but it does work: write a wrapper struct that conforms to Decodable instead:

struct DecodedPoint: Decodable {
  var value: Point
  enum CodingKeys: String, CodingKey {
    case x
    case y
  }
  public init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    self.value = Point(x: x, y: y)
  }
}

This doesn't have any of the problems with inheritance, because it only handles the base class, Point. But it makes everywhere else a little less convenient—instead of directly encoding or decoding Point, you have to use the wrapper, and that means no implicitly-generated Codable implementations either.

I'm not going to spend more time talking about this, but it is the officially recommended answer at the moment. You can also just have all your own types that contain points manually decode the 'x' and 'y' values and then construct a Point from that.

I would actually take this a step further and recommend that any time you intend to extend someone else’s type with Encodable or Decodable, you should almost certainly write a wrapper struct for it instead, unless you have reasonable guarantees that the type will never attempt to conform to these protocols on its own.

This might sound extreme (and inconvenient), but Jordan mentions the issue here below in The Dangers of Retroactive Modeling. Any time you conform a type which does not belong to you to a protocol, you make a decision about its behavior where you might not necessarily have the "right" to — if the type later adds conformance to the protocol itself (e.g. in a library update), your code will no longer compile, and you’ll have to remove your own conformance. In most cases, that’s fine, e.g., there’s not much harm done in dropping your custom Equatable conformance on some type if it starts adopting it on its own. The real risk with Encodable and Decodable is that unless you don’t care about backwards/forwards compatibility, the implementations of these conformances are forever.

Using Point here as an example, it’s not unreasonable for Point to eventually get updated to conform to Codable. It’s also not unreasonable for the implementation of Point to adopt the default conformance, i.e., get encoded as {"x": …, "y": …}. This form might not be the most compact, but it leaves room for expansion (e.g. if Point adds a z field, which might also be reasonable, considering the type doesn’t belong to you). If you update your library dependency with the new Point class and have to drop the conformance you added to it directly, you’ve introduced a backwards and forwards compatibility concern: all new versions of your app now encode and decode a new archive format, which now requires migration. Unless you don’t care about other versions of your app, you’ll have to deal with this:
Old versions of your app which users may have on their devices cannot read archives with this new format
New versions of your app cannot read archives with the old format

Unless you don’t care for some reason, you will now have to write the wrapper struct, to either
Have new versions of your app attempt to read old archive versions and migrate them forward (leaving old app versions in the dust), or
Write all new archives with the old format so old app versions can still read archives written with newer app versions, and vice versa

Either way, you’ll need to write some wrapper to handle this; it’s significantly safer to do that work up front on a type which you do control (and safely allow Point to change out underneath you transparently), rather than potentially end up between a rock and a hard place later on because a type you don’t own changes out from under you.

I should point out that there’s a third case: the case where you want to add conformance to a type from another framework, but you own both frameworks.

Plenty of examples of this can be found in the Cocoa frameworks, actually. For example, as NSString is, of course, declared in Foundation, its original declaration cannot conform to the NSPasteboardReading protocol, which is declared in AppKit. As a result, Apple declares NSString’s NSPasteboardReading support in a category in AppKit. There are reasons one might want to do the same thing in their own code—make one library and/or framework for use with Foundation-only programs, and extend a type from that library/framework with NSPasteboardReading support in a separate framework. It can’t currently be done with Swift, though.

Charles