Using subtypes in the implementation of protocol properties and function return values

lhunath · August 23, 2021, 6:08pm

Given the following interface:

protocol Animal {
    var head : AnimalHead { get }
}

protocol AnimalHead {}

A Horse module might implement it as such:

struct Horse : Animal {
//    let head = HorseHead() // [1]
    let _head = HorseHead()
    var head: AnimalHead { _head } // [2]
}

struct HorseHead : AnimalHead {}

Unfortunately, it appears the language does not permit [1] and forces me to use a workaround [2] to implement the head property requirement.

Can anyone explain why this limitation exists? I would understand that it is impossible for the compiler to accept [1] if the property was {get set}, seeing as it's not possible to honour the guarantee of the protocol in that case, but seeing as the property is read-only, I do not see why this restriction exists and the example demonstrates that it is in fact trivial to implement correctly.

What's going on here? Should the compiler consider allowing [1] in the case of read-only properties?

The same is of course true for function return values. In the following code, Horse is theoretically fully compliant with the contract requirements of Animal, but the compiler refuses to permit it:

protocol Animal {
    func copy() -> Animal
}

struct Horse : Animal {
    func copy() -> Horse { // Compiler error
        Horse()
    }
}

On the other hand, the following is perfectly permissible:


struct Horse : Animal {
    func copyHorse() -> Horse {
        Horse()
    }
    func copy() -> Animal {
        copyHorse()
    }
}

There does not appear to be any legitimate reason for why the latter should be allowed but the former illegal.

That is – per my understanding, a protocol definition such as:

func copy() -> Animal

is a contract requiring that any implementation of copy yields an instance that supports at a bare minimum the contract requirements of the Animal type. It appears that the way these things have currently been implemented in the compiler is to require that the implementation of copy may only support the contract requirements of Animal and nothing more - and I can't fathom a reason as to why this should be enforced.

Saklad5 · August 23, 2021, 8:28pm

The issue is relatively simple: AnimalHead is not the same as HorseHead. To comply with Animal, you have to use the right type:

struct Horse : Animal {
  let head: AnimalHead = HorseHead()
}

jrose · August 23, 2021, 8:28pm

This is SR-522, and it would be source-breaking to change. However, the change could be tied to a language mode, so somebody motivated could suggest it and drive it for Swift 6.

lhunath · August 23, 2021, 8:34pm

Thanks for the feedback!

Is there a rational for why the current behaviour is as-is? I'm also not entirely sure I can see where it's source-breaking.

Saklad5 · August 23, 2021, 8:38pm

I thought this was a type inference/erasure issue?

For instance, Horse can’t satisfy a requirement for Animal unless it is explicitly used as one. If you use as Animal or return it in a function that returns Animal, it works.

Am I missing something here?

lhunath · August 23, 2021, 8:40pm

That's an excellent suggestion. Unfortunately it's not universally useful. It's in fact very sensible for Horse to define its own head as a HorseHead specifically, and allowing Horse to have any type of AnimalHead may not at all be the right thing to do. Furthermore, it's unclear as to why the Animal protocol should make this type equivalency requirement, rather than simply requiring that the return value conforms the protocol parameter's type.

As a quick example, Horse may want to be Hashable, but AnimalHead is not Hashable while HorseHead can be.

Saklad5 · August 23, 2021, 8:42pm

What you want is an associated type.

protocol Animal {
  associatedtype Head: AnimalHead
  var head: Head { get }
}

That requires a specific type that conforms to AnimalHead. When you instead use the protocol itself as a type, you are requiring the ability to have any implementation of it.

lhunath · August 23, 2021, 8:47pm

This certainly solves the issue, but the down-side is that it makes the protocol far heavier than it needs to be. Note that if I want to use an animal as an Animal, simply to use its head as an AnimalHead, I don't need to have any concrete information on the specific type of animal or animal head I'm working with. Your suggestion however forces the types to resolve to concrete types and the user of the protocol is now forced to either have full compile-time type information of the concrete types involved or it forces me to type erase the concrete types; all of this for simply being able to access an abstract animal head which depends on absolutely no concrete type information.

Saklad5 · August 23, 2021, 8:51pm

I believe that is what @jrose was referring to earlier. The compiler could in theory cast HorseHead to AnimalHead implicitly and conform to the protocol that way. If you are always returning a HorseHead, that does meet the requirement of returning anything that conforms to AnimalHead.

jrose · August 23, 2021, 8:51pm

There are some examples in the bug. The current behavior comes from way back in Swift 1 when the compiler wasn’t good at synthesizing wrappers to convert between the concrete return type and the abstract one, i.e. it was more an implementation restriction than a design one. You can see this in method overrides for classes, where subtyping like this is permitted.

lhunath · August 23, 2021, 8:55pm

Very helpful insight. Being new to the evolution process, what are the steps that would be required to proceed on this front, or where can I learn about what I can do to contribute? Does Sam Ballantyne own the evolution on this?

lhunath · August 24, 2021, 2:18pm

Note that attempting to use an extension as a work-around yields code that compiles but (confusingly) the fork implementation does not replace the extension fork when the type is Animal:

protocol Animal {
    func fork() -> Animal
}
extension Animal {
    func fork() -> Animal {
        fatalError("https://bugs.swift.org/browse/SR-522")
    }
}
struct Horse : Animal {
    func fork() -> Horse {
        Horse()
    }
}


let horse = Horse()
let animal : Animal = horse

horse.fork() // fine.
animal.fork() // fatalError.

jrose · September 9, 2021, 7:33pm

Very belated answer: the full process is documented here, swift-evolution/process.md at main · apple/swift-evolution · GitHub, but the most important thing would be to start a discussion in the Evolution > Pitches section of this forum. It would probably help a lot to at least sketch out the proposal template (also linked from the process page) with what would need to change, some motivating examples, and things like that.

lhunath · November 12, 2021, 5:29pm

Note that this is especially problematic when adopting libraries.

class CatHead {}
class CatAnimal {
    var head = CatHead()
}

When Cat is a library and you're extending it to conform to protocols (for example, so that you can erase the library's specific implementation from your code and use abstract protocol types instead), this is unsolvable since you have no access to the original CatAnimal implementation.

extension CatHead : Head {}
extension CatAnimal : Animal {}

This should be sufficient, and I see no reason as to why it shouldn't be (granted, excepting the fact that it just wasn't implemented this way).

I should add that I regrettably have neither the experience nor the time to initiate a Swift evolution proposal, so at this point I expect this issue is dead in the water until someone else can pick up the mantle on it.

lhunath · February 21, 2022, 5:31pm

The solution for this should also unlock extending protocol conformances:

protocol AnimalHead {}
protocol Animal {
    var head: AnimalHead { get }
}

protocol HorseHead: AnimalHead {}
protocol Horse: Animal {
    /* override */ var head: HorseHead { get }
}

Slava_Pestov · February 21, 2022, 9:29pm

This form of subtyping would make associated type inference a much more difficult problem. It would not be trivial to implement at all (unless we removed associated type inference).

In the case of your Animal protocol, perhaps Head should be an associated type instead:

protocol Animal {
    associatedtype Head : AnimalHead
    var head : Head { get }
}

protocol AnimalHead {}

struct Horse : Animal {
    let head = HorseHead()
}

struct HorseHead : AnimalHead {}

lhunath · February 21, 2022, 10:16pm

Would you mind clarifying this for me, perhaps with an example case? AFAIK associated types are inferred from witnesses in concrete types. Since concrete types have all members actualized, there does not appear to be any additional ambiguity introduced by supertype compatibility in member signatures.

Certainly there exist ways of re-architecting these simple examples so as to work around the issue. However, that's hardly the point of this thread – introducing associated types into a protocol fundamentally changes the requirements of this type as well as its member signatures.

protocol Animal { var head: AnimalHead { get } }

This Animal describes a promise that the concrete type should yield any kind of head, with the only requirement that it conform to AnimalHead.

protocol Animal { associatedtype Head: AnimalHead; var head: Head { get } }

This Animal describes a promise that the concrete type must expose a new concrete type of head whose minimum requirements are that it conforms to AnimalHead but additionally that it fully expose all concrete details regarding the particular head the subtype implements.

Ignoring the fact that the latter is clearly a different type of contract, far more expensive, demanding and poorly encapsulated, we also ought to acknowledge that the current implementation is counter-intuitive and surprising when we take a moment to think about the type promises involved:

protocol AnimalHead {}
protocol Animal {
    var head: AnimalHead { get }
}

These protocol requirements simply say, "any Animal must have a head that conforms to AnimalHead".
The reality is that in the following example (which fails to compile):

struct HorseHead : AnimalHead {}
struct Horse: Animal {
    let head = HorseHead()
}

These protocol requirements are met. The Horse type is compatible with the Animal protocol's contract.

The failure case is that the compiler is enforcing additional requirements onto subtypes, namely that in addition to "any Animal must have a head that conforms to AnimalHead", it also enforces that "Animals may not specialize their head.", and it is unclear whether this additional requirement should be a part of the way users should expect protocol types to behave.

Notably, for instance, this additional requirement is absent in classes:

class AnimalHead {}
class Animal {
    var head: AnimalHead { AnimalHead()}
}

class HorseHead: AnimalHead {}
class Horse: Animal {
    override var head: HorseHead { HorseHead() }
}

Slava_Pestov · February 21, 2022, 10:31pm

Sure, consider this:

protocol P {
  associatedtype T
  var foo1: T { get }
  var foo2: T { get }
}

extension P {
  var foo1: Q { ... }
  var foo2: Q { ... }
}

struct Q1 : Q {}
struct Q2 : Q {}

struct P1 : P {
  var foo1: Q1
  var foo2 : Q2
}

Should we pick Q1, Q2 or Q as the witness for P.T in P1 : P? Either assignment would work, with different choices for whether to use the default implementation of foo1 and foo2 defined in extension P1, or the concrete implementation in foo1 and foo2. What if the concrete implementations themselves are in extensions of P1?

Presumably you also want the other subtyping rules from the expression type checker to carry over to witness matching, such as optional injection (T < T?), subclassing (C < D where class C : D and function conversions (T1 -> U1 < T2 -> U2 where T2 < T1 and U1 < U2). This makes the problem of choosing the type witness in the presence of multiple requirements that could potentially match even more difficult. Beyond the disambiguation heuristics required, it also makes associated type inference into a more 'global' problem; today we can resolve type witnesses by looking at a single candidate requirement that mentions the associated type in its signature, but allowing subtyping would require finding all requirements and all corresponding witnesses and then computing intersections of various combinations, and trying to discard non-matching candidates where appropriate.

Class members are allowed to override members of the superclass with a subtype in certain narrow cases because there is no equivalent of associated type inference and the related issues there. However note that this feature isn't perfect either; you can override a property with a type that is a subclass of the base class's property, but not a concrete implementation if the base class's property has an existential type. That's a restriction that could probably be lifted more easily, though.