Why Is Covariant Self more flexible on protocols than classes?

Let's say I want to define a strongly typed UIView wrapper that adds padding to another view:

public final class PaddedView<Content: UIView>: UIView {
  public init(
    content: Content,
    padding: UIEdgeInsets
  ) {
    self.content = content
    self.padding = padding

    addSubview(content)

    // Set up constraints
  }

  public let content: Content
  public let padding: UIEdgeInsets
}

Then I want to make a convenient extension to pad a view:

public extension UIView {
  func padded(with padding: UIEdgeInsets) -> PaddedView<Self> {
    .init(content: self, padding: padding)
  }
}

This does not compile, and produces the following error:

Covariant 'Self' or 'Self?' can only appear at the top level of method result type

I recently learned of a trick to work around this (by investigating how the publisher<T>(for: KeyPath<Self, T>) extension on NSObject is implemented). We introduce a protocol representation of the class:

public protocol UIViewProtocol: UIView {}

Then extend the class to conform to the protocol:

extension UIView: UIViewProtocol {}

Then change the generic parameter and the extension to use the protocol:

public final class PaddedView<Content: UIViewProtocol>: UIView {
   ...
}

public extension UIViewProtocol {
  func padded(with padding: UIEdgeInsets) -> PaddedView<Self> {
    .init(content: self, padding: padding)
  }
}

Covariant Self is allowed to appear anywhere in a function signature on a protocol, so this now compiles, and it achieves the intended result (if I call let result = something.padded(padding: ...), I can query result.content and get back the correctly typed something).

That made me curious: why does this restriction exist on classes but not on protocols? It seems, although I haven't thought deeply about it, than any time you'd get stuck on this limitation with classes, you can introduce this protocol equivalent trick and surmount it, which makes it seem like the compiler has already solved the problem it would need to solve to allow the same with the class directly.

Maybe it has to do with existentials? What happens if I had an any UIViewProtocol and I tried to call .padded on it? Well I can't. Generic parameters can't be covariant, so this appearance of Self is in a non-covariant position, which means this extension gets deleted from the existential. Existentials of protocols that use covariant Self didn't exist at all until Swift 5.7, and extensions that introduced such a use were deleted from existentials. Perhaps forbidding anything but the only possible covariant ("top level" of return value) use of Self in a class extension was an older approach to avoiding this problem, on par with forbidding the comparable protocol existentials entirely?

But what's interesting is you can still "erase" a subclass up to its base class: let erasedView: UIView = UILabel(), and the extension is available, and you'll get an instance of PaddedView<UIView> back (that's not just the static type of the return value but its actual dynamic type too). So apparently the existential that gets opened when going into this extension is the static type if that's a class, but the dynamic type if it's a protocol. The extension isn't available on a protocol existential here, but would be if I changed the return type to UIView. Then if I declared erasedView as an any UIViewProtocol, the result of erasedView.padded would have a static type of UIView but a dynamic type of PaddedView<UILabel>. This does not happen if I make the return type UIView and extend UIView instead of UIViewProtocol (the static type would be UIView and the dynamic type would be Padded<UIView>). So introducing the protocol enables me to at least end up with the "correct" dynamic type (bound to the actual dynamic type of the receiver), which isn't possible if I extend the class.

Why couldn't it (or shouldn't it) work that way with the class directly, where it would simply bind the generic parameter of the return type to the known static type of whatever the extension is called on? That is, why can't I put this extension on UIView, and then when I call something.padded, it just binds Self to the static type of something?

Does it have something to do with the way dispatch is implemented? I've been caught off guard before by how dispatch works in conditional extensions on generic types, and concluded that generic concrete types must have a singular representation under the hood: one method table (for both statically and dynamically dispatched member functions) for the type itself, which is very different than, say, C++, where each specialization of a template class is its own full-blown class with its own compiled methods (that makes constrained method overloading possible). I may be remembering this incorrectly, but I recall being unable to make conditional overloading work on a generic struct, but when I replaced the generic struct with a protocol with an associatedtype and hung the extensions off of that, I was able to make it work.

3 Likes

Existentials are a generalization of class variables to non-reference types, the reason the protocol extension method isn't callable on an existential is the same reason you can't define the method on UIView. The underlying type of Self is not fixed. Class variables are references to their data, and those references can be replaced with references to subclasses. They are, effectively, always existentials.

No, in the protocol code you wrote there is no existential at all. Self is an associated type, it's resolved statically. An existential is like a class variable, in that the dynamic type can be replaced with any subtype, but generalized so that it can work with value types. Protocols are like generics, they work with the statically known concrete type.

1 Like

The keyword Self inside a class is not a generic parameter, it is its own special thing primarily meant for Objective-C interoperability (it’s spelled instancetype there). It is always the dynamic type of the instance and not the static type at the call site. Eg,

protocol P {}

class Base: P {
  func f() { print(Self.self) }
}

extension P {
  func g() { print(Self.self) }
}

class Derived: Base {}

(Derived() as Base).f()
(Derived() as Base).g()

Given this behavior returning a Foo<Self> from a class method is unsound because we might get Foo<Base> or Foo<Derived> dynamically, but the two types are distinct and we don’t know which we would get.

9 Likes

It is worth emphasizing, IMO, that this is because in the general case Foo<T> and Foo<U> are always completely distinct types, regardless of any relationship that may exist between T and U. This may not be obvious because Swift allows for a few narrow special cases such as Array, Optional, Set, and Dictionary which are 'covariant' in their generic parameters, that is, an Array<T> can be treated as a subtype of Array<U> when T is a subtype of U. But the compiler is able to do this based on special knowledge about how these types behave, and those properties don't hold for generic types in their full generality.

6 Likes

This is true even after introducing the protocol in order to write this extension. The compiler deals with this by binding Self to the static type instead of the dynamic type:

let erasedView: UIView = UILabel()

let result = erased.padded(with: …) // The static and dynamic type of `result` is PaddedView<UIView>. The dynamic type is *not* `PaddedView<UILabel>`

If it works with way when writing the extension on UIViewProtocol, why can’t it work the same way by writing it directly on UIView?

What I was referring here to here is this:

extension UIViewProtocol {
  func padded(with padding: UIEdgeInsets) -> UIView {
    PaddedView(content: self, padding: padding)
  }
}

let erasedView: any UIViewProtocol = UILabel()
let result = erased.padded(with: …) // The static type is `UIView`, the dynamic type is `PaddedView<UILabel>`

The existential here is erasedView, and “opening the existential” means that when calling into this extension from an existential, Self gets bound to the dynamic type of whatever is inside the existential (i.e. the existential is “opened” and the type is bound to what is held inside).

So in that case, no protocol existentials do not work with the statically know concrete type. They work with the dynamic type, and this is different than how classes work (with the exact same code, the extension still written on UIViewProtocol, but with the type of erasedView changed to UIView, the dynamic type of result changes to PaddedView<UIView>.

That’s just me speculating on what differences exist that might make it matter that the extension is on a protocol instead of a class. But I still can’t see what stops the compiler from letting me write this:

extension UIView {
  func padded(with padding: UIEdgeInsets) -> PaddedView<Self> {
  .init(content: self, padding: padding)
  }
}

And then having it bind Self statically the same way it does if you write the extension on the protocol but call it on a variable whose static type is a class.

This problem doesn't go away when I use the protocol trick to get around that limitation. The way the compiler deals with it there is by always returning Foo<BaseType> even if the receiver is, dynamically, a SubType, but its statically declared type is BaseType. This is type-valid because even if Foo<T> has a let member: T, then in Foo<BaseType>, member can be assigned to a SubType. It might be surprising or not what you actually want, but it isn't type-invalid.

What you mentioned about ObjC interop might be the real answer to my question (Self in a class is really some quasi-legacy or interop feature that doesn't behave the way it would if it were introduced into the language without that historical connection). This would imply that introducing the protocol really is just a mechanical workaround to defeat this otherwise irrelevant historical context.

I do have a couple of extensions defined as:

extension NSObjectProtocol where Self: UIView {

	public func connecting(to outlet: inout Self) -> Self {
		outlet = self
		return self
	}

}

For some reason this won't work at all when extending directly UIView. I found this a bit surprising, but I moved on to using the above pattern and it works fine.

2 Likes

Yes, keeping that in mind is critical to understanding what's going on here, and in several other places (I've found it's usually the culprit behind existentials losing functions they "should" support, like map on Publisher).

If one day Swift allows us to declare generic parameters (and associated types) as covariant, such as with the out keyword like C# and Kotlin do (i.e. struct Foo<out T>), then this:

extension SomeProtocol {
  var wrapMeInFoo: Foo<Self> { .init(wrapping: self) }
}

would become available on the existential:

let anExistential: any SomeProtocol = SomeStruct()
let wrapped = anExistential.wrapMeInFoo // static type is Foo<any SomeProtocol>, dynamic type is `Foo<SomeStruct>`, this is okay because the latter is a subtype of the former

However, would that mean it would also become possible to write this?

extension SomeClass {
  var wrapMeInFoo: Foo<Self> { .init(wrapping: self) }
}

Or does the language treat covariant Self on protocols and the Self keyword in classes so differently, that we still wouldn't be allowed to do this, possibly for some arcane historical reason and not because it's fundamentally type-unsafe (they seem equivalent from a type theory perspective to me, but maybe I'm missing something)?

When you declare a conformance of a type to a protocol, the mapping from protocol requirements to implementations (witnesses) is evaluated at the point where the conformance is declared, so we must make a choice among multiple overloads right there.

In C++, templates work completely differently. The compiler basically copy and pastes the definition, substitutes the template parameter with the type, and only then does it resolve names (yes, I’m aware this is a over simplification, but it’s good enough).

4 Likes

Yeah, if you were designing a new language from scratch and the goal was conceptual elegance above all else, you probably would not ever want to add dynamic Self (or reimagine it as a much more general value-dependent type or something).

4 Likes

Tangent, but I consider this one of the biggest mistakes in the language, because it implies that arrayOfInts as [Int?] or stringToStringDict as [AnyHashable:Any] are almost-free constant time operations, when they are actually expensive O(n) operations. What's worse, you don't even need the as, type context (i.e. passing the value of one type into a function that takes the other) is enough, so the conversion is completely hidden.

There is a case to be made perhaps for conversions of arrays of classes to behave like this, if there were guarantees around those conversion actually being implemented for free (i.e. no reallocation, individual element coercion, or in dictionary's case rehashing/bucketing).

But even then, the use of as is a bad way to provide this feature, because it causes much confusion for people coming from other languages, who misunderstand what it means (i.e. it is not a "cast", but people call it that and think of it as that). It would be much better to expose these conversions via functions IMO.

So whenever I see a thread about variance limitations, my mind goes immediately to "yes please, how can we get more" not "how do we lift them". I would love some future swift version to eliminate these flaws.

10 Likes

That's actually an interesting question: is container covariance implemented the way you're thinking, by copying the contents? I assumed for a while that it has to be implemented that way, but as I became more familiar with how I might write my own copy-on-write container in Swift, now I'm not so sure.

I could see it being written in a way where the reference-counted class inside that manages the allocation knows the exact type of its contents and the nominal type of the array its representing, and it provides element read functions that perform runtime casts of whatever is actually stored in the allocation. It may even be possible to do this in a way where you can avoid any per-read runtime checks if the container contents type matches the array type (maybe the class managing the allocation has two flavors, one for mismatched types and the other for matching types).

Of course when writing an element or otherwise modifying the array, the contents have to be copied anyways (copy-on-write), so you'd use that opportunity to switch to a "matching" element type in the allocation.

If user-defined covariant types get added to Swift, I think you'd have to find a way to implement covariant containers this way, because you don't get to write your own copy constructors that would do something like make a new allocation.

But even if there's a single quick CPU instruction per conversion, overall the increased time complexity is still O(n), is it not? Regardless of when the actual conversion happens, at the point of the cast:

// conversion happens up-font
let v = something as [SomethingElse] // 1 extra CPU instruction * n = O(n)
// O(n) overall

or at the point the converted value is used:

// conversion happens lazily
let v = something as [SomethingElse] // let's suppose this is O(1)
for item in v { // n items
    use item // let's suppose this is O(1) – that's where the actual conversion happens
}
// O(n) overall

The idea (if it is practical, there may be implementation complications I'm not considering) is that there would be no individual element operation at all. Bear in mind that in declaring let superclass: Superclass = subclass no conversion actually happens at runtime. All that happens is the compiler says "OK, this value is now of that type":

class C { }
class D: C { }

let d = D()
let c: C = d

// these print the same values
print(ObjectIdentifier(c), ObjectIdentifier(d))

So similarly, an array would just say "OK, I'm an Array<C>" without having to go amend each individual element of type D in its buffer to be of type C. But this only works for classes. Existentials, optionals of structs etc are not like this, and involve a representation change to every element.

This is a big win[1] for everything-is-a-pointer languages like Java or Objective-C have – of course an NSArray<D> isa NSArray<C> because the generics are lies lightweight. Under the hood all NSArray holds is object pointers and so "casting" that array is just the compiler saying "ok, that's what they are now".

For this reason, not having this implicit conversion for Swift would be very upsetting for folks converting from an OOP-and-pointers language. And so it is comforting to those folks that array-of-concrete can be "cast" to array-of-existentials. But it is absolutely not the same. It's actually more akin to concreteArray.map { $0 as any P } and my dream is that a change to provide a warning+fixit made this clearer to people.

Unfortunately such a change would generate a lot of code churn at this point, so whether it's still viable is unclear.


  1. vastly oughtweighed by their various downsides ↩︎

4 Likes

This can be made to work, but I would discourage ever actually doing this, and do not believe this is a good direction for the language to pursue. These reality-hiding abstractions tend to bite you hard from a performance POV when used in anger. In this area, I think transparency from the language about what is actually going on at the representation level is important.

My view, absent an extremely strong design that demonstrates squaring away all of the many challenges it would bring (language complexity, performance foot guns), is that this is not a good direction for Swift to ever consider.

IMO Scala should be considered a cautionary tale :)

4 Likes

Well I just did a test and proved that assigning a covariantly upcasted array makes an actual copy. The alternative would be to put all array-typed variables inside Any-style boxes that thunk to witness tables for both value operations (copy-move-destroy) and calls to the Array interface.

This is probably better because such assignment is much more rare than access, and you wouldn't want to make everyone pay with thunks on every array access just to support covariant assignment.

Isn't the definition of an abstraction "hiding the reality" under the hood? Abstractions tend to, as a rule, trade performance for ease of maintenance. The reason why "premature optimization" is so nefarious is because it practically commands low-level coding that is much more difficult to get correct, and to modify correctly, than the abstracted version (abstractions enforce invariants that low-level code can't). And most of the time it isn't important. Have you ever actually had to remove a covariant array upcast because it was too expensive?

It wasn't that long ago people were arguing that message passing was unacceptably slow on Macintoshes. Ten years later people were writing message passing (ObjC) apps on iPhone 3s and almost never needed to drop down to C-based Foundation for performance reasons.

Arguably Swift has already violated this with copy-on-write, which can be badly suboptimal if you need to modify a single element of a large array that was shared previously at a critical point where time is scarce (even worse, this problem suddenly appears out of nowhere if you weren't sharing it before but then innocently pass a "copy" of the array out to someone else). You can take precise control of that by jumping down to UnsafePointers. It's much harder to get right, and you should only do that after you've proven that the performance characteristics of the abstraction are the real bottleneck.

This can go both ways.

If you want a cautionary tale of "keep a language simple", look at Go. I once helped people re-implement exceptions in it LOL (because "exceptions are bad"). It's like a carpenter bringing only a hammer to the job site because the belt-full of tools is "too complicated".

It doesn't seem Swift's goal is to have entirely predictable and controllable performance by default. It favors safety even when it is costly (i.e. preventing out of bounds access, which requires bounds checking on every array access everywhere). It rather provides an option to drop down into expert-only levels to do that kind of thing, and plasters "USE AT OWN RISK" stickers all over it. Swift isn't C++.

Covariant concrete value types could get messy because as I mentioned it would force generic structs into polymorphic boxes (it arguably runs afoul of the point of structs, which includes not being polymorphic). On the other hand, so what? If it's opt-in (and it should be), then you're paying for it only after deliberately deciding you need it.

Covariance already exists for abstract or reference types in C# and Kotlin and has proven very useful. The more powerful the type system is, the less type-unsafe code (giving up and regressing to Any or weakly typed existentials) you have to write, and the more bugs get caught by the compiler at build time instead of your testers (or worse customers) at runtime.

Even if there's a hidden performance cost to it, then just don't opt into it (don't add the out keyword to type parameters if you need to avoid that cost). But if it's not in the language, I can't opt out of the type unsafety I have to do to work around that.

Sorry, I didn’t mean to imply it didn’t copy today. I am suggesting it could in future for classes specific, and that IMO that should be the only acceptable implicit conversion to support. Even then I suspect it would still be a bad idea to do it implicitly due to the confusion over what “casting” does in Swift.

Is it reasonable to add some guidance in future swift version? I mean a warning about as casting in cases where it has O(n) cost, like:

let array: Array<Int> = ...

let casted = array as Array<any BinaryInteger> // Warning: this is a O(n) operation for value-types, replace it with explicit map
concreteArray.map { $0 as any BinaryInteger } // OK

// But what about opaque types?
let casted = array as Array<some BinaryInteger> // is it also O(n)?

It is important for right guidance. It should be clear that as is cheap for classes and (possibly?) opaque types.

Is this the same or different?

let casted = [any BinaryInteger](array)

This sort of illustrates my concern about people thinking of as as "a cast". A lot of the time, all as is doing is providing type context.

let casted = array as Array<some BinaryInteger> 

isn't valid Swift, because you can't use some like this. But if you could this would not be a "cast" as I think people understand the term. For example, no "casting" is happening with

func f() -> Array<some BinaryInteger> {
  let a: [Int] = [1,2,3]
  return a
}

Rather, what is happening is that the compiler is hiding (keeping opaque) the actual type contained in the array. But that doesn't mean it has been "cast to some BinaryInteger". That array is the same one created as a, and returned untouched on its return to the caller. The compiler is just keeping it's type a secret from you.

This is very different from

func g() -> Array<any BinaryInteger> {
  let a: [Int] = [1,2,3]
  return a
}

Again, I wouldn't describe this as "a cast" but it is an implicit conversion. It compiles, but a is not returned. Instead a brand new array, with every element converted from an 8-byte Int to a 40-byte any BinaryInteger, is created and returned, and a is discarded.

4 Likes