Why Is Covariant Self more flexible on protocols than classes?

I was the one who pondered if it actually copies. I was curious because that's how I always assumed it was implemented, but it's inconsistent with copy-on-write, and it is possible to implement it without copying, and instead as either a special container that is aware both of its storage element type and nominal element type, or as a polymorphic box that thunks calls to the container interface through a vtable. Since it's easy to confirm which one because the performance of assignment would be dramatically different between the two, I decided to answer my own question, and confirmed it's doing an explicit conversation via copy. I agree with you this is the better approach.

I find it interesting you treat "casting" and "conversion" as such different and separate concepts. This is surely due to my time working with C++, but I treat the two as not just overlapping concepts but "casting" as just a subset of "conversion" (specifically a lossless conversion). Maybe it makes sense to further restrict "casting" to a conversion of a reference (so you're still accessing the same identical object), but that's dubious because the whole point of "values" is they don't have surrogate identities so you could never tell the difference anyways.

This is because "casting" in C++ is literally nothing more than implicit conversion functions (either a single parameter constructor, or the operator T() member), or explicit ones through ordinary functions (free floating or member, depending on the syntax you want). This is exactly how smart pointer flavors of cast operations are written in the standard library.

If you static_cast something, the compiler looks for conversion functions. If you define Rectange and Square with no inheritance relation, and give Rectangle a constructor constructor(const Square& square), then static_cast<Rectangle>(Square()) calls that constructor. What could "casting" mean if not "conversion"? It only restricts the concept to a conversion that in some sense "preserves" the value.

This has caused me to imagine a language that permits covariance with value semantics. If a Base defines a function func giveMeAnInt() -> Int64, then a subclass Sub: Base should be able to override that function as func giveMeAnInt() -> Int32, as long as Int32 is implicitly convertible to Int64. This is expanding the usual notion of covariance from pointer/reference types to all types, and treating the pointer case as merely exploiting that a Sub* is implicitly convertible to a Base*. You could do the same with contravariance.

Casting is never necessarily cheap. You always have to convert something, which can be arbitrarily expensive depending on the two types involved (also does it cascade? Is it a direct conversion or does it require multiple conversion steps?), and casting pointers can be even worse. Dynamic casting can be very expensive because it has to explore the inheritance/conformance tree, which is many-to-many. If you're thinking that casting is cheap you need to excise that idea entirely.

Even as (without ? or !) is not always a runtime no-op. If it's just being used for type inference it is, but it could also indicate toll-free bridging. The two have identical syntax because they are semantically equivalent. If you're worried about performance of toll-free bridging, should you also be worried about performance of let upcast = Sub() as Base? That's not free either. "Upcasting" from a concrete type to an existential box is expensive, both at the time of casting and later at access (the indirection can destroy performance in a loop through cache thrashing). The test I ran to prove "upcasting" arrays copies showed that creating the upcasted array took much longer, over 10x I think, than creating the original array. That's because of the expense of boxing every element. If you want "conversion" to be more explicit, arguably you should want all casting to be explicit, which I'm not necessarily against (conversely, a language that allows no implicit conversion is less capable than one that supports it, the more powerful language lets the user specify where it is possible).

In fact, after thinking about this, conversion functions is how Swift should implement covariant generic value types if it ever does, rather than making generic value variables polymorphic and thunking through vtables. So if you declare a generic value type to be covariant, MyStruct<out T>, the compiler both enforces that T only appears in covariant position, and requires you to define a conversion function. You'd either need a special keyword to identify this function, or you'd need generalized supertype constraints to write the required conversion as a regular init. The compiler would then insert appropriate conversion where necessary.

This enables things you can't defeat by merely inserting explicit conversions. For example today you can't map a Publisher existential, even with constrained associated types:

let publisher: any Publisher<String, Never> = Just("")
let mapped = publisher.map { $0.capitalized } // Not possible

Why isn't this possible? It should even be possible without associated type constraints ($0 would just come in as Any), because the associated types are covariant here.

Because the return type of map is Publishers.Map<Self, Result>. There is covariant Self in a non-covariant position, as a generic parameter in a generic struct. But Self is used strictly covariantly in Publishers.Map. You should be able to "upcast" Publishers.Map<Self, String> to Publishers.Map<any Publisher<String, Never>, String>, with the latter being the static type of mapped. If we could declare it as Publishers.Map<out Self: Publisher, Result>, then we can express this is possible, and the compiler will enforce it. We'd also need to deal with the fact the existential doesn't conform to Publisher.

Those two problems, lack of covariance in generic parameters and no way to express that a generic parameter be restricted to protocol existentials (and thus forbids access to metatype and non-covariant requirements), are the two major reasons why we still have to hand-write type erasers in Swift. Even if you could write a supertype-constrained explicit conversion, that won't solve this problem. You have to be able to express the covariance.

I personally believe Swift's primary goal right now should be to eliminate the need for hand-written type erasers. They're not only tedious to write and maintain (and most people, me included for a while, get them wrong, i.e. how do you correctly write a type eraser for a protocol with mutating requirements that correctly preserves value semantics?), they are loaded with footguns (they simply can't work properly in casting), and when I have to explain them to junior devs or even senior ones coming from other languages they invariably get confused, not just over why they are necessary but how to use them properly.

1 Like