An inheritance problem

I remember that in earlier version of swift the problem of member name collision was unhandled, apparently it still is. The main objections I got are: method dispatch and name collision. The former is easy: concrete types always suppress the default implementation of the protocol and the most recent refinement of a protocol for which defimpl is given should suppress the older ones (I believe that it is the current behaviour), and the latter is unlikely to happen (well doesn't matter, because naming collision is still there and aint gonna go away until conformances become named).
Also there is orphanage problem that I got impression that you do care about, but I dont see how internals and family would ever manifest it, because if there is an internal conformance in one module, swift shouldnt be so dumb to not to be able to resolve the dispatch in different module.

The main problem here is that extensions are called using static dispatch but when you declare them it uses a form of dynamic dispatch. This is something that needs to be better explained in swift as it intuitive for me either. The error should have a better explanation. See: Thuyen's corner

Well, nice :neutral_face:. Any attempts to solve this, @core-team ?

There's a few things here that aren't quite right:

This isn't necessarily a problem, per se, just a place where Swift and Objective-C differ. Non-public methods in one module can never get invoked from another module or and are never chosen to satisfy requirements for a conformance in another module.

This is a difference between methods in a protocol declaration and methods in a protocol extension, but I don't think it's relevant here: in this case the Mixin protocol does declare fin() as a requirement and thus a dynamic dispatch entry point.

I guess both of these points would be relevant if Swift did use name-based dispatch (like Objective-C does), but it doesn't (when a method's not exposed to Objective-C) because of the problems that @itaiferber brought up. If we're all on the same page about that…


…let's back up. How does protocol inheritance work today? It's essentially a shorthand for an additional requirement on the conforming type. That is, these two are equivalent:

protocol Sub: Base {}
protocol Sub where Self: Base {}

struct Impl: Sub {}

In both cases, anything that's generic over Sub can make use of the value also being a Base, and so there must be a way to take a Sub-constrained value and use it as a Base-constrained value. The implementation of this is for the run-time representation of the Impl: Sub conformance to include a reference to the Impl: Base conformance.

Okay, so how would we extend this to "mixin conformances"? Well, it gets tricky. As you note, we could have a rule that whenever you have an Opaque conformance, you can use that to build a Mixin conformance using the default implementations provided. That's totally implementable! However, it gets weird in cases like this:

struct Concrete: Opaque, Mixin {
  typealias T = String
  func fin() -> Int { return 20 }
}

func useOpaque<T: Opaque>(_ value: T) {
  value.fin()
}
useOpaque(Concrete())

In a language like C++, we'd get a different version of useOpaque for each concrete type that gets used, but Swift doesn't work like that. All useOpaque knows is:

  • the run-time concrete type of T, which is Concrete
  • how that type conforms to Opaque

So if it uses the default implementation of Opaque: Mixin (where fin() returns 42), it'll have different behavior from Concrete: Mixin (where fin() returns 20). Hm. But useOpaque doesn't have access to Concrete: Mixin—at least, not without doing dynamic lookup. And dynamic lookup has its own problems (mostly the "what if two modules independently implement the same protocol" problem).

But let's say we go with dynamic lookup. That still doesn't solve all the problems: what happens in this case?

// Module Base
protocol A {}
protocol B {}

struct Concrete: A, B {}

// Module Mixin
extension A: Mixin {
  func fin() -> Int { return 1 }
}

extension B: Mixin {
  func fin() -> Int { return 2 }
}

Now we have the same problem: which implementation should we use when trying to form Concrete: Mixin? Neither one is obviously better than the other. (This is the same reason why you're not allowed to have a type conform to a protocol with two different sets of conditions—there may not be one that's better than the other.)


None of these problems are insurmountable in the abstract; it would be possible to define behavior in all these cases. But what would be hard is doing it in a way that doesn't compromise performance and doesn't make the language (too much) more complicated than it already is. So the recommended approach today is to build adapters instead:

struct OpaqueMixinAdapter<RawValue: Opaque>: Mixin {
  var rawValue: RawValue
  init(_ rawValue: RawValue) {
    self.rawValue = rawValue
  }

  func fin() -> Int { return 42 }
}

It does mean that anyone who wants to use an Opaque as a Mixin has to hop through your adapter type, but at least the behavior is very clear.

4 Likes

One more thing: this is absolutely a pain point, even within a single library. Splitting out AdditiveArithmetic from Numeric is something that couldn't be done today without breaking ABI compatibility because we don't have this mechanism or something like it. But it turns out to be really hard to design in practice due to other decisions that have been made about the language. (Not that I can point to one in particular that is "the problem".) Everything's a set of tradeoffs.

I think the problem could be tackled by making a compiler track cyclic dependencies and informing programmers and by allowing conformance to be invoking the desired methods by the full path.

extension A: Mixin {
  func fin() -> Int { return 1 }
}

extension B: Mixin {
  func fin() -> Int { return 2 }
}
struct Concrete: A, B {} //collision found. Resolve manually

struct Concrete: Mixin, A, B { func fin () -> Int { return Self.A.fin() } } //1
//now Concrete have implementation from A: Mixin

//or just redeclare it. Both impls from A: Mixin and B: Mixin would be
//suppressed
struct Concrete: Mixin, A, B { func fin () -> Int { 666 } } 
//ok, now implementation is in Concrete: Mixin


//module S
struct Concrete: Mixin {}
//module T
extension Concrete: Mixin { ... } //the most recent refinement
//a single redeclaration within one module should be ok

Is this not all to it? :thinking:

I would like to hear the reply from you to this first, but meanwhile, I object that the intention to make both simple and "system-level" language looks like a contradiction to me, given how dynamic the language already is, wittingly not improving woes that lay beneath the ground is not going to do any good to anyone (because among all, it restricts the expressivity). Solving this particular downfall - I believe - would not introduce any new complexity on the surface (syntax), and could probably remain hidden for some users.
Also to note: I spent some time browsing this topic and revealed a bunch of pain regarding the dispatch facility (this one makes my skin shiver :anguished:), which started me to consider not to come into swift. I hope that sentiment of unimprovement is not of the majority of the core team.

Speaking from experience, I'm in awe with the level of performance Swift can provide.

I was experimenting with Geometric Algebra a while ago. The details doesn't really matter, but there's this one add function which is highly generic and works on any class that conforms to Storage type. The add implementation uses Storage.scalar extensively. I've been juggling between generality and performance for the longest time, then I figured that I can inline Storage.scalar. The results is that I gained ~1000x performance, on par with me manually writing only necessary mutations by hand. What you suggest would most likely prevents that inlining since we need to read the whole program to know which implementation is being used for scalar. So it is making a code somewhere 1000x worse.

I won't say that we don't need dynamism at all, but being unnecessarily dynamic is probably not a good thing.

3 Likes

Ah, I cannot be so sure, because neither I see your code nor I can confidently state the exact behaviour if the correct dispatching would be implemented, because it is not rigorously defined yet. But despite that, I already don't see how this would incur performance degradation for static libs (which yours seems to be). Moreover, I think that in the examples I have shown the dynamic dispatch is unnecessary.
All I have talked about can be implemented completely in compile-time - compiler just has to resolve conformance graph and pick a single witness that is set in the current context (module) and then generate assembly - and nothing prevents it from inlining (as nothing prevents inlining of generic functions and methods now).
To demonstrate:

Prelude
extension A: Mixin {
  func fin() -> Int { return 1 }
}

extension B: Mixin {
  func fin() -> Int { return 2 }
}
struct Concrete: A, B {} //collision found. Resolve manually

struct Concrete: Mixin, A, B { func fin () -> Int { return Self.A.fin() } } //1
//now Concrete have implementation from A: Mixin

//or just redeclare it. Both impls from A: Mixin and B: Mixin would be
//suppressed
struct Concrete: Mixin, A, B { func fin () -> Int { 666 } } 
//ok, now implementation is in Concrete: Mixin
//module S
struct Concrete: Mixin {}
//module T
extension Concrete: Mixin { func fin () -> Int { return 99 } } 
//the most recent refinement
//a single redeclaration within one module should be ok

//at this point if you use module T 
//runtime shouldn't give a damn about all previous 
//refinements of Concrete, because in this module it is explicitly set 
//to be with exact implementation (return 99).
//it just has to construct a correct kind 
//(Archetype it is called?)

So I don't see how this would get it not better, but worse.

ps. I have some hypothesis about the way to nail it:
First, the runtime should be separated into two different domains: private domain for internal, private, fileprivate code etc), and public domain for public, open code. The witness picking should then have the following algorithm:

  1. Construct all the code that is internal for any external modules used
  2. Collect all most recent public declarations of witnesses in the current module
//module S
protocol A { func lp () }
struct Tau: A { func lp () {print("Tau from S")} }
//module R
import S
extension Tau: A { func lp () {print("Tau from R")} }
//this is the most recent declaration
//choose this witness and forget about S.Tau.lp()
  1. Descend the hierarchy of refinements and pick the most recent public witness from imported modules
//module S
protocol A { func lp (); func zx () }
extension A { func zx () {print("A from S")} }
struct Tau: A { func lp () {print("Tau from S")} }
Tau().lp() //"Tau from S"
Tau().zx() //"A from S"

//module R
import S
extension Tau: A { func lp () {print("Tau from R")} }
Tau().lp() //"Tau from R"
Tau().zx() //"A from S"
//this is the most recent declaration
//inherit zx() implementation
//choose new witness for lp() and forget about S.Tau.lp()
  1. If any collisions were found, throw an error and make the programmer to choose single witness manually
extension A: Mixin {
  func fin() -> Int { return 1 }
}

extension B: Mixin {
  func fin() -> Int { return 2 }
}
struct Concrete: A, B {} //Error, collision found. Resolve manually

struct Concrete: Mixin, A, B { func fin () -> Int { return Self.A.fin() } } //1
//now Concrete have implementation from A: Mixin
//Now compiler can eliminate unnecesary code for this particular type
//(Implementation from B: Mixin gets annihilated)

//or just redeclare it. Both impls from A: Mixin and B: Mixin would be
//suppressed and never get to assembly
struct Concrete: Mixin, A, B { func fin () -> Int { 666 } } 
//ok, now implementation is in Concrete: Mixin
//Destroy A: Mixin and B: Mixin impls

Regarding variable parameterization by existential, then I think the best way is to first:

  1. Track all assignments of all valid concrete types to it
  2. Construct a box with the biggest size that could be allocated on the stack (kind of how enums work) or put it on the heap
//module R
var u: A = Tau()
u.lp() //"Tau from R". Yes, the most recent is picked
u.zx() //"A from S"

To spill a bit about generic functions, behaviour that differs from this is basically inferiority of current implementation:

//module R
func useA<T: A>(_ value: T) {
  value.lp()
}
useA(Tau()) 
//if it prints "Tau from R" than we are golden
//otherwise it just a bold bug, because a witness is explicitly set to be 
{print("Tau from R")}

Let's jump back to Mixin example. How can I allow fin inlining. If I have this:

// Module A
import Mixin

protocol A: Mixin {...}
extension A { /* Implements Mixin */ }

func take(a: A) {...}
struct Concrete: A {...}

// Application
import A

protocol B: Mixin {...}
extension B { /* Implements Mixin */ }

We want to compile module A into a binary, and ship it to Application, which is doable. Now, what should happen if I do this in the Application module?

extension Concrete: B {...}

If it shouldn't be allow, I'm not sure of the utility of the manually resolution that we've been working hard for.

If it should be manually resolved, things will be getting interesting. What should happen if I call take(a: Concrete()) from Application? It'd surely use the resolved implementation. What if I call it from within A? It could

  • Uses the resolved implementation. This will surely prevent inlining when compile a binary for A since it can't ascertain the actual fin at that time.
  • Uses the A.mixin implementation. It's, well, is a little weird, but somewhat reasonable.

Either way, welcome to An Implementation Model for Rational Protocol Conformance Behavior

I updated the reply, might wanna take a look.

This is obviously a diamond (collision) so you would have to manually choose whose implementation you want to use in the current module.

If you would call it from the application module, it would use the witness from that context, if not specified otherwise.

I see you choose 2:

which, I'd deem it a bit drifted away from simplicity that you seem to allude to in the beginning. Nonetheless, that style of resolution does gain some support in the link I posted above. Not to say that it's easy to implement, though.

To clarify:

// Module A
import Mixin

protocol A: Mixin {...}
extension A { /* Implements Mixin */ }

func take(a: A) {...}
struct Concrete: A {...}

// Application
import A

protocol B: Mixin {...}
extension B { /* Implements Mixin */ }

extension Concrete: B {...} 
//that uses Application.Mixin implementation,
//because it is the most recent refinement
take (a: Concrete()) //uses implementation of Mixin from the current module
//because it has been explicitly redeclared

Couldn't you refine to all types conforming to CustomStringConvertible or CustomDebugStringConvertible or TextOutputStream? There are plenty of existing protocols which might give you the functionality you need and most o f the built-in types conform to them.

This is a basic hack. It doesn't solve the real problem which was admitted by many people on this site. Despite that, I want all other types to opt to conform, while, as I already wrote, the extension of CustomStringConvertable and such provides default implementation to only limited set of types.

Alright, I see what you mean. I was going to suggest "\(anything)" as your final solution.

The problem is that in my example, there's no one place that's at fault. This code is obviously valid:

protocol A {}
protocol B {}

struct Concrete: A, B {}

And we want this code to also be valid:

protocol Mixin {
  func fin() -> Int
}

extension A: Mixin {
  func fin() -> Int { return 1 }
}

extension B: Mixin {
  func fin() -> Int { return 2 }
}

Neither of them has done anything wrong. What about this code?

func useMixin<Value: A>(_ value: Value) {
  value.fin()
}

That should be okay. And this?

useMixin(Concrete())

If we don't look at the body of useMixin, this should be okay too. That means there's nowhere to emit the error about the collision.


There's a premise in here that I haven't stated, which is that supporting separate compilation is a design goal for Swift. Unlike several other modern compiled languages (mainly thinking of Rust), the Swift compiler cannot see the entire program at compile time. There are a few reasons for this, but most of them have to do with how Apple ships its OS: using dynamically-linked closed-source libraries that preserve binary compatibility across versions. So there will always be libraries where all the compiler can see is the interface.

When languages with separate compilation want to do cross-module analysis, they typically use some kind of runtime support for this, possibly even a JIT. While Swift doesn't currently have any sort of JIT, it does have plenty of runtime functions that set up various bits of metadata on first use. So it's totally fine to suggest features that require runtime support, though if it needs a full-on JIT the current runtime isn't set up for that.

2 Likes

Uhmm, what do you mean by that? I don't see any reason why compiler wouldn't be able to emit an error about a collision. Because see I think about it in terms of a scope that is some module, and that module should be the single point of witness resolution. See:

//module A
protocol A {}
protocol B {}
struct Concrete: A, B {}

//module B - imagine that it is a current root of compilation
protocol Mixin {
  func fin() -> Int
}
extension A: Mixin {
  func fin() -> Int { return 1 }
}
extension B: Mixin {
  func fin() -> Int { return 2 }
}
func useMixin<Value: A>(_ value: Value) {
  value.fin()
}
useMixin(Concrete())
//aha! usage of type with vague witness found.
//compiler now can emit error about that
//in current module the witness for fin() of Concrete
//is unclear, so you should resolve manually
extension Concrete: Mixin { func fin() -> Int { return Self.A.fin() } }
//ok in current module Concrete is explicitly declared to 
//have witness from Concrete: Mixin.

Regarding dynamic linking, I don't see that it is related to this because using a dynamic library is basically putting everything through a layer of indirection. Also, every library has to be compiled at some point (right?) and it should be totally sane to construct a separate witness record for each context (module) - from the example above: module A gets its own witness record, module B gets its own (witness data can be even shared between them).

//module A
protocol A {}
protocol B {}
struct Concrete: A, B {} 
//witness record of Concrete is unique to current context (module A)

//module B
protocol Mixin {
  func fin() -> Int
}
extension A: Mixin {
  func fin() -> Int { return 1 }
}
extension B: Mixin {
  func fin() -> Int { return 2 }
}
extension Concrete: Mixin { func fin() -> Int {140} }
//this Concrete gets new/modified witness record
//that is unique to module B

So for example when you need to import a dynlib, the code that uses objects from module B is routed to witnesses records from that context, and other code that uses objects from module A gets routed to that module's witness records. There is no problem here.


That is perfectly fine and not a problem: if library has internal witness (private extension and likes), it should never conflict with code from other module/file

//module S
public struct Zeta {}
internal extension Zeta: Equatable { ... }

//module T
import S
internal extension Zeta: Equatable {} 
//this is other module, a different witness table here!

That is disappointing! Does it imply that swift has stuck forever with crappy dispatch semantic?

A lot of this has already been heavily discussed in the thread I mentioned:

I would suggest that you read through it. I know it's long and arduous, but I personally find the thread very entertaining to say the least.