C++ Abstract Class Inheritance and C++-Interop (to Swift Protocols)

plotfi · September 7, 2022, 8:28pm

Hello Everyone. Recently I was having discussions with @cipolleschi about some of the C++ language constructs they are looking at for their first pass at experimenting with C++-Interop. We saw quite a lot of inheritance with virtual functions, but in many of the cases these were purely abstract classes. Currently with C++-Interop we import the following:

struct P { virtual int f() const = 0; };
struct C : public P { virtual int f() const override; };

as

struct P {
  init()
  func f() -> Int32
}
struct C {
  init()
  func f() -> Int32
}

In this instance, since the parent class only has pure virtual methods with no member variables and no custom ctors/dtors, I could see how the parent could conceivable be imported as a protocol. So instead the following:

protocol P {
  func f() -> Int32
}
struct C : P {
  init()
  func f() -> Int32
}

I would like to know if there are any strong technical or semantic reasons why this should not be the case?

Thanks

cipolleschi · September 8, 2022, 7:59am

Thanks Puyan for raising this concern.

To add some other information on top of this, I think that the compiler knows that the two are somehow related, because the error that is raised is

error build: Ambiguous use of '<name of the method>'

So it finds two methods with the same nameand it doesn't know how to resolve it.

Moreover, I'm a bit worried about the current decision of mapping structs that are related by a hierarchy relationship to separate structs. This prevent us to pass C, in the example, to a function that accepts a parameter of type P, the superclass, for example.
And it does not allow us to use superclasses as types in variables. For example:

// C++
struct P { virtual int f() const = 0; };
struct C : public P { virtual int f() const override; };
struct D : public P { virtual int f() const override; };

// Swift
let a: [P] = [C(), D()]

The last line will fail because C and D are separate structure. Moreover, the array a can't be created if not empty, because we can't instantiate P.

I can see two different approaches to solve the problem:

Are we considering the idea of adding annotation in C++? For example, a pure virtual class could be annotated as @pureVirtual (or something like that) so the compiler knows that a class can be safely converted to a protocol
A more computational expensive solution could be that, every time the compiler finds an extends clause, it crawls the hierarchy upward, and converts all the structs in classes. The problem here would be that C++ allows for diamond inheritance while Swift doesn't... so it could get tricky.

Any thought on these?

michelf · September 8, 2022, 1:56pm

Maybe the polymorphic nature of a C++ type could be encapsulated in a protocol to which concrete types would conform to:

// C++
struct P { virtual int f() const = 0; };
struct C : public P { virtual int f() const override; };
struct D : public P { virtual int f() const override; };

// Swift
protocol VirtualP { func f() -> Int32 }
// struct P is abstract so don't emit concrete type for P

protocol VirtualC : VirtualP { }
struct C : VirtualC { func f() -> Int32 {...} }

protocol VirtualD : VirtualP { }
struct D : VirtualD { func f() -> Int32 {...} }

let a: [VirtualP] = [C(), D()]

This way, hierarchies can exist for non-abstract types, and it makes diamonds expressible.

cipolleschi · September 8, 2022, 4:29pm

Hi @michelf, thanks for the answer.

Could you explain why we need the lines:

protocol VirtualC : VirtualP { }
protocol VirtualD : VirtualP { }

??

Are they potentially required to support diamond inheritance?

michelf · September 8, 2022, 5:51pm

The VirtualC protocol is not very useful in the absence of another struct derived from C, but if there was one it would be used to represent inheritance from C.

Since C is not declared final, there could effectively be another struct we just don't have visibility to but which gets exposed as a C via a C++ pointer/reference somewhere.

Let's make it a bit more complicated, with a diamond:

   P
 /   \
C     D
|     |
E     /
 \   /
   F

// C++
struct E : public C { };
struct F : public D, public E { };

// Swift
protocol VirtualE : VirtualC { }
struct E : VirtualE { }

protocol VirtualF : VirtualD, VirtualE { }
struct F : VirtualF { }

let a: [VirtualP] = [C(), D(), E(), F()]
let b: [VirtualC] = [C(), E(), F()]
let c: [VirtualD] = [D(), F()]
let d: [VirtualE] = [E(), F()]
let e: [VirtualF] = [F()]

Each struct is still independent from each other (from Swift's perspective), so you can't put a E somewhere that expects a C, but you can put a E somewhere that expects a VirtualC.

I think I should have called this PolymorphicC instead, as it's not necessarily related to C++'s virtual.

I wonder... can we model C++ reference semantics in a way that supports inheritance using protocols like this? For instance:

// C++
void byCopy(C arg);
void byRef(const C & arg);

// Swift
func byCopy(_ arg: C)
func byRef(_ arg: VirtualC)

Here byCopy can only accept a concrete C and byRef can accept either a C, E, or F. Note that with byCopy you'd get slicing behavior in C++, so even in C++ the function effectively can't deal with derived types, whereas byRef can.

jrose · September 9, 2022, 1:01am

This is not quite my area of expertise, but I'll try to lay out my implementation guidance anyway. There's a lot of trickiness here, because Swift uses protocols in lots of ways:

foo as? SomeProto to check conformance. This probably can't be done safely at all for arbitrary foo, so if you wanted to allow it for arbitrary C++ structs you'd have to have the compiler check something about the static type of foo (the precise constraint eludes me at the moment). I also don't remember whether dynamic_cast works if you turn off arbitrary RTTI in C++, and this probably has the same restriction.
foo: any SomeProto to pass around values, roughly equivalent to std::shared_ptr<SomeProto> or std::unique_ptr<SomeProto>. Note the ownership part: this is a self-contained indirected value, not a reference. I don't actually see any problem implementing this and thunking at the FFI boundary, but you'd have to be very careful about how you implement the "copy" operation for values of this type. Unlike Swift protocols (but like Objective-C protocols), there's no need to pass a separate witness table; you're ”just" using the class's intrinsic vtable. (Though there may be some implementation complexity for upcasts adjusting the pointer base offset…)
foo: any SomeProto + Equatable: I don't think compositions make things any harder, except that if you compose two C++ protocols you might have vtable fixups. (Hm, maybe the "witness table" for a C++ protocol is just the base offset.)
foo: some SomeProto for arguments. I think this actually ends up looking a lot like any SomeProto representation because, again, you don't have a separate witness table. You do need the basic value witness table around to copy or destroy the value, and that doesn't have to be packed in with the value like it does for any SomeProto, but it shouldn't be a problem.
foo: some SomeProto for return values: pretty much the same reasoning, except that the value has to be owned.
Foo: SomeProto - obviously arbitrary types cannot adopt "C++ protocols" after the fact.
T: SomeProto for (unconditional) generic constraints: I think this imposes basically no run-time requirements, the same way as Objective-C protocols, and presumably we will already have run-time representations of C++ types for things like print(T.self), with or without specifying a base class / protocol.
Foo: Equatable where T: SomeProto: sadly and weirdly there are places where this turns into run-time-type-checking, and I don't know if C++ supports that (a run-time check whether one type subclasses another type).

I think I got everything.

But a downside: once you have done all this, you still won't be able to say "C and all its subclasses conform to Swift.Hashable", because Swift conformances can't be added to protocols retroactively. (This is not just a current limitation; it's Very Hard for a language with library evolution and run-time-instantiated generics.) Admittedly you can't do that with independent structs either, but that's definitely a reason to push towards representing single-inheritance hierarchies of classes with virtual methods as classes in Swift, not protocols.

Also non-virtual diamond inheritance absolutely screws this all up, because Swift says you can always statically upcast from a concrete type to a protocol and that a type can only conform to a protocol in one way. I don't think you're going to get around that no matter what implementation strategy you pick.

cipolleschi · September 9, 2022, 12:22pm

I'm beginning to understand why it is so complicated!

I don't know which philosophy are we following for the interoperability mapping. But , to start with, it would be nice to manage the hierarchy without mixing the languages.
The problem I'm facing is that I have a parent class and a subclass all in C++-land and I'd like to call the a method on a subclass, instantiating it from swift. However, the compiler get confused because it finds 2 candidates for the same function. But it shouldn't because the concrete class is known a compile time.

We would be fine, at the beginning, if the inheritance works properly in C++-only: we could ask users to write a bit of C++ if they have to subclass a class that lives in C++, if they can instantiate and use it from Swift.