Adding c++ hierarchy information to swift AST

pschuh · August 27, 2019, 5:50pm

In Add support to upcast for c++ classes. by pschuh · Pull Request #26861 · apple/swift · GitHub, it was suggested that the swift AST should have the c++ hierarchy information. Would that be an auxiliary structure like ObjCMethodLookupTable which maintains the base class information? Or would that be just some methods that use the clang decl to get the necessary base class information and then use clang importer to potentially lazily load the base classes?

@jrose

jrose · August 27, 2019, 6:14pm

I meant that people are going to want to go from Derived * to Base * in Swift and we ultimately need to decide how that's represented in the language; the existing UnsafePointer API definitely does not cover it. I hadn't really thought about the compiler implementation side.

pschuh · August 27, 2019, 6:22pm

Hmm. I suppose on the syntax side,

If it were possible to have a protocol like:
CXXSubclass where all c++ subclasses of T would have conformances
Then, there could be a cxx_upcast() builtin to do the actual upcast?

Then, UnsafePointer could have a upcast function conditioned on conformance to CXXSubclass that calls that builtin.

Of course, the use case of upcast that I was initially planning was one where you call a method on a base class of a c++ type and when emitting that invoke it really needs to be upcast(v).some_method(...); I would hope that could be implicit and not require the upcast explicitly.

jrose · August 27, 2019, 6:38pm

Yeah, the brute force method of mirroring all inherited methods onto the subclass would work.

Another thing that would work, for some value of "would work": a C++ class is a protocol with a single-pointer existential representation. Upside: super easy to handle vtable dispatch. Downside: breaks nearly everything else, including C interop.

pschuh · August 27, 2019, 7:12pm

I'm not so sure about just mirroring all the methods. There needs to be some this-ptr adjustment in the case of multiple base classes. I would think we would want the this-ptr adjustment to be an explicit upcast in sil? If we just clone all those decls, the irgen code for dispatching a non-virtual method call would require knowing about the this-ptr adjustment.

compnerd · August 27, 2019, 8:49pm

Just some random thoughts without given this the appropriate depth of consideration that it deserves, so this probably will break down horrendously.

I believe that the following cases are the ones that need to considered:

static_cast: the types are known and can be statically converted. Can we achieve this as an "optimization" by statically doing this check as a SIL level pass? Doing that lets us fold this into the next cast type. If we actually track the clang Decl, we can easily verify the inheritance as needed.

dynamic_cast: this is the only cast that I think that matters to Swift for C++ interop. This requires the checking of the type (if RTTI is enabled - RTTI being disabled prevents dynamic_cast from being used) and returning the cast or nullptr. This is modelled in the language as the as cast that is traditionally done. This actually does require changing the underlying runtime calls here.

reinterpret_cast: this can be left as an unsafe memory operation as it is. The reinterpretation of the memory is then effectively made explicit in the language.

C style casts: select the correct cast or go with the syntactic reinterpret_cast equivalent which is what the reinterpret_cast equivalent really is.

Basically, what this boils down to, keep the CXXRecordDecl associated with the imported type. Perform a SIL pass over the casts to see if they can be identified statically or not. If so, great, replace it with the inline casting. Failure to do so, treat it as a dynamic_cast and use the C++ runtime's casting to cast the pointer at runtime and return the value as the optional which as returns, which makes this fit naturally into the language and provides a reasonable point for the casting to be handled.

When perform a method dispatch, if the method is known to be implemented directly by the type, a regular method call be done, otherwise, make it a regular C++ virtual dispatch. That would allow the regular behaviour of C++ method dispatch to be maintained as well.

Upcasts are generally unsafe and go through an explicit pointer cast - which is possibly a solution here - allow the user to take the address of the type, and assign to the explicit type that they are upcasting to. However, does the upcasting really take into account the this pointer adjustment thunk invocation?

pschuh · August 27, 2019, 9:12pm

The upcasting as I've implemented in the PR is always safe presuming the pointer is null or valid. I dispatch out to clang to do the proper this-ptr adjustment. I agree that reinterpret_casts are already handled as you say. Actually, all of these casts can be done now with inline helper functions in a header file that do the proper call. The reason why I've focused on upcasting (a subset of static_cast) first is that I think it is necessary for being able to call methods on base classes. Once we get to the point of adding sugar for casts, your "lower as to dynamic_cast" seems pretty reasonable to me. The one caveat is that if we're defining sugar I would hope it could be general enough that it could hook up somehow eventually with the llvm::is_a; llvm::cast casting system.

John_McCall · August 27, 2019, 9:27pm

If we want to provide a native C++ experience in Swift, then we can map all the C++ cases to implicit conversions, as, as!, and as? as necessary. It's not clear to me that we actually want to do that, but we certainly could, using unique or borrowed ownership as appropriate.

But right now, this all seems extremely premature to me.

pschuh · August 27, 2019, 9:58pm

Hi John,
Part of this thread is about how to look up (and call) base class decls from a subclass. This will probably involve some form of upcast in sil (similar, perhaps, to the this-ptr adjustment in cxx_virtual_method) to make sure the this ptr is properly adjusted (even for static methods and even if we blindly import/clone all the methods from the base class like how jordan proposes). Do you have any comments about that? Do you think that there is some c++ feature that is better to work on first?

John_McCall · August 27, 2019, 10:43pm

The this adjustment in cxx_virtual_method is basically non-semantic. In the MSVC ABI, the adjusted this is not necessarily pointing at any meaningful subobject, and the callee just has to compensate for the expected meaningless adjustment done by the caller. So that's quite different, even if theoretically we could use it to perform semantic adjustments as well.

It would be sensible to provide SIL instructions for upcasting and downcasting C++ pointers regardless of what we expose in the AST / language.