C++ Interop

I've been looking into c++ interop. It is mostly just straightforward bug fixes right now as I'm following the plan laid out by @Douglas_Gregor and @John_McCall here C++ / Objective-C++ Interop - #2 by Douglas_Gregor but I foresee that there will be some design questions related to this space that will need discussion.

I have some meta-questions about proceeding with work on this feature:

  • Are there any similar features that I could draw inspiration on when engaging in design discussions?
  • How should I proceed towards writing a manifesto?
  • Should the individual features be discussed independently, or as part of one long design discussion?

Here is an example of the early stage design questions that will need discussion:

  • c++ types imported as swift type objects contain c++ namespaces as their decl context; how should I mangle the namespace into this generated __C.TypeName? These are accessible via using statements that import into the global namespace.

Some slightly later design questions:

  • Should names in namespaces be exposed as `namespace_name::name` or import a namespace object that you can then member lookup.
21 Likes

Hi Parker,

As you know, I'm super thrilled that you're pushing on this. I'd start with the general plan outlined by Doug in the other email: get the existing feature set to work in C++ mode, then start turning on the uncontroversial things (e.g. operator overloads in C++, methods on POD types, etc). At the same time, we can open specific design discussions on non-obvious things: e.g. how are namespaces imported (single member enums? ignored? mangled into the type names? something else?)

The thing to shoot for is to make sure the importer ignores anything that cannot be imported uncontroversially: it is perfectly fine to drop them on the floor, since they wouldn't have come in in c mode anyway.

I'm super thrilled to see this progressing!

-Chris

7 Likes

Thanks for pushing this forward @pschuh! I've been playing with this idea for the last few days without knowing there's anyone else driving it at this time.

My initial plan was to start with simple codegen tools that would be able to use ClangSwift to scan C++ declarations and generate C wrappers around that. These C wrappers could be imported into Swift code as proper types thanks to the implemented "Import as Member" proposal, although this would only work for simple classes without destructors. For destructors, this would probably need another class wrapper on Swift side that would call the destructor function from deinit. This would still not allow using C++ templates as generics, but it could be a start.

As far as I know, this is similar to what rust-bindgen is doing, although it's a bit easier for them thanks to availability of macros in the language.

Obviously, proper support in the Swift compiler itself would be great, but I imagine would require more expertise in the compiler itself. IIUC this would require extending the existing ClangImporter?

Another direction this could go is preparing a proposal for C and C++ calling convention attributes on Swift declarations (only non-generic Swift function to start with?). This would probably require such functions to be mangled by C++ rules when exposed to C++. I'd expect this to work similarly to the existing @objc and the unofficially supported @_cdecl attribute. Maybe we could have a single @ffi (name for bikeshedding) attribute that could be reused for interop with other languages, not just C and C++?

I'm really excited about the possibility of interfacing with C++ from Swift, as this would unlock building more powerful tools that interact with the Swift compiler directly. We have SwiftSyntax, which uses a mostly manually written C wrapper, but I can't wait to be able to use other compiler modules like AST and Sema directly for Swift. Who knows, maybe when that's all working, we could consider a possibility of a bootstrapped Swift compiler? :crossed_fingers:

3 Likes

Hi Max,
These all sound like interesting future directions. I have some local patches that convince me that it is relatively straightforward to support the basic importing in ClangImporter. This lets you write your "C wrapper" inline into the c++ header file using the c++ types directly in the signature of the wrapper. Things c++ does not understand are just ignored. Then, over time, the verbosity of this wrapper can be reduced. Most of the benefit will come from the first couple features. eg: not having to wrap basic methods.

Another example of a thing to discuss:
This is a general c-interop thing, implicit conversion for const char* happens through _convertConstStringToUTF8PointerArgument which gets emitted from StringToPointerExpr in silgen. This in turn produces a mark_dependence between the original string and the resulting const char*.

In the c world, if I wrap a c++ class in a c type as a pointer, and then wrap that pointer in a swift class that destroys the c++ class on deinit {}, if you convert that class back into a pointer via an computed property, and then pass it to a c-function, there is no guarantee that the class will stick around for the duration of the function call. Perhaps a @dependent attribute on a computed property that would enable providing this safety for "pointer-like" types in interop definitions (this would only hold true for the body of the returned function). C++ solves this with sequence points.

Another related project I've just discovered: cbindgen

cbindgen creates C/C++11 headers for Rust libraries which expose a public C API.
...
C++ headers are nice because we can use operator overloads, constructors, enum classes, and templates to make the API more ergonomic and Rust-like. C headers are nice because you can be more confident that whoever you're interoperating with can handle them. With cbindgen you don't need to choose! You can just tell it to emit both from the same Rust library.

Hi @pschuh,

I’m also interested in C++ interop, and am very happy to see it being worked on.

I checked out your PR locally and have looked at trying to extend it to support member functions (mostly copying the logic from the Objective-C implementation). Before I spend too much time trying to get that working, I thought it worth asking whether you’ve gotten any further in any local branches, and if so whether you’d mind putting the work in progress up on GitHub?

Also, my two cents on namespace handling: importing into empty enums will give the right call-site behaviour for Swift, and can fairly easily be replaced by a possible better solution later on. Namespaces are used so widely in C++ libraries that I think it’s worth addressing them early.

1 Like

Hi, thanks for the interest.
Tony (@allevato) put together an example that exposed methods here: https://github.com/apple/swift/compare/master...allevato:cpp-interop

I think I agree with you on the namespaces. I did some initial testing to allow using types within namespaces with using declarations and concluded that is probably best just to support them properly.
My current priority is:

  • Namespaces.
  • Member functions.
  • Fully specialized/instantiated templates.

I'm not yet organized enough to have a personal development branch with everything brought together and up to date, but I'll make one. It would certainly be helpful to have people finding failure cases or contributing patches.

5 Likes

As promised, I've put together a place to keep up to date with future changes: GitHub - pschuh/swift at cpp-head
This includes the original PR and some basic prototyping for importing c++ namespaces as Decls and lowering the types inside them.
More to come.

10 Likes

I've also been working on my own branch, integrating @allevato and @pschuh's changes: Commits · troughton/swift · GitHub. I've added support for using NamespaceName::Type declarations (rather than using Type = NamespaceName::Type), (probably broken) importing of reference parameters (void function(const SomeType &param)), and have the mangling working with nested namespaces.

It's quite hacky and probably incorrect in the handling of a few things, and I wouldn't recommend others work off it. Still, might as well have it up in case other people find it useful – in particular, @pschuh, https://github.com/troughton/swift/commit/72f026425f7fb72ecb51ae891414f11e318cae72 might be a slightly cleaner way to handle namespace mangling?

2 Likes

A request for help (@John_McCall or @Joe_Groff , this might be your area if you have time?):

I've got a project with C++ interop up and mostly working, but I'm running into random EXC_BAD_ACCESS (code=1, address=something non-pointer-like like 0x46b) issues when calling methods on C++ types – sometimes the methods will work, and at other times they won't. I'm guessing that arguments are being passed incorrectly or that the stack is somehow being corrupted, but I haven't been able to properly track down the problem. I'm posting the SIL and LLVM-IR in the hope that someone else might be able to spot it.

SIL around the function call (to int MFnDagNode::objectColor(MStatus *) ) that's causing the problem:

  %465 = alloc_stack $MFnDagNode                  // users: %466, %476, %474
  store %463 to %465 : $*MFnDagNode               // id: %466
  %467 = begin_access [modify] [static] %52 : $*MStatus // users: %475, %468
  %468 = address_to_pointer %467 : $*MStatus to $Builtin.RawPointer // user: %469
  %469 = struct $UnsafeMutablePointer<MStatus> (%468 : $Builtin.RawPointer) // user: %472
  %470 = tuple ()
  %471 = tuple ()
  %472 = enum $Optional<UnsafeMutablePointer<MStatus>>, #Optional.some!enumelt.1, %469 : $UnsafeMutablePointer<MStatus> // user: %474
  // function_ref _ZNK8Autodesk4Maya16OpenMaya2019000010MFnDagNode11objectColorEPNS1_7MStatusE
  %473 = function_ref @_ZNK8Autodesk4Maya16OpenMaya2019000010MFnDagNode11objectColorEPNS1_7MStatusE : $@convention(c) (@in MFnDagNode, Optional<UnsafeMutablePointer<MStatus>>) -> Int32 // user: %474
  %474 = apply %473(%465, %472) : $@convention(c) (@in MFnDagNode, Optional<UnsafeMutablePointer<MStatus>>) -> Int32 // users: %509, %478
  end_access %467 : $*MStatus                     // id: %475
  dealloc_stack %465 : $*MFnDagNode               // id: %476

LLVM IR:

  %14 = alloca %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV, align 8
  ...
  %315 = bitcast %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV* %14 to i8*
  call void @llvm.lifetime.start.p0i8(i64 64, i8* %315)
  %.f_path26 = getelementptr inbounds %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV, %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV* %14, i32 0, i32 1
  %316 = bitcast %TSvSg* %.f_path26 to i64*
  store i64 %308, i64* %316, align 8
  %.f_xform27 = getelementptr inbounds %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV, %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV* %14, i32 0, i32 2
  %317 = bitcast %TSvSg* %.f_xform27 to i64*
  store i64 %310, i64* %317, align 8
  %.f_data128 = getelementptr inbounds %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV, %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV* %14, i32 0, i32 3
  %318 = bitcast %TSvSg* %.f_data128 to i64*
  store i64 %312, i64* %318, align 8
  %.f_data229 = getelementptr inbounds %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV, %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV* %14, i32 0, i32 4
  %319 = bitcast %TSvSg* %.f_data229 to i64*
  store i64 %314, i64* %319, align 8
  %320 = bitcast %TSo8AutodeskJ4MayaJ16OpenMaya20190000J7MStatusV* %status to i8*
  %321 = ptrtoint i8* %320 to i64
  %322 = bitcast %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV* %14 to %"class.Autodesk::Maya::OpenMaya20190000::MFnDagNode"*
  %323 = inttoptr i64 %321 to %"class.Autodesk::Maya::OpenMaya20190000::MStatus"*
  %324 = call i32 @_ZNK8Autodesk4Maya16OpenMaya2019000010MFnDagNode11objectColorEPNS1_7MStatusE(%"class.Autodesk::Maya::OpenMaya20190000::MFnDagNode"* %322, %"class.Autodesk::Maya::OpenMaya20190000::MStatus"* %323)
  %325 = bitcast %TSo8AutodeskJ4MayaJ16OpenMaya20190000J10MFnDagNodeV* %14 to i8*
  call void @llvm.lifetime.end.p0i8(i64 64, i8* %325)
  %objectColor._value30 = getelementptr inbounds %Ts5Int32V, %Ts5Int32V* %objectColor, i32 0, i32 0
  store i32 %324, i32* %objectColor._value30, align 4

Registers at the time of the crash:

General Purpose Registers:
       rax = 0x0000000000000003
       rbx = 0x0000000000000000
       rcx = 0x00007ffeefbfaf50
       rdx = 0x0000000000000000
       rdi = 0x00007ffeefbfaf50
       rsi = 0x00007ffeefbfb040
       rbp = 0x00007ffeefbfb070
       rsp = 0x00007ffeefbfaba8
        r8 = 0x0000000000000036
        r9 = 0x0000000000000000
       r10 = 0x0000000000000037
       r11 = 0x0000000000000037
       r12 = 0x00007ffeefbfb090
       r13 = 0x00007ffeefbfafb0
       r14 = 0x0000000138312c00
       r15 = 0x00000001c5db01c0  llamaPlugin.bundle`$s11llamaPlugin9nodeAdded0C010clientDataySpySo8AutodeskJ4MayaJ16OpenMaya20190000J7MObjectVGSg_SvSgtFTo
       rip = 0x000000016d71a37b  libOpenMaya.dylib`Autodesk::Maya::OpenMaya20190000::MFnDagNode::isInstanceable(Autodesk::Maya::OpenMaya20190000::MStatus*) const + 11
    rflags = 0x0000000000010206
        cs = 0x000000000000002b
        fs = 0x0000000000000000
        gs = 0x0000000000000000

Call-site assembly:

    0x1c5daf0e2 <+3138>: movq   -0x58(%rbp), %rax
    0x1c5daf0e6 <+3142>: movq   -0x50(%rbp), %rcx
    0x1c5daf0ea <+3146>: movq   -0x48(%rbp), %rdx
    0x1c5daf0ee <+3150>: movq   -0x40(%rbp), %rsi
    0x1c5daf0f2 <+3154>: movq   %rax, -0x160(%rbp)
    0x1c5daf0f9 <+3161>: movq   %rcx, -0x158(%rbp)
    0x1c5daf100 <+3168>: movq   %rdx, -0x150(%rbp)
    0x1c5daf107 <+3175>: movq   %rsi, -0x148(%rbp)
    0x1c5daf10e <+3182>: leaq   -0x30(%rbp), %rax
    0x1c5daf112 <+3186>: leaq   -0x180(%rbp), %rcx
    0x1c5daf119 <+3193>: movq   %rcx, %rdi
    0x1c5daf11c <+3196>: movq   %rax, %rsi
    0x1c5daf11f <+3199>: callq  0x1c69efb32               ; symbol stub for: Autodesk::Maya::OpenMaya20190000::MFnDagNode::objectColor(Autodesk::Maya::OpenMaya20190000::MStatus*) const
    0x1c5daf124 <+3204>: movl   %eax, -0x128(%rbp)
    0x1c5daf12a <+3210>: movl   $0x1, %edi
    0x1c5daf12f <+3215>: movl   %eax, -0x41c(%rbp)

I know that the MFnDagNode is a valid type and I can intermittently call other methods on it successfully. Any ideas as to what's going on here and how I might work around it? I'm fine with hacky solutions at this stage – I realise a lot of the C++ metadata isn't getting properly propagated through yet.

AFAIK, our C calling convention lowering doesn't handle the indirect SIL conventions like @in. It's also debatable whether they're appropriate for by-reference C++ arguments, since the Swift conventions make no-aliasing, no-escaping guarantees that C++ doesn't. Even if the convention lowering works, you might be seeing miscompiles as the optimizer makes assumptions that aren't valid for C++ methods. You should probably lower C++ reference arguments to pointers.

It doesn't matter much for the Itanium ABI, but we'll probably want a distinct @convention(cpp_method) eventually too, since Visual C++ uses a different convention for instance methods from other C functions.

3 Likes

It's an interesting question. The semantics of C++ references (and this) certainly match Swift's UnsafePointer rules better than they do inout or borrowed arguments. On the other hand, Swift is definitely going to use C++ types in ways that aren't consistent with C++ rules about e.g. construction/destruction order; we could also just decide to strengthen our interpretation of references and this so that we make stricter guarantees when calling into C++ than C++ would (and accordingly expect them when called). I don't think that would fundamentally break anything; it would probably make it harder / more error-prone to import some kinds of C++ libraries into Swift, and it would probably require some more boilerplate on the C++ side to make the interoperation work, but it would also tend to produce better code and better-feeling APIs.

However, even if we were doing that, @in is definitely not the right convention for this; it should be either @inout (for non-const methods) or @in_guaranteed (for const methods). I don't know if that's the source of your miscompile, but it certainly could be.

2 Likes

Compared to the liberties we might take with object lifetimes, which are applying operations already defined in C++ but just in a different order from what you might expect in C++, it feels brittle to me to assume that C++ code is so well-behaved, and also difficult to implement and verify on the C++ side, since we'd be invisibly imposing new implicit rules on C++ APIs on top of the already-shaky pile of implicit rules they have to think about. We've had mixed success with our implicit rules on C interop, where we also impose the rules that passed-by-pointer Swift values be used by the C code in the way Swift likes—to this day, we still get people reporting bugs because their C code improperly escapes the pointer, writes through supposedly-const pointers, or other shenanigans.

For ergonomic reasons, we'd definitely want you to be able to pass references down to C++ functions and methods without explicitly taking pointers, but it seems to me like we can still lower those operations down to more lenient pointer-based operations in SILGen.

3 Likes

Yeah, it's definitely the more cautious approach, and you're right that we can allow the natural code and just handle it at the SIL level. On the other hand, I don't think we can avoid relying on non-standard expectations of the C++ code if we want even passably efficient/ergonomic interactions — if a C++ API takes a const &, then we'll be relying on it to not mutate through that reference, regardless of we model it in SIL as taking an UnsafePointer<T> or an @in_guaranteed T.

1 Like

@Torust Looking closer, it seems that all const functions are not translating properly. They get labeled as @in incorrectly (Which does not translate to In_Indirect), and then that results in a copy to a temporary. This happens even if the value is not supposed to be copied. I'll look into it and see what can be done, but I think that const methods cannot be considered functional until that is fixed. You can override this behavior in ImportDecl.cpp to temporarily remove the distinction between const and non-const.

1 Like

I'd recommend starting with the simplest possible thing (even if that means a horribly non-ergonomic solution with unsafe pointers) then experimenting to see if we can improve it.

We have lots of design room here to introduce new attributes, defaulting rules, etc. In the meantime, just basic functionality with a non-ergonomic solution is still useful to unblock other work.

-Chris

@pschuh That's fixed it, thanks! One note: result->setSelfAccessKind(SelfAccessKind::Mutating) should only apply if !ifStatic.

Seems like the semantic of c++ is so diverse that some potential interoperability strategy would require introducing separate intermediate representation form to further lower it to acceptable swift mappings. No matter what approach would be chosen, if correct interop even possible, i believe that careful analysis and formal proof of type correspondency is required in the first place, rather than randomly roaming solving cases, which could in the end lead to a realization that correct interaction is not feasible, given c++ complexity and multiple compiler implementations.

You may be interested in these two more recent threads:

1 Like