C++ Interop workgroup meeting notes (June 22nd 2022)

This post contains the summarized meeting notes from the Swift and C++ interoperability workgroup sync-up meeting from the 22nd of June.

Note: the workgroup sync-ups are scheduled on demand right now instead of always happening weekly so you will be seeing meeting notes come up on irregular basis instead of being up weekly like earlier this year.

Attendees: Adrian_Prantl, compnerd, Alex_L, adlere, richard, zoecarver, tongjiew, cabmeurer, egor.zhdan, Robertorosmaninho, drodriguez, 0x41c


zoecarver: I wanted to discuss the following post: Towards a safe and ergonomic language model for Swift and C++ interop, as it will impact the direction of forward interoperability (using C++ from Swift) a lot.

So far what we’ve been doing is importing every single declaration with our best effort if we can map to a Swift declaration. This happens even if we don’t understand what the API is doing , if it’s safe or if it’s ergonomic.

I’m proposing that we change this direction almost entirely and only import things that we have some understanding of semantics of. I propose 4 categories which I will go over. These categories will allow us to import safe APIs in a safe and ergonomic manner and really understand what the best mapping for these APIs is.

It will also allow us to have a clear boundary around what interop can do so it’s very clear what’s supported and what works. It also gives us some clear goals to work towards.

Let me go over the different categories now.

The first one is trivial types (is that the right word for it?). The idea is that they don’t have special C++ members or pointers (excluding Objective-C pointers).

compnerd: the way I read the trivial type name makes sense. It maps to C++ trivial types. anything that can be bit copied.

zoecarver: Yes correct, so the only exception is the Objective-C pointers. They’re safe to handle from Swift as part of a trivial struct as they’re managed by ARC.

The second category is iterators. Egor has been doing awesome work for how we can map iterators to Swift iterators .

The third category is owned types, like a std::string or std::vector. Its constructors and destructors manage its own memory and are allowed to allocate and own memory as long as their destructor destroys it.

compnerd: So this is excluding anything like std::shared_pointer.

zoecarver: Yes.

compnerd: The problem is that C++ has no annotation for that.

zoecarver: Right we are going to have our own suite of annotations such as owned or import_as_ref. The trivial type and iterator categories can be inferred though.

compnerd: Worry about annotations. There's no guarantee that they will be present on a code base and people might try to use it and there's no way to use an API. Maybe we can approach similarly to nullability annotations?

zoecarver: Maybe we should have a flag to control default. But using API notes could give us enough flexibility where libraries can annotate themselves using API notes.

compnerd: API notes make sense. What I’m suggesting is something less granular. For example, add a header annotation that sates that this header has been audited and you can assume everything is normal by default.

zoecarver: Not sure how this would look like as we try to exclude certain types and how this would map to auditing. I could see it being useful to have a pragma that says these type definitions are owned.

compnerd: Yes . It would be nice if you can entirely encapsulate the entire header saying that this has been audited it's fine to import. You can refine it with API notes. You can refine it at a per declaration whatever you need.

zoecarver: Yes. There some specific concerns with projections from owned types and reference types though, we can’t just infer those or expose those even if something is audited. I’m not opposed to blank directive saying that these N types are owned types , but I want to make sure it’s clear that this is just a bulk application of the attribute.

compnerd: Yes this is mean for people who explicitly went through and organized their header in such way they can do that.

zoecarver: Let’s now cover the last category, reference types. Very interesting, we see this pattern in a lot of code bases, so a lot of types could map to this pattern.

Alex_L: Can you expand of foreign reference types, does it cover things like custom reference counting and/or std::shared_pointer?

zoecarver: So there are two different subcategories. There’s the reference counted subcategory, which could be like a std::shared_pointer or an NS object if didn’t have its own bridging, i.e. something that has its own retain and release functions we can point to. The compiler then applies those using existing infrastructure which is really cool. It’s safe and cleaner, and we are benefiting from all ARC optimizations in Swift.

The second subcategory is immortal type. For instance an LLVM IR function/instruction in Swift compiler can be immortal, as we’re intentionally not releasing it for the duration of the compilation.

compnerd: Extremely useful if you can replace manual ref count with ARC in Swift. I’ve experimented with this previously so I can point you to source tree with this pattern.

zoecarver: Yes it’s perfect for this use case. Also, this will allow us to represent inheritance and casting.

Alex_L: Do the FRTs require runtime support?

zoecarver: Yes, there’s runtime support that has been recently added.

Alex_L: So there could be back deployment concerns there.

zoecarver: Yes we could have an additional back deployment library to link against and also we could have a hack with limited use case mode that uses existing class descriptors . But we couldn’t do things like casting then.

Alex_L: Probably not a good approach to support FRTs with different functionality based on deployment target so the back compat library makes more sense.

Alex_L: What about an unsafe escape hatch for this new interop direction? Could we use the previous importer behavior to experiment or to build specific files with a flag.

zoecarver: We should discuss it. There’s a need for that, but I don’t have specific proposal at the moment.

egor.zhdan: For standard library overlay we will need this , the idea is that the overlay is going to call into these unsafe APIs and then regular Swift code won’t be able to call into these unsafe methods. So it’s useful specifically to build overlay itself.

zoecarver: This could also apply to other API vendors building their own overlays.

compnerd: Could you talk more about the iterator category, how do you determine whether something is iteratable?

egor.zhdan: At the moment we don’t determine that automatically. In the current patch I have the user is required to add conformances to these protocols manually, but the good thing is that you don’t really have to implement anything in the extension that provide the conformance. In the future we should look into into adding conformance automatically.

zoecarver: I think the best way to start with that is to look at methods like ‘begin’ and ‘end’ . Then you’ll know this type will be imported as a Sequence and then you’ll have to figure out how to bring in ‘begin’ and ‘end’, which will give you the information you’ll need to import an iterator type for that sequence.

compnerd: I also dislike the idea of relying on ‘begin’ and ‘end’ as names. You could have multiple sequence in a type having different method names for iteration.

zoecarver: Right but they all use a similar begin/end pattern. We would always want to make sure we have a pair of generators. Egor do you have details on how you would conform to the sequence-like protocol in that case?

egor.zhdan: I was going to start from automatically conforming iterators to the iterative protocols, and then discuss the sequence conformance later. We can also add sequence conformance to another type and handle different begin/end pairs by having an accessor that provides access to that sequence type.

zoecarver: Any other thoughts? This is going well so far.

0x41c: What would be semantics of the unsafe escape hatch? Will it be a compiler directive or an import attribute on top of the Swift file, or a function provided as an alternative to original API to access the hidden unsafe function?

zoecarver: We can apply this directly to C++ functions with an attribute. But all the ways you suggested could also work. What I was suggesting is specifically applying the attribute to specific C++ functions/APIs to force them to imported in an unsafe manner. And I think you bring up a good point, which is that we may want a 2nd thing on the Swift side that people are actually aware that these are unsafe. So, maybe that's a rename. Maybe that's a certain scope you have to call it within.

And the other option is a compiler flag on the Swift side. It would be used to mark everything as unsafe and it’s useful for the overlay package.

0x41c: How would an overlay package work that worked with existing overlay package. Would you have to forgo the original one or can you use both?

zoecarver: You can use both. This can also apply to packages. The unsafe flag is not viral, it only applies to the specific overlays/package as its being built.

Alex_L: How would you map this forum post to a user guide that we can present to users so that they can try this out without us handholding them in the process.

zoecarver: I think that's sort of like the beautiful thing about an approach like this is that it has some very clear, you know, it gives a very clear boundary for what kinds of things we can import and exactly how we're importing them, so it makes it easier to write docs. We should definitely add some documentation that we’ll put in the Swift compiler docs folder. Once this goes through evolution it will also be easier to complete the full documentation for this.

compnerd: You want to have some implementation/docs before going through evolution but I think the first thing to do is to put out some more examples like give examples of things from the C++ standard, for example for trivial types. That can give people a good way of mapping these concepts directly to something they’re familiar with.

Alex_L: Any issues with debugging for this approach ? This should probably just work but it’d be good to get some feedback.

Adrian_Prantl: The thing I’m wondering most about is to have an escape hatch that we can use for Swift expressions in LLDB.

zoecarver: Could you clarify what you’re talking about?

Adrian_Prantl: So when we’re evaluating Swift expressions in LLDB at the moment we’re leaking references to objects that you’re using which is something we would want to avoid. So the fact that we actually have an specific escape hatch to avoid this it’s a good thing.

Specifically the references right now are leaked whenever an expression is evaluated, as the result of that expression is stored in a result variable. So the same would apply here to foreign reference types too. Right now it makes the debugging experience just work, but it has the, uh, unintended side effect that it might cause objects to be kept alive beyond where it's used. So, it actually changes the semantics of the program.

zoecarver: It’s a problem in Swift too.

Adrian_Prantl: Yes. We don’t have a solution for this at the moment. The tradeoff that we have for C++ is that if you’re using a pointer, we store the address. And if you modify the contents of the address you’re going to see the new entity at the new address, rather than what was there when the object was created, and that’s the semantics you would get in the programming language itself.

I would imagine that people would expect something similar for FRTs from C++ on the Swift side.

zoecarver & Adrian_Prantl concluded by discussing representation details for types imported from C++ and they tried to debug interop code in LLDB to see if it can correctly identify C++ values imported into Swift when printing out variable information for such a value.