[Swift/C++] User survey: how would you use C++ interoperability?

gribozavr · February 17, 2020, 12:28pm

We are continuing to investigate Swift/C++ interop (see manifesto and the forum discussion). To help guide our work, we’d like to hear how you would use this feature.

C++ is not an opinionated language. It accommodates a variety of coding styles and usage patterns. One can easily find projects that use C++ very differently.

If infinite engineering resources were available, Swift/C++ interoperability would accommodate most coding styles and use cases that C++ supports. In practice, we have limited resources, so we need to make prioritization decisions and resolve technical tradeoffs.

I would like to invite prospective users of Swift/C++ interop to discuss the use cases you have, your goals for using C++ interop, as well as the constraints that you operate under.

How would you describe the C++ APIs that you are most interested in calling from Swift?
- Do they use C++ like “C with classes”, or do they use templates heavily?
- Do they use value types a lot, or do they have lots of polymorphic types with virtual functions?
- Do they make ownership explicit by using smart pointers, or do they require using new/delete?
To what extent, if any, would you be able to change the C++ APIs to facilitate Swift/C++ interop?
- Note that our approach requires C++ headers to compile as a Clang module. Headers that don’t meet this criterion yet need to be “modularized”, and unfortunately we don’t see a way around this.
- We’re also thinking about ways in which optional changes or annotations to the C++ API can improve the quality of the imported Swift API. For example, annotating pointers with nullability qualifiers can turn the corresponding Swift types from optionals into non-optionals. Nevertheless, if you don’t annotate nullability, the imported APIs will be still usable, but they will have lots of implicitly unwrapped optionals.
To what extent do you care about the performance overhead added by the interoperability layer? Would you prefer to see idiomatic Swift types in imported APIs at the cost of O(n) bridging conversions, or would you prefer high performance interoperability at the cost of seeing C++ types in imported APIs?
How important is calling Swift from C++ for your use cases?
Are you interested in just calling existing C++ code from Swift and vice versa, or are you also interested in migrating C++ code to Swift, using the Swift/C++ interop to facilitate an incremental rewrite?

jamesgh · February 17, 2020, 12:36pm

I think it would be better if the C interop was less painful to use. Making a C wrapper for a C++ library takes less time than trying to figure out how to get that C wrapper to work with Swift.

If you're gonna add a new language to support for interop, Rust would be nice.

Max_Desiatov · February 17, 2020, 12:39pm

These are pretty complex APIs such as LLVM and Clang and the C++ part of the Swift compiler codebase itself

As far as I'm aware the LLVM/Clang APIs use templates quite heavily.

These vary a lot through those APIs as far as I'm aware.

I hope that the Swift compiler codebase would be open towards evolving in that direction, but I doubt that LLVM and Clang APIs could be easily changed that way.

This makes perfect sense

I personally would lean towards having high performance. The API can then be Swift-ified with a separate wrapper, where users of the bridged API can control the performance as they see fit.

Calling C++ from Swift is a bit more important than Swift from C++, as this is the direction I would use when slowly rewriting from C++ to Swift. Additionally, is calling Swift from C already easy enough? If not, I would be quite surprised if calling Swift from C++ is supported before C interoperability is improved in that way.

gribozavr · February 17, 2020, 1:46pm

Could you provide more details about the difficulties you have in getting the Swift/C interop to work?

jamesgh · February 17, 2020, 2:10pm

Mostly it's the messiness around pointers, pointers to pointers, casting, and arrays. I know these are all surmountable challenges but I find the way to get it to work in swift is really non-obvious so I, at least, have to spend a bunch of time digging through documentation and stackoverflow posts to get the right pattern any time I need to do Swift <> C interop. The worry I have about adding C++ into the mix is that it will be like the C interop but way worse in terms of complexity.

gwendal.roue · February 17, 2020, 4:41pm

My personal pet-peeve is that variadic (...) functions are not imported into Swift. Quoting https://developer.apple.com/documentation/swift/imported_c_and_objective-c_apis/using_imported_c_functions_in_swift:

Swift only imports C variadic functions that use a va_list for their arguments. C functions that use the ... syntax for variadic arguments are not imported.

A workaround is to define extra C headers / Clang modules / Swift package for the sole purpose of defining wrapper functions (example). It is verbose but doable. But it also creates unexpected issues that are really difficult to characterize. For example, I dismissed this one as an Xcode bug, but could never figure out how to report it properly.

jamesgh · February 17, 2020, 5:09pm

Yes, that's a good point - interfacing GLib is a huge pain because of this.

compnerd · February 17, 2020, 5:19pm

Most of the use is heavily templated, but a significant portion of the templating is done a-priori and for most of the parameters are fixed in the interface (that is the templating is used to generate a forest of hierarchies). The interfaces are heavily polymorphic, effectively using C++ for double dispatch. The ownership is explicit and manually performed (e.g. AddRef, DecRef). The C++ types are wrappers over COM interfaces. You can find documentation on https://msdn.microsoft.com under Windows Template Library (WTL).

The C++ APIs are hardcoded in stone and cannot be modified. They are vended by Microsoft and have compatibility requirements. The headers are not modularized sadly. I expect to have to make some change to clang eventually to support virtualizing the file system for the module to support an external module map from the source tree. I believe that if annotations are required, they should originate from an APINotes augmentation.

The overheads are significant concerns as this would be useful for implementing UI interfaces and be used in interactive scenarios. I would definitely prefer high performance interoperability by seeing the C++ types.

Calling Swift from C++ is somewhat important, although largely already possible with a bit of contortion. The calls are callbacks, and use of @convention(c) { } closures solve the need.

I think that calling C++ code from Swift is the current primary desire, though it could be useful for porting to have the ability to have good support for gradual replacement of C++ applications to Swift.

kjteske · February 17, 2020, 7:46pm

How would you describe the C++ APIs that you are most interested in calling from Swift?

Our C++ code is largely complex business logic re-used across multiple platforms, Apple & non-Apple. On each platform, we wrap the C++ with platform-appropriate code for UI, threading, networking, etc. Platform-appropriate code provides input to and handles output from the C++ logic.

Do they use C++ like “C with classes”, or do they use templates heavily?

Our C++ classes themselves aren't heavily templated, but we use a lot of standard library containers and things like std::function, etc.. We currently write/codegen wrappers around the C++ classes on each platform, i.e converting NSArray to std::vector, etc.

Do they use value types a lot, or do they have lots of polymorphic types with virtual functions?

Mostly value types, or non-virtual classes.

Do they make ownership explicit by using smart pointers, or do they require using new/delete?

Mostly smart pointers going forward, although plenty of legacy code using new/delete still.

To what extent, if any, would you be able to change the C++ APIs to facilitate Swift/C++ interop?

Very-doable. Would need to make it unobtrusive for non-Apple environments (i.e. use macros that become no-ops on non-Swift platforms).

To what extent do you care about the performance overhead added by the interoperability layer? Would you prefer to see idiomatic Swift types in imported APIs at the cost of O(n) bridging > conversions, or would you prefer high performance interoperability at the cost of seeing C++ types in imported APIs?

Would like high performance interop on the roadmap, but our current solution with wrappers has been sufficient so far and have the same O(n) conversions.

How important is calling Swift from C++ for your use cases?
Are you interested in just calling existing C++ code from Swift and vice versa, or are you also interested in migrating C++ code to Swift, using the Swift/C++ interop to facilitate an incremental > rewrite?

Primarily interested in calling C++ from Swift.

Other: we're also very interested Bazel supporting this smoothly as well.

GetSwifty · February 17, 2020, 9:00pm

Do they use C++ like “C with classes”, or do they use templates heavily?
Mostly classes
Do they use value types a lot, or do they have lots of polymorphic types with virtual functions?
Both
Do they make ownership explicit by using smart pointers, or do they require using new/delete?
new/delete
To what extent, if any, would you be able to change the C++ APIs to facilitate Swift/C++ interop?
At little as possible as they are usually cross-libraries.
To what extent do you care about the performance overhead added by the interoperability layer? Would you prefer to see idiomatic Swift types in imported APIs at the cost of O(n) bridging conversions, or would you prefer high performance interoperability at the cost of seeing C++ types in imported APIs?
Performance is generally less of a concern, but there are some instances where they're more of a concern. However as long as basic Type bridging (Int, Float, etc.) is performant, that shouldn't be a problem.
How important is calling Swift from C++ for your use cases?
Generally less important, but it would be very restricting if it were not possible at all.
Are you interested in just calling existing C++ code from Swift and vice versa, or are you also interested in migrating C++ code to Swift, using the Swift/C++ interop to facilitate an incremental rewrite?

My main specific use case would be using an existing cross-platform C++ library that will remain that way, with an incremental rewrite from the current Objective-C (++) portion to Swift.

dwaite · February 19, 2020, 1:51am

I'd say more generally there's a lack of best practices. Objective-C you have a very clear list of best practices that will lead to a more idiomatic Swift interface (and they are purposely close to existing objc best practices because of the Swift compatibility requirements), but with C there is a lot more flexibility.

So for example

Should I expose a C struct directly or hide it in a new Swift type?
If the struct has a foo_t* as an array of entries, should I expose that as a [foo_t] equivalent in Swift, even though that means creating arrays more often?
If the struct has a foo_t* (I'll assume caller-managed memory semantics), how do I manage the memory of the imported struct on copy?
The attributes/macros/apinotes for controlling bridging are generally undocumented, except for CF_SWIFT_NAME.

I've found it far easier to bridge a C library through Objective-C than to do so directly in Swift. Even mores if I'm trying to create a binding for third parties to use. The gymnastics needed to clean up a direct C import are often more verbose, painful, and hurdle-ridden than just rewriting the library in Swift.

lukasa · February 19, 2020, 12:23pm

From working on SwiftNIO and Swift Crypto I've built up a list of best-practices that might be worth thinking about. I'll put my answers below. Note that this is not authoritative and I can't claim that this is what the community wants to do, but I think I've got more experience here than most.

You should hide it in a Swift type. In general, exposing the types of your dependencies isn't great. Wrapping the C types in Swift types gives you a type that is in your namespace and that you control, which makes it easier to (for example) add conformances to helpful Swift protocols.

This does force you to write more code than you otherwise would, but the advantage is that you can promise your types will behave in a Swifty way. This also allows you to reference count non-trivial C types to manage their lifetimes.

As an example, consider Swift Crypto's ArbitraryPrecisionInteger, a wrapper around BoringSSL's BIGNUM. This is a complex wrapper that provides lots of helper functionality (ExpressibleByIntegerLiteral, Equatable, Comparable, AdditiveArithmetic, Numeric, SignedNumeric, CustomDebugStringConvertible). It also takes a fundamentally referential type and turns it into a value type, making it much nicer to work with in Swift. (More on this later.)

Even there I'm missing some useful protocols: I could have given this a BinaryInteger conformance, too. I was just too lazy.

Wrap it in a Collection type that wraps the underlying pointer. If the pointer has a length this can be a RandomAccessCollection, otherwise it's a Collection. This avoids the need to copy, while giving you most of the friendliness you want from Arrays. In general when you think you want an Array what you actually want is something that can do what Array can do (be looped over, fast indexing, etc.): RandomAccessCollection is that.

An example of this is swift-nio-ssl's SSLConnection.PeerCertificateChainBuffers. This type provides access to the certificates provided by the peer in a TLS connection. This type is a RandomAccessCollection, which means it can be used essentially anywhere an Array can, but doesn't need a copy.

Wrap it in a class.

Any time you have an object that needs lifetime management being brought into Swift, the only way to safely manage that lifetime is to use the wrapper object in a class. In most cases you want value semantics for these things though, so you often need to then wrap that class wrapper in a struct to re-establish value semantics.

Again, Swift Crypto's ArbitraryPrecisionInteger is an example of this. It uses the CoW pattern along with a backing class that owns the C struct's lifetime. The need to involve an extra heap allocation here is a bit frustrating, but in many cases that heap allocation can be elided for objects that don't live long.

I recommend not bothering to use apinotes: writing a proper Swift wrapper will likely be better in the long-run.

dwaite · February 19, 2020, 7:13pm

Thank you for this! I have used SwiftNIO minimally so far (mostly ByteBuffer for easing implementation of parsers and binary generators in projects like Cyborg, which provides CBOR tooling.)

dwaite:

Should I expose a C struct directly or hide it in a new Swift type?

You should hide it in a Swift type. In general, exposing the types of your dependencies isn't great. Wrapping the C types in Swift types gives you a type that is in your namespace and that you control, which makes it easier to (for example) add conformances to helpful Swift protocols.

This does force you to write more code than you otherwise would, but the advantage is that you can promise your types will behave in a Swifty way. This also allows you to reference count non-trivial C types to manage their lifetimes.

As an example, consider Swift Crypto's ArbitraryPrecisionInteger , a wrapper around BoringSSL's BIGNUM . This is a complex wrapper that provides lots of helper functionality ( ExpressibleByIntegerLiteral , Equatable , Comparable , AdditiveArithmetic , Numeric , SignedNumeric , CustomDebugStringConvertible ). It also takes a fundamentally referential type and turns it into a value type, making it much nicer to work with in Swift. (More on this later.)

Even there I'm missing some useful protocols: I could have given this a BinaryInteger conformance, too. I was just too lazy.

Interesting. As a slight segue here, do you see this sort of strategy changing over time - for instance, would you consider moving to a common arbitrary precision integer type such as one in Swift Numerics, or would that generally be more trouble than its worth considering the bridging between that and the needed type by an underlying crypto engine?

dwaite:

If the struct has a foo_t* as an array of entries, should I expose that as a [foo_t] equivalent in Swift, even though that means creating arrays more often?

Wrap it in a Collection type that wraps the underlying pointer. If the pointer has a length this can be a RandomAccessCollection , otherwise it's a Collection . This avoids the need to copy, while giving you most of the friendliness you want from Array s. In general when you think you want an Array what you actually want is something that can do what Array can do (be looped over, fast indexing, etc.): RandomAccessCollection is that.

An example of this is swift-nio-ssl's SSLConnection.PeerCertificateChainBuffers . This type provides access to the certificates provided by the peer in a TLS connection. This type is a RandomAccessCollection , which means it can be used essentially anywhere an Array can, but doesn't need a copy.

Since I imagine there are often similar memory management patterns within a codebase for multiple exposed types, did you ever consider building a generic type, e.g. one that works around BoringSSL CryptoBuffers and not specifically one for PeerCertificateChains?

The flip side of this I imagine is that your types may have some functional limitations because your wrapper doesn't have certain features you would get 'for free' with native swift code.

A CryptoKit/SwiftCrypto example might be that you can't initialize a digest to an arbitrary value, which limits your ability to do certain things (like implement Codable) and treat a digest as a specialized Data type.

Likewise, a PeerCertificateChainBuffers type would need to be copied into an Array for use in many API, simply because [Foo] is easier to type than RandomAccessCollection where Element: Foo (although perhaps not the best example for this class as it appears to be internal.)

dwaite:

If the struct has a foo_t* (I'll assume caller-managed memory semantics), how do I manage the memory of the imported struct on copy?

Wrap it in a class.

Any time you have an object that needs lifetime management being brought into Swift, the only way to safely manage that lifetime is to use the wrapper object in a class. In most cases you want value semantics for these things though, so you often need to then wrap that class wrapper in a struct to re-establish value semantics.

Again, Swift Crypto's ArbitraryPrecisionInteger is an example of this. It uses the CoW pattern along with a backing class that owns the C struct's lifetime. The need to involve an extra heap allocation here is a bit frustrating, but in many cases that heap allocation can be elided for objects that don't live long.

Ahh, that was somewhat my expectation but I wondered if there were any generic tools to do so, such as a malloc/free'd buffer you could leverage for the inner class and do CoW patterns on.

Have you had to deal with exposing internal types for compatibility when users want to call other C API?

scanon · February 19, 2020, 7:23pm

Off topic, but: crypto has some specific requirements that don't apply to other uses of bignums, so a more generic bignum type in Swift Numerics may not be suitable for use in Swift Crypto (or it might, but that would be something that both projects would have to decide to commit to).

lukasa · February 20, 2020, 9:31am

@scanon has already given a good answer, but I want to elaborate a bit more. The reasons to move to numerics' data type, assuming that it meets our functional requirements, are that:

swift-numerics noticeably outperforms BoringSSL. My guess is that this is unlikely, but not impossible. Or,
We expose bignums in our API.

I certainly don't want Swift Crypto to vend an arbitrary precision integer type in its API that is specific to Swift Crypto: that makes life very awkward. However, I think it's highly unlikely that we'll ever need to. This also makes the decision somewhat uninteresting: we can change it whenever we want.

As a general rule I never build a generic abstraction when I have one specific use-case. Generalising too early tends to push you towards the wrong abstraction: having multiple examples helps you find the actual commonality.

In this case the thing to generalise over is not CRYPTO_BUFFER, it's BoringSSL's STACK_OF() abstraction. This is a macro that generates typed stacks of objects. The backing pointer in PeerCertificateChainBuffers is STACK_OF(CRYPTO_BUFFER), and we do have another instance in the codebase wrapping a STACK_OF() that only implements Sequence, not RandomAccessCollection (due to laziness on my part at the time).

We could try to generalise this, but I don't think it works well. The big problem here is that most of the C types (e.g. CRYPTO_BUFFER) are actually erased into OpaquePointer. That means we don't have any type we can hang the helper functions off of. This is all just a total mess, and so it's simpler not to bother.

This limitation really has nothing to do with C/C++: the digest types are entirely written in Swift. This is a deliberate design decision: we want to discourage constructing Digests from random data. Digests have comparison functions you can use to compare them against a Data, so there should be no need to construct them directly.

I mean, they could also just write PeerCertificateChainBuffers. "Easier to type" is not really the concern here. I'm fine with users copying things into Arrays if they want to, that's their prerogative: I'm just interested in giving them the option to do things the less-copying way, even if they choose to do something else.

There are many times when copying an arbitrary RandomAccessCollection into an Array is a good idea. There are times when it is not. It's good to have the choice, where you can.

Well if you really try you can use ManagedBuffer, but ManagedBuffer is a pretty nasty API that's hard to hold correctly and somewhat limiting. It tends to be easier just to hand-roll it, especially as exactly what functionality you need to expose tends to be somewhat varied.

In my experience it rarely takes more than about 10 minutes to hand-roll the basic implementation.

Can you elaborate on what you are asking about here? I'm not sure exactly what you're trying to get at.

Torust · February 20, 2020, 11:05am

Firstly, thank you for pushing forward on this! I think this is a very valuable feature and an exciting direction.

The APIs I'm most interested in are game-development oriented APIs which tend to make limited/no use of the STL and minimal usage of C++ features - C with classes and virtual methods is usually the extent of it.

Ownership is usually managed with raw pointers, stack-allocated types, and (in rare cases) new/delete. Custom allocators are common, so often a freeType method is provided or the type is manually reference counted.

Preferably, no changes would be needed; while I could modify the headers manually, that requires maintenance every time the API changes.

Performance is very much the priority. While I would like to see some accomodations in the form of converting common enum case patterns and the like, I do not want the interop layer to abstract away any of the underlying types. I would, however, think it worth considering exposing multiple overloads for C++ methods – say one that takes an std::string and one that takes a String, allowing the user to choose the right method and therefore tradeoff for their current use case (provided this would not introduce type-checker ambiguity).

In one case where I've already experimented with C++ interop, for the Maya C++ APIs, it would have been useful to be able to subclass/implement interfaces in Swift and pass those to C++ APIs. However, I don't see any real usefulness beyond that for calling Swift from C++. That's mainly because Swift is the main language for my codebase; for codebases which have more equal shares or incremental adoption of Swift I can see being able to call Swift code from C++ being very useful.

blangmuir · February 21, 2020, 5:50pm

I thought I would answer your survey with a concrete case for me: IndexStoreDB is a C++ codebase with a Swift interface that currently uses a C layer in the middle for interop.

At the interface boundary we have largish reference types such as the "IndexSystem" that have reference semantics, use either pimpl or abstract base classes, and are typically managed by std::shared_ptr. Index requests are methods on these classes and those methods use StringRef, ArrayRef and small value types as inputs and typically provide outputs using callbacks via function_ref that get called for each symbol. The index symbol types are a mix of small value types and shared_ptr-managed.

Ownership is typically "borrowed" like StringRef, or explicitly owned with std::shared_ptr.

Our C++ interface is unstable and we can make any changes we want to improve it.

If bridging were expensive, we would have to evaluate our APIs on a case-by-case basis and workaround the overhead as necessary. At the point we translate from C to Swift, we already make these kinds of decisions, so I don't see a big change here either way.

We need to efficiently call a Swift closure or else we would need to change how we pass results. It would also be handy to be able to call class/protocol methods, but it's less important.

Primarily calling existing C++ code.

Fabio_Kaminski · February 22, 2020, 2:32pm

For me being able to have an accurate management of memory of the c++ objects would be the most desired feature.

If Swift can calculate the size of the object (primitives, struct, classes) by looking up at the class/struct declaration and being able to use move, it would be great.

For instance, i must have stupid wrappers because the actual handle is a shared pointer.

If Swift could know the correct size, move, and use C++ constructors/destructors correctly, we could just pass the smart pointer, be it unique or shared.

Also, to allocate some C++ class/struct in (swift) stack as we can do with C would be awesome. In this case the big difference would be RAII being correctly triggered inside swift scopes.

This alone would be a game changer for swift.

Also, if feasible, to be able to control the external C++ compiler and flags, even if it only can be done with clang. Maybe not just assuming that the clang shipped with the swift distribution should be used. I dont know if this can be done, but it would be nice if possible, to know you can use the "host" compiler, you are using to compile your own c++ code.

Doug · February 22, 2020, 11:26pm

I’m not sure how to describe them, but I’d love a future where I could use OpenCV 4 from Swift

To what extent, if any, would you be able to change the C++ APIs to facilitate Swift/C++ interop?

Minimal to none, I would simply be a consumer of the API.

To what extent do you care about the performance overhead added by the interoperability layer? Would you prefer to see idiomatic Swift types in imported APIs at the cost of O(n) bridging conversions, or would you prefer high performance interoperability at the cost of seeing C++ types in imported APIs?

Very much so, as real-time use of the APIs with high resolution images would be amazing.

How important is calling Swift from C++ for your use cases?

In this instance not very high, although potentially being able to use Swift code in a C++ game engine would super amazing if possible.

ctreffs · May 4, 2020, 8:27am

How would you describe the C++ APIs that you are most interested in calling from Swift?

I'm most interested in high performance computing libraries like physics engines (i.e. NVIDIA PhysX) that are currently not directly available in Swift.

To what extent, if any, would you be able to change the C++ APIs to facilitate Swift/C++ interop?

I would like to keep it minimal, I'd like to be only a consumer of the API, especially if users should be able to use the existing documentation of that particular API in my code as well.

To what extent do you care about the performance overhead added by the interoperability layer? Would you prefer to see idiomatic Swift types in imported APIs at the cost of O(n) bridging conversions, or would you prefer high performance interoperability at the cost of seeing C++ types in imported APIs?

Performance in these frameworks and libraries is critical, so I'd be definitely on the performance side here.

How important is calling Swift from C++ for your use cases?

Not very much.

Are you interested in just calling existing C++ code from Swift and vice versa, or are you also interested in migrating C++ code to Swift, using the Swift/C++ interop to facilitate an incremental rewrite?

That depends on the size of the code base, but generally speaking, yes, migrating would be part of my scope as well.