[Swift-to-C++] incorporating Swift argument labels into generated C++ function name

Alex_L · December 23, 2022, 12:01am

The current prototype implementation of Swift-to-C++ ignores argument labels when generating C++ functions and methods that represent the native Swift functions. This approach has worked so far, but it's not without its flaws. The primary issue is that functions overloaded by argument labels don't map well to C++. For example, there is no way to distinguish these two methods from the Array Swift type in C++:

index(after:)
index(before:)

Mapping them to C++ index methods would produce compiler errors in the generated header on the C++ side.

I've been thinking about some potential solutions. The Swift compiler could:

Detect conflicting overloads like the ones shown above and not emit them in the generated C++ header. The user would need to explicitly specify their name in C++ using @expose attribute.
Incorporate some argument labels into the name of the C++ function , if there is a naming conflict in the overloaded set. For example, this overload set:

index(_ , offsetBy:)
index(after:)
index(before:)

Could be mapped to these C++ functions, renaming the last two overloads to prevent a conflict:

index( )
indexAfter()
indexBefore()

However, both of these approaches have issues when it comes to source compatibility for C++ users. Future changes to such a Swift API that remove, rename or add argument labels and/or overloads can cause previously exposed Swift functions to be renamed in the C++ header. That would break C++ sources for users who call these APIs from C++. For example, if we add another overload with two arguments:

index(_ , offsetBy:)
index(_ , subtractedBy:)

That would cause the existing index overload to be renamed to indexOffsetBy, breaking source compatibility for C++ clients. I don't think that this would lead to a great user experience, therefore I think that always incorporating argument labels might be best.

Always incorporate argument labels

I tried incorporating Swift argument labels into the names of C++ function/methods that represent Swift functions. So far this approach has worked quite well on the set of existing APIs that we expose into C++ from the Swift standard library.

For example, these methods in Swift's Array type:

index(after:)
index(before:)
insert(at:)
index(_,offsetBy:,limitedBy:)
distance(from:,to:)

Can now map to C++ without any ambiguities:

indexAfter()
indexBefore()
insertAt()
indexOffsetByLimitedBy()
distanceFromTo()

This approach does not have the downside of inadvertent source breaks on the C++ side whenever a new overload that uses different argument labels is added to the overload set with the same base name. Because of that, I think that this approach is the best option that I've explored so far. Swift APIs are expected to evolve over time so the C++ clients need to be resilient to trivial API additions like the ones mentioned in this post that don't break Swift code.

Right now this approach generates the C++ name by adding each argument label with capitalized first character to the base name of the function. In the future we can improve the heuristics, for example certain argument labels do not need to be capitalized potentially.

I'm planning to make this change in the prototype implementation of Swift-to-C++ interop in the compiler and pitch this approach to evolution in the future. Let me know if you have any feedback and/or concerns about this approach.

Torust · December 23, 2022, 2:12am

Another option to consider: only include the argument labels in the base name if the argument label is explicitly stated as separate to the parameter name in the function. For example:

func index(after i: Self.Index) -> Self.Index

would still map to

Index Collection::indexAfter(Index i)

but something like

func split(
    maxSplits: Int = Int.max,
    omittingEmptySubsequences: Bool = true,
    whereSeparator isSeparator: (Self.Element) throws -> Bool
) rethrows -> [Self.SubSequence]

would be

Array<Subsequence> Collection::splitWhereSeparator(size_t maxSplits, bool omittingEmptySubsequences, std::function<bool(Element)> isSeparator)

(approximating how the types would be mapped over).

Jumhyn · December 23, 2022, 3:41am

I'm definitely on board with always incorporating the argument labels into the C++ name.

How does (or should) this approach address ambiguities caused by the loss of boundary information in the Swift-name-to-C++-name translation? E.g., these swift names would all map to indexAfterFirst on the C++ side, no?

index(after:first:)
indexAfter(first:_:)
index(afterFirst:_:)

stevapple · December 23, 2022, 4:11pm

Alex_L:

For example, these methods in Swift's Array type:
index(after:)
index(before:)
insert(at:)
index(_,offsetBy:,limitedBy:)
distance(from:,to:)
Can now map to C++ without any ambiguities:
indexAfter()
indexBefore()
insertAt()
indexOffsetByLimitedBy()
distanceFromTo()

I would believe this is the correct approach — although it may look weird for some functions. I thought about adding words like With or And to make it nicer, but there’s no “one rule for all”. Simply enumerating the labels looks good as they are.

For a good C++-interop user experience, library authors should be able to and always annotate a function with a custom name if the synthesized name feels unnatural. Regarding to compatibility, the generated header should always include the synthesized names, but make them deprecated with message (like what we have in Swift) once a custom name is given.

tevamerlin · December 23, 2022, 10:48pm

I think you’re right: boundaries should get preserved in order to avoid this source of ambiguity. Using _ for this purpose, the examples you gave would then map to:

index_after_first
indexAfter_first
index_afterFirst

Of course, even this approach doesn’t guarantee a 100% absence of ambiguity, e.g. if the Swift name index_after_first() is also present in the code. But since Swift naming conventions don’t lead to this kind of name, the compromise is probably ok.

jrose · December 23, 2022, 11:21pm

FWIW ObjC translation doesn’t distinguish index(after:) and indexAfter(_:) either. I think you can get away with the shorter-and-simpler thing here; this is, after all, not an arbitrary FFI but only the set of methods someone has chosen to expose.

Jumhyn · December 23, 2022, 11:34pm

I understood the plan of record to be that exposure of Swift constructs in the C++ interface is opt-out rather than opt-in (and that library clients should be able to use C++ interop with libraries that have not considered interop at all and for which the user does not control the source), so that does make these issues a little more germane. It would be a shame if you just… couldn’t use some function in a library across the interop boundary because it happens to be named in a way that causes collisions for the interop machinery.

I agree we shouldn’t make member names unnecessarily ugly in the common cases. An underscore looks okay for the examples above IMO (modulo the somewhat strange mixing of camel- and snake-case), but would the same be applied to unlabeled arguments?

index(_:offsetBy:)
index___offsetBy()

Jumhyn · December 24, 2022, 12:01am

Even beyond formal ambiguities/collisions, I can imagine situations where discarding boundary information would lead to downright misleading function names. E.g., if

func perform(after delay: TimeInterval, work: @escaping () -> Void)

became

performAfterWork(...)

on the C++ side, it suddenly appears as though 'after' is modifying 'work' rather than labeling its own argument. Of course, in this situation you could probably figure out what was going on just based on the argument types, but I suspect there are cases where we could construct some very misleading names... also, in this case, I'm not sure underscores would help us much here:

perform_after_work(...) // not much clearer!

We could potentially mitigate this by making use of the parameter name in addition to the argument label, when available:

perform_afterDelay_work(...) // better!

but this has the highly notable downside that it would cause changing the internal parameter name to become a source-breaking change for C++ interop clients, so this doesn't seem like a road we should go down.

LucianoPAlmeida · December 24, 2022, 1:56am

It is possible to have as argument type a tuple that has labels as part of it. Example:

struct S {
  func f(_: (a: Int, b: Int)) { print("labeled") }
  func f(_: (Int, Int)) { print("not labeled") }
  // Those are considered two different overloads.
}

let s: S = S()
let a = (a: 1, b: 1)
s.f(a)
let b = (1, 1)
s.f(b)
// Prints: 
// labeled
// not labeled
}

Note that could be recursive e.g. func fr(_: (a: (c: Int, d: Int), b: Int)).

Also, such type could be the return of the function so the following are also considered different overloads.

  func g() -> (a: Int, b: Int) { fatalError() }
  func g() -> (Int, Int) { fatalError() }

Depending how the interop for tuples work those would be different types so there would not be an ambiguity problem in that case.

But thought worth a question anyways if those labels should somehow be considered in this C++ translation?

masters3d · December 24, 2022, 2:12am

I was thinking something like this as well but I don’t think the tuples work because tuples do not support default arguments.

In fact once upon a time swift arguments where backed by a swift tuple but this was removed a few versions ago.

masters3d · December 24, 2022, 2:24am

How about adding the arg placement to the name.

… index_arg1after_arg2first(…)
… indexAfter_arg1first_arg2(…)
… index_arg1afterFirst_arg2(…)

Jumhyn · December 24, 2022, 2:38am

Unfortunately I think this falls into the realm I mentioned above:

There are a lot of tricks we could do that remove any chance of collisions at the cost of being completely ugly in C++ land (e.g. just use the mangled name!) but those sorts of approaches seem like non-starters to me.

masters3d · December 24, 2022, 2:57am

Verbose is on the eye of the beholder. C++ has always been verbose in my book.

Jumhyn · December 24, 2022, 4:38am

It's not really verbosity that I'm objecting to. IMO verbosity is perfectly fine, but we should try to make sure that in most common cases people end up with C++ names that would not be totally ridiculous to write natively. Something like index_arg1after_arg2first might save us from conflicts, but it would make the experience of using Swift APIs from C++ extremely sub optimal.

Karl · December 24, 2022, 6:28am

Apparently there are ways to fake it. Not sure how well it would work when applied to an entire library, and the "=" sign doesn't work quite as well as a ":" for this purpose, but it might be worth exploring.

(See also: The Boost Parameter library by @dabrahams)

index(after = i);

Obviously it would be better to have things bridge to more idiomatic C++, but there are expressivity differences (e.g. overloading based on a function's return type, supported in Swift but not C++), so IMO we will have to accept some compromises.

Alex_L · February 3, 2023, 1:59am

Thanks for the feedback here, I haven't made this change yet given some of the issues highlighted here. I'm hoping to come back to this in the next 1/2 weeks.

Alex_L · February 3, 2023, 2:01am

Karl:

Apparently there are ways to fake it. Not sure how well it would work when applied to an entire library, and the "=" sign doesn't work quite as well as a ":" for this purpose, but it might be worth exploring.

(See also: The Boost Parameter library by @dabrahams)
index(after = i);
Obviously it would be better to have things bridge to more idiomatic C++, but there are expressivity differences (e.g. overloading based on a function's return type, supported in Swift but not C++), so IMO we will have to accept some compromises.

We have thought about this approach before while working on the initial design but for now we decided that it's not really worth to pursue something like that, it as it doesn't really fit into existing idiomatic C++ code bases that well, and is a little problematic for tooling too.

austintatious · February 3, 2023, 4:59am

Would something like the following be a good approach, where you can specify the symbol/identifier for the function explicitly?

@cpp(exportAs: "nonConflictingName") func conflictingName(this: A, that: B) {}

When not supplied, the compiler could give some sort of default like conflictingName_this1_that2_cpp.

At least then you can avoid ugly names if you don't like them. @objc has something similar for selectors.

Alex_L · February 3, 2023, 2:42pm

Yes, that's already supported using the @_expose attribute:

@_expose(Cxx, "nonConflictingName")
func conflictingName(...)