C++ function template specialization and generic functions in Swift

zoecarver · November 14, 2020, 10:38pm

Hello, all. This post will discuss the issues of using Swift's generics model with C++ function templates, namely, the best way to support calling C++ function templates with one or more generic arguments.

The current state of the world

Currently, we have very preliminary support for function templates. We can import function templates with a full concrete substitution map where all replacement types can be converted to C++ types. This means we can't invoke a C++ function template with generics or even custom object types.

Here are a few examples:

template<class T>
void myCxxFunctionTemplate(T) { }

func basicCaller() {
	myCxxFunctionTemplate(0) // Works!
}

func genericCaller<T>(arg: T) {
	myCxxFunctionTemplate(arg)
}

func basicGenericCaller() {
	genericCaller(0)
}

func complexGenericCaller(condition: Bool) {
	if condition { genericCaller(0 as UInt32) }
	else         { genericCaller(0 as Int32)  }
}

// Note: *public*
public func impossibleGenericCaller<T>(arg: T) {
	genericCaller(arg)
}

Currently, basicCaller works... and that's it. We need to determine how to support or gracefully fail for the rest of the test cases above.

Proposed solution

I propose that we essentially treat Swift generics as C++ templates; that is, we force all generic functions to be fully specialized if they invoke a C++ function template. This means, somewhere in the compiler, we would visit all invocations of any function that calls a C++ function template with a generic argument.

In the basic example, the compiler would find that basicGenericCaller calls genericCaller, which it knows calls a C++ function template. So, it would create a specialization of genericCaller where T = Int.

In the more complicated example: when the compiler visited complexGenericCaller, it would see that there are two calls to genericCaller and would then create two specializations of genericCaller with T = UInt32 and T = Int32.

Last, the compiler would see that it cannot make a specialization of genericCaller based on the function call in impossibleGenericCaller, so it would create a static error. If this were a private function, then the compiler would simply ignore it.

Likely all specialization would be "transparent" or "always inlinable," so optimization passes could remove them.

Specializing each function vs. "if" statements

Another possible implementation to improve code size would be to modify the body of genericCaller to call out to specific specializations of myCxxFunctionTemplate based on T. This could be achived with an if-statement like so:

func genericCaller<T>(arg: T) {
  if T == Int.self {
    myCxxFunctionTemplate(arg as Int)
  } else if T == Int32.self {
    myCxxFunctionTemplate(arg as Int32)
  } else if T == UInt32.self {
    myCxxFunctionTemplate(arg as UInt32)
  } else {
    fatalError()
  }
}

I propose that we implement this as a possible optimization later on, and right now, we only implement the more straightforward solution outlined above.

Should specialization of generics happen in the type checker or a SIL pass?

I propose the logic for analyzing invocations of C++ functions, and their generic callers is implemented in a mandatory raw SIL pass, similar to SILGenCleanup or MandatoryCombine.

The current specialization logic exists in the type checker; we could conceivably implement this logic in the type checker as well, which would allow us to error earlier.

Why a mandatory SIL pass? There are a few reasons. First, it seems like the logical place to put this type of checking and transformation. In a SIL pass, it will be self-contained and separated from mostly unrelated type checking code. Other specialization and inlining passes have similar logic, so it makes sense to group all these transformations together.

Second, this logic will be fairly expensive, and I'm not sure we want to slow down the type checker by visiting and analyzing every call made in a program. Note: no matter where this logic is implemented, it will likely only be enabled when C++ interoperability is also enabled.

Third, we could potentially accept more code if this pass was run after mandatory inlining and other optimization passes that remove logic that would otherwise create an error. For example, we could potential specialize glob in the below example, something that could not easily be done in the type checker:

var glob = { myCxxFunctionTemplate($0) }
func caller() { glob(0) }

@_must_specialize

The C++ Interoperability Manifesto suggests the addition of the @_must_specialze attribute. I do not think this attribute is necessary for implementing generic functions that call C++ templates. This may be a helpful feature to implement down the road, but the proposed pass would be capable of generating errors itself and could easily keep track of what functions must be specialized internally.

References

The C++ Interoperability Manifesto
Initial Support for C++ Function Templates

I look forward to your feedback and guidance.

zoecarver · November 14, 2020, 10:43pm

CC @John_McCall @gribozavr @hlopko @typesanitizer based on your comments in the linked PR.

Dante-Broggi · November 14, 2020, 11:08pm

I dislike this. It is causing the implementation of the function to affect how it can be called.

I would add an @attribute, perhaps called @compileConst (though I don't really like the name)
and the examples would work as:

template<class T>
void myCxxFunctionTemplate(T) { }

func basicCaller() {
	myCxxFunctionTemplate(0) // Works!
}

func genericCaller<@compileConst T>(arg: T) {
	myCxxFunctionTemplate(arg)
}

func basicGenericCaller() {
	genericCaller(0)
}

func complexGenericCaller(condition: Bool) {
	if condition { genericCaller(0 as UInt32) }
	else         { genericCaller(0 as Int32)  }
}

// Note: *public*
public func publicGenericCaller<@compileConst T>(arg: T) {
	genericCaller(arg)
}

zoecarver · November 14, 2020, 11:28pm

If we're going to do that, I'd rather just implement @_must_specialize. I think it's a bit more clear and could be used more broadly (not just for generics or C++ functions).

John_McCall · November 15, 2020, 12:53am

We actually can’t just implement all Swift generics as if they were C++ templates and force eager specialization all the way down; there are semantic limitations on what you can do with C++ templates that we don’t impose in Swift. For example, in Swift you can recurse in a generic function with a “larger” type argument, and as long as the recursion isn’t dynamically infinite it will succeed; in a C++ template, that kind of recursion would violate instantiation-depth restrictions (which exist specifically to allow that eager-specialization implementation).

Even if Swift as a language permitted mandatory eager specialization, actually doing it for all generics would require accepting sacrifices around some of our implementation goals around compile times and code size. And it would be necessary for all code just in case it used a C++ template.

So I think some way to force eager specialization — with all the concomitant expectations and restrictions — is probably necessary on some level, even if you can figure out some way to avoid writing it in common cases.

zoecarver · November 15, 2020, 1:54am

We actually can’t just implement all Swift generics as if they were C++ templates and force eager specialization all the way down

Just to be clear, I'm only suggesting that we create specializations of functions that call C++ function templates-- not all Swift generics. I think this proposal essentially follows what you outlined in your comment here.

there are semantic limitations on what you can do with C++ templates that we don’t impose in Swift. For example, in Swift you can recurse in a generic function with a “larger” type argument, and as long as the recursion isn’t dynamically infinite it will succeed; in a C++ template, that kind of recursion would violate instantiation-depth restrictions (which exist specifically to allow that eager-specialization implementation).

Sure, in this case, we could just emit an error because there would be no way to determine what type to specialize the C++ function template with.

So I think some way to force eager specialization — with all the concomitant expectations and restrictions — is probably necessary on some level, even if you can figure out some way to avoid writing it in common cases.

I assume you're talking about a @_must_specialize attribute (or equivalent). My thought process was, there's no behavioral change whether we require Swift functions that call C++ function templates to have that attribute or not. The only difference is that in one case they would be required to explicitly mark the function as "must specialize" and in the other case it would be assumed (because there's no alternative).

I think there's a good argument for the expressiveness and clarity of requiring @_must_specialize.

michelf · November 15, 2020, 2:45am

This all look fine functionally. Specializing might be a good approach to calling C++ code given this is how C++ expects things to be. But I wonder about error reporting.

I understand that within a module everything could become implicitly @mustSpecialize to accommodate the C++ call without having to bother the user with it. This is actually very nice, but there's a downside.

It looks like at module boundaries you'd need an explicit @mustSpecialize, which would imply @inlinable. If you forget to annotate your public function as such, the error message will have to report that 3 level deep somewhere there's a C++ template function being called. It'll then force you to add attributes along the path of all those calls. It seems to me that errors would be easier to follow if @mustSpecialize was explicit everywhere and you didn't have to decipher errors spanning the whole call tree. So far I'm very happy that Swift avoided those error cascades.

zoecarver · November 15, 2020, 3:28am

I'm not sure we can ever support public functions that call C++ function templates, even with the @_must_specialize attribute. Unless we have the whole body of the function in the other module, we can't statically analyze it or generate a C++ function specialization.

michelf · November 15, 2020, 11:22am

Isn't that what @inlinable does, making the function body available across modules? @_must_specialize would have to require @inlinable when the function is public.

dabrahams · November 15, 2020, 6:20pm

Some colleagues (@pschuh, @gribozavr, @saeta) and I recently had a long discussion about related issues, and while we made quite a few wrong turns in our exploration, I think we eventually began to understand what needs to be done to make C++ templates interoperate more fully with Swift. I'm sorry that we can't present a simple distillation of the conclusions yet, but I still think it might be valuable for anyone exploring this stuff to read through it.

zoecarver · November 15, 2020, 7:27pm

Potentially down the road, we could do this, but we'd also have to have access to the clang module somehow so that we could generate specializations of the C++ function template.

My suggestion is, for the time being, we assume a public Swift function that calls a C++ function template will create an error.

zoecarver · November 15, 2020, 8:43pm

Did a quick read. My main takeaways are:

We are all in agreement that monomorphization (or specialization) is required for C++ templates (imported as generics) and anything that directly touches them (i.e., their callers and any generic functions above them in the call stack), but nothing else.
Function templates and class templates go hand in hand; we should be discussing them together. I didn't even bring up class templates in my proposal; this was an oversight of mine; we need to be discussing these as well.

I think we should discuss the two-phase type-checking idea brought up in that document. I'm not convinced we'd need a second type-checking phase. Let's look at some examples that will cause errors.

template<class T> struct type_wrapper { using type = T; };

template<class T>
struct has_the_thing { T get_the_thing() { ... } };
struct does_not_have_the_thing {};

template<class T> struct type_wrapper<has_the_thing<T>> {};

template<class T>
typename type_wrapper<T>::type get_the_thing_wrapper(T value) { return value.get_the_thing(); }

template<class T>
struct get_a_type { using type = has_the_thing<T>; };

template<>
struct get_a_type<int> { using type = does_not_have_the_thing; };

protocol HasTheThing {
  associatedtype T
  func get_the_thing() -> T
}

extension has_the_thing : HasTheThing {}

func test<T: HasTheThing>(h: T) -> T.T {
	return get_the_thing_wrapper(h)
}

func caller() {
  test(does_not_have_the_thing()) // Error during first phase of type-checking (pre-silgen). 
  test(has_the_thing<Int>()) // No errors during type checking. 
  // But when we go to specialize `get_the_thing_wrapper` Clang gives us an error. 
  // So we don't have to do anything.
}

func caller2<T>(value: get_a_type<T>.type) {
	test(value) // Calls test(has_the_thing<T>), so no phase 1 type-checking errors.
}

func caller3() {
  caller2<Int>(...)  // Oops, caller2 now calls test(does_not_have_the_thing()), 
  // but we didn't know that until  after specialization. The bad news is that there 
  // should have been a Swift error here because a non-"HasTheThing" type 
  // was passed to "test". The good news is we will still get an error from Clang. 
}

I think the last example (caller3) is really what we want to focus on. Is it OK to just rely on Clang errors? Maybe only as a first step?

Also, if anyone is interested in another example (from the document) here's how I'm proposing we import std::vector/VectorManualModel.

John_McCall · November 15, 2020, 8:44pm

Okay, so it sounds like you effectively want functions to have a must-specialize property (without necessarily an attribute to make it explicit) and for this to be inferred. I agree this is potentially workable, but you need to actually lay out what you think the inference rules should be.

zoecarver · November 15, 2020, 9:09pm

Yes, that is exactly what I'm thinking.

but you need to actually lay out what you think the inference rules should be.

I am proposing that it is inferred that any C++ function template imported as generic Swift function, and any Swift function that calls a C++ function template with one or more generic arguments and any function that calls that function with one or more generic arguments, and so on are marked as "must specialize." Note: I am not proposing that any other functions should be marked as "must specialize" (C++ or otherwise).

Functions that are marked as "must specialize" must be specialized with all concrete types, i.e., they cannot have any generic parameters or protocols[1].

Does that satisfy what you're asking? If not, could you explain more specifically what you are asking or what information you would still like to know? Sorry if this proposal is unclear. Thanks for the questions and feedback.

[1] I need to flush out exactly what the "or protocols" part of this means.

zoecarver · November 15, 2020, 10:12pm

Thinking about it more, maybe a better way to express the same behavior would be to say: any C++ function template imported as a generic Swift function is marked as "must specialize." A function marked as "must specialize" must be specialized and called with all concrete types. Then the rest of the logic sort of falls out of this and allows us to have more freedom with the actual implementation (we could specialize all callers or we could create an if-statement in the future).

michelf · November 15, 2020, 10:59pm

That's what I thought you were suggesting at first when reading the pitch, assuming "must specialize" was being made implicit to remove clutter.

dabrahams · November 15, 2020, 11:45pm

Well, I'm not sure I agree with the premise that C++ templates can be “imported as [Swift] generics” in any meaningful way. Swift generics are fundamentally very different beasts.

My claim is that this is the complete statement (from the document): Every Swift generic that uses a C++ template must be monomorphized, as must every swift generic that uses such a generic, transitively.

That includes generic types as well as functions, per your point 2, with which I agree.

OK…

My goodness, you've made these examples a bit complicated. Next time, when claiming there's an error, could you please spell out what the error is? I'm having to run this through a C++ compiler to analyze it.

func caller() {
  test(has_the_thing<Int>()) // No errors during type checking. 
  // But when we go to specialize `get_the_thing_wrapper` Clang gives us an error. 
  // So we don't have to do anything.
}

“Clang gives us an error” is part of phase 2 type checking. But in this case, because we have the declaration

extension has_the_thing : HasTheThing {}

Having clang give us the error is ergonomically suboptimal. What we should do as I mentioned here is to check has_the_thing<Int> for conformance to HasTheThing and issue a single diagnostic about its failure to conform (because there's no nested T type), rather than allow clang to complain every time we try to use has_the_thing<int>. In this case that can happen in phase 1 because has_the_thing<Int> is fully concretized at the point where it is bound to the generic parameter of test.

IMO that should fail to compile in phase 1, for a couple of reasons. The first is that there's no way to deduce T from a call site, and Swift forbids the declaration of generic functions with un-deducible generic parameters. The bigger reason is that test requires its argument to conform to HasTheThing, but there's nothing constraining the type of value to have that conformance. This is just one Swift generic function calling another, notwithstanding the use of a C++ generic type in the signature, and all the standard rules should apply.

For the sake of your third example, let's assume you had written:

protocol DefaultConstructible { init() }
protocol HasANestedType { associatedtype type }
extension get_a_type: HasANestedType {}

func caller2<T>(_ T.Type)
  where get_a_type<T>.type: HasAThing & DefaultConstructible  
{
  test(get_a_type<T>.type())
}

func caller3() {
  caller2(Int.self)
}

In this case, Swift typechecking fails in phase 1 because the constraints on caller2 aren't satisfied: get_a_type<Int>.type doesn't conform to HasAThing.

I think the last example ( caller3 ) is really what we want to focus on.

You seem to be claiming that Swift can offload all phase 2 typechecking to Clang, but I don't think any of your examples really touch the cases that I think make it necessary. The ones I can think of off the top of my head are all about checking conformances. If I write

extension X: Y {}

and X is a Swift generic, the compiler can check the conformance, for all concretizations of X, at the moment the conformance is compiled. If X is a C++ class template, the conformance can only be checked for a given specialization.

Is it OK to just rely on Clang errors? Maybe only as a first step?

As a first step, I'm for whatever gets the job done . If conformances are really the only places where Swift could end up doing type checking in phase 2, it may turn out that we can live without it; we'll have to see. My bet, though, is that a separate conformance check that limits the Clang errors that pop up deep inside template instantiations, is a huge win for usability.

I'll take a look at your gist next.

zoecarver · November 16, 2020, 2:47am

Well, I'm not sure I agree with the premise that C++ templates can be “imported as [Swift] generics” in any meaningful way. Swift generics are fundamentally very different beasts.

What I meant to say is "for any C++ templates that are able to be imported as Swift generics". What can and cannot be imported as a Swift generic is out of the scope of this proposal.

My goodness, you've made these examples a bit complicated. Next time, when claiming there's an error, could you please spell out what the error is? I'm having to run this through a C++ compiler to analyze it.

Sorry, and will do

“Clang gives us an error” is part of phase 2 type checking.

I misunderstood what a second type-checking phase meant. If it means "check that the argument types still match and propagate any Clang errors," I'm 100% on board.

Having clang give us the error is ergonomically suboptimal. What we should do as I mentioned here is to check has_the_thing<Int> for conformance to HasTheThing and issue a single diagnostic about its failure to conform (because there's no nested T type), rather than allow clang to complain every time we try to use has_the_thing<int> . In this case that can happen in phase 1 because has_the_thing<Int> is fully concretized at the point where it is bound to the generic parameter of test .

Let me think about this and circle back. But this tentatively sounds good to me.

For the sake of your third example, let's assume you had written:

Sure, let's keep going with your version of the "third example." With my proposal, this will not actually fail in the first type-checker phase. The reason for this is because of the way we currently (on ToT) handle class templates. Here's how we will import get_a_type:

template<class T> struct get_a_type { using type = has_the_thing<T>; };
template<> struct get_a_type<int> { using type = does_not_have_the_thing; };

struct get_a_type<T> { typealias type = has_the_thing<T> }
struct _CxxSpecialization_get_a_type { typealias type = does_not_have_the_thing }

Because these are generics, caller2 is going to pick get_a_type (not _CxxSpecialization_get_a_type) and that works fine because get_a_type<T>.type = has_the_thing<T>. Once we specialize, though, we pick the other "overload" (_CxxSpecialization_get_a_type) and fail with a Clang error (or I suppose we could just add a conditional diagnostic in the pass-- either way, we fail in the "second phase").

dabrahams:

The ones I can think of off the top of my head are all about checking conformances. If I write
extension X: Y {}
and X is a Swift generic, the compiler can check the conformance, for all concretizations of X , at the moment the conformance is compiled. If X` is a C++ class template, the conformance can only be checked for a given specialization.

This brings up another question. In Swift the idea of type specializations doesn't really exist. So, when we import a specialization of a type, does that count as the "same" type or a different one? And should extensions apply to both types, or only one? If we say that extensions must be applied to a specific type specialization, then we could probably make this a "phase one" error which would be nice. For example:

template<class T> struct A { T x; }; 
template<> struct A<int> { int x; };

A<int> would not be extended to conform to Y below:

extension A: Y {}

You would specifically have to extend A<int> like so:

extension A<Int> : Y {}

What do you think?

My bet, though, is that a separate conformance check that limits the Clang errors that pop up deep inside template instantiations, is a huge win for usability.

Agreed.

dabrahams · November 16, 2020, 5:06pm

I understand. What I'm saying is that I'm not sure there exists a C++ template that can sensibly be imported as a Swift generic. The properties of Swift generics are just very different from those of C++ templates, and thinking of them as one thing (and especially representing them as one thing in the compiler) may not make sense.

“Clang gives us an error” is part of phase 2 type checking.

I misunderstood what a second type-checking phase meant. If it means "check that the argument types still match and propagate any Clang errors," I'm 100% on board.

Propagating Clang errors may be required, but I'm not thinking of phase 2 typechecking as being exclusively about clang errors. There's still a valuable role for Swift's type checker to play in phase 2.

let's keep going with your version of the "third example." With my proposal, this will not actually fail in the first type-checker phase. The reason for this is because of the way we currently (on ToT) handle class templates. Here's how we will import get_a_type:
template<class T> struct get_a_type { using type = has_the_thing<T>; };
template<> struct get_a_type<int> { using type = does_not_have_the_thing; };
struct get_a_type<T> { typealias type = has_the_thing<T> }
struct _CxxSpecialization_get_a_type { typealias type = does_not_have_the_thing }
Because these are generics, caller2 is going to pick get_a_type (not _CxxSpecialization_get_a_type) and that works fine because get_a_type<T>.type = has_the_thing<T>. Once we specialize, though, we pick the other "overload" (_CxxSpecialization_get_a_type) and fail with a Clang error (or I suppose we could just add a conditional diagnostic in the pass-- either way, we fail in the "second phase").

Yeah, I'm pretty sure that approach isn't going to work out well. For one thing, get_a_type might just as easily have been defined like this:

template<class> struct get_a_type;
template<> struct get_a_type<int> { using type = has_the_thing<T>; };

Now caller2 fails to compile because the compiler doesn't even have a general definition of get_a_type, but when actually called with Int.self it ought to work.

Secondly, Swift's overload resolution happens during phase 1 type-checking. That's a feature-not-a-bug that makes Swift generics more predictable than C++ templates. I don't think we want to introduce overload resolution into phase 2 of Swift if we can help it. Of course it's unavoidable on the C++ side.

This brings up another question. In Swift the idea of type specializations doesn't really exist.

Depends which of the several C++ meanings for “specialization” you intend . That's why I am using the word “concretization” instead to refer to a Swift generic type or function with all of its generic parameters replaced by concrete types.

So, when we import a specialization of a type, does that count as the "same" type or a different one?

It acts like the same generic type but applies to one or more concretizations (multiple if it is a partial specialization). This is analogous to the way a conditional extension or conformance applies to one or more concretizations of a Swift generic.

And should extensions apply to both types, or only one?

Unconditional extensions should apply to all concretizations. Conditional extensions should apply according to their conditions.

If we say that extensions must be applied to a specific type specialization, then we could probably make this a "phase one" error which would be nice.

Yes, full specializations of C++ templates are always type-checked in phase 1 (even in C++), because they are concrete types.

For example:
template<class T> struct A { T x; }; 
template<> struct A<int> { int x; };
A<int> would not be extended to conform to Y below:
extension A: Y {}
You would specifically have to extend A<int> like so:
extension A<Int> : Y {}
What do you think?

Since you asked… to me it seems like an unnecessary limitation that would make programming verbose and tedious. I imagine we'll want to be able to write a conformance of std::vector<T> to Sequence for all Ts, don't you? If you're asking about it as a short-term step toward full interoperability, though, I say again, “whatever works!”

Speaking of specializations, from this thread I conclude that Swift generics eventually need to gain similar expressive power to C++ templates. One of the issues raised there is that non-monomorphizability would force some ambiguities to be resolved at runtime in such a world. Now that we're talking seriously about bringing C++ templates into Swift, and with it, forced monomorphization, it's probably worth asking if there's a way to unify the solutions to these two issues. Maybe there's a way to force monomorphization of just the pieces of Swift that would be needed to resolve/report the ambiguities. And if we get that far, maybe thinking of C++ templates as being imported as Swift generics does make sense after all.

dabrahams · November 16, 2020, 6:00pm

OK, from that gist I think you're maybe missing the point of VectorManualModel in the document. Remember, in that discussion we're trying to find a general way to import templates. If we want the template mechanism to work in general, we can't even assume std::vector<T> has a visible body unless, say, T is movable; the general definition might be

template <class T, class A = std::allocator<T>> struct vector;

and the details might only be filled in via a partial specialization for movable Ts. Whether or not it's technically legal for the C++ standard library to define things this way is irrelevant to the exercise.

Also, for any given category of T, vector might have a partial specialization that provides a completely different definition. While a large majority of templates probably have a general definition that could serve the purposes of phase 1 type-checking in Swift, not all do.

Further the type information provided by the general definition could easily result in an incorrect lowering of a Swift generic that uses the template, because it looks concrete when in fact it depends on a generic parameter:

template <class T> struct X { using Y = int; };    // Y looks concrete
template <class U> struct X<U*> { using Y = U; };  // …but is dependent
template <> struct X<void> { };                    // …or even missing

func g<T>(_: X<T>) -> X<T>.Y { 3 } // should this compile?

Our conclusion was that a Swift generic using X must assume, in general, that X has no knowable API. The point of XManualModel is to provide a declaration of the common API shared by all specializations of X (that are used by the program):

protocol XManualModel { associatedtype Y: DefaultConstructible }
extension X: XManualModel {} // checked in phase 2 for each X concretization used

func g<X1: XManualModel>(_: X1) -> X1.Y { .init() } // normal Swift type-checking
func g<T>(_: X<T>) -> X<T>.Y { .init() }  // ditto; maybe a bit more development work

We might be able to create some tools to assist with the generation of manual model protocols, and in some cases it may be possible to annotate a general C++ template such that the compiler can synthesize a protocol, but we think a system like this is probably needed for the general case.