Nested generic trouble

C++ templates are an entirely compile time construct. There is no runtime support nor is there an exported symbol that gets emitted into a binary from C++ template functions unlike Swift generics.

@inlinable/@frozen/@usableFromInline in my opinion are quite useless outside of resilient modules and its only used in SPM packages because either SPM doesn't pass the correct flags to the compiler to allow it to see the entirety of the package source, or the compiler just doesn't have this free reign that @lukasa mentioned. (I have no idea what binary packages do, so this argument only applies to packages who's sources are readily available).

3 Likes

around 4 or 5 years ago i discovered underspecialization was responsible for about a factor 10x slowdown in swift-png, because its pixel types are generic over FixedWidthInteger & UnsignedInteger to handle varying color depth.

about a year ago i discovered a very similar issue in swift-json, because its parser is generic over RandomAccessCollection<UInt8>.

you can also diagnose this issue directly with profiling tools like perf, by looking for clumps of "type metadata"-related samples.

but more often i discover it because i add a public convenience API somewhere else that uses some concrete type like [UInt8], and all of a sudden the generic benchmarks run 10x faster because the compiler generates a specialization for the convenience API to use, and the benchmarks all start calling the specialized function instead of the unspecialized one.

as a side note, even @inlinable isn't enough sometimes to fix this problem. sometimes even the mere presence of generics from outer scopes can cause major performance changes when nesting and unnesting types from namespaces, see Un-nesting a type from an enum namespace results in a 6x slowdown

3 Likes

This is good information, but it doesn’t sound related to “copy-by-default”.

copy-by-default is simply the observation that marshaling a pixel buffer of [RGBA<UInt16>] is always going to be faster than marshaling a pixel buffer of [RGBA<T>], because the compiler knows that RGBA<UInt16> is a trivial type, but it has no idea how to copy RGBA<T> without consulting the value witness.

on the other hand, passing the array itself wouldn't be affected. pass by reference would only really help when you have the "giant structs" pattern.

It is my understanding (from chats with @Erik_Eckstein) that there are good reasons that "just behave as though everything is @frozen and @inlinable is not a behaviour that SPM chooses by default. However, a good start might be for SPM to start passing -cross-module-optimization, whose default heuristic targets generic and small functions.

Apropos of nothing, but the much more substantial limitations to highly generic code IMO come in places where @inlinable simply cannot fix the problem in Swift today, as those areas are also resistant to cross-module optimization. The most notable of these is "generic types held in protocol existentials". In this circumstance, the protocol witness table will always be one for the generic functions, even if at the point of constructing the existential the compiler knew the generic types and could provide specialized alternatives.

2 Likes

This is somewhat in contradiction to the spirit if not the letter of the fourth goal from the mentioned document:

Even if you monomorphize all of your generics (the way Rust does) there's still benefit to type checking generic definitions separately, and not after template expansion (like Rust and Swift do).

5 Likes

The @inlinable discussion in this thread actually doesn't get to the crux of the issue. To implement the behavior you're asking for in Swift would require the call to bar to be dynamically dispatched from inside foo. That's not how it works today. The compiler type checks the body of foo, and finds the most specific overload of bar which applies to the generic parameter T at compile time. That is always going to be bar<T: P>, since foo calls bar with a T: P and not a S.

The alternative would be for the compiler to insert a dynamic dispatch there, checking the type of T at runtime to see if it's actually an S (and possibly elide the dispatch if foo was inlined and specialized at the call site). But that would be a big change.

Instead, you might want to turn bar into a requirement of P with a default implementation and another one for S:

protocol P { func bar() }
extension P { func bar() { print("P call") } }
struct S: P { func bar() { print("S call") } }

func foo<T: P>(_ value: T) {
  value.bar()
}
8 Likes

Java uses <> syntax for a completely different generics model, as explained in 'compiling Swift generics' it's even more dynamic than Swift as all generic types are type-erased at compile time. C++ is not the only game in town for generics implementation strategy or even just angle-bracket syntax among programming languages.

1 Like

This is really the key point. You can opt-in to the behavior that you want (or expect) by making the operation that you want to customize be a protocol requirement. We have both kinds of method dispatch: country and western.

10 Likes

In addition to not being able to ship binary frameworks the fundamental flaw with the C++ model is that type checking happening after template expansion is not really amenable to good tooling or diagnostics. Say you have something like


func foo<T>(t: T) {
  t.bar(123)
}

With the C++ model, the type signature of foo() doesn't describe the fact that it expects T to have an instance method named bar which can take an integer (or something expressible by an integer literal). When I invoke foo() with some specific concrete type, I expand the template and only then do I look up the definition of bar() on this type. It might take an Int, or it might take a String, or it might not exist at all. In the latter two cases, the error message at the call site will point inside the template expansion. This is simply unworkable for complex templates and leads to unusable error messages where it is not really clear what the source of the error is. By enforcing generic requirements as part of the function's type, Swift is able to separately type check the body of foo() and the caller, which is really much better.

15 Likes

If I may try to take @taylorswift’s POV for a second, how often are framework vendors forced to publish source code (via @inlinable) rather than binary interfaces for performance reasons?

i’m curious, why does SPM turning everything @inlinable cause problems? i always understood the attribute as meaning “this thing can be inlined” as opposed to “this thing must be inlined”, like @inline(__always).

You use SwiftPM to build N artifact against the source of its dependencies, while distributing a binary that is linked dynamically against binary dependencies.

I don't know, but there's no possibility that switching to the C++ model of type-checking after template expansion would allow framework vendors to publish less source code in their interfaces.

this is true of any implicitly added @inlinable, is it not?

This is undeniably true (you can’t type-check after substitution if you have no source to substitute into), but not what I intended to argue. The point I wanted to raise is that there might be a gap between the perception and reality of Swift’s success at implementing a workable generics model that does not expose source code. @taylorswift has expressed very strong opinions about the need for liberal use of @inlinable to get acceptable performance, but that’s arguably off-topic for this thread.

FWIW, it looks like generics in C# behave the same for this.

interface P {}
struct S : P {};


static class C
{
    static void foo<T>(T value) where T : P {
        bar(value);
    }

    static void bar(S _) { Console.Write("S call\n"); }

    static void bar<T>(T _) where T : P { Console.Write("P call\n"); }

    public static void test() {
        S s;
        bar(s);
        foo(s);
    }
}

produces:

S call
P call
1 Like

Likewise with Java generics: Online Compiler and Editor/IDE for Java, C/C++, PHP, Python, Perl, etc

And Kotlin: Kotlin Playground: Edit, Run, Share Kotlin Code Online

…

And that's about the limit of OO languages I could find that support generic functions and overloading functions with a generic and non-generic variant.