Nested generic trouble

hassila · February 18, 2023, 11:08am

While I do agree with the overall sentiment, I’d just want to caution that at least in very simple testing we saw performance regressions - but stopped testing it as it was unclear if it was still something that was pursued?

Perhaps time to give it another spin…

taylorswift · February 18, 2023, 7:25pm

i don’t think “does not expose source code” needs to be the only criteria for a workable generics model, generics are useful even when they are totally inlinable. so i don’t think there ever was a perception that generics are about resilience, resilience is just something generics can also do alongside templating.

what @stuchlej was asking about with the overload resolution has to do with swift choosing a particular set of overload resolution rules that makes sense for resilient generics, and that set of rules is different from the one that makes sense for inlinable generics, like C++ has. so for consistency, swift just made the resilient overload resolution rules apply to the inlinable case as well. that’s why you can’t override the default overload resolution rules with @inlinable and get C++ style dispatch.

it is confusing to people coming from a C++ background, but it’s mostly unrelated to the performance aspects of generics.

to me, the difficulties ive had working with generics is because we don’t have any compile time checking for inlinable generics to make sure they are always specialized when you expect them to. so the monomorphization often fails for interesting reasons that are hard to understand or anticipate. and the compiler doesn’t tell you if it failed because you have to read assembly for every usage site to see what the compiler did.

i think many of these issues with swift generics could be resolved if we just had some kind of attribute like @specialized that attaches to type parameters and tell the compiler that the type parameter it’s attached expects to be used only in declarations emitted into the client. (this would be different from the current behavior of the underscored @_specialized attribute.)

@specialized(T)
struct RGBA<T>

public and @usableFromInline internal things that use the type parameter would have to be @inlinable public, or the compiler would emit a warning.

extension RGBA
{
    public
    var premultiplied:Self
//  ^~~~~~
//  warning: computed property 'premultiplied' must be marked @inlinable,
//  because it uses a specialized type parameter 'T'

but things that don’t use the type parameter wouldn’t need to be inlinable.

stuchlej · February 18, 2023, 7:58pm

Nice summary.

I would just add for the sake of this discussion, that there is a private annotation @_assemblyVision, that informs you about such things. (See PSA: Compiler optimisation remarks .) I wasn't able to get it working in my Xcode project, but it works well with SPM (and VSCode plugin).

ksluder · February 18, 2023, 8:04pm

@inlineable works by publishing the function source.

I’m confused by your use of “inlinable”. Calling template<typename T> void func(T *p) as func(reinterpret_cast<typeof (&someIntVariable)>(someVoidPtr) doesn’t “inline” anything; it generates a new function definition where T is substituted by the deduced type int *—not by the tokens reinterpret_cast<typeof(&someIntVariable)>.

The call to this newly generated function may or may not itself be inlined into its callers, enabling further optimizations. Swift can do the same, if it has access to the function’s source code. C++ guarantees that it’s available because it’s up to the caller to trigger template instantiation, even if the caller is in a different library.

Fun fact: there apparently used to be an export keyword that allowed C++ templates to be generated by the defining library, not be the caller. In other words, you could make your C++ templates link like Swift generics. That doesn’t change the token substitution model, though!

Terminology issues aside, your experience is informative.

Slava_Pestov · February 18, 2023, 8:31pm

There are two mostly orthogonal issues here, whether generics are separately type checked and whether calls to generic functions are monomorphized. As I explained in my earlier post (Nested generic trouble - #31 by Slava_Pestov) the C++ behavior of type checking after template expansion is undesirable for various reasons, even in a language that otherwise monomorphizes generics. For example Rust monomorphizes generics, but it still type checks them separately first because it’s a better semantic model. The “C++ style dispatch” as you call it can already be achieved in a more principled way using protocols (Nested generic trouble - #28 by Slava_Pestov).

taylorswift · February 18, 2023, 10:51pm

it’s reasonable to assume that when @stuchlej asked about

public struct Container<T: Encodable> {
  public var value: T

  public func bytes(_ val: Int) { print("Int") }
  public func bytes(_ val: String) { print("String") }
  public func bytes<A: Encodable>(_ val: A) { print("A") }

  public func store() {
    bytes(value)
  }
}

he was asking why the generics weren’t inlined into

public struct _Container_Int {
  public var value: Int

  public func bytes(_ val: Int) { print("Int") }
  public func bytes(_ val: String) { print("String") }
  public func bytes<A: Encodable>(_ val: A) { print("A") }

  public func store() {
    bytes(value)
  }
}

before type checking, which would allow the call to bytes(_:) inside store to resolve to the one that prints "Int". the call to store would never have any resilience overhead, because it just calls the (Int) -> () overload.

you are completely correct that there are lots of downsides to inlining generics before typechecking, probably more than the amount of upsides to it, but one of those upsides is that using _Container_Int never incurs resilience overhead.

i personally think that swift made the right choice in typechecking generics instead of specializations, and that inlining generics before typechecking was never going to scale. but my point this entire time has been that inlining after typechecking doesn’t work very well today, because we don’t have tools in the language today to statically assert that @inlinable has been correctly applied to all participants in a call chain that clients who do not care about resilience might call.

let’s walk through how protocol-based dispatch might work for Container<Int>.

protocol HasBytesTypeName
{
    static
    var name:String { get }
}
extension HasBytesTypeName where Self:Encodable
{
    public static
    var name:String { "A" }
}
extension Int:HasBytesTypeName
{
    public static
    var name:String { "Int" }
}

then we would refactor Container<T> into something like:

public
struct Container<T> where T:HasBytesTypeName & Encodable
{
    public
    var value:T

    public
    func bytes<Value>(_ value:Value)
        where Value:HasBytesTypeName & Encodable
    {
        print(Value.name)
    }
    public
    func store()
    {
        self.bytes(self.value)
    }
}

now, how would we use such a type?

import ContainerModule

let container:Container<Int> = ...
container.store()

in the absence of @inlinable, this call to store would call the unspecialized implementation, because Container doesn’t know what types Container<T> it might be asked to construct. this is going to be slow.

sometimes we do not care, because we do not consult type metadata in the implementation, or we do consult type metadata but the implementation does so many other things that the overhead of the generics is unimportant. but oftentimes we do care, and it would be helpful if the type system could help with things like diagnosing missing @inlinables.

Slava_Pestov · February 20, 2023, 7:33pm

Ok, yeah, that’s a fair point, and to be honest this criticism applies to any kind of “best effort” optimization pass, not just generic specialization. Elision of ARC traffic suffers from similar issues due to performance cliffs as well. It seems the solution we’ve settled on for the latter is to encode this in the type system itself using move-only values and types, and perhaps one day we can investigate a similar approach to force specialization of generics or error out when it can’t be done.

However I also think that performance of unspecialized runtime generics is not a totally lost cause either, and there’s work we can do to make it faster. It will never be as fast as specialized code but perhaps if we narrow the gap it will be less of an issue.