Specializing functions across module

Lantua · May 31, 2020, 3:35pm

I've got a function libFunction, which is intended to be used by user:

// module Lib
public func libFunction<...>() { ... }

// User land
import Lib
func userFunction() {
  libFunction<...>() { ... }
}

The problem is that libFunction relies heavily on specialisation. So ideally I would do:

@_specialize(...)
public func libFunction<...>() { ... }

But I'd like to have user specify @_specialize(...). Is there a way to achieve this? Wrapping libFunction doesn't seem to cross module boundary

@_specialize(...) // Doesn't get faster.
func userFunction<...>() { libFunction() }

I could use @inlinable all the way, but I still get 3-4x performance with @_specialize on libFunction*. I don't mind experimental attributes (I've been using @_specialize already anyway).

* Interestingly, @_specialize(...) seems to be ignored/unused on @inlinable functions.

Andrew_Trick · May 31, 2020, 4:25pm

@_specialize is a weaker form of specialization that inserts a type check for the type you specify behind the regular unspecialized entry point.

@inlinable should allow full specialization. If libFunction is emitted in the user module without being fully specialized, I think that's a bug. @Erik_Eckstein is there any reason that might happen?

Lantua · May 31, 2020, 4:59pm

A large part of libFunction uses Foo.self == ....self (to use type as an argument), not to actually use Foo instances itself, maybe this makes it more apt to @_specialize? Though you seem to imply that @inlinable encompasses @_specialize, so it is indeed quite baffling. libFunction looks something like this:

public protocol SomeProtocol {
  mutating func libFunction<A, B>(...)
}
public extension SomeProtocol {
  mutating func libFunction<A, B>(...) {
    if A.self == Pivot1.self {
      libFunction(A.Next.self, B.Next.self, leftData)
    } else if A.self == Pivot2.self {
      libFunction(A.self, B.Next.self, rightData)
    } else {
      ...
      perform(...)
      ...
    }
  }
}

In hope that it'd ultimately compiled down to:

mutating func libFunction(...) {
  perform(...)
  perform(...)
  ...
}

This actual source is this _add function. I could try to create a simpler scenario, but that'd take time.

Andrew_Trick · May 31, 2020, 5:12pm

It's possible that the "inlined" function was only specialized on some of the generic parameters, but not the ones that you have explicit checks for.

I think you're doing the work of @_specialize manually. If you see a speedup with @_specialize on the types that you already check explicitly, then it's probably because @_specialize is a more efficient implementation of those checks. When the code is inlined and specialized on other parameters, you might lose that efficiency.
I'm just guessing though. In general, if @inlinable slows the code down I think it's worth a filing bug.

You might be able to use lldb to look at the symbols that were emitted in the user binary to see what happened.

Lantua · May 31, 2020, 5:16pm

To be clear, @inlinable results in speedup compared to unannotated functions (~30x), only that @_specialize is even better (extra 2-3x). If that's what you meant, I'll file a bug report some time later.

Andrew_Trick · May 31, 2020, 5:19pm

Yes, that's what I meant. I'm mainly curious to know if my hypothesis about the speedup is correct, but can't debug it now. It would also be good to have a strategy for giving you the benefits of using both annotations.

SDGGiesbrecht · May 31, 2020, 7:34pm

I suspect that means @_specialize, which happens within the module, has access to some internal information that @inlinable is unaware of, since it happens outside the module.

When this happens to me, I start profiling and usually find an unspecialized method in the offending call stack that is being called by the method I’m worried about, but isn’t the method itself. At that point, adding @inlinable to that method to open it up for the client module to see usually solves the problem (drilling down as many layers as necessary). Unless I’m using library evolution mode and care about ABI stability, @inlinable has always been faster in the end once it’s done right.

Cross‐module optimization will probably make the whole problem go away, as the implementation in master already essentially applies @inlinable automatically to all generic methods.

Lantua · May 31, 2020, 8:23pm

Turns out this is what happened. There are a few protocol conformances that I forgot to inline. With each additional inline, the perf gets better, until it reaches parity with in-module @_specialize. It makes sense since those aren't available to the user-module. So I won't be filing a bug.

There's one thing still, that when I annotate libFunction with both @inlinable and @_specialize, the compiler seems to use @inlinable to optimise, despite the @_specialize being a full specialisation that exactly matches the generic, but it is a separate issue. I'm not sure if I should file this one, or if it is a bug to begin with.

SDGGiesbrecht · June 1, 2020, 1:53am

You may as well file it. I don’t know how they are intended to work together either.