Call proper function based on parameter's dynamic type

Consider following set of functions:

func myDump(_ value: Any) {
    print("any \(value)")
}

func myDump(_ value: String) {
    print("string \(value)")
}

func myDump(_ value: Int) {
    print("int \(value)")
}

let vInt: Int = 1
myDump(vInt) // prints int 1

let vStr = "str"
myDump(vStr) // prints string str

let vDouble = 0.0
myDump(vDouble) // prints any: 0.0

let vAny: Any = "str"
myDump(vAny) // prints any str

In the latter case I want "any str" to be printed.
Here is the only way i found to achieve this:

func myDump(_ value: Any) {
    switch value {
    case value as Int:
        myDump(value)
    case value as String:
        myDump(value)
    // add more cases to handle other types
    default:
        print("any \(value)")
    }
}

But I don't like it for obvious reasons

Am I missing something?

There's no way around it. myDump(_:Int) is a different function from myDump(_:String), so you need something to funnel them down to the same function (like what you just did).

one trick you can do is to use protocol.

protocol Dumpable {
  func dump() -> String
}

extension Int: Dumpable {
  func dump() -> String { "int \(self)" }
}
...

func myDump(_ value: Any) {
  if let value = value as? Dumpable {
    print(value.dump())
  } else {
    print("any \(value)")
  }
}
2 Likes

To add to Lantua's answer, there are three ways to get type-based dynamic dispatch behavior in Swift:

  • Call a protocol requirement
  • Call a (non-final) method on a class
  • Manually check types with as?

It can get confusing sometimes because the compiler will pick the best overload based on the static type, which can look like dynamic dispatch. But this list is the only real way.

7 Likes

Can that ever be inlined?

The method specification in the protocol can't be inlined, because there is no code to inline. The method(s) implemented in concrete types that conform to the protocol, and in a protocol extension as a "default" method, can be inlined as appropriate.

Even based on a dynamic type? So when do we get dynamic dispatch through witness table vs a devirtualized call?

For example, in this case: Operator overloading with generics - #4 by scanon

Can Scalar.gemm(m1, m2) be inlined?

The short answer is No, I think. The only thing that the compiler knows about Matrix is that the concrete type Scalar has a method gemm that can be invoked on the type. Scalar is an existential type; the only guarantee is that whatever type is used to instantiate the Matrix at a specific call site, that type implements a method gemm. The witness tables are resolved by the linker, not the compiler, since the actual functions may be defined in different modules, different libraries, etc..

Now, if there is some clever optimization in the compiler to recognize that the call site is using Matrix<Float>, so it can use the Float MatrixScalar default gemm and thus inline the call to cblas_..., I don't know about it, but, that would be an implementation detail that users don't control. I personally don't think such an optimization exists, since the witness table only involves an indirect jump to subroutine via the function pointer in the witness table, which is cost of almost 0 time these days.

This problem does exhibit a crucial difference between class inheritance and protocol conformance. Classes that can be sub-classed go through a similar process via the class dispatch tables. However, if a class is declared to be final, then methods that are implemented in that class are invoked directly, even if the method is inheritable from a base class(es). By using the attribute final, the programmer is telling the compiler this class cannot be further sub-classed, so it's okay to invoke these directly, no need fo any further indirection. Protocols, as far as I know, don't have a similar mechanism.

2 Likes

Isn't that the point of cross module compilation or the inline all the things flag that @dabrahams was talking about ?

Not with respect to generics, as far as I know. The biggest issue could be a pragmatic one. If you're making the compiler do all of this inference and theorem proving, it will never be able to complete a compile in a timely fashion, only to eke out a couple of microseconds of performance gain, especially for a program of moderate to heavy complexity.

Certainly, if the compiler can see the type passed to myDump at the point of the call.

Many of these questions can be easily answered just by compiling programs; I use this command to dump readable assembly for a single function or method that I call testMe:

swiftc -O -S sourcefile.swift \ 
  | sed -e '/testMe/,/retq/!d;/retq/q' \
  | xcrun swift-demangle

(The second line strips everything between the first mention of testMe and its return instruction. The third line demangles names to make things readable)

Interestingly, Swift's optimizer forgets important type information when faced with an Any, as you can see by running it on this program (I removed printing and string interpolation for simplicity):

protocol Dumpable {
  func dump() -> String
}

extension Int: Dumpable {
  @inlinable
  func dump() -> String { "int" }
}

extension String: Dumpable {
  func dump() -> String { "string" }
}

func myDump(_ value: Any) -> String {
  return !(value is Dumpable) ? "any" : (value as! Dumpable).dump()
}

func testMe() -> String {
  return myDump(3)
}

In the result you can see the creation of boxed existentials and calls to the Swift runtime's dynamic casting machinery.

Assembly output
	.private_extern	x.testMe() -> Swift.String
	.globl	x.testMe() -> Swift.String
	.p2align	4, 0x90
x.testMe() -> Swift.String:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	pushq	%r15
	pushq	%r14
	pushq	%r13
	pushq	%r12
	pushq	%rbx
	subq	$104, %rsp
	.cfi_offset %rbx, -56
	.cfi_offset %r12, -48
	.cfi_offset %r13, -40
	.cfi_offset %r14, -32
	.cfi_offset %r15, -24
	movq	type metadata for Swift.Int@GOTPCREL(%rip), %rax
	movq	%rax, -48(%rbp)
	movq	$3, -72(%rbp)
	leaq	-144(%rbp), %r15
	leaq	-72(%rbp), %rdi
	movq	%r15, %rsi
	callq	outlined init with copy of Any
	leaq	-112(%rbp), %r12
	leaq	demangling cache variable for type metadata for x.Dumpable(%rip), %rdi
	callq	___swift_instantiateConcreteTypeFromMangledName
	movq	%rax, %r14
	movq	type metadata for Any@GOTPCREL(%rip), %rbx
	addq	$8, %rbx
	movl	$6, %r8d
	movq	%r12, %rdi
	movq	%r15, %rsi
	movq	%rbx, %rdx
	movq	%rax, %rcx
	callq	_swift_dynamicCast
	testb	%al, %al
	je	LBB10_1
	leaq	-112(%rbp), %rdi
	callq	___swift_destroy_boxed_opaque_existential_0
	leaq	-144(%rbp), %r15
	leaq	-72(%rbp), %rdi
	movq	%r15, %rsi
	callq	outlined init with copy of Any
	leaq	-112(%rbp), %r12
	movl	$7, %r8d
	movq	%r12, %rdi
	movq	%r15, %rsi
	movq	%rbx, %rdx
	movq	%r14, %rcx
	callq	_swift_dynamicCast
	movq	-88(%rbp), %rbx
	movq	-80(%rbp), %r14
	movq	%r12, %rdi
	movq	%rbx, %rsi
	callq	___swift_project_boxed_opaque_existential_1
	movq	%rax, %r13
	movq	%rbx, %rdi
	movq	%r14, %rsi
	callq	*8(%r14)
	movq	%rax, %rbx
	movq	%rdx, %r14
	movq	%r12, %rdi
	callq	___swift_destroy_boxed_opaque_existential_0
	jmp	LBB10_3
LBB10_1:
	movabsq	$-2089670227099910144, %r14
	movl	$7958113, %ebx
LBB10_3:
	leaq	-72(%rbp), %rdi
	callq	___swift_destroy_boxed_opaque_existential_0
	movq	%rbx, %rax
	movq	%r14, %rdx
	addq	$104, %rsp
	popq	%rbx
	popq	%r12
	popq	%r13
	popq	%r14
	popq	%r15
	popq	%rbp
	retq

However, you only need to write myDump as a generic to get the performance back:

func myDump<T>(_ value: T) -> String {
  return !(value is Dumpable) ? "any" : (value as! Dumpable).dump()
}

which results in:

	.private_extern	x.testMe() -> Swift.String
	.globl	x.testMe() -> Swift.String
	.p2align	4, 0x90
x.testMe() -> Swift.String:
	pushq	%rbp
	movq	%rsp, %rbp
	movabsq	$-2089670227099910144, %rdx
	movl	$7630441, %eax
	popq	%rbp
	retq

Since the generic formulation and the Any formulation are semantically equivalent, there's no reason the optimizer couldn't have made the former as efficient as the latter; it just didn't.

4 Likes

Basically, yes. The flag is called -sil-cross-module-serialize-all and it landed in this PR, thanks to @Michael_Gottesman.

Within module boundaries the ability to make that optimization is unconditional. Across module boundaries, for generics, it depends on -cross-module-optimization and for everything else, on -sil-cross-module-serialize-all.

To add to what Dave's said, the compiler may choose to inline the function that matches a protocol requirement if it can figure out what function is going to be used at run time. (Usually this is done by proving what type is going to be used at run time.) As @Ratingulate pointed out, that's usually multiple steps: "devirtualization" is the step that identifies the function being called, and "inlining" is a second step. As @jonprescott pointed out, simply devirtualizing isn't too much of a win, but it can still be worth it because it opens up the opportunity for inlining and other optimizations. (For example, if the compiler knows you're passing in an integer instead of an arbitrary value, it doesn't have to worry about retains or releases.)

(There's actually a third piece to this too: "specialization". If you're calling a generic function with concrete arguments, and the compiler can see the implementation of the function—because it's in the same module, or because it's inlinable—the compiler has the opportunity to produce a specialized version of the code in the function where the generic type is replaced with the concrete type. This is usually what allows devirtualization, and it means that you get a sort of "virtuous cycle" between specialization, devirtualization, and inlining. As @dabrahams pointed out, specialization could also apply to Any parameters, but the compiler doesn't do it yet, or at least not in as many circumstances as it could.)

As for final, there's no equivalent of that for protocols because there's only ever one level of "overriding" for a protocol: either the concrete type implements the protocol requirement, or it doesn't and the implementation comes from a protocol extension. So in effect everything's final, in that if the compiler knows the concrete type, it will always use a concrete implementation.

All of this comes with a caveat: it doesn't quite apply the same way to modules with library evolution support. In that case, the compiler has to prove not just that the function is going to be called now, but also that the library is not allowed to change it in the future. (The precise rules for this are a little tricky right now, but that's the basic idea.)

1 Like

This is off topic, but I think it’s worth mentioning. I have occasionally wanted to be able to provide a “default” that cannot be “overridden” (i.e. implemented by the conforming type). This has generally come up when a refining protocol provides the “default” for a requirement of a base protocol. This “default” calls through to a requirement on the refining protocol and the library wants to ensure its implementation is used. I have think of these “defaults” as final and have thought about pitching this as a feature.

Ooh that’s interesting. We actually do have optimisations which turn existential-type parameters in to generic parameters so they can be specialised.

If I’m reading it correctly, it seems to be explicitly disabled for Any/AnyObject, apparently because of concerns about code size: https://github.com/apple/swift/blob/ea142dba029b9c14bd778b04daca646319d53c4e/lib/SILOptimizer/FunctionSignatureTransforms/ExistentialSpecializer.cpp#L151

Wow, thanks for all the information everyone. That clears things up for me, save for one question.

Can a protocol default implementation dynamically dispatch on more than just the type of the first argument, by giving multiple implementation depending on the other types.

How about an extension?

And if so, can these again be specialized, devirtualized and inlined under the right circumstances?

Sorry if that's a basic question, I'm new to swift.