Why storing function in the Array produces a reabstraction thunk?

Nickolas_Pohilets · October 27, 2019, 1:04pm

The following code produces 3 reabstraction thunks:
func makeFunc(_ x: Int) -> () -> Void {
return { print(x) }
}
var funds = [makeFunc(1), makeFunc(2), makeFunc(3)]

reabstraction thunk helper from @escaping @callee_guaranteed () -> () to @escaping @callee_guaranteed () -> (@out ())partial apply forwarder with unmangled suffix ".8" at )
reabstraction thunk helper from @escaping @callee_guaranteed () -> () to @escaping @callee_guaranteed () -> (@out ())partial apply forwarder with unmangled suffix ".8" at )
reabstraction thunk helper from @escaping @callee_guaranteed () -> () to @escaping @callee_guaranteed () -> (@out ())partial apply forwarder with unmangled suffix ".12" at )

They seem to be optimised away in the release build, but why are they even there in the first place?

I understand why reabstraction would be needed in case of swift/ABIStabilityManifesto.md at 269d306b9d275aab5bd10f380a999e51901c5832 · apple/swift · GitHub, but I don't see any change in abstraction levels here.

Is this intentional or a bug?

Nickolas_Pohilets · October 29, 2019, 9:13pm

I think I understood why reabstraction is needed here, please correct me if I'm wrong.

Code that puts function into array does not know how that array will be used after that. It may end up being used from the generic code, which would need to reabstract function signature into something like (T) -> U. But to perform reabstraction from original representation compiler needs to know specific types of the arguments and return values. At the site there function gets reabstracted from Array.Element into (T) -> U, this information is not available. So, instead compiler places into array the representation which is generic enough to be able to create any representation out of it. Which is - pass every argument and return value by reference.

But it is still not clear to me why multiple thunk helpers are needed.

Joe_Groff · October 29, 2019, 9:19pm

You are correct as to why the thunks are needed. It doesn't look like the compiler is directly generating multiple thunks, but it is optimizing together the reabstraction thunks and partial application forwarders (which takes the capture context for a closure and expands it out into multiple arguments to be used by the closure implementation function) into single functions, causing multiple functions to be formed.

Nickolas_Pohilets · October 29, 2019, 9:39pm

Thanks for explanation. For the case of inlining partial apply forwarders this makes sense. Do you know by any chance what happens with duplicated reabstraction thunk helpers if partial application forwarders cannot be inlined?

Joe_Groff · October 29, 2019, 9:52pm

The unoptimized thunk functions should not be duplicated; the compiler will only emit one for each type.

Nickolas_Pohilets · October 29, 2019, 10:12pm

Hm... looks like they still do. And metadata records too. Should I report this as a bug?

Joe_Groff · October 29, 2019, 10:14pm

Are you talking about the number of functions emitted, or the number of closures allocated? It will have to allocate a separate closure for each reabstracted closure. They should all share one thunk function implementation.

Nickolas_Pohilets · October 29, 2019, 11:35pm

Number of functions (reabstraction thunk helpers). Running with -O -whole-module-optimization, I do see multiple functions being produced - both in the output of nm and in LLVM IR. In LLVM IR I can also see that multiple metadata's and capture descriptors are being created.

Test code - import Foundation@inline(never)func makeFunc(_ x: Int) -> () -> Void { - Pastebin.com
Generated IR - ; ModuleID = '-'source_filename = "-"target datalayout = "e-m:o-i64:64-f80:1 - Pastebin.com

Joe_Groff · October 30, 2019, 12:28am

That's probably an effect of inlining into the multiple thunks.