IIUC, the linker is making the decision to ignore the object file from the archive without taking into account llvm.linker.used
. That directive only gets considered on symbols in object files that aren't wholly excluded from linkage. (I believe the protocol conformance metadata already has that directive today.)
Ah! I see what you mean - yes, because the archive is lazy loaded. The dummy reference that @Joe_Groff is mentioning would be responsible for force loading the necessary cross-module references. I wonder if we could just get away with an extern reference forcing the linker to load the object
You're probably more familiar with the problem space here than me. We definitely have more cross-platform control over LLVM-level behavior than platform linkers, so requiring LTO makes implementing language-specific dead stripping behavior more accessible (using "LTO" imprecisely here to generally mean "build to IR during the parallel compile stage, and do the entire IR -> executable image lowering during the link stage", not necessarily involving heavy optimization beyond dead stripping at that stage). It's my understanding that switching to LTO can be a big hurdle for existing ad-hoc build systems, though, so my thinking was that a program operating on object files could be more readily integrated into builds that don't or can't change to an LTO style build. There are also potentially other Swift-specific optimizations we can do more effectively at link time than per-compile, such as generating metadata prespecializations we know the linked program will need, which could be done at either the binary object file level or LLVM IR level.
In my experience with binaries at my company, LTO has been totally infeasible; not because of build system integration issues, but for performance reasons. Our infrastructure doesn't permit single build actions to run for the hour-plus that a full LTO link requires for binaries the size we're dealing with. Thin LTO is supposed to be better in that regard, but I haven't had much success with it either. Maybe Thin LTO is the right answer but there's still a lot of work needed to get there.
Given thatâas you saidâthere are multiple platform linkers to wrestle with, I've avoided placing my hopes too high in a solution that requires so many moving parts to work together in unison. We have lld
on Linux to deal with and then ld64
on Apple, the latter of which itself has gone through a major overhaul in the past year or so.
So, the tighter control over behavior is why I'm more eager to explore ideas that can be done purely in the context of Swift. Especially if those actions can be parallelized and distributed between compilation and the final linking step, vs. LTO's "bet it all on linker" approach.
Sorry, I should have been more clear about this. I was suggesting the "pre-linker" just be a LTO plugin. This wouldn't require a proper LTO, just would allow the "pre-link" phase hook in the existing systems. Basically doing what Microsoft calls LTCG - link time code generation.
*nod* Last I remember about this, the conservative condition for âconformance is unusedâ is âmetadata for conforming type is (otherwise) unused, or metadata for protocol is (otherwise) unusedâ, because otherwise one function can wrap an instance up in Any and another (possibly in another module) can look up the conformance with as?
. Thatâs not a condition that can easily be expressed in existing binary formats. So yeah, some kind of synthetic reference might be the way to go.
For reference
- https://github.com/kateinoigakukun/swift-lto-benchmark
- [GSoC] LTO support progress report (with some design discussion and status)
This was the reason why we went with object linking in SwiftPM, btw. IIRC, library linking with -ObjC
did give us some issues which we didn't encounter with object linking, but I don't remember the exact details.
Even with object linking, could we use .subsections_via_symbols
/-function-sections
/whatever the equivalent is on Windows to allow the linker to remove dead definitions independently from each other? Right now we blanket llvm.used
on stuff like type and conformance metadata, but it could be possible for the compiler to do the sort of dependency analysis we discussed above to be more precise in labeling things as used.
.subsections_via_symbols
is implicitly enabled on MachO. I added support for -function-sections
and -data-sections
along with Alex some years ago (but it is not enabled by default) for ELF. For COFF -Xlinker -opt:ref
(or -Xlinker /OPT:REF
if you are of the Windows persuasion) should do something similarish. We should hoist that into the driver. I should note that I didn't even enable that in antimony, so there might still be more room for improvements.
So, I didn't do it exactly the same ... I just ran the commands by hand instead of with bazel.
> swiftc -emit-library -o out\bin\L.dll -Xlinker -implib:out\lib\L.lib -emit-module -emit-module-path out\swift\L.swiftmodule\x86_64-unknown-windows-msvc.swiftmodule -module-name L .\L\Type.swift .\L\Type+CustomStringConvertible.swift
> swiftc -emit-executable -o out\bin\E.exe -Iout\swift -Lout\lib .\E\E.swift
PS C:\Users\compnerd\AppData\Local\Temp\allevato> .\out\bin\e.exe
I got called!
> swiftc -emit-library -static -use-ld=lld -o out\lib\libL.lib -emit-module -emit-module-path out\swift\L.swiftmodule\x86_64-unknown-windows-msvc.swiftmodule -module-name L .\L\Type.swift .\L\Type+CustomStringConvertible.swift
> swiftc -emit-executable -o out\bin\E.exe -Iout\swift -Lout\lib .\E\E.swift -llibL
PS C:\Users\compnerd\AppData\Local\Temp\allevato> .\out\bin\e.exe
MyType()
> swiftc -emit-library -static -use-ld=lld -o out\lib\libL.lib -emit-module -emit-module-path out\swift\L.swiftmodule\x86_64-unknown-windows-msvc.swiftmodule -module-name L .\L\Type.swift .\L\Type+CustomStringConvertible.swift
> swiftc -emit-executable -o out\bin\E.exe -Iout\swift -Lout\lib .\E\E.swift -llibL -Xlinker -wholearchive:libL
PS C:\Users\compnerd\AppData\Local\Temp\allevato> .\out\bin\e.exe
I got called!
> swiftc -emit-library -static -use-ld=lld -o out\lib\libL.lib -emit-module -emit-module-path out\swift\L.swiftmodule\x86_64-unknown-windows-msvc.swiftmodule -module-name L .\L\Type.swift .\L\Type+CustomStringConvertible.swift
> swiftc -emit-executable -o out\bin\E.exe -Iout\swift -Lout\lib .\E\E.swift -llibL -Xlinker '-include:$s1L6MyTypeVs23CustomStringConvertibleAAMc'
PS C:\Users\compnerd\AppData\Local\Temp\allevato> .\out\bin\e.exe
I got called!
So the behaviour is as expected. A simpler solution that occurs to me - -static
is already used for emitting static libraries (at least on Windows). This is serialised into the module. When the module is deserialised at compile time, we could easily deserialise protocol conformances for referenced types in the module and emit the reference (defsym/include) for it which will cause the linker to preserve the appropriate definition tree without -wholearchive
or -ObjC
. Not ideal, but slightly better than the -wholearchive
.
I think that might be sufficient to tide us over until we can properly prune the set by WHOPR conformance analysis via a LTO plugin.
Just curious does this issue only apply to Windows or did you run the same tests on Linux / macOS and see similar behavior?
I think that Windows will do a better job of exhibiting the issue. Linux and macOS should have some improvements as well, but, they may be less drastic. I think that the overall performance of the swift toolchain on ASi does hide the latency much better. Linux will also hide some of the latency due to the icache/dcache heuristics.
Unfortunately, I did not run the test on Linux or macOS due to lack of easy access and complexity inherent with the shortcuts that I took to expedite the data collection (I figured that if there was no benefit to be had, then there is no value in spending the time).
I've been playing around with the supporting infrastructure to migrate from the temporary representation towards the proper representation on a branch (GitHub - compnerd/antimony at compnerd/parser) which would make it easier to test on other platforms as well. It seems that @Karl is also interested in exploring what the implications are on other platforms. I feel like there is a little bit more to go with the work on that branch before we can get proper numbers on all platforms.
Wow, thinking about performance seems like quite a luxury; I just hope we can get to correctness sometime soon.
The biggest problem SPM has right now, as far as I can tell, is that it's severely under-resourced. Just looking at the still-open issues I've filed, it looks like approximately one bug a week since mid-September, and several of the enhancement requests there are of the form âgive me feature X so I can work around bug Y.â The impression I've gotten is that the team is too busy with⌠something else; I don't know what⌠to do more than speculate about possible causes for these problems. Actual diagnosis, to say nothing of fixing problems, appears to be out of reach. I do get the sense there are some serious architectural problems that prevent SPM from tracking dependencies properly, from working properly on Windows, etc., but those could all be addressed if the team were given the resources to do so.
Yeah, more developer control is crucial. The current SPM design approach makes it extremely difficult, and sometimes impossible, to work around a bug when you find one. For example, it appears -whole-module-optimization
is broken but there's no supported way to turn it off in SPM and even though I know the sequence of swiftc commands that would be required to build my project in release, turning off WMO in the right place, there's no way to get SPM to issue those commands. Right now, I'm stuck. The ability to get in and adjust how these things are handled would save me.