We would like to leverage the existing llvm/linker optimizations for symbols with hidden visibility, with the goal of reducing app size. The idea is basically to force hidden visibility by default, and then mark symbols as exported when necessary. There are two ways to mark symbols as exported we are thinking,
leverage the new package access modifier (https://github.com/apple/swift-evolution/blob/main/proposals/0386-package-access-modifier.md). Say we only package one binary in each swift package, and pass the same -package-name binaryName to all the swift modules that live in this binary when compiling. Effectively, symbols marked with package will be exported to be accessible/visible to other binaries in other packages. The idea is basically tailoring the concept of package to link unit. Is this feasible?
alternatively, we can implement the __attribute__((visibility("...")) in swift compiler, similar to clang.
Please let us know your thoughts/feedbacks! Thanks!!
That is feasible provided those binaries are built from source together. However, the package symbols are currently stored in the final executable so you'll have to manually modify their visibility with -exported_symbols_list or -no_exported_symbols. They are also stored in llvm.used so there are currently limitations as to how much of them can be optimized. These are areas we are currently looking into improving, along with other size optimizations by treating a package as a whole unit, as mentioned in the future direction of the proposal referenced.
However, the package symbols are currently stored in the final executable so you'll have to manually modify their visibility with -exported_symbols_list or -no_exported_symbols .
Yeah we are trying to avoid using extra steps to generate the exported symbol list. We would like a way to explicitly mark symbols as default visibility and to have symbols with hidden visibility by default. As an example if we use a script to analyze the symbols in each link unit and generate the exported list, we can reduce the binary size by a few percent.
By default, package symbols are exported in the final libraries/executables. It would be useful to introduce a build setting that allows users to hide package symbols for statically linked libraries; this would help with code size and build time optimizations.
From the future optimization work in the package proposal, will the build setting allow us to use hidden visibility for package symbols?
The entities marked with ?(a) and ?(b) from the matrix above both require accessing and subclassing cross-modules in a package (open within a package). The only difference is that (b) hides the symbol from outside of the package and (a) makes it visible outside. Use cases involving (a) should be rare but its underlying flow should be the same as (b) except its symbol visibility.
These are areas we are currently looking into improving, along with other size optimizations by treating a package as a whole unit
I am not sure if our goals are aligned with the future directions of package, but looks like pretty close.
That might be an easy win if we never expect package symbols to be exported from dynamic libraries, which seems reasonable to me at first glance. Still, we should also generally have no reason to give symbols default visibility unless we're building the API for a dynamic library, so it would be nice if executables and statically-linked packages generally got hidden visibility for even their public declarations as part of their normal build process.
Wondering do you mean we should force hidden visibility for all swift symbols when building for executables and static libs despite of access modifiers; while when building for dylib, we should force hidden visibility but respect the access modifiers, e.g. export public/open symbols?
One question is how the build system knows at compile time if it's building for executables/static-lib/dylib? Or should we implement something like -fvisibility=hidden for swift too?
Yeah, the current behavior should be appropriate for dynamic libraries, where public symbols are given default visibility, and internal or private symbols are not exported at all. For executables or static libraries giving everything hidden visibility is usually more appropriate.
I think there are some existing flags to control this, but maybe the build system(s) such as swiftpm, xcodebuild, etc. don't set them appropriately. @compnerd might know, since on Windows it's necessary to know this in order to correctly emit the symbols in the library.
I don't know that it makes sense to assume that intra-package module boundaries never cross dylib boundaries — I'm sure it's uncommon, but I'm not sure we want to make it impossible.
I think the compiler should make conservative assumptions by default, but the build system should be able to give the compiler "assembly instructions" that specify these sorts of relationships between modules so that we can do something better. For example, the build system can tell the compiler that it's making a dylib with modules A, B, and C in it, and that that dylib is (or is not) a stable-ABI boundary, and that the dylib needs to expose the interface of modules A and C but not B; and then it can also say that it's making a separate executable for modules D and E that will need to link that dylib. And then at every step of the build process, the compiler can make optimal decisions (at least at module granularity) about which symbols to export and how reference symbols from other modules.
Having that kind of information is also very important on Windows because the code-gen / linking model is much less forgiving about failing to annotate imported symbols.
We'll need to work together to make sure that the information is something that the build system can reasonably provide, but that doesn't seem too difficult — e.g. SPM is already making all these policy decisions, it just needs to write them down. And in the long run, maybe some of that information will also become an optional input to SPM for users that want to make specific decisions about module arrangement.
With a dependency graph of App -> libA -> libB -> libC and App -> libD -> libC, Xcode will build libC as a dylib and link libB against that even if B and C are from the same package, so "intra-package module boundaries will never cross dylib boundaries" is not currently true.
We are already making package symbols accessible across dylib boundaries today in Swift 5.9, and in my mind that was an explicit use case of the package proposal. For example, a team that owns a resilient framework and also an application ought to be able to use package to create framework APIs that are only for the application.
I think John has the right idea with respect to how the build system and compiler would conspire to adjust symbol visibility contextually. It should be possible to optimize symbols associated with package decls in the final executable when we know that the interface of the module they belong to isn't exposed.
Yes, that's the idea. The setting would primarily target statically linked products; for dylibs it would be more tricky as it would require knowing which of the package symbols in dylibs are not accessed (directly or indirectly) by the clients in the same package boundary.
Yeah build system has the information for dylibs/static libraries, I am not sure if it knows if a dylib needs to expose the interface of a module (I don't know much about build system :]). There is also the limitation around module granularity, some modules can be pretty big and I feel module granularity is not good enough.
For our use case of code size, we are assuming LTO is on and we need a way to express the visibility around dylib partitions similar to clang.
There are a few Swift optimizations that may need visibility in order to be functional or work better:
internalize-at-link, conditional-runtime-records CC @kubamracek
Maybe, but that seems like something that you should only need to reach for in specialized situations, like if you're conforming to a C plugin interface that needs specific symbols to be exported from a plugin dylib or something like that. In most normal cases, it seems like the compiler ought to be able to figure out the appropriate visibility given the current set of access control modifiers and some build plan information to describe what kind of final product is being built.
It's already included in swift 5.9. The visibility setting will be included in the future.
LTO is not on for Swift. In swift 5.9, you can pass a flag --experimental-lto-mode thin|full to swift build though along with -lto=llvm-thin|llvm-full as a linker flag. It works for a simple project (e.g. an executable without other package dependencies) but buggy otherwise (there are also discrepancies between thin and full LTO). We're looking into improving this as well.
Unfortunately, AFAICT, SPM's policy is currently "a module is a dylib" effectively. This has been a serious problem for Windows as it prevents us from doing more appropriate things such as re-exposing interfaces from a static library, building static libraries without warnings, etc.
I'm not sure how easy this will be on all targets, particularly thinking about PE/COFF. It is isn't just an instruction to annotate the symbol, it is part of the symbol definition itself. I don't think that creating an alias symbol would work, and we would end up with extra indirection to export a symbol (beyond just the moral equivalent of the GOT).
I think you’re misunderstanding what I mean by “assembly instructions”. I’m not talking about machine instructions; I’m saying that the build system would tell the compiler the arrangements between the module and its dependencies and dependents. On Windows, that would be enough to correctly treat things as dllimport/dllexport only when necessary to cross a shared library boundary.