Embedded Swift uses a different linkage model from “Desktop” Swift. These differences become especially pronounced when building and linking static libraries with Swift code. This document explores their differences and provides a suggested design direction for a more suitable Embedded Swift linkage model that accounts for static libraries and provides more control over where Embedded Swift symbols are emitted.
Linkage models
Desktop Swift linkage model
When compiling a library, Desktop Swift generates symbols for all public
, package
, and open
declarations in a module [1] into the resulting object file (.o
). The definitions of some of these declarations might also be available for clients to inline (e.g., through the @inlinable
attribute or cross-module optimization), but the canonical definition is always present in the object file.
Generic functions and types in this model require extensive use of type metadata. For example, a generic identity function
func identity<T>(_ value: T) -> T { value }
compiled into an object file will require type metadata so that it can copy the T
value. Protocol conformance requirements like T: P
would require additional metadata. This metadata is not present in Embedded Swift, which therefore cannot emit object code for generic functions.
This linkage model allows some implementation hiding from clients. A non-@inlinable
function’s definition will only be present in the compiled object file, so it can refer to entities made available via an internal
or @_implementationOnly
import. However, without library evolution, this does not in practice fully hide implementation dependencies from clients, as discussed in SE-0409.
Embedded Swift linkage model
When compiling a library, Embedded Swift always retains definitions of every entity in its intermediate representation (SIL) stored in the compiled module file (.swiftmodule
). The compiler generates object code for either the transitive closure of the current module and all of its dependencies (the default, suitable for executables) or an empty object file (with -emit-empty-object-file
, suitable for libraries).
Embedded Swift relies on specialization of generics. For example, when using the identity
function ahead with a given generic argument (say, Int
), Swift will create a “specialized” function identity<Int>
. While specialization is an optimization in Desktop Swift (which has the generic implementation to fall back on), it is required in Embedded Swift because it eliminates all uses of type or protocol conformance metadata. Specialization requires access to the definition of the generic function (including across module boundaries), which is always possible in Embedded Swift because the compiler retains all function definitions in the .swiftmodule
(as SIL).
The fact that all definitions are stored in the .swiftmodule
means that there is no implementation hiding: importing a module via internal
or @_implementationOnly
does not hide the dependency on that module from clients, because the full implementation is already exposed.
Analogy with the C(++) linkage model
C and C++ effectively provide both of these models, with fine-grained control over how a particular symbol is exposed. The “Desktop Swift” model is similar to defining a symbol in a .c
or .cpp
file, e.g.,
// hello.c or hello.cpp
void hello(void) {
printf("Hello!\n");
}
The resulting object file will contain the _hello
symbol (or __Z5hellov
in C++ with the Itanium ABI) with the implementation. The interface that clients use is expressed in a separate header file that doesn’t have the definition:
// hello.h
void hello(void);
The “Embedded Swift” model is similar to making the function (static [2]) inline in the header:
// hello.h
/*static in C*/ inline void hello(void) {
printf("Hello!\n");
}
Here, no symbol will be emitted unless hello
is actually called. A client that calls hello
will produce object code for the implementation hello
.
C++ templates follow a similar model to inline
. When instantiating a template with a given set of template arguments, the definition of that template has to be provided in the header, and its object code will be emitted into the client. This is essentially what Embedded Swift does with specialization of generics.
C++ template instantiations and inline
functions produce symbols that can be de-duplicated by the linker. Those definitions all need to be the same according to C++‘s One Definition Rule (ODR).
Issues with the Embedded Swift linkage model
There are a few issues we’ve run into with the Embedded Swift linkage model.
Non-Swift clients of Swift libraries
The Embedded Swift linkage model assumes that the final executable will be built as Swift, and can generate all of the object code for that module and everything it depends on. If instead the Embedded Swift code is meant to be packaged into a static library for use by non-Swift clients (such as a C client calling into a @_cdecl
Swift function), this model does not work as well.
One of the issues is that the Swift compiler will need to emit definitions for the module being compiled into the static library as well as the transitively closure of everything that module depends on. Let’s say we do this for two Swift modules B
and C
, both of which depend on a common Swift module A
. If another module D
links the static libraries for both B
and C
, it will result in duplicate symbol errors for the symbols in A
.
[copy of A] [copy of A]
| |
B C
| |
+----+----+
|
D
The Embedded Swift flag -mergeable-symbols
emits all Swift symbols such that they can be de-duplicated by the linker, using the same mechanism as C++ does for template instantiations and inline
functions. This eliminates linker errors due to duplication. However, it does not provide a way to ensure that specific symbols only have a single definition in one object file, the way that defining a function in a .c[pp]
file does.
No link-time polymorphism
Link-time polymorphism refers to a technique where there are multiple implementations of a given interface, and one selects which implementation to use by linking in the static archive corresponding to one of the interfaces. For example, in the C world, one can have a standard “hello” header:
// hello.h
void hello(void);
There might multiple implementations of this interface built into different static libraries. For example, libhello_english.a
might contain:
void hello(void) {
printf("Hello");
}
whereas libhello_français.a
might contain:
void hello(void) {
printf("Bonjour");
}
Then, you can select the appropriate implementation at link time via -lhello_english
or -lhello_français
, respectively. If somehow both get linked into a binary, the linker will produce a duplicate-symbol error to detect the problem. The Desktop Swift model of compilation supports this approach in the same way C does for non-inlined code, although it takes care to ensure that these two modules share the same interfaces.
Embedded Swift does not work well here. The implementation of hello
would be inlined into each of the clients, so it’s not possible to replace those implementations later. This would normally be detected by the linker as an error (so at least one wouldn’t accidentally make this mistake), but trying to address the first problem by using -mergeable-symbols
would hide this problem.
No implementation hiding
All of the definitions within a Swift module are exposed in the .swiftmodule
file, so that clients can generate code for them. If these definitions depend on some C headers in a module, those C headers and the module map that covers them must be available to all clients. In practice, this means that any dependency on a C header requires that header to be modularized, including in all of the clients.
The internal import
feature (or its predecessor, @_implementationOnly import
) does not adequately protect against this. It will detect attempts to use entities from the internally-imported module in the signatures of public APIs, but does not hide those dependencies from client compiles. With Desktop Swift’s linkage model, one can hide those dependencies using library evolution and non-inlinable code, but no such affordance exists for Embedded Swift.
Proposed solution
Explicitly mark symbols as being part of the object file
We propose to introduce an attribute @alwaysEmitIntoObjectFile
that specifies that a given non-inlinable function should be emitted into the object file, and that its definition should not be placed into the corresponding .swiftmodule
file. For example, given this definition:
@_cdecl @alwaysEmitIntoObjectFile
public func hello() {
printString("Hello")
}
The Swift compiler would emit the definition in the symbol _hello
in the object file, as a strong definition (i.e., one that cannot be merged with others of the same name). However, it would not emit the definition into the .swiftmodule
file, so clients cannot inline the function: they must call into that _hello
symbol. Since this is a @_cdecl
entry point, a C program could also link the library and call this function. From a C perspective, one can think of @alwaysEmitIntoObjectFile
as moving the definition of the function into a .c
file (rather than it being in the header file).
The Swift compiler may also need to emit additional code into the object file to support the definition of hello
. For example, the printString
function might come from another Swift module Printing
, where its definition is only in the .swiftmodule
file. Or printString
might be another function within the same module. Either way, the definition of printString
will be emitted into the object file (so hello
can call it) as a symbol that can be merged by the linker.
There are some necessary restrictions on @alwaysEmitIntoObjectFile
, including:
- It cannot be used on a generic function, or a function within a generic type.
- It cannot be used on a function marked
@inlinable
or@_alwaysEmitIntoClient
.
For _main
and other entry points, some mechanism will need to ensure that object code is generated for that entry point, as well as triggering object code generation for all of its dependencies. This object code must not be stripped from the resulting binary.
Module-level flag for “build an object file linkable by non-Swift tools”
Some Swift modules are “leaf” modules that are intended to be consumed by non-Swift tools. For example, they might be linked by C programs that don’t know about Swift, or be passed into the linker to build an executable or shared library.
Swift should provide a module-level flag that indicates that it’s building such a “leaf” module. This has the effect of inferring @alwaysEmitIntoObjectFile
on every public
, package
, and open
declaration in the module that is well-formed by the rules above (i.e., it is non-generic and isn’t marked @inlinable
or @_alwaysEmitIntoClient
). It should also emit any serialized @_used
declarations and entry points from imported modules into object code, effectively encapsulating all of the Swift code.
Don’t infer @alwaysEmitIntoObjectFile
from other attributes
There are a few cases where we might be temping to infer @alwaysEmitIntoObjectFile
. For example:
@_used
declarations could infer@alwaysEmitIntoObjectFile
, because they will eventually need to end up in an object file.- The
_main
entry point will need to be emitted into object code, and we could infer@alwaysEmitIntoObjectFile
on to ensure that happens. @cdecl @implementation
functions provide Swift implementations of C functions that were declared in a C header. By definition, at least some of the clients of such functions are C clients, so they will need the symbols defined in object code, which could be accomplished by inferring@alwaysEmitIntoObjectFile
.
In all of these cases, inferring @alwaysEmitIntoObjectFile
defeats whole-program optimizations that could be important in embedded Swift, because the object code is generated in the library’s context rather than in the context of the whole “leaf” module. Therefore, we should not pursue inference of @alwaysEmitIntoObjectFile
from other attributes, and will instead rely on knowledge of which modules are ”leaf“ modules.
Implementation
I've started an implementation of this in a pull request here.
Doug