One source of memory and performance overhead in Swift code is the instantiation and fetching of type metadata. Even though generic specialization eliminates the need for type metadata in most fully-specialized code, we still need the metadata in many frequently-occurring situations:
- Objects always need their class metadata, which serves as the "isa" pointer with the object's method table and other dynamic metadata.
- When putting a value inside an existential box, the type metadata for the value's type is stored in the box to represent its dynamic type.
- When calling into unspecialized code, type metadata for the generic type arguments has to be formed. Code may remain unspecialized because it crosses ABI boundaries or is invoked via dynamic reflection.
Currently, when Swift needs the metadata for a generic type or for builtin structural types such as tuples and functions, it always calls into the runtime, which will allocate and initialize metadata records for these types on demand. Although the runtime maintains caches for the resulting records, these calls can still be expensive and lead to noticeable time spent in functions like swift_getGenericMetadata
. This cost can be particularly compounded by libraries that rely on deep composition of generic adapter types, like the standard library's lazy collections. If you compose a bunch of transformer wrappers, then put the result
in AnySequence
or some other dynamic-typed container, like:
let seq = AnySequence(array.lazy.concat().filter { ... }.map { ... })
then we need the metadata for the composed sequence, and getting metadata for a deeply nested generic type like LazyMap<LazyFilter<LazyConcat<...>>>
through the runtime requires instantiating every level of generic type, which can take a significant amount of time and memory if it occurs in a hot path.
In many cases, we only really need the metadata for the outermost type, and building the component types' metadata is just a side effect of relying on the runtime to dynamically build the metadata on-demand. We also know in most cases exactly what types we need metadata for at compile time, when we emit code that instantiates the metadata. We can minimize the runtime's involvement in metadata instantiation and reduce memory use by having the compiler generate pre-specialized metadata records for these types, and updating the runtime and ABI in forward-compatible ways to efficiently accommodate prespecializations. @nate_chandler has been working on this optimization in a PR currently open on the Swift compiler: Generic metadata prespecialization, part 1 by nate-chandler · Pull Request #28610 · apple/swift · GitHub
Maintaining uniqueness of metadata records
Prespecialization is limited by some ABI decisions we've already made. The Swift runtime relies on the pointer identity of metadata records to correspond to the identity of types; in other words, if two runtime values have the same dynamic type, their type metadata records should be the same identical object at the same address in memory, so that pointer equality corresponds to type equality. This poses a challenge for pre-specialized metadata record, because it may not be unique:
- Different dynamic libraries in the process may have generated the same specialization. Only one can be picked as the process-wide canonical metadata for the type.
- Dynamic requests to instantiate the type metadata need to produce the same metadata record as direct references to the specialized type. In other words, a request to build the metadata for
Array
with the dynamic element type(Int) -> String
has to return the same pointer as a direct reference to specialized metadata forArray<(Int) -> String>
.
Therefore, we need a way to register specialized type metadata with the runtime so that it can be "blessed" as canonical metadata. We can do this with various levels of confidence and overhead, depending on how much control we have over the types involved in the specialization.
Alternatively, we might be able to relax the uniqueness requirement in some situations, since there are relatively few operations in Swift that fundamentally rely on that uniqueness, such as ==
for metatypes and dynamic casts. We could potentially relax the ABI requirement for new code, allowing "surrogate" metadata records to be used up to the point the canonical metadata is actually required, such as when performing one of those operations, or calling into code compiled with the older ABI. Establishing that latter criterion is tricky, however, and having multiple metadata records active for the same type has its own potential for increasing the working set of a process, or for exposing surprising behavior because of differences between different metadata records.
Specialized metadata for generic types defined in the current compilation unit
The current ABI already gives a module full control over how the generic types it defines are instantiated, by having references from outside the module call a metadata accessor function defined by the module. This makes specialization of metadata for types defined in the current compilation unit relatively straightforward. Since the defining module controls metadata instantiation, it can ensure that any prespecialized metadata records it created itself are canonical, by using them to serve dynamic metadata instantiations of the same type. For instance, if a module contains the following code:
struct Foo<T> {}
then it will define a metadata accessor function for Foo
, which looks something like this pseudo-C:
const Type *`metadata accessor for Foo`(const Type *T) {
return swift_getGenericMetadata(&`type descriptor for Foo`, T);
}
where swift_getGenericMetadata
is the Swift runtime's default mechanism for dynamically instantiating and caching metadata records at runtime. However, if within the same compilation unit, we know that specific metadata records are used:
func foo() {
// Instantiate metadata for the given generic instances, by passing the type
// as a dynamic value to print()
print(Foo<Int>.self)
print(Foo<String>.self)
print(Foo<Float>.self)
}
then we can generate pre-specialized metadata records for Foo<Int>
, Foo<String>
, and Foo<Float>
, and to serve dynamic metadata requests, we can extend the metadata accessor to check for these specializations before falling back to
swift_getGenericMetadata
:
const Type *`metadata accessor for Foo`(const Type *T) {
if (T == &`metadata for Int`) {
return &`metadata for Foo<Int>`;
}
if (T == &`metadata for String`) {
return &`metadata for Foo<String>`;
}
if (T == &`metadata for Float`) {
return &`metadata for Foo<Float>`;
}
return swift_getGenericMetadata(&`type descriptor for Foo`, T);
}
In turn, this guarantees that the specialized metadata records are canonical, so code within the compilation unit can directly reference the metadata records by address instead of calling the metadata accessor. The prespecialized metadata objects themselves would normally remain private to the module, because the exact set of prespecializations that happened to be used inside the implementation of the module should not be ABI. We could conceivably extend the @_specialize
attribute, which currently applies to functions to emit specializations for specific types or constraints as ABI, to allow a library to explicitly export certain metadata prespecializations. This would allow the standard library in particular to pre-specialize metadata for common Array
, Set
, Dictionary
, and other types, and let other modules directly access those metadata records.
Specialized metadata for generic types defined in other modules
For generic types defined outside the current module, we don't have the benefit of completely controlling the type's metadata instantiation, but we can build mechanisms into the runtime that give modules the opportunity to influence other modules' generic metadata instantiation.
Because specialized metadata is not guaranteed to be unique in cases like this, we will not be able to access it by direct reference. We'll need to feed accesses through a runtime call that caches the canonical metadata record for the type, similar to what swift_getForeignTypeMetadata
does for type metadata generated by the Clang importer from C struct and enum types. So for something like:
func bar() {
print(Array<Int>.self)
}
we could generate a metadata record specialized for Array<Int>
, along with a global variable to cache the instantiated canonical record, and access it through a runtime call, like this pseudo-C:
const Type `type metadata for Array<Int>`;
const Type *`cache for type metadata for Array<Int>` = 0;
void bar(void) {
// Fetch the metadata
Metadata *type = swift_getSpecializedGenericMetadata(
&`type metadata for Array<Int>`,
&`cache for type metadata for Array<Int>`);
print(type);
}
swift_getSpecializedGenericMetadata
would do the one-time instantiation work of returning the cached pointer if it's been set to non-null, or trying to register the metadata as canonical if possible, if all else failing getting the canonical metadata record from the runtime.
As to how that registration occurs, there are a number of possibilities. The most straightforward one is probably to introduce a new registration section to the binary, akin to the __swift5_proto
and __swift5_types
sections for protocol conformances and types, to register the set of specializations in each binary with the runtime. This would impose some overhead when the runtime needs to instantiate a generic type, since it would have to scan these sections first, though once all the used generic types are cached, that overhead should eventually be amortized.
There may be ways to improve on this basic approach, particularly to allow for pre-specialized metadata records to be assumed to be canonical in more situations. One rule we could implement is that, if the generic arguments consist only of types defined in a specific module, that the runtime always favors specialization records from that module as canonical. This would allow the module that defines a type Foo
to also generate the canonical specialized metadata for Array<Foo>
, Set<Foo>
, and other standard library types.
Specialized metadata for structural types
Metadata for specialized structural types, such as tuples and functions, presents similar issues to those for generic types defined in the standard library. One complication is that the structural type metadata layouts are currently mostly private to the runtime. We would need to establish some longer-term guarantees about the layouts for kinds of types we want to be able to prespecialize.
Lazifying access to generic arguments through metadata
One of the optimization opportunities of metadata pre-specialization is the ability for the compiler to skip over emitting metadata for unnecessary intermediate types; for instance, if the metadata for Array<(Int, String) -> (Int, String)>
is needed, then the compiler can generate a metadata record for specifically that type, without also generating metadata for (Int, String) -> (Int, String)
and (Int, String)
. Unfortunately, the current ABI interferes with this ability, because code generation expects to be able to load the generic arguments directly out of the metadata for a generic type. For example, a metadata record for an instantiation of a generic struct Foo<T, U, V>
looks something like this in memory:
struct FooMetadata {
uintptr_t kind;
const TypeContextDescriptor *contextDescriptor;
const Metadata *T;
const Metadata *U;
const Metadata *V;
}
And if code needs to derive the generic arguments from the metadata record,
such as when calling a generic function from a protocol witness, then the
compiler today will generate code that loads directly from those offsets. Something like this:
protocol P {
func foo()
}
struct S<T, U, V>: P {
func foo() {
bar(T.self, U.self, V.self)
}
}
func bar<T, U, V>(_: T.Type, _: U.Type, _: V.Type)
Then the implementation of foo
might end up looking something like this
pseudo-C:
void `S.foo`(const FooMetadata *Self) {
bar(Self->T, Self->U, Self->V);
}
This isn't great for prespecialized metadata, since it means that we would have
to instantiate all of the generic argument metadata anyway, either with additional metadata prespecializations or at runtime, which would increase the code size cost of specialization and introduce instantiation overhead when using pre-specialized records.
We're stuck with the ABI we have for code that has to deploy to existing OSes, which means that we will have to fully instantiate the generic arguments before
passing metadata prespecializations to code that might be compiled with older compilers. We can do better for code built with new compilers, by having the compiler generate code to trigger instantiation of the generic arguments on demand, making the above more like:
struct FooMetadata {
uintptr_t kind;
const TypeContextDescriptor *contextDescriptor;
// These fields point at a mangled name or instantiated type metadata
// depending on the instantiation state of the metadata record
union {
const Metadata *T;
const char *T_mangled_name;
};
union {
const Metadata *U;
const char *U_mangled_name;
};
union {
const Metadata *V;
const char *V_mangled_name;
};
}
void `S.foo`(const FooMetadata *Self) {
// Runtime call forces all the type arguments to be instantiated
swift_instantiateGenericArguments(Self);
bar(Self->T, Self->U, Self->V);
}
which adds a small amount of overhead, but would let us put off recursively
instantiating metadata for the type's generic arguments until we need to, in code compiled for newer platforms. The compiler-generated prespecialized metadata would then include something like a mangled name for each of its arguments that could be used by the runtime to instantiate the metadata, instead of pointers directly to metadata.
Code size and performance tradeoffs with metadata prespecialization
Generating specialized metadata records at compile time should reduce memory usage by replacing dynamic allocations with statically-allocated metadata records, and by avoiding instantiating intermediate types when building composed generic types. The prebuilt metadata records will of course increase the code size of binaries. One issue is that type metadata records are not "true const" because they contain absolute pointers, so generating prespecialized metadata records as pre-instantiated metadata could also carry a load-time cost. It may still make sense for prespecialized records to use a type metadata pattern and be instantiated lazily.
Metadata prespecialization also opens up further specialization opportunities; at the cost of yet more code size, we could also specialize the value witness methods for specialized types, as well as the destructor and virtual methods of classes. We'll want to experiment to see how the performance/code size tradeoff works out if we take those opportunities.
Compatibility issues
There are places in the Swift runtime that need to be modified to handle pre-specialized metadata records, particularly the metadata cache. Since metadata prespecialization is primarily an optimization, we can disable it when targeting OSes with older Swift runtimes. Some of the ABI changes discussed would also only be possible to take advantage of in programs that don't link in any binaries built with the older ABI, which is a trickier condition to establish. It may not immediately be practical to implement these changes.