TL;DR: we can get much faster manifest loading for packages that stick within a declarative subset of the language. Details below---I'd love your thoughts on the approach.
Swift package manifests (Package.swift) are executable Swift code written in a mostly-declarative style. The current package manifest loader calls the compiler to build the package manifest into a program, then runs that program and parses the JSON output describing the package. This allows arbitrary code execution in the package manifest, which is useful, but can be somewhat slow due to the cost of all of these steps. Times are at best in the hundreds of milliseconds, with the average across the whole of the Swift Package Index being around 1.5 seconds on my machine (the median is around 1.3 seconds). The results are cached, but the first-build experience, CI systems, and places where user interaction is involved (e.g., editing the manifest in an IDE) still can feel slow.
Many package manifests are simple and fit into a declarative style: a single Package initialization that declares the various products, dependencies, and targets. Here is a (simplified) version of the package manifest for swift-argument-parser that illustrates what many packages look like:
import PackageDescription
var package = Package(
name: "swift-argument-parser",
products: [
.library(
name: "ArgumentParser",
targets: ["ArgumentParser"]),
],
dependencies: [],
targets: [
// Core Library
.target(
name: "ArgumentParser",
dependencies: ["ArgumentParserToolInfo"],
exclude: ["CMakeLists.txt"]),
]
)
For these package manifests, we could implement a separate manifest loader that parses the Swift code and evaluates it's syntax tree, without calling the compiler or running any code. Such an implementation should be significantly faster, and is also naturally sandboxed, but is naturally limited to those package manifests that it can fully understand from looking at the syntax.
Approach
This pull request implements a new parsing manifest loader, which uses swift-syntax to parse the manifest directly, and then walks the resulting syntax tree to find the package declaration, its targets, dependencies, and so on. The result is a package manifest that is equivalent to the one produced by building and executing the manifest, but faster.
Limitations
One particularly important aspect of the parsing manifest loader is how it deals with manifests that include executable code or, more generally, code that it does not recognize. When the parsing manifest loader encounters a syntax node is does not recognize, it records a "limitation" associated with that syntax tree node. For example, an if like this
if buildDynamicLibrary {
products = [
.library(
name: "_SwiftSyntaxDynamic",
type: .dynamic,
targets: [ /* some targets elided for brevity*/ ]
)
]
} else {
products = [
.library(name: "SwiftBasicFormat", targets: ["SwiftBasicFormat"]),
// more targets elided for brevity
]
}
will be recorded as an unrecognized syntax node on the if. The presence of any limitations while processing a manifest indicates that the result of the parsing manifest loader should not be trusted. In such cases, SwiftPM can fall back to the existing (executable) manifest loader, allowing the parsing manifest loader to be treated as an optimization.
With the implementation, one can request that the limitations, if found, are printed to the terminal using the experimental flag --experimental-show-manifest-parser-limitations. For example, the swift-syntax package itself (from which the above excerpt was taken) will report the limitation as follows:
swift-syntax/Package.swift:6:24: error: Unsupported syntax 'expressionStmt' in package manifest [#PackageManifest]
6 | let products: [Product]
7 |
8 | if buildDynamicLibrary {
| `- error: Unsupported syntax 'expressionStmt' in package manifest [#PackageManifest]
9 | products = [
10 | .library(
This flag can be used to determine why a particular package manifest isn't handled by the parsing manifest loader. For package authors, this can be useful to ensure that your particular manifest takes the fast path. For developers of SwiftPM itself, it can be used to identify places where the parsing manifest loader could be extended to support more common patterns, thereby making the parsing manifest loader apply to more manifests.
Validation
The parsing manifest loader is a second implementation of a manifest loader. As such, the primary way to validate it is to ensure that it produces exactly the same results as the existing (executing) manifest loader. Fortunately, it is easy to get a complete view of the package manifest, because SwiftPM can already dump it as JSON with the swift package dump-package operation. The parsing manifest loader therefore has two possible modes of success for any given package manifest:
- It can produce identical JSON to what the executing manifest loader produces, or
- It can record one or more limitations, indicating that it does not support this manifest.
Either result is correct. The first result is more desirable, because it indicates that we could skip the executing manifest loader entirely. If the parsing manifest loader is sufficiently fast, and enough packages produce the first result, then the parsing manifest loader works as an optimization.
The implementation of the parsing manifest loader adds an experimental command-line option, --experimental-manifest-processing-mode, that can enable the parsing manifest loader. It must be provided with one of the following options:
only-parsed: only use the parsing loader. This is not correct, because some manifests require execution, but can be used for testing.only-executed: only use the executing loader. This matches SwiftPM's current behavior, and remains the default at this point.parsed-with-fallback: try using the parsing loader. If it encounters limitations, fall back to the executing loader without failing. This option is what can provide build-time performance improvements when the parsing manifest loader takes over.crosscheck: try both the parsing and the executing loaders and compare the results to ensure they match. If the parsing loader encounters limitations, no further checking will be done. On success, it will print the relative times. If the results differ, it will indicate that the parsing manifest loader has a bug and print the JSON form of the manifest from each parser for future.
Additional benefits
The parsing manifest loader uses swift-syntax, and therefore can benefit from other tools based on swift-syntax. For example, if a manifest declares two targets with the same name, SwiftPM will provide an error message like this:
error: 'pkg': duplicate target named 'pkg'
The parsing manifest loader has complete source location information in the syntax tree and access to the same diagnostics printing machinery that the Swift compiler uses, so it could provide a proper diagnostic pointing at the source location of both target definitions. We could also leverage tools such as Fix-Its to help make the experience of writing package manifests smoother.
As noted earlier, the parsing manifest loader is naturally sandboxed, because it isn't running code at all. This also means that its evaluation of a manifest isn't tied to the host where SwiftPM runs: SwiftPM running on Windows could process a package manifest as-if it were on macOS (e.g., so #if os(macOS) code is included), or by emulating a different set of environment variables.
Risks
The primary risk of the parsing manifest loader is that, as a second implementation, it gets out-of-sync with the executable manifest loader. At worst, it would accept a manifest (without recording any limitations) but produce a different, incorrect result, causing incorrect package builds. The parsing manifest loader is architected to avoid this, always matching what syntax it understands and recording limitations along the failure paths. However, it's ~3,000 lines of code and it's possible that bugs along these lines still exist.
A softer failure condition is that the parsing manifest loader doesn't keep up with changes to the manifest format, so fewer packages can take advantage of the performance improvements from using the parsing manifest loader. This won't be seen as a hard failure, but as a slow degradation in aggregate manifest loading performance over time.
There is also the risk that package manifests only work with the parsing manifest loader, because they take advantage of the fact that it doesn't validate everything about a package manifest that the compiler would. For example, it doesn't check that the arguments to a Package initializer or the target function are provided in the right order. A package that depends on the parsing manifest loader being more lax would not work with the executing loader, for example in an older version of SwiftPM. This would also be annoying when moving from a declarative to an executable package manifest: your first if statement could cause you to need to reshuffle your package manifest a bit to meet the compiler's stricter constraints. Any single problem like this can be addressed by making the parsing manifest loader more strict, but doing so adds complexity and creeps the parsing manifest loader closer to becoming a real compiler and interpreter, possibly undermining its performance advantage.
Results
Assuming that the parsing manifest loader implementation is correct, weighing the benefits and risks comes down to two basic numbers:
- How fast is the parsing manifest loader?
- How often do package manifests fit within the limitations of the parsing manifest loader?
Across all of the packages in the Swift Package Index, the parsing manifest loader processes manifests in an average of ~1.36ms, with a median of ~1.28ms. That's about three orders of magnitude faster than the executing manifest loader, which makes this an excellent optimization when it kicks in. Moreover, it's fast enough that it is reasonable to run the parsing manifest loader for every manifest as a silent optimization: if it encounters limitations, we've only wasted a millisecond to find out and can silently fall back. This is the parsed-with-fallback strategy mentioned earlier.
In its current form, the parsing manifest loader successfully handles ~87% of the manifests in the Swift Package Index. The remaining package manifests encounter limitations, so they would still need to be processed using the executing manifest loader. These measurements were taken by performing a swift package describe on every manifest using the crosscheck mode mentioned above. I've fixed all of the cross-checking failures encountered in the SPI, which doesn't guarantee that the parsing manifest loader is correct, but is a large-enough corpus that we're probably close.
Together, these numbers mean that we can get a 1000x speedup for manifest loading for ~87% of the packages in the wild, and we can do so as an optimization---without users having to do anything. The main cost is in maintaining this second implementation of a manifest loader over time.
Possible improvements
The 87% of packages that already work with the parsing manifest loader is very good, but improving that number makes this optimization more beneficial. There are two complementary approaches to improving on that number.
Make the parsing manifest loader smarter
The parsing manifest loader is, in essence, a greatly simplified version of the Swift compiler that cuts out most of the checking. It can be extended to process additional aspects of the Swift language. For example, it's fairly common for a package to pull out common settings into a separate global variable, like this example inspired by the swift-collections package:
let extraSettings: [SwiftSetting] = [
.enableUpcomingFeature("MemberImportVisibility"),
.strictMemorySafety(),
.enableExperimentalFeature("Lifetimes"),
]
and then use those global variables as part of the package manifest definition, e.g.
.target(
kind: .exported,
name: "_RopeModule",
settings: extraSettings + [.swiftLanguageMode(.v5)])
)
The parsing manifest loader could keep track of these global variables and perform the substitution when processing the target. It's still effectively declarative, but allows natural de-duplication.
We could also handle simple mutation patterns that aren't really declarative, but are nonetheless easy to model. For example, swift-argument-parser adds some targets using append(contentsOf:) :
#if os(macOS)
package.targets.append(contentsOf: [
// Examples
.executableTarget(
name: "count-lines",
dependencies: ["ArgumentParser"],
path: "Examples/count-lines"),
// Tools
.executableTarget(
name: "changelog-authors",
dependencies: ["ArgumentParser"],
path: "Tools/changelog-authors"),
])
#endif
The parsing manifest loader already handles the #if using the SwiftIfConfig library. However, it could recognize the append(contentsOf:) call to add these new targets.
These two improvements have been implemented in a separate pull request, because they've shown up fairly often in packages. However, there is a potentially infinite number of such changes we could make to improve the percentage of packages that are handled by the parsing manifest loader. The only cost is complexity in the implementation.
Make the ecosystem more declarative
The performance benefits of the parsing manifest loader as an optimization is likely to push more packages to stay within the limitations of the parsing manifest loader. We could extend the package manifest format to formalize this notion a bit. For example, we could have a syntax to request that the manifest only be used with the parsing manifest loader, e.g., by stating that the manifest is meant to be declarative:
// swift-tools-version: 6.5; (declarative)
In this case, the parsing manifest loader will always be used with this manifest. Any limitations will be treated as user errors, and SwiftPM will not implicitly fall back to the executing manifest loader. Along with this, we could also make it possible for the manifest to state that it is executable, skipping the parsing manifest loader.
// swift-tools-version: 6.5; (executable)
This would be useful for avoiding bugs in the parsing manifest loader, or simply documenting the intent that this work with the executable loader. Looking forward, a future version of SwiftPM could make declarative the default behavior. Users could opt in to executable manifests, but the defaults would strongly nudge them toward staying within the limitations of the parsing manifest loader to keep more of the ecosystem's manifests declarative and fast.
Doug
