SwiftPM: 2x faster resolves, 3x smaller disk footprint

Swift Package Manager: 2x faster resolves, 3x smaller disk footprint

At Ordo One, we have a server-side Swift project with 48 dependencies (soto, swift-protobuf, swift-nio, gRPC, etc.), and as the dependency graph grew we noticed dependency resolution and download times becoming a significant part of our development and CI cycle.

SPM currently fetches the full git history for every dependency. For our project, resolution takes 60+ seconds and .build/ reaches 1.8 GB. There have been previous discussions on improving this - shallow cloning, depth-1 clones, reduced download sizes - each with its own challenges. We'd like to suggest a different approach that sidesteps git cloning entirely for the common case.

For context on the scale: soto is 381 MB of git history when the source archive is 18 MB. swift-protobuf transfers ~210 MB due to C++ submodules not needed for building the Swift library - source archives reduce this to ~20 MB.

We spent some time investigating approaches to improve this and have put up a PR with an implementation.
The public API is identical - no changes to Package.swift or any user-facing interfaces.

The improvement

GitHub (and GitLab, Bitbucket) already serve source archives for any tagged release. swift-nio 2.97.1 is a single 2 MB HTTP GET vs 70 MB git clone --mirror.

The implementation downloads ZIP archives directly from GitHub, re-using SPM's existing registry download architecture:

  1. git ls-remote --tags - discover available versions (same as presently)
  2. GET Package.swift from CDN - check tools-version compatibility
  3. Download ZIP archive from GitHub CDN

Packages with submodules fall back to shallow clones. Any failure falls back to git clone --mirror - per-dependency.

Zero GitHub REST API calls. Private repos work with the same auth that git clone already uses. SSH-only repos gracefully fall back to git. Package.resolved format is unchanged - existing lockfiles work without modification.

Benchmarks

Benchmarked against 6 real-world projects across two machines (Mac and Linux datacenter server with 1 Gbps internet), 5 runs per machine (10 total).
Times shown as p50 (median), p75, and p99 percentiles.

Cold resolve (shared SPM cache + .build/ + Package.resolved wiped - the CI scenario)

Project Deps zip p50 p75 p99 git p50 p75 p99 Faster
spi-server 67 68s 76s 92s 100s 104s 111s 1.2-1.5x
swiftpm-large-project 48 46s 46s 48s 93s 95s 101s 2.0-2.1x
penny-bot 47 44s 46s 49s 78s 80s 81s 1.7-1.8x
container 29 26s 27s 34s 42s 44s 47s 1.4-1.6x
swift-composable-architecture 17 12s 13s 17s 16s 17s 18s 1.1-1.3x
SwiftLint 9 11s 12s 13s 12s 13s 18s 1.1-1.4x

Warm resolve (.build/ wiped, shared caches retained)

Project Deps zip p50 p75 p99 git p50 p75 p99 Faster
container 29 5s 6s 11s 19s 20s 24s 2.2-3.8x
swiftpm-large-project 48 11s 20s 28s 32s 32s 38s 1.4-2.9x
swift-composable-architecture 17 2s 3s 3s 4s 4s 5s 1.7-2.0x
penny-bot 47 11s 13s 29s 14s 16s 18s 0.6-1.3x
spi-server 67 20s 21s 28s 22s 25s 26s 0.9-1.1x
SwiftLint 9 4s 5s 6s 4s 6s 7s 1.0-1.2x

swift package update (on warm .build/)

Update times are network-dependent and show high variance between runs. Neither approach consistently wins - both perform git ls-remote for version discovery, and the resolution/download phase depends on network conditions at that moment. For smaller projects (< 20 deps) both complete in 1-3 seconds. For larger projects (40+ deps) both are in the 10-25 second range.

.build/ disk usage

Project Deps Source archives Git Reduction
spi-server 67 514 MB 1,546 MB 3.0x
swiftpm-large-project 48 609 MB 1,871 MB 3.1x
penny-bot 47 459 MB 1,484 MB 3.2x
container 29 352 MB 889 MB 2.5x
swift-composable-architecture 17 102 MB 255 MB 2.5x
SwiftLint 9 240 MB 342 MB 1.4x

Source Archives vs Package Registry

Another option for reducing download sizes is hosting a package registry (SE-0292). A registry serves pre-built ZIP archives via a standardized HTTP API, but requires deploying and maintaining a server, populating it with packages, and configuring each client to use it (including URL-to-identity mapping for every dependency). For comparison, we benchmarked source archives against a stateless registry proxy (redirecting ZIP downloads to GitHub, using swift package resolve --replace-scm-with-registry).

For swiftpm-large-project (48 deps): cold resolve takes 93–101s with git, 46–48s with source archives, and 47–66s with a registry. Since both source archives and the registry download the same ZIP files, disk usage is the same. swift package update is where the registry pulls ahead: ~20s for both git and source archives vs 2–3s for the registry, thanks to more efficient version listing. Source archives capture most of the improvement over git for initial resolves without requiring a hosted registry, URL-to-identity mappings, or client configuration.

Cold Resolve

Project Deps zip p50 p75 p99 reg p50 p75 p99 Faster
swiftpm-large-project 48 47s 52s 58s 47s 57s 66s ~same
container 29 29s 30s 35s 24s 28s 35s 1.0-1.2x

Warm Resolve (.build/ wiped, shared caches retained)

Project Deps zip p50 p75 p99 reg p50 p75 p99 Faster
swiftpm-large-project 48 20s 21s 21s 3s 5s 6s 3.5-7x
container 29 5s 6s 12s 3s 4s 4s 1.5-3x

swift package update (on warm .build/)

Project Deps zip p50 p75 p99 reg p50 p75 p99 Faster
swiftpm-large-project 48 20s 20s 20s 2s 2s 3s 7-10x
container 29 8s 9s 12s 1s 2s 2s 6-8x

Manifest reading: local git vs CDN

On macOS, concurrent HTTP fetches from CDN are actually faster than local git for reading manifests - 120ms vs 600ms for 20 swift-nio tags. On Linux, local git is faster (83ms) because process forking is much cheaper than on macOS. In both cases, results are cached permanently by commit SHA, so subsequent resolves pay zero network cost.

Why is HTTP faster than local git?

During resolution, SPM checks Package.swift for every candidate version to determine tools-version compatibility. With git mirrors, this is a local git ls-tree + git cat-file against the bare repo on disk. With source archives, it's an HTTP GET to raw.githubusercontent.com.

Method macOS Linux
git ls-tree + cat-file (local) ~600ms ~83ms
HTTP GET (8 concurrent, warm CDN) ~120ms ~134ms
  • Git path: SPM spawns two separate git processes per manifest - git ls-tree to find the blob hash, then git cat-file -p to read the content. Each process fork + exec costs ~5-10ms on macOS, and they run sequentially because each process holds a lock on the repo. For 20 manifests that's 40 process spawns.
  • HTTP path: In-process network calls over a single HTTP/2 connection with connection reuse inside HTTPClient. First request pays TLS handshake (~150ms), but subsequent requests reuse the connection (~30-40ms round trip to CDN edge). With 8 concurrent requests, the latency is amortized across the batch.
# Try it yourself - setup
git clone --mirror https://github.com/apple/swift-nio.git /tmp/nio
cd /tmp/nio
git tag -l '2.*' | sort -V | tail -20 > /tmp/nio-tags.txt
while read t; do git rev-parse "$t^{commit}"; done < /tmp/nio-tags.txt > /tmp/nio-shas.txt

# Git path (~600ms)
time (while read t; do git ls-tree "$t" -- Package.swift > /dev/null; git cat-file -p "$t":Package.swift > /dev/null; done < /tmp/nio-tags.txt)

# HTTP concurrent (~120ms)
time (while read sha; do curl -sL -o /dev/null "https://raw.githubusercontent.com/apple/swift-nio/$sha/Package.swift" & [ $(jobs -r | wc -l) -ge 8 ] && wait -n; done < /tmp/nio-shas.txt; wait)

What about the edge cases?

  • Submodules: Detected early via a CDN check for .gitmodules. Packages with submodules get a shallow clone (--depth 1 --recurse-submodules --shallow-submodules) instead of a ZIP. Still much smaller than a full mirror.
  • Branch/revision pins: Fall back to standard git. Source archives only work with version-pinned dependencies.
  • SSH URLs: Fall back to standard git. Archive downloads require HTTPS.
  • Private repos: Work if HTTPS auth is configured (netrc, GITHUB_TOKEN, keychain). SSH-only auth falls back to git.
  • Git LFS: ZIP archives contain pointer files, not actual LFS content. Rare in Swift packages.
  • Any failure: Falls back to git clone --mirror per-dependency. One failing package doesn't affect others.

Manifest variants

Version-specific manifests (Package@swift-6.0.swift) are a rarely used feature - scanning all 9,873 packages from the Swift Package Index, only 81 (0.9%) use them, mostly for the Swift 5/6 transition. Source archives handle them via lightweight HEAD requests to probe for variants, with negligible overhead for the 99.1% that use only the base Package.swift.

All 81 Swift Package Index packages using manifest variants, by Swift version
Variant Count Packages
@swift-4 2 SwiftyJSON, swift-atomics
@swift-4.2 1 SwiftyJSON
@swift-5 3 MetaCodable, SwiftStella, SwiftyJSON
@swift-5.0-5.4 7 OpenCombine (5.0-5.4), swift-nio-ssh (5.4), swift-win32 (5.4)
@swift-5.5 4 GoogleSignIn-iOS, OpenAIKit, swift-argument-parser, swift-nio-ssh
@swift-5.6-5.8 7 OpenAIKit (5.6-5.8), swift-async-algorithms (5.7, 5.8), swift-async-algorithms-fork (5.7, 5.8), swift-win32 (5.7)
@swift-5.9 37 DIKit, DateKit, DefferedTaskKit, Engine, EventDispatcherKit, FetchRequests, FoundationKit, IDKit, Ignition, Lockman, SmartText, Threading, Transmission, Turbocharger, WindowSceneReader, XibKit, async-collections, cocoa-aliases, combine-cocoa, combine-interception, composable-architecture-extensions, console-kit, package-resources-cli, sqlite-kit, swift-associated-objects, swift-casification, swift-cocoa-extensions, swift-declarative-configuration, swift-foundation-extensions, swift-hashed, swift-interception, swift-keypath-mapping, swift-keypaths-extensions, swift-mockable, swift-resettable, typed-notifications, userinfo-representable
@swift-5.10 10 APNSwift, IndexStore, SmartImages, SmartNetwork, SpryKit, WindowSceneReader, flare, network-layer, swift-aws-lambda-events, swift-mockable
@swift-6.0 28 Aria2Kit, CodableDatastore, EmojiText, JapaneseHoliday, MacroTemplateKit, composable-architecture-extensions, flare, hummingbird, network-layer, soto-core, swift-aws-lambda-runtime, swift-builders, swift-cocoa-extensions, swift-composable-loadable, swift-declarative-configuration, swift-foundation-extensions, swift-keypath-mapping, swift-keypaths-extensions, swift-log, swift-openapi-async-http-client, swift-openapi-generator, swift-openapi-runtime, swift-password-validation, swift-resettable, swift-spyable, swift-standard-webhooks, swift-structured-queries, swift-url-routing-translating
@swift-6.1 5 Aria2Kit, JWSETKit, flare, hummingbird, swift-log
@swift-6.2 2 FaviconFinder, JavaScriptKit
@swift-6.2.3 1 IkigaJSON
@swift-6 1 swift-currency

Background: how we got here

As our project grew we explored several approaches to reduce dependency resolution time and disk usage. We have 50+ repositories on a development machine, and SPM clones the full git mirror per dependency for each project - swift-nio alone is 70 MB, duplicated across every project that uses it. With that many repositories, disk usage compounds quickly and swift package update becomes a significant part of the development cycle.

The journey from CI caches to package registries to source archives

We first tried caching .build/ between CI runs - save it to S3 and restore on the next build to get incremental builds. GitHub's cache action wasn't an option because our self-hosted runners are geographically far from GitHub infrastructure, so we had to run our own S3 cluster locally. Even with local S3, compressing and shipping the bytes took longer in most cases than just re-resolving from scratch. We also have a SwiftUI Xcode project that pulls in all our server code, where the build state is even larger (~4 GB). We tried various timestamp restoration tricks to preserve Xcode's incremental build state, but the time spent on zstd compression, S3 transfer, and restoration was often longer than the actual build.

We investigated the package registry next - AWS CodeArtifact, Tuist, and Artifactory - each viable, but with setup and operational overhead that didn't fit our use case. So we built our own registry. It went through several iterations:

  1. Full archive mirroring - download ZIPs from GitHub, store in S3, serve from the registry. Works, but need to populate it with external tooling.
  2. Signed S3 URLs - pre-signed URLs so clients download directly from S3 (Cloudflare R2 for free egress). Still need to populate the registry.
  3. Stateless on-demand proxy - redirects ZIP downloads straight to GitHub, caches only metadata. No S3 storage, no population step, can run multiple instances.

Each iteration simplified the infrastructure, but even with a working registry there were usability gaps when mixing registry and git sources:

  • The mirror-based URL mapping requires exact string matching (.git suffix, case sensitivity) between GitHub URLs and registry package IDs. For 100 dependencies, that means building and distributing a mapping file with every exact URL-to-ID pair.
  • Even after converting all top-level dependencies to registry, any transitive dependencies still went via the git path. Another flag (--replace-scm-with-registry) is needed on every command.
  • The Swift project identified a public registry as a 2023 focus area, and work is ongoing, but in the meantime there's an opportunity to improve things without registry infrastructure.
  • Xcode shows duplicate packages when mixing registry and git sources - one from git, one from registry, with different icons:

Looking at how other ecosystems handle this was instructive. Go uses ZIP archives via an HTTP proxy - downloading the CockroachDB source tree as a ZIP took 10 seconds vs nearly 4 minutes for git clone. Rust has served tarballs from crates.io since 2014, never using git-based source distribution. This suggested a similar approach could work for Swift, using existing hosting infrastructure directly.

66 Likes

And no - in case someone is dubious of the date - it's not an Aprils fools joke!

Hoping this approach can be acceptable - it would save many engineers a lot of time.

Thanks to @Sarunas for the relentless push to try to improve this in various ways!

21 Likes

I don't have any experience in SwiftPM's codebase. I just wanted to post a message of support to emphasize how much of a pain point resolve/fetch time is:

I'm extremely excited about this, having recently spent a fair bit of time trying to improve our Swift server project's build time. In a project just depending soto, a tremendous amount of time is spent resolving and fetching nio, crypto, and other such big packages. One of our server projects takes 7 minutes to resolve on a fresh VM with no caching. A much more complex Python project with many more dependencies finishes its entire CI run in that time.

So even just supporting public, https GitHub packages would be a tremendous improvement. If this lands, I'll definitely be investigating swapping our internal/private dependencies from SSH to HTTPS.

And I'm glad to see an approach which doesn't depend on getting someone to run a package registry, since that's clearly proven to be a big lift.

6 Likes

Yes please, I'll take anything. Xcode resolves packages so often any improvement would be worthwhile.

11 Likes

One thing this loses is the integrity check inherent in a git hash: Package.resolved includes revisions (even for packages referenced by tag), and those revisions must (a) exist in the repo, and (b) contain what they say they contain. How does that get checked here? (I'm sure it's in your PR, so I'm being a bit lazy about asking here!)

6 Likes

Thank you for addressing this huge problem!

Yours is a great improvement, well-founded.

(As a separate proposal for the disk footprint, should SPM be using a common directory for local repos, and each project .build directory would only have worktrees? I have hundreds of projects, many with the same dependencies. (Has that not been proposed?))

I'm sorry to lose the revision pins; tags being mutable (cf Trivy[1]), I've been migrating to hashes as better hygiene. But I'm sure I'm in a tiny minority.

For that and to avoid many network calls, I wonder if @taylorswift would be interested in making a metadata database available from https://swiftpackageindex.com. It would have content from Package.swift files per SHA==tag (in some SPM-optimized form e.g., a sqlite database) for the indexed packages (updated weekly) (perhaps separated into top-1000 and all). Then SPM version+tools-solving for dependency analysis has a fast path, and can map tags to revision (if not a link to the validated tagged-source binary?).

As for that .zip file: since source compresses well, I imagine swiftpackageindex publishing a signed, compressed, indexed archive of the tagged source from the top 100 projects, or all the swiftlang/apple projects. I'm reluctant to rely on github or gitlab given their outage history and lack of concern to issues we might be having.

I like the swiftpackageindex solution because it's one the community can manage, it incentivizes publishing there, it can focus on the high-traffic cases, and doing this might send corresponding resources in that direction. When people have these archives locally, it would avoid most of the lookup cost.

[1] Trivy compromise via tag re-writing: Trivy ecosystem supply chain temporarily compromised · Advisory · aquasecurity/trivy · GitHub

How does that get checked here?

Package.resolved revision remains the same.

The tag discovery is the same as the current path:

git ls-remote --tags https://github.com/apple/swift-nio | grep 2.97.1
558f24a4647193b5a0e2104031b71c55d31ff83a	refs/tags/2.97.1

Downloading https://github.com/apple/swift-nio/archive/refs/tags/2.97.1.zip you get 558f24a4647193b5a0e2104031b71c55d31ff83a, the same code that you would fetch from git. Github creates and caches the source archives on request.

Package.resolved with this change still matches:

      {
        "identity" : "swift-nio",
        "kind" : "remoteSourceControl",
        "location" : "https://github.com/apple/swift-nio.git",
        "state" : {
          "revision" : "558f24a4647193b5a0e2104031b71c55d31ff83a",
          "version" : "2.97.1"
        }

There is a very small millisecond range time window between tag list and initiating a download for this tag, where the repo owner has a chance to remove the tag and re-create it. But it is possible to switch back to downloading the sha. Probably the git checkout is also affected by this problem as well as it is two non atomic operations.

There is a check for when Package.resolved already has a revision, comparing to what ls-remote has currently and aborting if they mismatch.

1 Like

Implicitly, yes, that's what would happen, but how would the client verify the zip they received actually contained the proper revision? When using git directly, even a shallow clone guarantees the hash, where a simple source download has no way of doing so. This seems like a major flaw, no?

2 Likes

Yes, it is possible to perform a de-dup via symlinks with this approach instead of a copy as it is now:

$ du -sh .build/source-archives/swift-nio/2.97.1
 10M	.build/source-archives/swift-nio/2.97.1

The original zip is downloaded and extracted to ~/Library/Caches/org.swift.swiftpm/source-archives/downloads/swift-nio/2.97.1 so creating a softlink will allow to deduplicate across all projects and save even more space.

tag vs sha is a good discussion to have, seeing sha in the download was a bit hard to read. Cannot switch between on the fly as the zip files have different folder paths inside, and checksum then is different.

I did end up with a stateless Package Registry which was redirecting to GitHub downloads internally that is self populating, metadata and version info is tiny for 10k projects. IIRC I did load ~1.5k SPI projects that actually had dependencies to other projects and this for the latest version was < 8GB.

Hosting this on a several Geo DNS backed VPS servers wouldn't be a problem. What is blocking is the usability of Package Registry and Xcode (see the expand section under Background). This would save 10-15 seconds per update, while the generic proposal gives more benefit and for everyone.

Go has a default GOPROXY environment by default proxy.golang.org and one can change that to a self hosted instance/mirror GitHub - gomods/athens: A Go module datastore and proxy · GitHub

i’m not sure if you’ve got the wrong user, but i am not associated with the swiftpackageindex.com project, you probably want to talk to @daveverwer for that :)

(i do maintain an index-like website called swiftinit.org, perhaps you got them mixed up?)

1 Like

Or @finestructure :slight_smile:

2 Likes

Is this package registry implementation published anywhere? It would be very nice to have a default registry for the community.

1 Like

Not yet, but can have a look to publish this in a week or two. Currently it ignores signing of packages.

There is also: GitHub - adam-fowler/swift-package-registry · GitHub and GitHub - ehyche/swift-package-registry-service: This is a Swift Package Registry Service which proxies the Github API · GitHub

GitHub allows you to download the source code of a specific commit as a ZIP or TAR file. If a branch or tag is first resolved to a commit hash through the REST API, you have a stable reference. SwiftPM could then keep the archive file along with it's integrity hash (different from the commit hash), to ensure files aren't changed locally.

All of this does rely on GitHub's archiving API not lying to you, and giving you the exact commit you asked for. But if a bad actor has control of a GitHub project, it's still much more difficult to swap a version tag without SwiftPM noticing the commit sha is changing (because GitHub will say something different).

Looking at the PR, it doesn't seem to make use of this approach, and downloads archives for tags directly. So you're right, this is definitely unsafe in the current state.

1 Like

I'll redo to use sha instead of a tag, but I do not believe this is a huge issue, this only affects < 1 second interval. Next update will report that the tag had changed.

1 Like

The main issue is not tag/sha desync during the operation, it's "I pin on my machine and push, the upstream tag changes, my colleague pulls, and now they are running different code from me". If you're checking that already then good.

The secondary issue, which I do still think is an issue, is "what if GitHub's lying to you about the zip file corresponding to the commit". You can say "oh, if GitHub was going to lie about this, surely they can lie earlier in the process too", but, well, that's how malware works; if someone sneaks extra code into GitHub archives specifically, the current way would catch it and the new way won't.

3 Likes

it’s sort of a tangential issue, but i do wish SwiftPM became more tolerant of floating tags, they are quite a common feature in other language ecosystems (like reusable GitHub Actions), and force-rewrites are kind of just a fact of life if you care about security and privacy of contributors. SwiftPM uses a lot of caches scattered across a lot of file system locations, and it is very painful right now to get it to recover from a SHA desync.

1 Like

As the maintainer of penny-bot and author of the mentioned "caching .build/ between CI runs" article who has had to deal with a lot of build performance issues and tried to work around them, it's a strong +1 from me.

I could try to get deep into the details and perhaps find some shortcomings of this approach, but even assuming this solution is not an "excellent" solution and is only "good"/"fine", I'd still rather have it in SwiftPM than not, assuming no critical/security issues.
Better solutions can always be added later. It's been a long time we've had this issue and no such solutions were adequately presented/implemented, so it'd be great to finally have a noticeable improvement.

4 Likes

Both ways catch; on update and recheck.

Update: 7+x less disk space, 3x faster package update

Softlinks to the shared cached packages work, which means disk space savings are huge, nearly 7x. Check on Linux for the swiftpm-large-project now shows me 1916MB to 278 MB (which is shallow clones; non-shared as sub-modules might not be pinned, TBD on how to deal with this).

Structure is the following:

.build/source-archives/github.com/apple/swift-metrics
.build/source-archives/github.com/apple/swift-metrics/2.8.0-f17c111c
.build/source-archives/github.com/apple/swift-http-types
.build/source-archives/github.com/apple/swift-http-types/1.5.1-45eb0224
.build/source-archives/github.com/apple/swift-argument-parser
.build/source-archives/github.com/apple/swift-argument-parser/1.7.1-626b5b7b
.build/source-archives/github.com/apple/swift-nio
.build/source-archives/github.com/apple/swift-nio/2.97.1-558f24a4
.build/source-archives/github.com/apple/swift-distributed-tracing
.build/source-archives/github.com/apple/swift-distributed-tracing/1.4.1-dc403018

Softlink to a specific version:

ls -l .build/source-archives/github.com/apple/swift-nio/2.97.1-558f24a4
lrwxrwxrwx 1 sarunas sarunas 107 Apr  2 19:24 .build/source-archives/github.com/apple/swift-nio/2.97.1-558f24a4 -> /home/sarunas/.cache/org.swift.swiftpm/source-archives/downloads/github.com/apple/swift-nio/2.97.1-558f24a4

I also managed to optimize swift package update 3x. Our project update goes from ~22 seconds to ~7 seconds.

13 Likes

This is a massive win! I really hope this lands

6 Likes