Swift vs Rust on a single Lambda

I wrote a Swift 6 Lambda that zips 15 GB of S3 objects in one invocation, streaming end to end on 512 MB arm64. It was a challenge posted by Jérémie Rodon (RustyServerless) who built the Rust reference implementation.

The Swift contender lands at 1.04x Rust on median (219 s vs 211 s). Three-stage pipeline with TaskGroups, actors, and AsyncStreams. Pure Swift CRC32 (slicing-by-8), hand-rolled ZIP64 encoder, Soto for the S3 layer.

A few findings relevant to this community:

  1. Soto vs aws-sdk-swift: same app code, same pipeline. Soto finishes in 250 s. The official SDK hits the 600 s timeout. The gap is uploadPart latency: 200 ms (AsyncHTTPClient) vs 680 ms (aws-crt-swift) per 10 MiB part.

  2. ByteBuffer vs Data: switching the upload path from Data to ByteBuffer eliminated ~15 GB of copies per run. Data's per-append reallocation and the SDK's internal flattening are the culprits.

  3. Pure-Swift CRC32 vs ARM __crc32d: the intrinsic has a serial dependency chain. Slicing-by-8 issues 8 parallel table lookups. On a 0.29 vCPU allocation, memory-level parallelism wins: 4 ms vs 76 ms per 5 MB file.

Code: demo-s3-archiving/contenders/swift at main · RustyServerless/demo-s3-archiving · GitHub

PR: Add Swift contender (sebsto-soto): pure-Swift, ~1.04× Rust by sebsto · Pull Request #1 · RustyServerless/demo-s3-archiving · GitHub

Blog post with full context: Can Swift Match Rust on a Lambda Micro-Benchmark? Almost. | Seb in the ☁️

I'm calling on the Swift community here: I'm sure I missed something. The code is about 1,200 lines of Swift 6 strict concurrency. If you spot inefficiencies in how I use TaskGroups, AsyncStreams, actor isolation, or ByteBuffer management, I'd genuinely appreciate the feedback. Same if you see a way to reduce the cold-start variance (9 s sigma vs Rust's 0.5 s at 512 MB).

A few specific questions I still have:

  • Is there a better pattern for the producer/consumer handoff than an actor with manual slot counting?

  • Could the download stage benefit from a custom AsyncSequence instead of collecting into a pre-sized ByteBuffer?

  • Any known pitfalls with AsyncHTTPClient connection pooling on very low vCPU (0.29)?

The STATS=1 instrumentation is left in the code so anyone investigating can get per-stage timings from a single run. PRs and issues welcome.

30 Likes

Blog post with full context: https://stormacq.com/2026/06/02/swift-vs-rust-s3-challenge/

error loading the blog post

1 Like

Fixed Can Swift Match Rust on a Lambda Micro-Benchmark? Almost. | Seb in the ☁️

2 Likes

I love this Sébastien!

3 Likes