AWS Lambda functions and the Linux Static SDK - works!

sebsto · June 19, 2024, 5:48pm

Good news, AWS Lambda functions can be compiled with the Static Linux SDK (aka musl) just released.

The advantage of using the Static Linux SDK when building AWS Lambda functions is that 1/ we don't need to build inside Docker anymore and 2/ the binary will work on any version of Linux. We don't have a dependency on Amazon Linux 2, 2023, or 202x (whatever will come next)

There is a bit of ceremonial to make this work at the moment and this is not ready for prime time yet IMHO, but we'll getting there when Swift 6 lands.

I created an issue to open the discussion and keep track of the specific steps to take care about when building manually for now.

I'm interested to hear about community feedback.

Thanks

t089 · June 19, 2024, 8:07pm

Have you checked the size of the final „static musl“ binary? I think there might still be a good reason to get platform support for AL2023 in Swift 6 and then prepare a „Swift Lambda SDK“. Ideally, we would link against as many system libraries as possible to reduce the binary size and only statically link what is missing (eg the swift runtime). Using a special „AL2023 lambda SDK“ we could still do cross-compilation from macOS and then deploy to AWS without docker.

This article compares binary sizes for different compilation modes for Alpine Linux. But I guess something similar applies to AL2023? Hopefully the binary size for static binaries can still be improved…

sebsto · June 20, 2024, 7:23am

Hello, thank you @t089 for the feedback and links. This series of posts is really interesting.

The musl-linked executable are 115Mb vs 66Mb for the statically linked + dynamic Glibc (and others) created with Docker on Amazon Linux 2.

.build/plugins/AWSLambdaPackager/outputs/AWSLambdaPackager//UrlLambda:
total 181280
-rwxr-xr-x  1 stormacq  staff    66M Jun 19 23:09 UrlLambda
-rw-r--r--  1 stormacq  staff    22M Jun 19 23:09 UrlLambda.zip
lrwxr-xr-x  1 stormacq  staff     9B Jun 19 23:09 bootstrap -> UrlLambda

.build/aarch64-swift-linux-musl/release/UrlLambda*         
-rwxr-xr-x  1 stormacq  staff   115M Jun 19 18:21 .build/aarch64-swift-linux-musl/release/UrlLambda
-rw-r--r--  1 stormacq  staff    75M Jun 19 19:13 .build/aarch64-swift-linux-musl/release/UrlLambda.zip

While I understand the importance of producing small executables for mobile and desktop applications, I wonder about its importance in the context of Lambda.

Each Lambda executable is deployed in one microVM. There is a one-to-one relationship between the executable and its execution environment. Using shared libraries will not have a significant disk space benefit.

However, I see two areas where the executable size is important:

The lambda quotas impose a maximum size of 250Mb for a function uploaded through a ZIP file (we can upload up to 10Gb when using container images instead of a ZIP).

A simple "Hello World" complexity example is already at 46% of that size.

I can imagine adding an alternate packaging option with docker images. But that method has additional dependencies on Amazon ECR that would increase the list of pre-requisites.
The executable size impacts the cold start time. A larger executable means more code to download to the microVM, more disk reads etc.

Larger executables clearly impact the download time.

However, I don't think loading a large executable is slower that loading a small one + all its shared libraries. I don't know how Linux loads its binaries and perform the dynamic name resolution, but, intuitively, I think loading a statically linked binary is faster than loading a smaller executable + all its libraries and resolving all symbols.

t089 · June 20, 2024, 7:51am

I was coming from the POV that reusing what is already included in the microVM saves on "provisioning time", ie cold-start time. But maybe this is not so significant in the end... Still, I wonder, if a simple Hello World really needs to weigh in at ~100 MB. And why is the static binary larger than the dynamic binary + its shared libraries?

sebsto · June 20, 2024, 8:54am

The Static Linux SDK comes with libXML, ICU, BoringSSL, curl, ZLib ... in addition of the Glibc replacement

The list is here

github.com

swiftlang/swift-docker/blob/main/swift-ci/sdks/static-linux/Dockerfile#L19


      
          #
          #  See https://swift.org/LICENSE.txt for license information
          #  See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
          #
          # ===----------------------------------------------------------------------===
          
          FROM ubuntu:22.04
          
          # Versions to fetch
          
          ARG SWIFT_VERSION=scheme:release/6.0
          ARG MUSL_VERSION=1.2.5
          ARG LIBXML2_VERSION=2.12.7
          ARG CURL_VERSION=8.7.1
          ARG BORINGSSL_VERSION=fips-20220613
          ARG ICU_VERSION=maint/maint-69
          ARG ZLIB_VERSION=1.3.1
          
          # ............................................................................
          
          # Install development tools

@al45tair does the linker optimize the binary size by dropping unused code ? Is it something possible to do ?

al45tair · June 20, 2024, 9:58am

It does.

BTW, the SDK tarball has an embedded SPDX SBOM that lists the versions built in to it, so you can inspect that to see exactly what you've got.

Did you strip the binary? A lot of the size of a binary built with the Static Linux SDK is debug information, so it's worth stripping it — though right now the easiest way is to use strip on an actual Linux machine (you can use llvm-objcopy too, if you have it built).

There is one additional reason to prefer using the Static Linux SDK for things like this also — namely that the container in which it runs really doesn't need any other components in it. In particular, you don't need a Linux userland, which means it's much harder to break into your systems by exploiting bugs in your program.

dima_kozhinov · June 20, 2024, 10:12am

If it does, why should I strip the binary?

al45tair · June 20, 2024, 10:44am

Stripping and dead code elimination are two different things. The linker does the latter (on Linux, it does so by removing or including things on a per-section basis; on Apple platforms it can actually do a more sophisticated analysis where it can add and remove individual functions without having to put them into separate sections). Stripping, on the other hand, refers to the removal of debug information, which isn't normally something the linker will do itself.

One of the reasons for leaving this part up to you to do is that it's possible you will want to archive the debug information somewhere so that you can use it in the event that your program crashes — you may need it in order to interpret a backtrace.

dima_kozhinov · June 20, 2024, 10:50am

Why should we use a mysterious musl to link the binary statically? Why a simple command line option is not enough? And why a simple hello world app sized ~100MB ? If the app does nothing more than output a line to stdout, shouldn't linker drop networking and the like functions from the binary?

dima_kozhinov · June 20, 2024, 10:59am

Why then a simple hello world app sized ~100MB ?

If I use swift build -c release , shouldn't debug information already removed?

sebsto · June 20, 2024, 11:03am

There are two reasons why Apple released the Static Linux SDK.

Linux programs written in Swift need to ensure that a copy of the Swift runtime—and all of its dependencies—is installed on the target system. Not all Linux distributions have a Swift runtime build for them. So it makes sense, for some use-cases, to statically link all the Swift runtime inside the executable.
A program built for a particular distribution, or even a particular major version of a particular distribution, would not necessarily run on any other distribution or in some cases even on a different major version of the same distribution. So, to simplify the distribution or your executable, the idea is to statically link the libraries that are traditionally shipped by the OS. Typical examples in this list are the libc and SSL/TLS libraries.

Building your Swift executable statically is not aimed at being a catch all solution for everybody building Swift applications on Linux, but it solves many challenges for specific use cases. Deploying on AWS Lambda is one of them.

If you know you're going to deploy your application on Ubuntu, Debian, or any other officially supported Linux distribution, you don't need to use the Static Linux SDK. But when you do not know on which distribution your executable will run or even if there is a Swift runtime available for that distribution, it helps a lot.

Additionally, as Alastair mentioned, there is an additional benefit in terms of security. You know exactly what version of what libraries are statically linked, there is no dependencies on the distribution and your executable don't have dependencies on Linux userland libraries, so you can remove them from your deployment platform, making it much harder to break into.

sebsto · June 20, 2024, 11:07am

Because the resulting binary includes the Swift runtime, the libc runtime, crypto libraries, XML libraries, the unicode libraries and all it needs to run without any dependency on the libraries provided by the underlying Linux distribution. This executable should work on any Linux distribution, without modification.

dima_kozhinov · June 20, 2024, 11:09am

My app does not use crypto, XML, etc, etc. Shouldn't linker drop that unused functionality from the binary?

sebsto · June 20, 2024, 11:15am

musl is not that mysterious :-)
https://musl.libc.org/

musl is an implementation of the C standard library built on top of the Linux system call API, including interfaces defined in the base language standard, POSIX, and widely agreed-upon extensions. musl is lightweight , fast , simple , free , and strives to be correct in the sense of standards-conformance and safety.

In my personal opinion, it also has the advantage over glibc to not be encumbered by the GPL license.

sebsto · June 20, 2024, 11:18am

Maybe. You should give it a try. The Lambda Hello World example I shared at the start of this thread is a Lambda function running with a custom AWS Lambda Runtime. It is not a regular Hello World (print("Hello World"))

The AWS Lambda runtime requires a full HTTPS client and server, with Swift NIO, TLS etc.

Maybe your executable will be much smaller than mine.

dima_kozhinov · June 20, 2024, 11:18am

Why Swift compiler does not use it by default?

sebsto · June 20, 2024, 11:20am

Because of all the objections you shared before. Static linking produces larger executables. For most cases, it's better to rely on the libraries provided by the Linux distributions, and glibc is almost always available by default. But for some cases, it makes sense to build a statically linked file. I guess many embedded devices developers will love it too.

dima_kozhinov · June 20, 2024, 11:45am

I meant a simple "hello world" app, not an AWS Lambda app. I do use AWS Lambda, this is why I am interested in this thread. I do not yet use Swift for my Lambda functions, but would like to.

Max_Desiatov · June 20, 2024, 12:13pm

Musl is pretty much the only widely supported^[1] C standard library on Linux that allows static linking in the first place. Glibc doesn't support static linking.

This point is crucial, I know there are more obscure alternatives, and I'm not counting Android as Linux here for the purposes of this discussion. ↩︎

dima_kozhinov · June 20, 2024, 12:28pm

I was under impression that AWS Lambda functions just do stdout text output.