AWS Lambda functions and the Linux Static SDK - works!

So the -static-executable and -static-stdlib command line options of the swiftc compiler do not work?

They do work, there has been quite some discussion on this previously. Eg in the original announcement thread here

The crux of it is: you can build a -static-executable but only if it does not use anything from libc because libc really does not want to be statically linked.

So, if you want a fully static executable AND you need something from libc, you can now use musl.

-static-stdlib has worked since a long time and allows you to only statically link the swift runtime libraries, but still dynamically link libc (and a few others). This brings the downside that the binary will not be as portable (mostly only runs reliably on the same distro and only if the libraries are present).

1 Like

How do I know if I need something from libc? I use Swift, not C.

I just try an simple hello world, to answer your question

print("hello world")

on macOS with

swift build -c release 

ls -al .build/arm64-apple-macosx/release/HelloWorld
-rwxr-xr-x  1 stormacq  staff  55928 Jun 20 15:20 .build/arm64-apple-macosx/release/HelloWorld

When using the Static Linux SDK

export TOOLCHAINS=org.swift.600202406131a                 
PATH_TO_TOOLCHAIN=/Library/Developer/Toolchains/swift-6.0-DEVELOPMENT-SNAPSHOT-2024-06-13-a.xctoolchain

DYLD_LIBRARY_PATH=$PATH_TO_TOOLCHAIN/usr/lib/swift/macosx $PATH_TO_TOOLCHAIN/usr/bin/swift build -c release --swift-sdk aarch64-swift-linux-musl --target HelloWorld

ls -al .build/aarch64-swift-linux-musl/release/HelloWorld
-rwxr-xr-x  1 stormacq  staff  42116112 Jun 20 15:24 .build/aarch64-swift-linux-musl/release/HelloWorld

It's 40Mb vs 55Kb :-)

Then, I stripped it and it's now 5.9Mb. That's an impressive 86% reduction.
Given that musl libc.a is 2.4Mb and libc++.a is 10Mb, I find that 5.9Mb for an executable that contains both libc and the Swift runtime is not that bad :-)

# on Linux
strip -o stripped HelloWorld

$ ls -alh stripped 
-rwxrwxr-x. 1 ec2-user ec2-user 5.9M Jun 20 13:37 stripped
2 Likes

You always need something from libc, unless you're using the embedded mode. Swift runtime is built on top of libc++ or libstdc++, which are built on top of a libc.

If we're making binary size measurements for "Hello, World!" printers on Linux, one could produce an executable taking 1096 bytes on arm64 with Embedded Swift.

For posterity here's the source code for that, showing how using syscalls without any libc would look like. No allocator is included though, to reduce the amount of dependencies.
import LinuxSyscall

@_cdecl("_start")
func start() {
    let hello: StaticString = "Hello, World!\n"

    __syscall3(64, 1, Int(bitPattern: hello.utf8Start), hello.utf8CodeUnitCount)

    __syscall1(93, 0)
}

where LinuxSyscall directory contains this module.modulemap:

module LinuxSyscall  {
    header "aarch64/syscall_arch.h"
    export *
}

and LinuxSyscall/aarch64/syscall_arch.h looks like this:

// ----------------------------------------------------------------------
// Copyright © 2005-2020 Rich Felker, et al.

// Permission is hereby granted, free of charge, to any person obtaining
// a copy of this software and associated documentation files (the
// "Software"), to deal in the Software without restriction, including
// without limitation the rights to use, copy, modify, merge, publish,
// distribute, sublicense, and/or sell copies of the Software, and to
// permit persons to whom the Software is furnished to do so, subject to
// the following conditions:

// The above copyright notice and this permission notice shall be
// included in all copies or substantial portions of the Software.

// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
// EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
// MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
// IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
// CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
// TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
// SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
// ----------------------------------------------------------------------

#define __asm_syscall(...) do { \
	__asm__ __volatile__ ( "svc 0" \
	: "=r"(x0) : __VA_ARGS__ : "memory", "cc"); \
	return x0; \
	} while (0)

static inline long __syscall0(long n)
{
	register long x8 __asm__("x8") = n;
	register long x0 __asm__("x0");
	__asm_syscall("r"(x8));
}

static inline long __syscall1(long n, long a)
{
	register long x8 __asm__("x8") = n;
	register long x0 __asm__("x0") = a;
	__asm_syscall("r"(x8), "0"(x0));
}

static inline long __syscall2(long n, long a, long b)
{
	register long x8 __asm__("x8") = n;
	register long x0 __asm__("x0") = a;
	register long x1 __asm__("x1") = b;
	__asm_syscall("r"(x8), "0"(x0), "r"(x1));
}

static inline long __syscall3(long n, long a, long b, long c)
{
	register long x8 __asm__("x8") = n;
	register long x0 __asm__("x0") = a;
	register long x1 __asm__("x1") = b;
	register long x2 __asm__("x2") = c;
	__asm_syscall("r"(x8), "0"(x0), "r"(x1), "r"(x2));
}

static inline long __syscall4(long n, long a, long b, long c, long d)
{
	register long x8 __asm__("x8") = n;
	register long x0 __asm__("x0") = a;
	register long x1 __asm__("x1") = b;
	register long x2 __asm__("x2") = c;
	register long x3 __asm__("x3") = d;
	__asm_syscall("r"(x8), "0"(x0), "r"(x1), "r"(x2), "r"(x3));
}

static inline long __syscall5(long n, long a, long b, long c, long d, long e)
{
	register long x8 __asm__("x8") = n;
	register long x0 __asm__("x0") = a;
	register long x1 __asm__("x1") = b;
	register long x2 __asm__("x2") = c;
	register long x3 __asm__("x3") = d;
	register long x4 __asm__("x4") = e;
	__asm_syscall("r"(x8), "0"(x0), "r"(x1), "r"(x2), "r"(x3), "r"(x4));
}

static inline long __syscall6(long n, long a, long b, long c, long d, long e, long f)
{
	register long x8 __asm__("x8") = n;
	register long x0 __asm__("x0") = a;
	register long x1 __asm__("x1") = b;
	register long x2 __asm__("x2") = c;
	register long x3 __asm__("x3") = d;
	register long x4 __asm__("x4") = e;
	register long x5 __asm__("x5") = f;
	__asm_syscall("r"(x8), "0"(x0), "r"(x1), "r"(x2), "r"(x3), "r"(x4), "r"(x5));
}

You can build it with a recent development snapshot using this command:

swiftc -Osize -enable-experimental-feature Embedded \
    -wmo \
    --target=aarch64-none-none-elf \
    -I . \
    -nostartfiles \
    -Xfrontend -function-sections \
    -Xlinker --gc-sections \
    -c -o hello.o \
    hello.swift

and link with lld:

ld.lld hello.o --gc-sections -Bstatic -EL -o hello
2 Likes

What is this magic spell? Is it documented somewhere?

Yes, see the snapshot installation instructions page.

2 Likes

So the release build does not strip debug information from the binary. We need to use the additional strip command.

1 Like
  1. I could not find org.swift.600202406131a there;
  2. These instructions are for macOS, while we are talking about Amazon Linux here.

OK, Now I realize that this is related to some development snapshot.

To bring the topic a bit back on track: So if I read the arguments correctly in this thread there are good reasons (security, simplicity, acceptable overhead) to say that building and deploying fully static library to AWS Lambda is actually better than building a dynamic library (with static swift stdlib) tailored for the specific known runtime env (AL2023)?

2 Likes

The doc has

export TOOLCHAINS=$(plutil -extract CFBundleIdentifier raw /Library/Developer/Toolchains/<toolchain name>.xctoolchain/Info.plist)

We're talking about cross-compiling on macOS to produce Linux binaries

2 Likes

I'm still trying to figure out :-) Thank you for this discussion that helps to gather diverse opinions.
Here is what I think about the Static Linux SDK in the context of AWS Lambda functions

PROs

  • simplicity of the build (no need to have Docker, but it requires to install the SDK)
  • no dependency on Amazon Linux 2 / 2023 or next

Neutral

  • binary size. It is clearly larger, even when stripped down (beware stripping must happen on Linux, making the deployment pipeline more complex). It's not necessarily bad in the context of the Lambda execution environment. Does it affect cold start time ? I don't know. Intuitively, I would answer "no" but I need to measure this.

  • I don't think there is an advantage in terms of security. I understood Alastair comment as "we can strip down the OS and remove libraries that are not used, hence reducing the surface of attack" But in the context of AWS Lambda, you typically don't provide the OS images, the AWS Lambda service does. Unless you provide your full OS image as an OCI container image. But even if you do, what are the security risks you want to protect against in the context of an AWS Lambda execution environment ?

Unless it has a negative impact on cold start time, I don't have a con yet.

Would you mind trying a

swift build -c release -Xswiftc -gline-tables-only -Xcc -gline-tables-only

That should give you a release binary which still has line information embedded (very useful for debugging crashes), but likely smaller than -g.

Thank you @johannesweiss for the suggestion. If I applied them correctly, these two command line options do not move the needle much : 192 bytes less (- 0.0000045%)

without

swift build -c release --swift-sdk aarch64-swift-linux-musl

ls -al .build/aarch64-swift-linux-musl/release/HelloWorld 
-rwxr-xr-x  1 stormacq  staff  42116112 Jun 22 14:04 .build/aarch64-swift-linux-musl/release/HelloWorld
➜  SAM git:(main) ✗ ls -alh .build/aarch64-swift-linux-musl/release/HelloWorld

with

swift build -c release --swift-sdk aarch64-swift-linux-musl -Xswiftc -gline-tables-only -Xcc -gline-tables-only

ls -al .build/aarch64-swift-linux-musl/release/HelloWorld
-rwxr-xr-x  1 stormacq  staff  42115920 Jun 22 14:04 .build/aarch64-swift-linux-musl/release/HelloWorld
1 Like

How about

swift build -c release -Xswiftc -gnone

do we still need strip?

That's the easiest way currently, for sure. We should probably start shipping llvm-objcopy et al with the toolchain, which would let you strip it on any platform Swift runs on.

3 Likes

Since the libraries being linked also contain debug information, you probably do need to explicitly strip the result (-Xswiftc -gnone just controls debug information for your program, not for the libraries that are already built).

2 Likes

I don't know what the Lambda cold start time includes, but one thing that doesn't have to happen here is dynamic library loading and the associated run-time linking. So, naïvely, I would expect cold start time to be lower for the fully statically linked binary.

I agree. There is a probably a few ms added to download the binary into the runtime environment. But give networks these days, it should be really minor. Then, as you mentioned, there is no need to do any symbol resolution. I also intuitively think the statically linked binary should decrease the cold start time. I'll try to measure that if time permits. Although I'm a bit scared of any Lambda benchmarks, these are so easy to interpret incorrectly.