Recommended way to develop for Amazon Linux 2023?

i’ve been running swift applications on Amazon Linux 2023 EC2 instances for a while now. our current setup uses a 5.9.2 toolchain, and i install it like this:

curl https://download.swift.org/swift-5.9.2-release/amazonlinux2/swift-5.9.2-RELEASE/swift-5.9.2-RELEASE-amazonlinux2.tar.gz -o toolchain.tar.gz
tar -xf toolchain.tar.gz
sudo mv swift-5.9.2-RELEASE-amazonlinux2/usr/lib/* /usr/lib/
sudo mv swift-5.9.2-RELEASE-amazonlinux2/usr/libexec/* /usr/libexec/

i’ve gotten surprisingly far with this setup, but i have more than a few misgivings about it, and i cannot escape the feeling that this is unsustainable. i have a few reasons for this:

  1. i am installing a runtime that was compiled for a different OS (Amazon Linux 2) than the OS i am actually using.

  2. due to the much older Glibc version associated with the Amazon Linux 2 toolchain, it is not really tenable to continue developing locally with this toolchain, as VSCode devcontainers no longer support Amazon Linux 2.

  3. i don’t really have a solid procedure for upgrading this runtime, as i am installing it in a completely ad-hoc manner. right now i just hope that overwriting the files in /usr/lib is sufficient.

  4. the 5.9 toolchain itself is deeply flawed - the compiler crashes frequently, it miscompiles code that segfaults randomly at runtime, its concurrency checking is reportedly incomplete, and key libraries have already dropped support for it. therefore i have reached the conclusion that switching to a 5.10 toolchain is the only path forward. but because swift lacks ABI stability on linux, it would be difficult to keep the runtime version in sync with the compiler version, as the 5.10 toolchain changes continuously.

what is the recommended way to use swift on the server in 2024?

1 Like

perhaps it should have been obvious in foresight, but the 5.10 compiler is even more unreliable than the 5.9 compiler. one could hardly blame it for crashing or miscompiling code, as 5.10 is experimental and makes no claim of being production-ready, so i did not think it was worthwhile to continue exploring this direction.

so: it would appear that getting 5.9.2 to work on Amazon Linux 2023 is the only path forward. but despite expending countless hours trying to build the toolchain from source, i could not compile a basic project with the custom built toolchains, as i must have built it incorrectly which causes SIL verification to fail somehow. (more on that here.)

investigating why SILModuleTransform "MandatoryInlining" is hitting assertions that the official toolchain does not is an interesting tangent, but unfortunately it is not reasonable for us to spend the rest of Q1 2024 debugging the swift compiler.

the blocking issue here is that the official swift docker images are no longer compatible with VSCode, because VSCode has bumped its Glibc requirement. are there any (realistic) options for those of us who wish to continue using swift on the server, without holding back VSCode from updating indefinitely?

1 Like

5.10 is getting close to release next month, so any remaining issues should be filed and looked at.

Building the 5.9.2 or 5.10 toolchain from source, now with assertions disabled to avoid that SIL verification issue, and reporting the remaining issues appears your only choice.

Did you ever talk to @sebsto about using his ALI2023 toolchain and maybe making that official?

yes, i figured as much, so i’ve been setting aside a couple hours each day to investigate these crashes as i run into them, and gotten minimal reproducers for nine of them so far:

(kudos to @Joe_Groff for fixing the last one!)

i have not been in contact with him, but i actually stumbled upon his dockerfile gist earlier today and tried to build it myself this afternoon, although i ended up having to interrupt the docker build when i went home for the day.

i do not have any authority at Apple (or Amazon for that matter) so i don’t know what role i could play in making these images “official”. i’d be eager to help in any way i can though.

2 Likes

You do not need any authority at those companies to submit a Docker image for AL2023, though I believe the core team has to approve a new official platform. I suggest you and @sebsto submit an image and maybe we can get it in time to have an official 5.10 toolchain for AL2023. :smiley:

2 Likes

the dockerfiles in that repository do not compile the toolchains, instead they just download and extract a prepackaged toolchain from https://download.swift.org. i could submit the full dockerfile i used to compile the AL2023 toolchain, but that would be a very large image that takes much longer to build than the other images in that repository. is this something likely to be accepted?

You're right about most of them, but not those in the swift-ci/ directory: those are used to build the official release and snapshot toolchains, which is why you'll notice they install an older prebuilt Swift 5.8.1 toolchain initially, to build the Swift source in a fresh toolchain on the CI.

Hello @taylorswift - I'm just discovering this thread.
Thank you for working on this project !

TL;DR
I managed to compile Swift 5.8 and Swift 5.9.x on Amazon Linux 2023 x64 and Graviton. I have scripts to build on EC2, to build on Docker to produce a RPM file, and I'm working on the CI integration.
(Thank you @Finagolfin for your continuous support and your patience)

Here is the script I use to build on EC2

The key points to pay attention to are

  • You need to install ld.gold as it is the default linker used by the Swift project. I use this script to build it from the sources.
  • On aarch64 (Amazon Graviton chipset), we must add a new triplet definition to clang compiler. I use this patch to do so.
  • On both platforms, the compilation and packaging works, but there are 4 unit tests that fail. More about that later.
  • To compile Swift 5.9.x, you must install the binaries for Swift 5.8 first and have swift available in the PATH

Here are the deliverables I created and you can use / test (I'm keen at receiving feedback)

Actions pending

  • Review and possible merge the docker and RPM build scripts. Owner: the Swift Installer Script project maintainers @tomerd or @compnerd.
  • Resolve the 4 failing unit test for Swift 5.8 on the Swift CI Docker project. Owner: me. I have an open thread with @mishal_shah, and maybe @Finagolfin you have ideas too! let me know)
  • Submit the PR to produce official builds of Swift 5.8 on Amazon Linux 2023
  • Once official builds are available, update the swift CI docker file to include swift 5.8 in the build container. That will allow to build Swift 5.9 and newer.

More details about the 4 failing unit tests
I started a new thread to discuss about the unit test errors I received.

2 Likes

To address the Swift 5.8 dependency to build Swift 5.9, I proposed the following plan.

  • start to build 5.8 in the nightly builds and eventually promote the binary package as an official download on swift.org
  • once available, add the 5.8 download and installation instructions to the Swift CI Docker file and start to build Swift 5.9 and beyond

i have opened a pull request here:

it also includes speculative dockerfiles for downloading the AL2023 toolchains from download.swift.org. however, this would likely require swift.org to also distribute ld.gold binaries, as we probably don’t want to build those in the dockerfile.

sigh… even with assertions disabled, this custom toolchain is unable to compile anything in release mode (debug mode is unaffected)

/usr/bin/../lib/gcc/x86_64-amazon-linux/11/libstdc++.a(eh_throw.o)(.note.stapsdt+0x14): error: relocation refers to local symbol ".text.__cxa_throw" [4], which is defined in a discarded section
clang-13: error: linker command failed with exit code 1 (use -v to see invocation)

although the error message suggests to pass a -v flag, this does not do anything except parrot the ld.gold version:

$ swift build -c release -Xlinker -v
Building for production...
error: link command failed with exit code 1 (use -v to see invocation)
GNU gold (GNU Binutils 2.42.50.20240218) 1.16
/usr/bin/../lib/gcc/x86_64-amazon-linux/11/libstdc++.a(eh_throw.o)(.note.stapsdt+0x14): error: relocation refers to local symbol ".text.__cxa_throw" [4], which is defined in a discarded section
clang-13: error: linker command failed with exit code 1 (use -v to see invocation)
error: fatalError

i’m guessing this has something to do with the unofficial ld.gold linker in the image, although i don’t know enough about the linker to understand what is going on here.


update: so i found that passing -Xswiftc -use-ld=ld to the swift build -c release command works, as the problem is specific to the gold linker. sadly, this will not fly with SPM users due to that build system’s aversion to unsafe flags, but for our purposes it is a satisfactory workaround for compiling a server application, as supporting library users is not a top priority right now.

1 Like

I am not sure I understand.

These containers do not see to include ld.gold, therefore the build is likely to fail. Also, these container do not include tools such as tar and diff which are required for the packaging and the test. I'm pretty sure the build with these are going to fail. Did I miss something ?

On a side note, yum is deprecated in Amazon Linux 2023, you should use dnf instead.

Also, it is possible to build ld.gold in a container and only include the binary in the container used for the build, to avoid overhead.

Look at the solution I used here

based on @Finagolfin ’s explanation, i believe that the first dockerfile provides the environment that the swift CI loads before running utils/build-script, and this is the dockerfile that needs to have ld.gold. i built a toolchain overnight using this dockerfile as a base layer, so i know it works in at least one environment.

the version-specific dockerfiles install tar ephemerally before extracting the package and the “slim” images remove it immediately afterwards.

i originally anticipated that we would need to ship ld.gold with the version-specific dockerfiles as well, but as i mentioned earlier, ld.gold isn’t usable in the images anyways, so toolchain users need to specify use-ld=ld to compile anything in release mode.

updated here

Thank you - I think I need to dive deeper and fully understand the purpose of the Dockerfile under version-name directories :-)

You're correct for the Dockerfile nested under swift-ci that's the one the CI uses to build the sources. Sources are hosted outside of the container. For my tests, I have them in a separate directory.

i originally anticipated that we would need to ship ld.gold with the version-specific dockerfiles as well, but as i mentioned earlier, ld.gold isn’t usable in the images anyways, so toolchain users need to specify use-ld=ld to compile anything in release mode.

It is for me. I manage to build 5.8 in the container, without changing anything to the source tree for the x64 build and with one tiny patch to clang for the aarch64 build

I am now trying to build 5.9 with the container + Swift 5.8

i probably wasn’t clear enough, ld.gold is usable for building the toolchain itself, but this toolchain will not be able to use ld.gold to compile a project of sufficient complexity. instead, you must use the ld linker. one way to do this is to pass Xlinker flags when building your application. but anecdotally, symlinking /usr/bin/ld.gold to /bin/ld also worked for me.

i was also confused by the layout of the swift-docker repository.

1 Like

Have you tried using lld instead? If it works well, you guys should submit patches to change the Swift default on Amazon Linux to lld. Should be easy to do in the legacy C++ Driver with a compile-time check and you can do something similar in the new swift-driver too.

I confirm the Dockerfile under the VERSION directories (i.e. 5.8, 5.9 etc) are runtime environments for Swift apps. The SLIM one is the absolute minimum OS. The Dockerfile under swift-ci is the build environment for the toolchain itself.

I think the long term solution is indeed to modify Swift to use ld on Amazon Linux 2023. I like the short term solution proposed by @taylorswift : just symlinking /usr/bin/ld.gold to /bin/ld

@Finagolfin talking about patches

  1. to compile on Graviton (`aarch64), there is a patch needed. Here is my upstream PR. Is there a chance to get this accepted ? And, ideally, to have it backported to the 5.9 branches ?
  1. Are these the only places to modify the swift driver to force the usage of ld on Amazon Linux 2023 ?

5.9 is no longer taking patches at this point. The bar for 5.10 is very high as well.