Status of migration of Swift compiler build-script to Python (SR-237)?

Hi all, I'm currently trying to make WebAssembly compiler target work, but this requires cross-compiling libicu, musl, libc++ and libc++abi as these are dependencies of the Swift standard library (a bit more details available here).

We already have libicu building for Linux and Android, so I was able to reuse some of those build script helpers. libc++ and libc++abi can be built as a part of LLVM by symlinking sources of those in llvm/projects directory.

The main problem here is adding a musl build product. My understanding is that new products should be added to the giant build-script-impl. But I've also found --no-legacy-impl argument that can be passed to utils/build-script. It looks like that flag is work in progress towards resolving SR-237, is my understanding correct?

If so, what is the preferred way to add new build products that are built conditionally depending on the target platform? Should I add it to build-script-impl first? Should I add it as a Python module in utils/build_swift? Is it even worth trying to build new target platforms with --no-legacy-impl?

I was currently thinking if the work on migrating away from build-script-impl is in progress, maybe it's worth not to add new target platforms to the legacy build script and to rely only on the new Python build scripts for that?

Hope this can be clarified.

Thanks!

We're pretty much nowhere near eliminating build-script-impl, but at the same time I don't see why you need to build musl during the Swift build. Is it not possible to have it pre-built on the side?

I'm not quite sure this would be a good solution long-term. It's a fork of musl that's adapted for WebAssembly, I'd anticipate it could receive updates quite regularly with WebAssembly standard being a moving target or it could be replaced with the upstream version eventually. In addition to that, what makes it different from other libraries that are compiled together with Swift compiler like libicu? (with this PR hopefully merged soon)

I also could imagine a need to have a Linux build of Swift that uses musl instead Glibc to be able to run the compiler or executable produced by it on distributions like Alpine. If WebAssembly build bundles its own prebuilt musl, would that Linux build require its own prebuilt version of musl too? It would be beneficial to reuse musl build-script adjustments for both platforms.

Even if bundling a prebuilt version of musl is considered, could you please clarify what would be the best distribution method for it then? I don't see any prebuilt binaries of this specific WebAssembly musl fork readily available. How would the build process decide if a source for this binary is trusted? In case of musl Linux targets, if you're targeting Alpine and the compiler is hosted on let's say Ubuntu, you could reuse an existing musl-dev Ubuntu package, but this wouldn't work if you're cross-compiling from macOS.

I guess I'm thinking about how libicu is distributed today. There are a handful of reasons to prefer building it as part of the greater Swift build, but I don't think that's necessarily applicable to all libraries. Foundation on Linux, for example, currently depends on libcurl (or some other networking library; I forget), and we don't build that.

Sure, but libcurl is a good example that I think confirms the point: as far as I know there are 2 main reasons for using precompiled libcurl:

  • it's only used on Linux
  • so far no one swapped different versions of libcurl during Swift compiler build time

From what I've seen in the thread about potentially replacing libcurl with SwiftNIO the latter actually causes problems with different versions of libcurl, mentioned by @johannesweiss:

Another issue with using the platform's libcurl is that URLSession in swift-corelibs-foundation does support HTTP/2 if and only if the distribution's libcurl was compiled with HTTP/2 support (which it AFAIK isn't on Ubuntu 14.04 and 16.04).

Also interesting to consider how does URLSession works on Windows if it works at all? Would that require building libcurl on Windows from a source distribution?

In any case, I hope it can be clarified what's the best way to bundle different versions of musl that works for different platforms including cross-compilation scenarios.

So far (as @Tony_Parker and @millenomi were saying elsewhere) the plan is to first split up Foundation a bit so that URLSession (and XMLParser) become their own modules.

After that's done we can start porting URLSession to use SwiftNIO instead of libcurl. In my personal ideal world we'd also make (at least) the URLSession part updateable through SwiftPM but that's a larger piece of work. For the interim I think we would need to have a private copy of SwiftNIO where all the symbols are prefixed inside of Foundation so that the regular SwiftNIO can still be developed and updated as we know and like it. Every so and so often we would then import a stable NIO release into Foundation.

Now that does sound difficult and some may ask why not to just compile libcurl as part of the Swift toolchain compilation (as I think @Max_Desiatov suggested): The problem is that we would then still depend on the system's Open/LibreSSL version and we're still in the same messy situation as today where the only option is for everybody to use the system's Open/LibreSSL version which is a problem because not all OpenSSLs support ALPN etc. Of course we could also decide to ship OpenSSL with the Swift toolchain (and libxml2 and libsasl and ...) but then the Swift toolchain would essentially become a Linux distribution which is certainly not a good idea.
The only way out I see is to make the libraries that ship with Swift (Foundation, Dispatch & stdlib) depend on nothing but the absolute minimum (which does include ICU but ICU supports symbol prefixing).
Now I should point out that swift-nio-ssl does depend on OpenSSL today but as soon as Foundation loses its libcurl & OpenSSL dependencies swift-nio-ssl can then embed a BoringSSL version with __attribute__((visibility("hidden"))) and then we can finally have Swift programs on Linux that use Foundation that can still be statically linked and don't have any OpenSSL symbols in the global symbol tables.

Why is this only a problem on Linux? Basically because Darwin's dyld supports two-level namespaces and Linux does not...

I realise that this is probably all a bit dense so please do ask if you're interested in more details.

Also CC @lukasa and @kevints who have also spend some non-trivial portion of their lives discussing this issue :slight_smile:

6 Likes

I'm emptying my queue before getting into split work, but an eye toward SwiftPM usage for the split is absolutely part of my investigation.

2 Likes

@johannesweiss Many thanks for the detailed write up! To clarify, I'm not suggesting to compile libcurl as a part of toolchain build. I'm using libcurl as an example that relying on prebuilt libraries that stdlib and/or Foundation depend on causes different problems.

I hope this makes more realistic my suggestion to build a custom fork of musl for WebAssembly from source instead of using prebuilt musl binaries. And I hope people interested to have Swift working on distros like Alpine would support me here, where musl is used instead of Glibc.

1 Like

Sorry I'm mixing things up here a bit. When static linking works well it wouldn't matter to which target Linux distro a Swift binary is copied. But a binary statically linked with musl should still be lighter than one linked with Glibc. And I guess people using Alpine (which is mostly used to produce small Docker images) would still appreciate the space savings from Swift builds with musl even if the latter is statically linked.

Hopping a bit off-topic, but I'd like to address this misconception quickly.

As a general rule, static linking on Linux does not include statically linking libc, at least when glibc is in use. If you statically link glibc several things stop working, and this is a code path that is very poorly tested. In fact there are a few other libraries that even "static" Linux binaries should not link. I don't have an exhaustive list, but this includes libc when provided by glibc, libpthread, and libresolv.

As a result, unless you're willing to accept this tradeoff, it still matters to which Linux machine you copy your binary. Specifically, the rule is that you cannot copy your binary to a machine that has an older copy of glibc. This is a fairly well-understood problem in binary distribution for Linux: for example, the Python community's tool for distributing binary Python extensions, wheel, required a special enhancement with many rules to run on Linux correctly (see manylinux). Heck, even Go, the standard bearer for "static binaries FTW" runs into this problem with cgo on Linux.

This is one of the compelling reasons to use musl, as unlike with glibc musl does support static linking as a first-class feature. Of course, it does this at the cost of the dynamic platform features that are why glibc doesn't like being statically linked. I say all this just to say that while static Swift builds are a really great idea, we're quite a long way from supporting them at this time.

1 Like