porting to musl (was: building static binaries / reducing library dependencies?)

https://lists.swift.org/pipermail/swift-users/2015-December/000020.html

(Sorry I don't have the reply link handy)

Joe Groff wrote:
Porting to musl libc might be interesting too, but I'm not sure how dependent the core libs are/will be on glibc stuff.

I'm going to do a braindump on this. I spent several hours researching a musl port. I've decided I won't actually do one, but my research may be useful to the next person who picks up the torch.

The real problem with musl is that the swift ecosystem presumes linux == glibc, and it presumes this very, very, very deeply and very, very, broadly. Divorcing those two concepts is going to be very very radical at this stage, and this comes from a person who makes radical changes for fun.

# Ecosystem woes

Probably the biggest problem is the ecosystem. What people are actually doing with their Swift modules they write is

if os(Linux)
import Glibc
#else
import Darwin
#endif

For example in this tutorial here <http://blog.krzyzanowskim.com/2015/12/04/swift-package-manager-and-linux-compatible/&gt;\. So unless you want to prosecute this "linux is not glibc" patch across every project in the Swift ecosystem you're gonna have a bad time. IMO there should be some better way to handle "pick a libc" in the package manager or something, but I'm not completely sure what it is.

# The problem started in LLVM

The module authors aren't stupid arbitrarily. They have learned this "linux == glibc" philosophy from the swift package itself, and ultimately it even goes all the way to llvm:

// On linux we have a weird situation. The stderr/out/in symbols are both
// macros and global variables because of standards requirements. So, we
// boldly use the EXPLICIT_SYMBOL macro without checking for a #define first.
if defined(__linux__) and !defined(__ANDROID__)

Whereas what we wanted was:

if defined(__GLIBC__)

And the comment is not quite accurate either for musl, but I'm not enough of a language lawyer to be able to fix it.

Anyway, I have written this patch (attached) which does build llvm with musl, although it breaks glibc builds, so obviously it's not mergeable. There are similar patches floating around, but they're out of date AFAIK. This one is current.

# One does not simply

Research suggests that I am not the first to be vaguely interested in getting musl into the llvm family tree of projects/languages, based on the patches floating around for various components. And some downstream people have actually managed to build various musl-hacked llvms and llvm-backed languages (Alpine's llvm, rust-musl). So why can't we get it done upstream where it belongs?

Well, there is a requirements conflict:

llvm/clang (and I assume Swift) want some kind of strongish guarantee against borking the behavior on other non-musl systems that work just fine already, thank you
musl refuses to implement if __MUSL__ and calls the feature a bug <musl libc - FAQ.

These both seem reasonable independently but when you put them together it's kind of a heavy rock / immovable object situation.

# tl;dr

* swift at every level from third-party packages to low-level llvm assumes linux == glibc
* a requirements conflict prevents those concepts from getting properly divorced in relevant upstream projects
* solving this requires a tower of hacks even a crazy person like me can't stomach

As a result of all this, I have resigned myself to maintaining the glibc dependency for the forseeable future.

swift-llvm.patch (6.35 KB)

2 Likes

I can believe LLVM and Clang only build against glibc, but that's not necessarily a problem for compiled executables. Does target codegen for Linux have dependencies on glibc semantics? LLVM supposedly targets Darwin, BSD, glibc, and Windows, among other platforms, and as you noted people have gotten Rust and other LLVM-based languages to target musl already.

At the source level, it would be nice to standardize a portable POSIX module, though that of course is a fairly involved project.

-Joe

···

On Dec 28, 2015, at 5:11 PM, Drew Crawford <drew@sealedabstract.com> wrote:

[swift-users] building static binaries / reducing library dependencies?

(Sorry I don't have the reply link handy)

Joe Groff wrote:
Porting to musl libc might be interesting too, but I'm not sure how dependent the core libs are/will be on glibc stuff.

I'm going to do a braindump on this. I spent several hours researching a musl port. I've decided I won't actually do one, but my research may be useful to the next person who picks up the torch.

The real problem with musl is that the swift ecosystem presumes linux == glibc, and it presumes this very, very, very deeply and very, very, broadly. Divorcing those two concepts is going to be very very radical at this stage, and this comes from a person who makes radical changes for fun.

# Ecosystem woes

Probably the biggest problem is the ecosystem. What people are actually doing with their Swift modules they write is

if os(Linux)
import Glibc
#else
import Darwin
#endif

For example in this tutorial here <http://blog.krzyzanowskim.com/2015/12/04/swift-package-manager-and-linux-compatible/&gt;\. So unless you want to prosecute this "linux is not glibc" patch across every project in the Swift ecosystem you're gonna have a bad time. IMO there should be some better way to handle "pick a libc" in the package manager or something, but I'm not completely sure what it is.

# The problem started in LLVM

The module authors aren't stupid arbitrarily. They have learned this "linux == glibc" philosophy from the swift package itself, and ultimately it even goes all the way to llvm:

// On linux we have a weird situation. The stderr/out/in symbols are both
// macros and global variables because of standards requirements. So, we
// boldly use the EXPLICIT_SYMBOL macro without checking for a #define first.
if defined(__linux__) and !defined(__ANDROID__)

Whereas what we wanted was:

if defined(__GLIBC__)

And the comment is not quite accurate either for musl, but I'm not enough of a language lawyer to be able to fix it.

Anyway, I have written this patch (attached) which does build llvm with musl, although it breaks glibc builds, so obviously it's not mergeable. There are similar patches floating around, but they're out of date AFAIK. This one is current.

# One does not simply

Research suggests that I am not the first to be vaguely interested in getting musl into the llvm family tree of projects/languages, based on the patches floating around for various components. And some downstream people have actually managed to build various musl-hacked llvms and llvm-backed languages (Alpine's llvm, rust-musl). So why can't we get it done upstream where it belongs?

Well, there is a requirements conflict:

llvm/clang (and I assume Swift) want some kind of strongish guarantee against borking the behavior on other non-musl systems that work just fine already, thank you
musl refuses to implement if __MUSL__ and calls the feature a bug <musl libc - FAQ.

These both seem reasonable independently but when you put them together it's kind of a heavy rock / immovable object situation.

# tl;dr

* swift at every level from third-party packages to low-level llvm assumes linux == glibc
* a requirements conflict prevents those concepts from getting properly divorced in relevant upstream projects
* solving this requires a tower of hacks even a crazy person like me can't stomach

As a result of all this, I have resigned myself to maintaining the glibc dependency for the forseeable future.

<swift-llvm.patch>

1 Like

I was attempting to build Swift on a system (specifically, Alpine) which does not have a glibc package.

It may very well be possible to design a cross-compile toolchain to build glibc-free Swift executables from a glibc system. That kind of question is well outside my knowledge.

But that particular rabbithole does not lead to a swift package for Alpine, so unfortunately, it is of small help for my motivation.

···

On Dec 28, 2015, at 7:20 PM, Joe Groff <jgroff@apple.com> wrote:

I can believe LLVM and Clang only build against glibc, but that's not necessarily a problem for compiled executables.

According to musl's site they don't include a MUSL macro to encourage maintainers to make their software more posix, while I do think they should add one to encourage adoption of it, I was wondering how much does llvm/swift depend on glibc? what glibc specific features does it use?

FWIW Alpine Linux does support LLVM, and I don't think it depends on glibc when installed on Alpine. If that's so, I hope Alpine doesn't have a considerable amount of patches on top of upstream LLVM to get this working. Then one would need to cherry-pick these patches into Swift's LLVM fork, or to make sure that get merged upstream.

FWIW, there's a few counter points that might be worthwhile noting.

Swift runs just fine on other platforms with not-glibc (e.g., BSDs), but they are still called Glibc, which is not ideal. There is a pitch to paper over that to push all the conditionals in a module called CStdlib somewhat but as you note requires a large-scale code change to do this.

#if os(Linux) import Glibc is wrong, however, and most "modern practice" seems to be to use canImport(Glibc).

I actually suspect trying to build against musl will probably work reasonably fine. I suspect in the worst case some conditionals might be needed, but it's important to note that we have platform conditionals os() and canImport(), but no immediately reliable way to distinguish whether a module masquerading as Glibc is really musl under the hood or glibc.

I hope that when musl support works, it would make sense to use import Musl (or import musl if possible?).

Last time I tried it a year or two ago I had trouble building upstream or forked Swift's LLVM on Alpine with musl. The issues were with header search paths and were quite hard to diagnose, at least for me. I hope things changed since then and it's much easier to build now.

Given that we’re now taking import WASILibc for WASI which is in fact based on Musl, I think import Musl is really the case. And yes, we’d better have a CStdlib module, but that should be fine-tuned to include only standard C APIs.

1 Like

Any additional blockers to get Alpine as a supported Linux version? Usually I have Alpine as the suggested version to use in production.

In my latest investigation of this a couple of months ago, freshly built Clang from Apple's fork of LLVM project repo still assumes that libc libgcc are available even and can be dynamically linked even on Alpine. I haven't got past that, and it prevents building the Swift compiler and runtime together with core libraries.