Binary dependencies on Linux

Continuing the discussion from [PITCH] Support for binary dependencies:

As discussed in that thread, shipping binary dependencies on Linux is not very easy. However, given the desire of SwiftPM to keep a simple build system it is probably still something we want to do, as there will be many C/C++ libraries that are not capable of being forced into the shape that SwiftPM wants them to be in to compile.

The best route for an interested party to go down is probably to borrow the hard work of the Python community in terms of manylinux. This was an attempt to define a restrictive subset of the Linux platform that can be safely used as a baseline for shipping compiled dependencies.

The current state of the art for manylinux is the manylinux2010 variant, defined here. Even this very conservative target is probably not sufficient for us, as it uses CentOS 6 as its baseline Linux, which I am not certain Swift currently supports.

For a detailed analysis of manylinux I will refer you all to the PEPs linked above, but here's a rough summary. manylinux notes that the only way to produce binary packages that are truly portable across multiple Linuxes is to restrict the number of binaries you link to to a tiny set that have a stable ABI across a wide range of Linux versions. To get this range to be acceptably wide you end up having a very restrictive environment. As of manylinux2010 you may link the following libraries at these specific SONAMEs:

libgcc_s.so.1
libstdc++.so.6
libm.so.6
libdl.so.2
librt.so.1
libc.so.6
libnsl.so.1
libutil.so.1
libpthread.so.0
libresolv.so.2
libX11.so.6
libXext.so.6
libXrender.so.1
libICE.so.6
libSM.so.6
libGL.so.1
libgobject-2.0.so.0
libgthread-2.0.so.0
libglib-2.0.so.0

As some additional limitations, you may only ship binaries compiled for the two architectures supported by CentOS 6: x86_64 and i686, and for libraries with versioned symbols there is a restricted maximum version you may use.

Naturally, this may appear to the Swift community to be a really quite extremely restrictive environment. It intentionally is. In principle we as a community could define a slightly less restrictive one, but in practice it would be very hard to do so in a meaningful way.

An added source of difficulty here is that Swift is pretty happy to link a wide range of libraries itself. Those libraries become something that a manylinux-type distributed binary cannot include within itself, as there is a risk of symbol clash leading to subtle breakage. We would, as a community, need to address how we handle the ABIs for those dependencies as well, or whether it is even possible for binary packages on Linux to use them.

Regardless, if we want to support binary packages in Linux without being extremely fine-grained about how we advertise their supported platforms, this appears to be the only possible strategy.

2 Likes

This is libgcc - which may be substituted with LLVM's compiler-rt equivalent: clang_rt.builtins-*.

This is not standalone. This needs an underlying ABI library. This can be replaced with STLport, LLVM's libc++, or Dinkumware’s implementation (there are others as well). This must also match what the standard library is built against.

These are part of the C library and come from different components. There is no requirement that the math library be split up (so libm is optional). IIRC FreeBSD does this (consider Gentoo/FreeBSD). libdl is similar, and is not needed for dynamic symbol resolution. librt is an extension IIRC, and not always available. This set must again match with the standard library.

This is an interesting choice. Does anyone even use Yellow Pages (I suppose NIS+ is the more modern name) anymore? I also believe that this is an extension.

What is the motivation for the name resolution service here? Is this meant for the DNS resolution or the general NSS mechanism in glibc?

Should X11 be in the set given that even RedHat is moving away from X11 to Wayland?

I think that we could possible be even more restrictive in Swift, since we are making the call on the set of libraries much later and so we have an idea of where things are currently and where they are moving towards.

1 Like

I don’t know the motivation behind those choices, but I can find it. In this case I placed the list there mostly to indicate how little is there.

Ah, okay. Im suggesting that we actually be even more restrictive than that set, however, the interesting set of libraries are also more classes of libraries which have equivalences which can be selected from. That makes that set less interesting to the Swift project I think.

So my understand of the rationale for the original selection is that that was the minimal subset of libraries that was a) available and b) had an actually-observed-to-be-stable ABI across a wide range of Linux distributions. We can certainly afford to be more choosy than that list.

1 Like

The interesting wrinkle is that I am suggesting that we reduce the list further, but at the same time widen it!

libgcc_s -> libgcc_s/libclang_rt.builtins-*
libstdc++ -> libstdc++/libc++/libstlport/...
libc -> libc/libmusl/...

Basically, we can take a small subset of it which we need, create equivalence classes, and permit a selection in between the equivalence classes.

I assume the libc equivalence class would be the module currently known as Glibc?

That's a separate issue, that needs to be resolved. But, yes, lets assume for the time being for the sake of this conversation, that they would all be covered by Glibc.

Why this doesn't really work

An example of where this really breaks down is glibc vs bionic. They are just completely different implementations. The headers are organised differently, the library setup is different. This means that the modulemap needs to be different. As a first step towards a saner world - bionic and glibc are now separate modulemaps.

Windows took the better approach of naming it MSVCRT, and libSystem, Darwin's C library provider does so under the Darwin name, so really, its all kinda wonky. I hope that I will have some time to put up a proposal for this soonish.

I think that is a separate concern and lets not derail this conversation with the specifics of the C library handling.

1 Like

Leaving aside questions of whether we can even do that (does musl actually provide a stable ABI? does bionic?), how much do we gain from going down that path, vs standardising on what the majority of Linux distributions actually do?

I think we would also want to support ARM.

1 Like

We can, but this comes with tradeoffs. Specifically, it multiplies the number of supported architectures (and thus build targets) by at least two (there is more than one "arm" ABI per word size). It also requires us to validate what libraries expose a stable ABI on those platforms.

All of this is do-able, but none of it is trivial.

These are all C libraries - they are about as stable as glibc I would imagine.

Supporting Arm would be important. Many if not most mobile and embedded devices are Arm. It would certainly increase the complexity of the dependency matrix but it seems very much necessary.

1 Like