Building libicu as part of the Linux Swift build

Hopefully this is the correct category to post this.

Currently on Ubuntu Linux, the Swift toolchain uses the system provided libicu which presents 2 problems:

  1. The version of ICU is quite old on 14.04 and 16.04, which is the source of some bugs in scl-foundation. Currently some tests are disabled to prevent CI failures.

  2. The way that libicu is built requires the use of dlopen(), thus making it unusable in a statically linked executable.

I propose building libicu by default as part of the Linux build. This will allow using the latest version and also it can be built without the use of dlopen().

Ideally I want to build with the version used by Darwin https://opensource.apple.com/tarballs/ICU/ICU-59152.0.1.tar.gz which I have had working before, however if there are reasons that version cant be used the offical version at https://github.com/unicode-org/icu could be used instead.

Does this sound reasonable?

For reference:

ICU versions for Linux/Darwin (from Add support for .withFractionalSeconds to ISO8601DateFormatter. by armadsen · Pull Request #1586 · apple/swift-corelibs-foundation · GitHub)

macOS 10.12 ships with ICU 57.1
macOS 10.13 ships with ICU 59.1
Ubuntu 14.04 ships with ICU 52.1
Ubuntu 16.04 ships with ICU 55.1
Ubuntu 18.04 ships with ICU 60.2

ICU Jiras:

Static Linking Jiras:

5 Likes

I want to sound a note of caution here: while this would work, it presents risks if you want to link (either directly or transitively) to the system libicu.

If this approach is taken, we need to ensure that the shipped ICU does not expose clashing symbols with the ICU provided by the system. This will likely require mangling them, which may be troublesome.

libICU can be built with symbol mangling and this is how it is currently done on Ubuntu16.04, eg:

$ nm  ~/swift-DEVELOPMENT-SNAPSHOT-2018-08-26-a-ubuntu16.04/usr/lib/swift/linux/libswiftCore.so |grep 'U ubrk'
                 U ubrk_close_55
                 U ubrk_following_55
                 U ubrk_open_55
                 U ubrk_preceding_55
                 U ubrk_setText_55
                 U ubrk_setUText_55

and on Ubuntu18.04

$ nm swift-4.2-RELEASE-ubuntu18.04/usr/lib/swift/linux/libswiftCore.so |grep 'U ubrk'
                 U ubrk_close_60
                 U ubrk_following_60
                 U ubrk_open_60
                 U ubrk_preceding_60
                 U ubrk_setText_60

In fact the libXML used by swift-corelibs-foundation will still link to the system ICU unless it is recompiled.

1 Like

It might be possible to define a custom U_ICU_ENTRY_POINT_RENAME macro.

(See unicode/urename.h and unicode/uvernum.h)

I think we've talked about this before, but this is a direction I'd absolutely like to go with the support of the standard library team (so we're all on the same page on using one version of ICU).

5 Likes

+1, this is very much the desired direction for the standard library build on Linux.

I could't find a JIRA for it, so I opened: [SR-8876] Build recent ICU on Linux · Issue #3625 · apple/swift-corelibs-foundation · GitHub

4 Likes

+1 from me.

This also might help move us to a position where we don't have to create a toolchain per version of Ubuntu. Not sure what other factors are forcing that, but this is a big one.

6 Likes

Huge +1. I've mentioned this in a couple PRs and JIRAs but the Unicode.Scalar.Properties APIs are difficult to trust on Linux without a known version of ICU being built into the standard library. On Apple platforms we can at least make specific properties @available based on OS releases that map to ICU versions, but we don't have that ability for Linux, so there's no way (without something extreme, like checking the ICU version at runtime) to distinguish between "this property is false" vs. "this property isn't supported" in client code.

Personally I would love to have access to libicu even outside of Linux.

What would be the next step if we were to go with Apple's fork? Does it just need a git repo to be created and initialised with the contents of https://opensource.apple.com/tarballs/ICU/ICU-59152.0.1.tar.gz ?

The latest Apple platforms (macOS 10.14, iOS 12, tvOS 12, watchOS 5) are using ICU 62, which isn't available from https://opensource.apple.com/ yet. So that might be a reason to choose http://site.icu-project.org/download or https://github.com/unicode-org/icu.

Apple's fork has some additions, but are any of them needed for stdlib or corelibs?

Has anyone got any updates on this? It looks like having a standard way to compile ICU together with the toolchain would definitely help in unifying the build process for other Linux distributions, Android toolchain and WebAssembly toolchain. While developing WebAssembly toolchain I currently clone GitHub - unicode-org/icu: The new home of the ICU project source code. at one level above the workspace directory and then link icu/icu4c from that clone to ${WORKSPACE}/icu, but that's not very convenient. It would be great if utils/update-checkout script could clone or download the correct version of ICU and put in a place that the current build infrastructure can agree on across all platforms.

PR: SR-8876: Always build libicu on Linux by spevans · Pull Request #19860 · apple/swift · GitHub
Jira: [SR-8876] Build recent ICU on Linux · Issue #3625 · apple/swift-corelibs-foundation · GitHub

Its working, just waiting approval for using an extra external library

2 Likes

Yes, having official Linux Swift packages usable in any distro would be a huge step. Other languages distribute a single package for all Linux distros instead of one per Ubuntu release.

1 Like

Right, there are multiple possibilities I see here:

  • bundling libicu allows Swift compiler to be used on Linux distros that don't have a specific version of libicu installed. But there are still a couple of other dependencies like Glibc that won't allow us to run the compiler on musl distros like Alpine, right?
  • statically linked self-contained ELF executables compiled from Swift packages. I assume these would link with the bundled libicu? And would also require some support from SwiftPM as well? These binaries would still link with Glibc, but statically, which would allow a user to just copy it to an Alpine instance and run there without a problem, I guess?

Any update on this? Would this allow us to build / use swift on Alpine?

This was implemented for Swift 5. If you are building on Linux you can add the --libicu option to the build script and it will build libICU as part of the build.

The option has also been added into the buildbot_linux preset so it will build automatically if that preset it used - this is how CI builds and tests it.