Shipping clang with the Swift toolchains

cc also @Devin_Coughlin, since this would allow us to use sanitizers from toolchhains.

1 Like

Shipping clang in the Swift toolchain will solve a long outstanding request for adding support for sanitizers in the toolchain. (Swift and clang use the same sanitizer library, which is version locked with the compilers.)

1 Like

Thanks everybody for the input, this is really valuable and shows that there's a common root to some of the issues we're experiencing.

Yes please, working sanitisers on Linux would be really important for us :slightly_smiling_face:. And shipping compiler-rt and libc++ seem like no-brainers to me too, this will really solve issues.
@compnerd seems to have made good experiences with this already and we've been running things in a controlled environment where all the compiler and runtime libraries truly fit together too and again it started to make things actually work by construction rather than accident. @kevints is the expert from our side here.

@ahti does obviously have a good point here that adding more binaries in 'standard locations' does cause issues but as @tkremenek says: we can put clang and friends in a place that SwiftPM can find but is also safe not to cause trouble. This also really makes sense: The reason someone would download the Swift toolchain is obviously to use Swift but using Swift does (maybe unknowingly) mean also using clang, the runtime libraries, and the sanitisers too. And it makes total sense to have the Swift binaries (swift, swiftc, swift-*) in a more prominent place than say clang even though they're all needed by almost everybody.

2 Likes

I have long agreed we should ship the Clang in the toolchain. I think it is fine to give it an alternate name and teach SwiftPM and other Swift focused things to use that name.

Another good reason to include clang with the toolchain is for Xcode, since currently when you have a Swift toolchain selected, C/ObjC/C++ IDE support significantly degrades until you switch back to the default Xcode toolchain.

6 Likes

Maybe have an equivalent to xcrun that would know how to execute and return paths to clang etc? This would give some redirection to the actual naming and locations, perhaps making those choices less fragile.

3 Likes

I like this idea, something like swift tool clang so you can launch the clang associated with the current swift executable.

Swift already has logic for looking up tools next to itself if they're named "swift-<something>" when it is invoked with "swift <something>" so swift-clang could be a good option.

4 Likes

I think I jump into cold water in this thread, but can you elaborate what that would mean more in detail?! Any chance this would enable to have a thread sanitizer on Apple platoforms rather then macOS and simulator targets only? (I really wish that would be possible today, because the thread sanitizer is completely useless in our case because we‘re working with CoreBluetooth on iOS which is unavailable in Simulator.)

Oh, I have forgotten about that. Yes, that’s a great option.

Okay, so, since there seems to be quite a bit of interest in this, I am putting up my personal setup: https://github.com/compnerd/toolchain-infrastructure.

This is what I have been using to generate a complete toolchain (replete with support for 2-stage bootstrapped cross-compiling). There are a lot of rough edges still, but, it works great for my purposes (and I welcome patches from others to actually improve this).

I am able to generate a full toolchain for Windows on Linux with this. It expects a monorepo style layout, and requires in addition to the standard repositories that swift has, lld and clang-tools-extras. It is designed to run effecitvely llvm.org master for llvm, clang, lldb with upstream-with-swift patches merged in (and local patches). It will run the other repositories with a single master model. For maximal utility, it expects that clang, llvm, lldb have the "upstream" remote set to llvm.org, "swift" remote set to swift's repositories. The other non-llvm repositories should have "upstream" set to their upstream, and the llvm ones to llvm's repositories. Everything can be rebased with the rebase.bash script (infrastructure/scripts/rebase.bash).

The repository itself must be checked out as infrastructure.

In order to cross-compile to Windows, you must have a WinSDK directory that contains MSVC/<version> which contains the MSVC content, and SDK/{Include,Lib}/<version> which contains the SDK content.

With Makefile symlinked to the top level, you can simply do make toolchain DESTDIR=... to generate the toolchain image. My common invocation is something like:

make DESTDIR=${PWD}/prebuilt/Windows-x86_64/Developer/Toolchains/unknown-Asserts-Default.xctoolchain/usr BuildType=Release Host=Windows-x86_64 toolchain
make DESTDIR=${PWD}/prebuilt/Windows-x86_64/Developer/Toolchains/unknown-Asserts-Default.xctoolchain/usr swift-stdlib-windows
make DESTDIR=${PWD}/prebuilt/Windows-x86_64/Developer/Toolchains/unknown-Asserts-Default.xctoolchain/usr BuildType=Release Host=Windows-x86_64 swift-corelibs-libdispatch swift-corelibs-foundation

That generates a full toolchain that runs on Windows from Linux.

3 Likes

Throw in the ability to download versions of toolchains (including nightlies) and we have ourselves a little clone of rustup, which I'd love to see for easier access to nightlies etc on Linux.

That could also come in handy to ease access to Swift on distributions without a packaged version, at least once all dependencies are compiled into the toolchain (except glibc i guess :fearful:).

Sounds like something that could even be done as a third-party tool (I believe rustup started as one), provided the toolchains are flexible in their install location etc.

1 Like

Does Linux not support installing multiple versions of an app or library? I thought it did, but I haven’t tried lately.

Regarding libc++, I have a few questions:

  • Would you want to ship the libc++ dynamic library AND the headers, or just the headers? Shipping the dynamic library with the toolchain requires that the libc++abi dynamic library used when running a program is compatible with the libc++abi headers that libc++ (in the toolchain) was built with. This may not be a problem for libc++, however we've had similar problems with libc++abi/libunwind mismatches during the LLVM 7.0 release (https://llvm.org/PR38473). Basically, we were building libc++abi against one set of libunwind headers, but the application was linking against a libunwind.dylib that had been built with newer (and incompatible) headers. This could conceivably happen with libc++/libc++abi. Including only the headers in the toolchain bypasses some of these problems because we always rely on the system's dynamic libraries (and we're careful to always be ABI compatible in the libc++ headers), but it introduces other problems (you can't use features in the headers when they require something in the dynamic library that your system libc++ does not implement yet).

  • Would you also want to ship libc++abi/libunwind in the toolchain, or just libc++?

The discussion of including the full LLVM toolchain suggests that all of libc++, libc++abi and libunwind would be shipped in the toolchain. I believe this would make the most sense, as those three should be consistent for a given release. Otherwise, I would probably suggest not shipping libc++ with the toolchain, but I'm not 100% familiar with how the Swift toolchain is being used so my concerns may not be valid.

So, I packaged Swift for Fedora Linux and I'd be in favor of shipping clang similarly to lldb if it was prefixed with something like swift-. Personally I'd love it if everything was prefixed with "swift" so I don't have to worry about whether the person installing it already has clang and/or lldb already installed.

Ron

3 Likes

@ldionne my general thought here would be that we want to build the stdlib (and everything else that compiles against that stdlib) using the same libc++. If the headers is enough to do that (and we get proper errors if we use something we shouldn't) then I think that would be fine.

That being said, I would rather us go for something simple and complete provided that adding these things do not add too much compile/testing time to our package builds. But again that is just my opinion.

@ldionne, @Michael_Gottesman,

Personally, I think that I really want to see something slightly different. I would like to see the content in the resource dir to be thinned.

The current approach really doesn't work very well for Linux to be honest. What exactly does the Linux OS mean? Does it mean Ubuntu? Well, that is alright, but that certainly doesn't work on my Linux distribution - exherbo. Android gets away with this because it has specially suffixed everything to deal with this conflict (clang and LLVM treat android as an environment to Linux, similar to how Apple is starting to treat the iOS simulator).

I think that Xcode has a wonderful layout strategy that scales amazingly well. I think that we can easily replicate that across the platforms, and this would only be jarring on Windows, where I think that even Microsoft is trying to move to this model [1], with a slight difference in names. Consider the following layout:

Developer/Toolchains/<vendor>[-...]-<version>/usr/bin/...
          Platforms/ubuntu-arm-14.04/usr/lib/...
          Platforms/ubuntu-x86_64-18.04/usr/lib/...
          Platforms/exherbo/usr/aarch64-unknown-linux-gnu/lib/...
          Platforms/exherbo/usr/x86_64-pc-linux-gnu/lib/...
          Platforms/exherbo/usr/ia64-unknown-linux-gnu/lib/...

This allows us to have multiple simultaneous toolchains co-installed and switch across versions easily. Additionally, it allows you have the multiple variants of Linux be handled cleanly and have the different versions of the libraries be separated. The top level directory names can be renamed of course. Since exherbo has a co-installed environment, we can flatten the structure there as an example. The exherbo "SDK" would give you the traditional sysroot, while on Ubuntu, the top level directory can separate the architectures and be versioned.

Note that if you really want the traditional layout for a single installation, it is just replacing the top of the tree, and you are done, since the toolchain layout is the traditional unix layout. You merge that with the SDK and you get the normal Linux layout. So, really, it is just replacing the root in the image, which seems perfectly reasonable. The toolchain can be relocated to /opt or /usr/local if they so desire.

This also means that we can easily cross-compile to and from the environments.

Also note, I would include the C++ runtime builds even for Windows, as I have already ported libc++ to Windows replete with MS ABI and all. It is effectively a drop in (source level) replacement for Microsoft's C++ runtime without any additional dependencies.

[1] https://blogs.msdn.microsoft.com/vcblog/2017/11/15/side-by-side-minor-version-msvc-toolsets-in-visual-studio-2017/

1 Like

Shipping clang with the Swift toolchains is critical to ensure that the Swift importer works properly. Currently, the system clang on Linux doesn't recognize swift_name attributes, so type names that are refined for Swift will fail to compile at use site on Linux when the same code compiles OK on macOS.

For example, shipping clang would effectively unblock https://github.com/apple/swift-llbuild/pull/404

2 Likes

Sanitizers are not currently supported in the Swift developer toolchains. The problem is that the sanitizer runtime revision needs to be in sync with a compiler revision. When one uses a Swift toolchain, the swift compiler comes from the toolchain and the clang compiler comes from the Xcode toolchain. If we wanted to apply a sanitizer to a mixed source project, we would not know which version of the sanitizer library to use.

Unfortunately, availability of Thread Sanitizer on iOS is a separate limitation and would not be solved by this.

1 Like

@Anna_Zaks, doesn't sound like there are any objections to the idea of just creating a package of the full toolchain with the tools under a directory. This would allow for a simpler model and address this. I've mentioned to @Michael_Gottesman the problem areas for making such a distribution a reality. It primarily involves the component model in the swift build system. It should be something that is addressable, with some work required in CMake and build-script-impl. I think that doing this is going to significantly reduce the barrier to entry to working with nightly builds of swift. The question that obviously comes up, is there anyone who would be able to work on this (since I think that it will require some help from Apple to make sure that this doesn't break anything). I believe that @Devin_Coughlin is also interested in this.