RFC: Building Swift Packages in build-script

blangmuir · December 18, 2018, 6:05pm

Hi all,

We have a number of SwiftPM packages (swift-syntax, sourcekitd stress tester, and now SourceKit-LSP) that we want to be able to build, test and install along with the rest of the Swift toolchain in continuous integration and packaging infrastructure. I started looking into this to add sourcekit-lsp to Swift CI, and I ran into a number of difficulties. The key challenge is that we want to build and run these packages using the just-built toolchain, including the compiler, package manager, and corelibs, but our tooling is not setup well for that. The rest of this post dives into what the problems are and how I propose we tackle them.

Thanks to @Michael_Gottesman, @compnerd, @Rostepher, and @mishal_shah for their feedback as I've tried to figure out these issues. None of them have seen this proposal though, so any bad ideas are mine.

Motivation

We want to be able to build, test and install SwiftPM packages along with the rest of the Swift toolchain. This needs to build and run using the just-built compiler, package manager and corelibs for a variety of reasons.

swift-syntax is a Swift library and we don't have Module Stability yet, so it must be compiled with the same swift compiler that it will be used with
swift-syntax generates syntax node definitions as part of its build process using information from the compiler repo, and must match the language syntax of the compiler in the toolchain
All of these packages are designed to interoperate with other parts of the toolchain (e.g. sourcekit-lsp loads sourcekitd), and therefore need to test against the just-built tools to prevent shipping broken toolchains.

In addition to those hard requirements, there are other practical benefits to using a single toolchain for the whole process:

Avoid needing to statically link corelibs/stdlib, which we would otherwise need until ABI stability on Linux
Avoid needing a Swift toolchain in order to build Swift

Further, it is important that building these SwiftPM packages happens as part of build-script , because much of our infrastructure is built on this tool, and it's what we've been teaching swift developers to use for reproducing issues (e.g. build-script —preset=X to reproduce a CI failure).

The existing build-script[-impl] and cmake build system is not setup well for building SwiftPM packages. We currently build swift-syntax and the stress tester from build-script-impl by passing paths of individual components from the build directories (e.g. swiftc, swift-build, and swift-test executables) to the build scripts for the package, and this only works on macOS right now. I experimented with extending this approach to work for Linux (i.e. handle the corelibs) and I believe it is unmaintainable to duplicate so much knowledge of how the toolchain is composed (see Alternatives Considered; the concerns I have about building a symlink toolchain are basically the same here).

Another concern is that this method of building looks very little like what a developer would normally do to build a SwiftPM package, because it requires passing a large number of additional search paths and run paths. Ideally we want building a Swift package with build-script to be equivalent to selecting a toolchain and then invoking swift build. A toolchain contains most of the components needed to build and test our packages. Each component (e.g. the Foundation swiftmodule) is installed to a known location that allows the compiler and package manager to find them. With build-script today, all of the components are spread across several build directories, each with their own ad hoc layout, and only come together if installed. Without installing, it is brittle to get the complete set of compiler and run path arguments that will find all of the corelibs, swiftpm runtimes, etc. (see Alternatives Considered).

Finally, while I limit this proposal to what is needed to build SwiftPM packages, it's worth thinking about how a solution could be extended to help building other parts of the toolchain. In particular, the bootstrap script in SwiftPM itself is solving some of the same problems.

Proposal Summary

During a build-script invocation that will build one of the SwiftPM packages, perform an install of swift, swiftpm, and the corelibs to a toolchain directory inside the build directory.
Drive the build of SwiftPM packages from build-script (Python), passing the toolchain path to their respective builds.
Turn on the --no-legacy-impl option to build-script, making build-script-impl perform a single operation at a time (e.g. macosx-swift-install) so that we can control the overall build and install process from build-script, and enable inserting additional install steps in between testing and the final packaging.

Details

I propose that we install the toolchain during the build so that SwiftPM packages such as SourceKit-LSP can build directly against the installed toolchain rather than try to use the individual components spread across several build directories. Any build-script invocation that requests building one of the SwiftPM packages would perform an install of swift, stdlib, overlay modules, and corelibs (dispatch, foundation, and xctest) as necessary. The installation would go into a per-platform subdirectory of the build directory (e.g. toolchain-macosx-x86_64) by default. Builds that did not include one of these SwiftPM pacakges would not need the additional install step.

When building the SwiftPM packages, we would pass the path to this toolchain, allowing us to invoke swift build mostly like a desktop build. At least for now, the packages are likely to still need a lightweight build script of their own in order to support install actions that aren't directly part of SwiftPM. However, they shouldn't need to pass a large number of custom search paths for finding the corelibs or standard library.

To perform the new installation step for the toolchain, we need to modify build-script[-impl]. I propose that instead of baking the changes into the build-script-impl Bash script, we drive this process from the build-script Python code. There seems to be general agreement that in the long run we want to move away from the monolithic and difficult to maintain build-script-impl towards build-script on one end, and cmake/swiftpm on the other. Today, the entire build + test + install + package process happens in a single monolithic invocation of build-script-impl. In order to drive the new installation and swift package builds from the python code, we need to split that up.

To split up the build-script-impl process, I propose that we fix up and enable the --no-legacy-impl option that was previously added to build-script. This makes build-script-impl execute only a single action at a time - e.g. macosx-swift-install, which allows the python code in build-script to drive the overall build, including adding the additional install steps for the toolchain, and performing the final packaging steps (dsymutil, codesign, etc.) after building and installing the SwiftPM packages. Using --no-legacy-impl also helps us get closer to killing build-script-impl in the long run by allowing us to factor the actions out individually.

The way --no-legacy-impl works is that build-script-impl is invoked multiple times - once for each action, such as build-llvm, build-swift, build-foundation, test-swift, test-foundation, etc. Each invocation is constrained to only perform a single action, but there is still some overhead from executing the script multiple times. I believe that the reason --no-legacy-impl was not turned on originally is that performance impact on the null build. I have a WIP patch to fix and enable the no legacy impl option at DO NOT MERGE [build-script] Flip the default value of legacy_impl by benlangmuir · Pull Request #21020 · apple/swift · GitHub, which includes an optimization to help with the null build time. For a null build of just llvm+swift I measured approximately 0.5 second slowdown on Linux (from 1.6 s to 2.1 s) and ~1 second on macOS (from 5 s to 6 s). For anything other than a null or nearly null build this is a rounding error, and if you build any components that come after swift in the pipeline (e.g. corelibs) the build seems to always be non-incremental anyway.

It's worth pointing out that enabling —no-legacy-impl changes the assumptions you can make if you are working on build-script-impl. Because it skips executing code for all but one action, you cannot rely on global variables defined in one action to be available in a “later” action - for example, a variable set during a build action will not be automatically set when running the test or install action. In many ways this was already true, because of options like --skip-build and --skip-test, and even where it works it's still mostly spaghetti code. However, I think that because of this difference in behaviour, we should commit to one model or the other. So if we adopt --no-legacy-impl, I propose we do so across the board.

While I am only proposing migrating swift-syntax, the stress tester and SourceKit-LSP to use the temporary toolchain approach proposed above, I think this could also be extended to simplify other parts of our build in the future. For example, building SwiftPM against a toolchain would allow us to simplify some of the ugly parts of its bootstrap script (in particular, the way it hacks up a toolchain symlink tree). In Alternatives Considered below I talk a bit about the problems with the existing symlink tree.

Alternatives Considered

Build on top of SwiftPM's boostrap script, or factor something like it into a standalone tool

As mentioned above, SwiftPM has to solve some of the same problems in its bootstrap script when building under build-script. The solution that is used there is to pass a number of command-line arguments that specify that paths to the build and source directories of swift, libdispatch, foundation, xctest, etc. Within the bootstrap script, these components are symlinked into a fake toolchain. The fake toolchain partially matches the layout of a real toolchain, but it is not identical and in order to work around the differences, SwiftPM's bootstrap script passes some additional search paths during the build. Moreover, the script needs to bake in assumptions about the layout of the sources and build directories of its dependencies, and how those map to the final installed locations. Finally, the way SwiftPM itself is built and run requires passing in locations of the swiftpm runtime libraries, which is a necessary complexity for bootsrapping SwiftPM, but creates additional headaches for any tools that want to build on top of the boostrapped SwiftPM.

I prototyped a more complete version of the symlinked toolchain approach (https://raw.githubusercontent.com/benlangmuir/indexstore-db/build-toolchain-swift-package/Utilities/build-toolchain-swift-package.py). The fake toolchain built by this script was sufficient to create a toolchain that could build and test SourceKit-LSP without passing additional -I/-L/etc. paths during the build. However, this experience revealed to me how tightly coupled the symlinked toolchain script ends up being to the layout of the build directories. For an egregious example, see how CoreFoundation headers and module map are handled.

Ultimately I believe this approach is unmaintainable. To make something like this work, I think the knowledge of how to layout (install) the toolchain needs to come from the individual projects themselves.

Have the Build System provide the necessary configuration for each step

When we build swift we get information about how to find the llvm components from source/build directories directly within the cmake. Effectively we export the needed configuration as API during the build. This approach is also used to varying extents for building the corelibs. This solves the biggest maintenance issue of the symlink toolchain script I mentioned above, because each project is responsible for telling later projects how to configure themselves to find the necessary libraries and headers.

The biggest problem with extending this approach is that our SwiftPM projects do not use CMake. In order to take advantage of this, we need a way to communicate the configuration (i.e. compiler/linker search paths and run paths) from CMake build such as swift and the corelibs to the SwiftPM bootstrap, and from both of those to the leaf SwiftPM pacakge projects like SourceKit-LSP. I think that if we wanted to fully standardize on CMake for all our build tooling, this might be a viable approach. However, it would require a substantial upfront investment to (a) teach the corelibs to forward their configuration on to SwiftPM and (b) write a CMake module for SwiftPM pacakges that the leaf projects could tie into, and (c) logic to drive all of this from build-script-impl.

jrose · December 18, 2018, 7:02pm

@ddunbar would definitely be interested in seeing this.

If we're going to do something new, can we drop the superfluous target extension? We have the build subdir for that.

blangmuir · December 18, 2018, 7:04pm

I don't understand, how is the target superfluous? This would be a peer directory of swift-macosx-x86_64.

jrose · December 18, 2018, 7:43pm

It's superfluous there too. Cross-compiling should add an additional component to the "Ninja-DebugAsserts+Blah+Blah+Blah", since it can't share any of the build products anyway. See SR-199.

blangmuir · December 18, 2018, 7:53pm

Ah, in that case my understanding is that the target is not yet superfluous since it has not been added to the parent directory name, right? I have build directories with swift-linux next to swift-macosx right now. Moving the target to the parent directory and then dropping it from all the children (including my proposed toolchain directory) seems fine to me, and I'd be happy to adapt to that change, but I see that as orthogonal from what I'm proposing.

Edit: to clarify, I have both of these right now:

build/Ninja-RelWithDebInfoAssert/swift-macosx-x86_64
build/Ninja-RelWithDebInfoAssert/swift-linux-x86_64

after doing a build on my host machine and in docker.

ddunbar · December 18, 2018, 8:19pm

I haven't digested your post yet, but obligatory link to my thoughts in this direction (which remained unimplemented, sadly)

[RFC] Toolchain based build process

blangmuir · December 18, 2018, 9:14pm

Thanks for the link! I think what I'm proposing is mostly a step towards the more extensive changes that you described. Something that surprised me was:

One of the things I've been trying to avoid is having multiple sources of truth about how the install happens. Is there no way for cmake itself to drive whatever we need? Duplicating the install configuration between the cmake install steps and a separate lightweight install within each project is certainly better than doing so outside the project, which is what we're doing to some extent today. On the other hand it still adds a fair bit of work to build this new mechanism and to refactor the existing install so that the two installation mechanisms can share configuration.

Was performance the only motivation for doing this custom installation instead of letting cmake manage it? I haven't measured the install step, but anecdotally it hasn't been an issue for me. That may be workflow dependent, because I very rarely want to incrementally build more than one project at a time - I mostly get away with doing a full build then working only on a single project.

blangmuir · December 19, 2018, 9:28pm

Here's a question I'm running into: what should be the relationship between the toolchain in the build directory and the existing installation step? I've been working from the assumption that they should be separate, since you might want to install slightly different components in each, but is that reasonable? It is possible to perform a second install to a different location and configure different sets of projects to install, but a key option for installing swift is the swift-install-components, which gets baked into the cmake configuration and cannot be altered without a reconfigure, which seems like a non-starter for incremental builds since they would have to reconfigure again the next time you build.

On the other hand, if we say the toolchain in the build is the one true install location, should we override the user's --install-destdir? Or maybe we could rsync to that location?

blangmuir · January 24, 2019, 5:20pm

This preliminary is now merged: [build-script] Turn on --no-legacy-impl by default by benlangmuir · Pull Request #21772 · apple/swift · GitHub