Implementing Parts of the Swift Compiler in Swift

Hi all,

In the past few years, some components of the Swift compiler have started being implemented in Swift, including:

All of these components are optional for one reason or another. The new Swift Driver is optional because we are still maintaining the existing C++ Driver implementation, which can be used for building a compiler with a host that doesn’t support Swift. Regular expression literals and the new SIL optimization passes are optional because we can build a compiler without them, then use that compiler to build a new compiler with them. All of this means that it is still possible to build a (mostly) working Swift compiler on a host where there is no existing Swift compiler, using the host C++ compiler.

I propose that we start requiring an existing Swift compiler to build the Swift compiler. This opens the door to non-optional (mandatory) parts of the compiler to be implemented in Swift.

Requirements for mandatory Swift code in the compiler

For a mandatory part of the compiler to be implemented in Swift, it must:

  • Build with CMake, which will be used when building the compiler.
  • Be part of a SwiftPM package, to allow a package-based workflow for most development.
  • Build with the current main compiler, release branch, and all Swift releases shipped in the last 12 months. For example, at the time of this writing, main will become Swift 5.8, and the release branch is Swift 5.7. A mandatory part of the compiler implemented in Swift would have to build with Swift 5.5, 5.6, 5.7, and 5.8. This gives ample time for anyone who wants to work with the Swift compiler to update their host tools (possibly building newer versions) without having to go through a multi-stage bring-up.
  • Deploy back to the Swift 5.1 runtime. This would allow Swift Concurrency to be used in the code base (via the back-deployment libraries), as well as opaque result types. However, it would be a significant bump in requirements for using the compiler on macOS: currently, the compiler can run on macOS 10.9 or newer. This would move that requirement to macOS 10.15. Other platforms are unaffected because they don't ship Swift as part of the OS.
  • Support cross-compiled builds to other host architectures. For example, this means that the code base must be free of #if os(...) checks (and similar) that conflate the host and target environments. Once we accept that Swift code can be a mandatory part of the compiler, bringing up a new host environment means cross-compiling all of the compiler’s code.

Use cases for mandatory Swift code in the compiler

The first few candidates for mandatory Swift code in the compiler are:

  • The new Swift Driver implementation. The new driver is a standalone replacement for the C++ driver. We can stop building the C++ driver executable, and instead make the new Swift Driver mandatory. As follow-up work, getSingleFrontendInvocationFromDriverArguments can be reimplemented by using the Swift Driver library, allowing the C++ driver to be removed entirely.
  • Regular expression literals, which could be enabled all of the time instead of conditionally. This is mostly a simplification to ensure that the language dialect doesn’t depend on how the compiler was built.
  • Various uses of gyb and tablegen for code generation could be replaced with Swift code in the build process.

Under this proposal, the new SIL optimization passes are not good candidates for becoming mandatory Swift code. These passes, and indeed the use of SIL instructions from Swift, are being used as testbeds for Swift/C++ Interoperability, so this code cannot become mandatory until Swift/C++ interoperability has been stabilized in a release that is older than the 12-month cutoff.

Concrete Build Process

Here’s a proposed build process for the Swift compiler with Swift code in it:

  1. Build C++ bits with the host C++ compiler
  2. Build mandatory Swift bits with the host Swift compiler
  3. Link a “minimal stage 1" Swift compiler
  4. Build optional Swift bits with the minimal stage 1 compiler. Note that these bits may not be fully optimized because the stage 1 compiler may lack some optimizer passes.
  5. Link a “full stage 2” Swift compiler
  6. Rebuild optional Swift bits with the stage 2 compiler.
  7. Link a “final stage 3” Swift compiler

The above does create a productivity risk: The optional bits get built twice, and this happens any time the stage 1 compiler changes. In practice, this risks slowing down developers who must wait for additional build steps to get a fully optimized result. Any developer working on the optimizer will need at least the stage 2 compiler; developers interested in compiler performance will need to work with the stage 3 compiler. A workaround would be to have a separate build mode that builds the optional bits with the host compiler; that would provide faster build turnaround for those cases where the host compiler can be fully up-to-date.

Also note: It is possible to do the above with dependency-based build systems (such as Make or Ninja), but it’s tricky to get right. Note that a naive version would have the optional bits depend on both the stage 1 and stage 2 compilers.

There are two cross-compiling scenarios to consider:

  • Initial platform bringup: For a new platform, someone will have to cross-build a stage 1 compiler one time and make it available. That is sufficient for everyone involved to do native builds from there on.
  • Platforms that are unable to build the compiler natively. This includes targets like Raspberry Pi that are capable of running a Swift compiler but not necessarily building it. These will require a different build process, as the stage 1 compiler above would be for the target platform, not the host.

Continuous Integration

To ensure that the compiler build succeeds with older compilers, we will need to bring up additional CI to build the main compiler with all of the supported host compiler versions, e.g., Swift 5.5, 5.6, and 5.7. As new versions of Swift are released, we can drop the CI jobs for older versions when adding the new one. For example, once a new version is released (say, 5.8), the oldest compiler can be removed (e.g., 5.5).

Personally, I'm excited to open the door to having more Swift code in the compiler, but I want to make sure we're doing so in a way that doesn't make it unduly complicated to develop the Swift compiler or port to other host architectures. Thoughts?

Doug

108 Likes

Which side is responsible for testing? Do we need to teach lit to work with XCTest, or SwiftPM to work with lit?

will there be performance test cases for portions of compiler that are migrated?

3 Likes

The latter, I think, because we'll want to be able to initiate testing via SwiftPM and make use of its integration in Xcode/VS Code. With the Swift Driver, we did a bit of both---all of its new tests are written in XCTest, but we have a way to swap in that driver to run the lit tests.

We evaluate the performance of any new component, regardless of implementation language. However, I don't think that belongs in this proposal--it's part of the normal development flow already.

Doug

3 Likes

I think it sounds like a great direction but:

I’ve never done a xcompile bring up, but my only concern would be that the bar for someone who want to bring swift to a new platform will be even higher - it seems it would require great documentation and concrete examples to not make it completely inaccessible to most.

There are already other bits like ie. Libdispatch that causes friction even for the major “other” platform (Linux) due to Mach dependencies afaict. eg. Like lack of support for the new concurrent executor (there is a big PR that got stuck last year that maybe has it), but I digress.

It’s likely still worth it as surely the people working on the compiler deserves better tools - good for future progress surely and very reasonable.

One simpler concrete question - would this impact how one would eg. Support a new version of Ubuntu or would that require to set up xcompile? Given the “we build everything from source” Linux mentality (in the Linux world) it’s unclear to me whether an older built swift tool chain would run on a newer distribution (assuming that’s not always the case given that we build for multiple versions and distribute them separately at swift.org).

Perhaps obvious how that would work for most, but I just wanted to clarify, thanks.

3 Likes

I’m not sure how this will affect Windows, where we have Swift toolchain but is not capable of bootstrapping at the moment.

Introducing mandatory Swift code certainly raises the bar of bringing Swift onto new platforms, unless we have a pure C++ “stage 0” compiler that can be used for compiling stage 1 pieces, so no cross-compilation will be required.

It raises the bar for a new host platform, yes.

That's a separate issue, unless the compiler were to depend on Dispatch itself.

I don't know what you're referring to, but I suspect it belongs in a different thread.

Amongst different Linux versions cross-compiling should be quite simple. Compilers are mostly self-contained, and we could statically link executables to make them more portable. Indeed, we might even want to do this anyway to make it easier to get Swift compilers on Linux.

It'll take some work, but we've been talking through the steps, and it's manageable.

This proposal specifically plans to not have a "pure C++" Swift compiler any more.

To be clear, cross-compiling compilers is a normal thing done throughout the industry. We want to make sure we do it well, but it's not something to fear.

Doug

9 Likes

I love it, but there may be a performance challenge down the line.
I wonder where we are on Apple's engineers all using Xcode?

I'm sure there is awareness of this but I'd be wary of a few points that you suggest that would make it inordinately difficult to do platform bootstrap not just on conventional systems but on more esoteric platforms.

Specifically, the dependency graph needs to be considered. Since Swift concurrency features depend on Dispatch, this means that bringup critically requires Dispatch when it hasn't before. Dispatch either requires some degree of tight kernel coupling to implement, or at the very least, requires a new Dispatch backend to be written using portable primitives (that would not be very performant, but at least wouldn't be a strong blocker to getting the basics of a Swift toolchain bootstrapped).

I'd strongly advise against that, because primarily SPM depends on Dispatch and Foundation, which have the same problems as mentioned above.

Overall, I don't disagree with the overall goal here, so this post is just a reminder that we can make the process of bootstrap extremely difficult if we aren't mindful enough of what we are suggesting be mandated.

That's not currently the case, there's already an implementation of single-threaded cooperative concurrency that we already use when targeting WebAssembly, where Dispatch is not available.

8 Likes

Oh, neat! Maybe I've missed that since last I looked. The points about SPM though stand, I believe.

+1 - I mentioned in the previous thread that I didn't think the heroic efforts to enable bootstrapping were worth the cost. Swift's availability is better than it has ever been - I'm not sure that getting access to a machine which can build a port of compiler, even if it is written in Swift, is actually a real barrier to supporting more platforms.

I would also like to recommend that the project start adopting swift-format for the large amount of new Swift code. One thing that I find a bit annoying about the standard library is that it isn't automatically formatted. DocC support would also be great, and its improved organisational capabilities could really help make the compiler more approachable.

Finally, I mention WebAssembly every time this comes up. Apparently it is a very simple bitcode and there are portable interpreters for just about every platform under the sun, and some allow compiling the files to native code ahead-of-time for better performance. Perhaps, on platforms where an existing "full" compiler is not available, something like that would make a reasonable "stage 1"/minimal compiler. I think we're talking about a real fallback scenario here; where you can't even access an x86/arm linux box to have a build machine with a native compiler for your initial bootstrap. It's important to support that case, but I wonder again whether this much complexity will actually amount to a tangible gain over something which could be simpler and have broader benefits for Swift's cross-platform story.

3 Likes

I'd strongly advise against that, because primarily SPM depends on Dispatch and Foundation, which have the same problems as mentioned above.

Maybe you missed the bullet point prior to that, where he says mandatory Swift packages in the compiler must build with CMake too. The idea is that you use SPM to build those compiler packages on one of the platforms that already has a working Swift toolchain, not that you use SPM running on a new platform to compile parts of the compiler for that new platform. Frankly, even when cross-compiling from an existing Swift platform to a new platform, I found SPM to be better at it than CMake, though that may be partially because I can't stand the CMake build language.

Under this proposal, the new SIL optimization passes are not good candidates for becoming mandatory Swift code. These passes, and indeed the use of SIL instructions from Swift, are being used as testbeds for Swift/C++ Interoperability , so this code cannot become mandatory until Swift/C++ interoperability has been stabilized in a release that is older than the 12-month cutoff.

@Douglas_Gregor, I was surprised by this: will the mandatory Swift code only use C interop to call the existing LLVM/Swift functions from that C++ codebase? I've certainly run into an issue with that experimental C++ interop, so avoiding it for now makes sense.

Totally makes sense! Using WebAssembly or Docker should provide a nice start for initial support, and I hope we can have a detailed guide on this topic.

I'm not sure what that would look like, e.g., if we build a SPM package with CMake, then why use SPM? Maybe we need a more detailed proposal...

Porting a compiler to a new platform is not something lots of people should be doing. It is an occasional task, generally done by people who are expert with both the compiler and the target platform and its SDK. Almost all compiler development should continue to be local development.

In particular, a new version of an existing platform should not require cross-compiling the compiler. A new version of Ubuntu (for instance) should be able to run binaries (including compilers) for the old version. That means that people working on Swift for Ubuntu should never need to cross-build any part of Swift, since it already runs on Ubuntu.

Tim

5 Likes

if we build a SPM package with CMake, then why use SPM?

I think the idea is that the CMake support maintains the current build process, while the new SPM build support for building Swift portions of the compiler is another option for those who are building on a host that has a full Swift toolchain. People may find it more convenient if they then want to use those compiler packages as libraries in external Swift projects and so on.

When I tried to do something with the swift compiler, CMake utterly turned me off, FWIW. I was shocked that Swift didn't just use Xcode. Generating Xcode project files with CMake didn't work either at the time.

I'd do everything possible to get rid of CMake. What a revolting kluge it is.

1 Like

Right, and this is the problem I want to highlight: to get that "full Swift toolchain" to properly bootstrap would require bringing up Dispatch and Foundation (and llbuild! and swift-crypto! and...). That is a lot of work relative to just bringing up Swift as it stands today.

I'd do everything possible to get rid of CMake. What a revolting kluge it is.

I agree, but it appears to be the de facto standard for cross-platform C++ projects, so the Swift toolchain uses it heavily. It is not realistic to migrate off of CMake any time soon, but we can certainly start going in that direction now.