[Second review] SE-0387: Cross-Compilation Swift SDKs (previously Destination Bundles)

Canadian Cross should not be supported.
This concept is only necessary for compilers such as GCC, where the target is fixed at the time of its own build.
Such compiler architectures are not adopted today.
For example, swiftc allows a single compiler binary to support multiple targets.
The need for dedicated binaries for WASM and Android support is merely an implementation issue, not an intrinsic design constraint.
Furthermore, for SwiftPM, the task of building a compiler is only a small part of its various uses.
So the Canadian Cross story is doubly rare.
Introducing complex terminology is confusing to users.
This means that the disadvantages are too large and the advantages are too small.

1 Like

Personally, I think it is better to help the untold number of future readers, even if it means breaking some existing links.

For example, if I'm trying to understand some subtle behaviour of the consume operator by reading its SE proposal, I'll go to the repository, look through the list of proposals, and find nothing. The file is actually called 0366-move-function.md. If I'm looking to understand the details of the SDK format, the file name 0387-cross-compilation-destinations.md is not very obvious.

But I do understand your position as well. Perhaps this points to a more general flaw in how we organise proposals.

2 Likes

Even though LLVM supports multiple targets with the same compiler binary, the default target triple is still set at build time. It defaults to the host triple. And the docs intentionally don’t mention the config variable to set the default target triple, preferring to only mention the host triple instead.

https://www.swift.org/swift-evolution/#?search=consume

3 Likes

i have also been doing github searches, as it never occurred to me that the swift website can also perform proposal searches. this is way better than digging through the swift-evolution repository!

1 Like

I can agree with not inventing new terminology for this process. However, if that is the case, we should really be using the de facto standard terminology for this which is GCC's (as @ksluder pointed out). The problem that people have expressed with this is that it is frame relative:

  • build: where the code is built
  • host: where the output is run
  • target: where generated code is run [cross-compilation only]

That is, when building the tools, the host is the machine that will host the compiler, and target is the target that the compiler will generate code for. However, when you build the runtime, the host is the value of the target.

I am fine with that and expecting that someone cross-compiling will have a grasp on the concept is acceptable. We could add documentation for that if desired, and use the terminology build and host generally and add target for cross-compilation scenarios.

1 Like

I liked the original color of the bike shed. :wink: The historical terminology of "host" and "target" is not as easily understandable as "build-time" and "run-time," and we create basic terminology like this primarily for casual and intermediate devs, not for the experts, who are aware of all the shades of "target." I agree with those that say we should just ignore Canadian Cross for that reason, as those experts are capable of coming up with their own terms.

While I also agree that renaming SPM's use of the target term would make sense, I think we should use "build-time" and "run-time" regardless of whether SPM changes.

This is not true anymore: the vast majority of Swift code is cross-compiled from macOS to iOS. As such, I think it makes sense to update our cross-compilation terminology so as not to confuse new devs with old, less descriptive terms, even if old devs like us are used to the old terms.

1 Like

First of all thanks to everyone for all of the feedback, this is incredibly useful.

Secondly, I'd like to focus everyone's attention on the nomenclature of the basic cross-compilation scenario with only two triples present. Whether Canadian Cross is kept or not as a separate concept we explicitly support, basic cross-compilation is still the most common use case. When writing the original SE-0387 text we had a consensus for how triples are named in the Canadian Cross scenario. It's how one selects two specific names out of those three for the basic scenario that made it so complicated.

I'd like to ask review participants to explicitly specify what terms they'd like to see when for basic cross-compilation, which is what SE-0387 was designed to support in the first place. Canadian Cross is a future direction, and we'd like to have names that play well with that if it's ever supported.

Our main goal in the second review is to find answers to these two questions:

  1. What do you call a triple that the toolchain is running on that produces the code you're cross-compiling?
  2. What do you call a triple on which cross-compiled code will run on?

After answering these two questions in this specific review it's easier to talk about more complex scenarios as future directions.

1 Like

Please use the existing, familiar terminology that Clang and LLVM, technologies underlying and required for compiling Swift, already use: target for the platform triple the artifact was compiled for, and host for the platform triples where the compiler that consumes the artifact will be run.

For that matter, as CMake is required for compiling Swift, why wouldn't we prefer their "build host" terminology? As I personally don't find this argument convincing, we use multiple underlying technologies which don't seem to be consistent with each other, why would we prefer one over the other? Just looking at this thread I've seen people proposing build/host and build/target, in addition to your host/target suggestion, in support for their argument they all refer to other tools in the ecosystem that use their own nomenclature.

In fact, LLVM itself refers to a "build host" in its cross-compilation docs and says that it tries to follow autoconf (albeit inconsistently):

Also note that LLVM_HOST_TRIPLE specifies the triple of the system that the cross built LLVM is going to run on - the flag is named based on the autoconf build/host/target nomenclature. (This flag implicitly sets other defaults, such as LLVM_DEFAULT_TARGET_TRIPLE .)

You may also want to set the LLVM_NATIVE_TOOL_DIR option - pointing at a directory with prebuilt LLVM tools (llvm-tblgen , clang-tblgen etc) for the build host, allowing you to them reuse them if available.

Because their term for “target” is CMAKE_SYSTEM_NAME, which is extremely ambiguous.

CMake is required for compiling the compiler, but that's not what this proposal is talking about—it's talking about using the compiler to build Swift code for other platforms. So it wouldn't be appropriate to use CMake as an analogy here. And yes, why Swift client code can be built by CMake, it's not the officially supported solution for most Swift users—SPM is.

As far as I'm concerned, the GCC nomenclature shouldn't be on the table here at all; whether or not it's the "standard", it's not what the Clang and LLVM—the underlying technologies of Swift—use. There would be far more harm caused by having Swift diverge from its own related technologies than would be caused by simply ignoring terms that aren't directly related to Swift in any fashion.

1 Like

I’m confused. GCC uses the same autoconf terminology that LLVM uses.

I'm admittedly not as familiar with GCC's nomenclature as I used to be, but I'm referring to this table from the original post:

...which has "host" inverted compared to LLVM/Clang.

Tony expressed all of my feelings better than I could have. I think using the host + target nomenclature is the best option, and ambiguity with target modules is not a large enough reason to diverge from LLVM.

I also don't fully understand the value of special casing the "Canadian Cross". It seems like you could make this cross arbitrarily deep, e.g.:

Machine A Machine B Machine C Machine D
LLVM (A) -> LLVM (B) LLVM (B) -> Swift (C) Swift (C) -> Program (D) Program (D)

In each of these compilation steps, there's a host machine and a target machine. One could add as many layers of cross-compilation stages, but I don't think there's value in naming each machine differently.

I think the explanation is that @mishal_shah made an error by combining “LLVM/Clang” with “CMake”. LLVM and autotools have the same definition of “host” and “target”. CMake’s “host” (CMAKE_HOST_SYSTEM_NAME) is equivalent to autotools’s “build”, and CMAKE_SYSTEM_NAME is equivalent to LLVM’s “host” (LLVM_HOST_TRIPLE). From CMake’s perspective, LLVM/Clang is itself the product, but since we’re building a compiler, it has its own target (LLVM_DEFAULT_TARGET_TRIPLE).

Technically, GCC doesn’t belong on this chart at all, because all cross-compilation is handled by autoconf.

To clarify, I'm the author of the proposal and the table you quoted, Mishal posted it as a review manager.

I'm totally fine with expanding the table and adding a new column for CMake, especially as that would show that there's no universal "established convention". Majority of these tools don't agree with each other and use build/host/target terms inconsistently, which seems like a good opportunity to introduce new clearly defined terms that we can use to avoid the confusion.

While GCC itself may not handle --build, --host, and --target options directly, I'm relying on GCC documentation for defining these terms in that column.

In addition, I took a look at the API naming guidelines @allevato pasted above and they largely don't apply to this situation.

The first two rules he pasted are in the context of "Stick to the established meaning if you do use a term of art," ie don't use old terms but change their meaning, which we wouldn't be doing if we used new terms.

As for "Embrace precedent," the examples there assume that there is long agreed-upon terminology that is widely used, like "array" or "sin(x)". In this case, while cross-compilation terms have certainly been around for decades, you've shown there's not much agreement between the various tools and, as Tony himself admits, cross-compilation was a niche that few devs used.

Now that cross-compilation has become the dominant way code is built and deployed, I agree with you that we need new terms and liked your original choice of "build-time" and "run-time."

2 Likes

I haven’t seen any evidence of inconsistency in the use of “host” and “target” so far. “Host” is where the developer runs the compiler. “Target” is where the developer’s code runs. In a Canadian cross situation, “build” is where the compiler is compiled.

The compiler-compiling situation is naturally confusing no matter what terminology is used. But “target”universally means “where the code runs”. Since CMake doesn’t understand that it’s building a compiler, “the code” is the compiler itself.

Please refer to my previous posts in this thread and the original text submitted for second review here, where I've been pointing out these inconsistencies. Here's another example in autoconf, where there is not a single mention of the --target option and it uses --build and --host options exclusively:

configure enters cross-compilation mode if and only if --host is passed.
[...]
Therefore, whenever you specify --host, be sure to specify --build too.

./configure --build=i686-pc-linux-gnu --host=m68k-coff

Yet another example in automake documentation:

The --host and --build options are usually all we need for cross-compiling. The only exception is if the package being built is itself a cross-compiler: we need a third option to specify its target architecture.

One more from Meson documentation:

if you are doing regular cross compilation, you only care about build_machine and host_machine. Just ignore target_machine altogether and you will be correct 99% of the time. Only compilers and similar tools care about the target machine.