SE-0387: Cross-Compilation Destination Bundles

Hello Swift community,

The review of SE-0387: Cross-Compilation Destination Bundles begins now and runs through Feburary 14th, 2023.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to the review manager. When emailing the review manager directly, please keep the proposal link at the top of the message.

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

  • What is your evaluation of the proposal?
  • Is the problem being addressed significant enough to warrant a change to Swift?
  • Does this proposal fit well with the feel and direction of Swift?
  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available at https://github.com/apple/swift-evolution/blob/main/process.md

Thank you,

Mishal Shah
Review Manager

24 Likes

I think I'm missing something here.

One of the big problems generating xc tools for me has always been creating the minimal runtime lib set necessary for the target. In particular, it is hard to pick out the dependencies necessary for distro-less docker containers. Ideally, it would be possible to identify the minimum set of shlibs necessary to be present to run a tool on the target which might access any part of the swift standard library.

The last time I tried this it had ballooned to somewhere north of 350MB for ubuntu focal on the R/Pi, with a lot of guessing. Will it be possible with this proposal to not only create SDK but also the runtime?

First, thanks to the authors and community for their hard work and reasoning. This is a pitch I am excited to read.

I have questions around the use of --toolset. Per the pitch:

With multiple --toolset options, passing both of those files will merge them into a single configuration. Tools passed in subsequent --toolset options will shadow tools from previous options with the same names

I assume the names of examples given — toolset1.json and toolset2.json — are entirely arbitrary. Is what matters is the order in which they are invoked? Can “multiple” mean more than one? (Therefore “passing both those files” could be rephrased as “passing those files“.) Can multiple files be passed per --toolset parameter?

Example:
Does…

swift build --toolset mouse.json cat.json hound.json --toolset fox.json

… mean that toolsets mouse.json, cat.json, and hound.json will “merge into a single configuration” and tools options from fox.json will shadow any options defined in the prior toolsets?

Nitpick, this should be artifactBundle not a lowercased artifactbundle.

1 Like

There's nothing in the proposal that prevents you from creating any runtime distribution, but we aren't making any assumptions about it either. The proposal is intentionally limited in scope to build-time matters, specifying only configuration metadata, basic directory layout for SDK+toolchain bundles, and some CLI helpers to operate on those.

I also don't think we can specify in a proposal the amount of space a runtime for a given triple takes. This feels like an implementation detail to me and not something we could reasonably restrict through Swift Evolution.

Yes, names of toolset files are arbitrary and an arbitrary number of files can be passed. The order matters in that they're "applied" left-to-right, with tools that have the same identifiers in subsequent toolsets shadowing tools from preceding toolsets.

You can't group multiple toolsets with a single --toolset option. The invocation would have to look like this (backslashes only added for formatting purposes):

swift build            \
  --toolset mouse.json \
  --toolset cat.json   \
  --toolset hound.json \
  --toolset fox.json

Let's say mouse.json specifies swiftCompiler, cat.json has cCompiler, hound.json has cxxCompiler, fox.json has linker, they will all be merged together in the end without overriding each other. If fox.json also specifies swiftCompiler, that one wins as it shadows swiftCompiler defined in mouse.json.

You can also view it as a sequence of "scopes" where each file defines a new one that can shadow a preceding one. When "translated" to Swift it would look like this:

_ = { // `mouse` toolset
  let swiftCompiler = "/usr/bin/swift"
  { // `cat` toolset
    let cCompiler = "/usr/bin/clang"
    { // `hound` toolset
      let cxxCompiler = "/usr/bin/clang++"
      { // `fox` toolset
        let swiftCompiler = "/custom/swiftc"
        let linker = "/usr/bin/lld"
        print([swiftCompiler, cCompiler, cxxCompiler, linker])
        // prints [
        //   "/custom/swiftc", 
        //   "/usr/bin/clang",
        //   "/usr/bin/clang++",
        //   "/usr/bin/lld"
        // ]
      }()
    }()
  }()
}()

I don't think there's a precedent for file system path extensions that are not lowercase, although on a case-insensitive filesystem this wouldn't matter. Primarily, we're following the naming established in SE-0305, which already uses .artifactbundle for binary targets. If the intention is to change that, it would have to be discussed outside of this proposal and provide a migration strategy for case-sensitive filesystems.

2 Likes

Love that this is being considered and seems good overall, but I have some questions and notes, after going over the proposal in detail:

  • You should have an example for the destination JSON, so that we know exactly how it will be used.
  • What is the use of buildTimeTriples, which didn't exist in the previous JSON format? Since SPM queries the compiler to figure out the host triple and the bundle already places each destination JSON in a host variant triple directory, this array of buildtime triples seems superfluous.
  • Is the idea that the length of the runTimeTriples array will match the length of the swiftResourcesPaths array, with the first of each corresponding and so on? If so, would the same be done with include and library search paths? Since you might want to provide multiple include or library paths per runtime triple, you may want to specify that one could also pass in an array of include/library paths per runtime triple, ie include/librarySearchPaths could be an array of arrays.
  • Having an array of runtime triples is so one destination can cross-compile for multiple arches, but you are then going to run into the issue that the Swift resource directory doesn't support multiple arches for platforms that don't have "fat" multi-arch libraries. That means you'll have to duplicate the non-platform-specific directories in a resource directory, that mostly have C headers and some clang libraries, which currently account for 20+ MB out of a 270 MB resource directory that comes with the current linux x86_64 toolchain, for example. If we ever fix that, so that all the arches can be installed in a single Swift resource directory, then we'll have the same resource directory path listed multiple times in the swiftResourcesPaths array. I'm not sure you can do anything about this issue right now, but it is worth thinking about.
  • I'm skeptical of all the toolset combinations posited, which presumably comes from CMake experience of the authors. Maybe that can be motivated more.
  • I like that the toolset allows explicitly specifying some of the tools used, but why stop with those four tools? SPM currently looks for a lot more, like a librarian archiver for static libraries and lldb.
  • Speaking of tools, many of those are currently looked for in the toolchain and if not found, taken from the system PATH. That is going to break cross-compilation in all kinds of subtle ways because it may end up inadvertently depending on random tools in the user's PATH. Instead, we should audit all the tools that SPM calls and make sure cross-compilation is only done by the tools in the bundle or explicitly specified in a toolset, ie disable looking in the system path when using these bundles.
  • The first future direction mentions hostTriple and destinationTriple, which are no longer part of the latest proposal and should be updated.
  • Another possible future direction would be to have bundle registries, similar to the current package registries.
  • Thought of one more, the proposal should explicitly specify how the triple is chosen when a destination supports multiple runtime triples, ie write out an example of that command.
3 Likes

I've updated the proposal text based on this feedback, please have a look at the diff of the changes.

That's a good point, the example is included in the diff linked above.

This is a fair point, I've cleaned that up by removing buildTimeTriples from destination.json and explicitly called out supportedTriples in info.json.

I don't like the requirement for arrays to match in length and possible use of arrays of arrays. In the updated version runTimeTriples is a dictionary from triples to corresponding paths. All of the paths are explicitly specified to be relative to destination.json, which makes it easier to share them between triples.

Can you elaborate on this concern? In the updated version the example of toolset combinations is provided in the ubuntu_jammy destination, where there is a toolset.json shared between triples and there are triple-specific toolsets present too, I hope that clarifies this potential use case.

That's a good suggestion, I've added librarian and also testRunner (right now only used for xctest on macOS, but could become useful for other platforms).

I think we've listed most of them in the proposed toolsets JSON schema, the only left that I know off the top of my head are zip, tar, and git. I don't think it's useful to be able to customize those in the cross-compilation context. As for other contexts, there's a larger discussion to be had whether we should shell out to call those as processes or replace their use with libraries that provide the same functionality, which would make SwiftPM more portable and self-contained in general.

Great catch, that's been updated in the diff.

I've addressed that point it in a separate commit.

Can you elaborate? The text in the proposal seems clear enough for me:

When multiple destinations support the same triple, an error message will be printed listing these destinations and asking the user to select a single one via its identifier instead.

Updated text of the proposal currently lives in a separate branch of swift-evolution#1942.

1 Like

Can you elaborate on this concern?

Given that most tools will come with flags specific to a destination, I'm skeptical that there will be much ad-hoc combinations of these toolsets outside of the bundles.

I hope that clarifies this potential use case.

It does for the bundles, yes.

I think we've listed most of them in the proposed toolsets JSON schema,

Great, but more importantly the proposal should state that SPM invocation of external tools for these bundles will never look in the system PATH, though it is fine for git and a few other tools that don't generate code to continue to be used for now.

Can you elaborate? The text in the proposal seems clear enough for me

I mean you should write out the command for choosing both a destination and a runtime triple.

I've added more review comments on the update.

1 Like

For Android one might need to use tools from the NDK. When targeting embedded platforms, their vendors might provide their own tools under licenses that don't allow redistributing them in destination bundles.

This seems wrong to me. We'd like to reuse as many tools already installed on the build-time system as possible. If one needs to override them, they should list them in a destination toolset, that's what toolsets were designed for.

Both of those commands are explicitly listed in "Using a CC Destination" section of the proposal.

1 Like

For Android one might need to use tools from the NDK. When targeting embedded platforms, their vendors might provide their own tools under licenses that don't allow redistributing them in destination bundles.

Sure, we already discussed all that in the pitch thread: this local toolset override is obviously needed for some bundles. Instead, I'm skeptical the toolsets will find much use with SPM when there is no bundle involved, ie We find that properties dedicated to tools configuration are useful outside of the cross-compilation context... Users familiar with CMake can draw an analogy between toolset files and CMake toolchain files.

I'm not saying that claim is wrong, as I don't have experience with CMake toolchain files, but it could be explained more for people like me who don't. My sense is that since most toolsets will come with CLI flags specific to a particular triple, they will not be reusable in other ways, ie without bundles.

This seems wrong to me. We'd like to reuse as many tools already installed on the build-time system as possible. If one needs to override them, they should list them in a destination toolset, that's what toolsets were designed for.

It is not clear what all the tools pulled in are: I doubt even the toolchain authors know at this point. I disagree about reusing tools from the build-time system that generate code- tools like git or zip are fine- as most will be geared to the host triple, and could silently break the code generated by these cross-compilation bundles. Instead, I'm suggesting that we proactively go in and disable all SPM system PATH lookups like this when we know we are building with one of these bundles.

We already ship a lot of the needed tools in our Swift toolchain, which is very cross-platform, but we can sometimes fall back to the system PATH to look for some missing tools occasionally. I think this proposal should ban falling back to looking for tools in the PATH for the bundles and require all needed tools to be specified in a toolset or be in the toolchain already, ie disable all those system PATH lookups if a bundle is involved.

Both of those commands are explicitly listed

I'm saying it should list the command needed when two or more destinations supply the same triple.

+1 I'm a huge fan of this proposal; I currently maintain an iOS cross-compilation toolchain for Linux and this proposal would enable distributing a minimal set of tools rather than an entire patched toolchain (--destination can sort of be used for this purpose too, but I've found it to be opaque and inflexible). Two questions:

  1. Will there be a way to specify an entire bin directory like destination.json's toolchain-bin-dir? If not, I agree with @Finagolfin in that we need more options to configure individual tools like ar.

  2. Will this only support the current set of triples handled by TSCUtility.Triple? Or does this give toolchain authors the ability to define custom run-time triples? And if we can define custom triples, is there a way we can make them "inherit" from existing ones? For example, SPM's Triple doesn't know about arm64-apple-iphoneos (even though swift-driver does) so it'd be ideal if I could define said triple to inherit from arm64-apple-darwin.

They could be reused in scenarios where users are already passing CC and SWIFT_EXEC environment variables, in the SwiftPM bootstrapping script for example. Env vars are fragile, when you make a typo in one of those, you won't get a proper error message, but toolset files that have a defined schema are harder to get wrong IMO. AFAIR there are some cases where for Windows we need to hardcode some paths, and moving those into a toolset file seems cleaner to me, but that would be for Windows maintainers to decide.

It's up to the user to choose one of those by id and disambiguate explicitly. Per the proposal text:

users can refer to it via its identifier passed to the --destination option, e.g.

swift build --destination ubuntu_jammy

They could be reused in scenarios where users are already passing CC and SWIFT_EXEC environment variables

OK, so good for cases where you want to reuse certain tool configs without any flags. To really enable that, you may want to add a way to use just the tools from a toolset, but wipe or overwrite all the command-line flags that come with that toolset.

I notice you skipped over my concern about the system PATH lookups.

It's up to the user to choose one of those by id and disambiguate explicitly.

Yes, but in your example destination after the update to this proposal, jammy supports multiple runtime triples, so this command of simply specifying a destination alone will fail.

I like all these CLI shortcuts you provide, I'm just saying lay out the full command needed when specifying the destination alone won't suffice.

For visibility, I cross-post a summary of my review comments here.

Some of them are nitpicking things, but my main concern is swiftResourcesPath schema.

Currently, swiftResourcesPath takes a path to the resource path in the bundle, but the resource path is not only one thing. In non-CC context, swift driver determines one of the resource paths lib/swift and lib/swift_static based on given options (-static-stdlib, -static-executable, ..) by assuming they are placed in relative to the compiler executable.

However, when we pass the resource path to the compiler by -resource-dir, lib/swift or lib/swift_static should be determined by those who are passing the option. (in this case, SwiftPM has the responsibility)

So I suggest making swiftResourcesPath a dictionary containing resource directories for both static and dynamic linking like below

      "swiftResourcesPath": {
        "static": "<an optional path relative to `destination.json` containing Swift resources for static linking>",
        "dynamic": "<an optional path relative to `destination.json` containing Swift resources for dynamic linking>",
      }

Specifying swiftResourcesPath is equivalent to passing -resource-dir compiler option. If it were to be split into dynamic and static properties of a dictionary, what compiler options would those map to?

I mean passing -resource-dir as well, but switching the path based on linking type.

When destination.json has the following entry:

      "swiftResourcesPath": {
        "static": "./usr/lib/swift_static",
        "dynamic": "./usr/lib/swift"
      }

Then:

  • swift build --destination XXX --no-static-swift-stdlib should pass -resource-dir ./usr/lib/swift to the driver
  • swift build --destination XXX --static-swift-stdlib should pass -resource-dir ./usr/lib/swift_static to the driver
2 Likes

Ok, that's interesting, I'd expect it to switch to a peer /usr/lib/swift_static directory automatically based on whether --no-static-swift-stdlib or --static-swift-stdlib are passed even when /usr/lib/swift is a -resource-dir, isn't this the current behavior? Do we really need to make the path for static resources explicit?

Update:

I see your point now, I need to have a closer look at how SwiftPM handles this to give you my opinion on this. But at a second glance this makes sense to me overall.

Update 2:

I've introduced "swiftStaticResourcesPath" in Update SE-0387: Cross-Compilation Destination Bundles by MaxDesiatov · Pull Request #1942 · apple/swift-evolution · GitHub to handle this, converting it to a separate dictionary seems like an overkill. Some platforms may support only static linking, some only dynamic, having separate properties without a new dictionary seems more suitable to me.

Thus currently proposed destination.json schema is:


```json5
{
  "schemaVersion": "3.0",
  "runTimeTriples": [
    "<triple1>": {
      "sdkRootPath": "<a required path relative to `destination.json` containing SDK root>",
      // all of the properties listed below are optional:
      "swiftResourcesPath": "<a path relative to `destination.json` containing Swift resources for dynamic linking>",
      "swiftStaticResourcesPath": "<a path relative to `destination.json` containing Swift resources for static linking>",
      "includeSearchPaths": ["<array of paths relative to `destination.json` containing headers>"],
      "librarySearchPaths": ["<array of paths relative to `destination.json` containing libraries>"],
      "toolsetPaths": ["<array of paths relative to `destination.json` containing toolset files>"]
    },
    // a destination can support more than one run-time triple:
    "<triple2>": {
      "sdkRootPath": "<a required path relative to `destination.json` containing SDK root>",
      // all of the properties listed below are optional:
      "swiftResourcesPath": "<a path relative to `destination.json` containing Swift resources for dynamic linking>",
      "swiftStaticResourcesPath": "<a path relative to `destination.json` containing Swift resources for static linking>",
      "includeSearchPaths": ["<array of paths relative to `destination.json` containing headers>"],
      "librarySearchPaths": ["<array of paths relative to `destination.json` containing libraries>"],
      "toolsetPaths": ["<array of paths relative to `destination.json` containing toolset files>"]
    }
    // more triples can be supported by a single destination if needed, primarily for sharing files between them.
  ]
}
2 Likes

nit: this seems to be located within frontend logic, not driver logic, but seems that it is relative to the compiler executable, not to a given -resource-dir value.

If there's a need for wiping or overwriting all CLI options specified by a toolset, just copy it and redefine its extraCLIOptions values accordingly. Thus even when you have both toolsets specified in a single invocation, if the new comes after the original one it will overwrite options of the original.

I've clarified my stance on this in one of the previous posts and I don't have anything to add. I'm currently not convinced that this should be codified in the proposal, especially as it's not an already established behavior and not something I've seen to be frequently requested. I'm open to reviewing concrete examples where the existing behavior caused active harm to reconsider.

Good catch, I've updated to swift build --destination ubuntu_focal, which in the context of the example provides only one run-time triple.

Check out the updated proposal text, it does mention the new librarian toolset field for ar, and if the proposal is accepted with the toolsets feature, we'll be adding new properties for other tools as needed.

It does not. Supporting an arbitrary triple is not just a matter of adding cases to an enum. All of the triple components (CPU architecture, OS, object format, libc flavor etc) need to be supported in all of the different parts of the toolchain and core libraries and each step requires a non-trivial amount of work and integration testing, as it does for any language. I'm personally not sure if this can be specified in a Swift Evolution proposal, it's primarily an implementation detail of the toolchain and core libraries.

Sorry, I'm not sure what "inherit" is supposed to mean in this context. If you think this is something that should be fixed in SwiftPM specifically, please create an issue or a PR on GitHub and we can discuss it there.

If there's a need for wiping or overwriting all CLI options specified by a toolset, just copy it and redefine its extraCLIOptions values accordingly.

That invalidates the claim of toolset reuse in the proposal, which I was skeptical of to begin with. The problem with most toolsets is that they will have flags specific to a particular usecase, so they will not be reusable, unless you can easily wipe their CLI flags as I suggest. If you have to write a whole new toolset config to add new flags instead, as you now suggest, there is no reusability, as you might as well write the config from scratch at that point.

Anyway, this is just a suggestion on the toolsets expansion of the bundle proposal, not too important.

I've clarified my stance on this in one of the previous posts

That would be where you wrote, "This seems wrong to me," not that confident.

I'm currently not convinced that this should be codified in the proposal, especially as it's not an already established behavior and not something I've seen to be frequently requested.

How many people do you know using the previous destination JSON config? I suspect it numbers on one hand, and I'm the one using it the most.

The point of this bundles proposal is to expand that greatly, and disabling these system lookups would keep them safe rather than sorry.

Good catch, I've updated to swift build --destination ubuntu_focal , which in the context of the example provides only one run-time triple.

No, it simply does not specify how many runtime triples it has. I just don't think it's good to write a proposal only listing CLI shortcuts, but not a single example of the full command.