Package Manager Extensible Build Tools

So far I don't really see a solution for how I can use a library to extend my Package.swift file while also declaring the library within that file.

The API complexity and model complexity is increased quite a bit. Core concepts went from packages, targets, products, and dependencies to packages, targets, products, dependencies, package extensions, custom rules, and tools. And that's keeping the fringe stuff like pkg-config for system libraries, package providers, modulemaps, testing, etc, off the table.

It's a lot. As the Package spec gets more and more specific, it feels like maybe we're fighting a losing battle against the complexity of the problem, hence the desire on my side to expand the scope of the conversation to include a different approach that may enable more flexibility more simply, by combining both the workspace concept and the extensibility concept into one (kinda major) shift in direction.

Maybe a new proposal is in order. Either way, I've voiced my objections.

This whole paragraph is not clear to me. Can somebody explain?

Hey David!

Currently, SwiftPM generates an llbuild manifest and builds it using the swift-build-tool. This means, all build tools must be defined inside llbuild or they must be invoked using the shell tool. llbuild is written in C++ but it has a C API (and there are Swift bindings now!). We can use these bindings in SwiftPM to provide the implementation for tools from package extensions (see this). So, the first step is to switch SwiftPM from using swift-build-tool to the C API (or rather the Swift API).

PS: This is going to be awesome!

4 Likes

This proposal looks sound, and I can see the need for the complexity (which isn’t too onerous I think, and in any case shouldn’t get in the way when not required).

I’m wondering though about the way that custom build rules receive their inputs and report their outputs.

You seem to have gone for a model where the inputs are given explicitly in the manifest when the rule is used:

              .build(
                    sources: ["misc.proto", "ADT/*.proto”]    <-- explicit inputs
                    withPackageExtension: "PBPackageExt”,
                    ...
                    ],

as opposed, for example, to an approach where the tool simply declares a class of files that it operates on (*.proto, for example) and leaves it to the build system to throw inputs at it.

I have two possible issues with that.

  1. In complicated project setups it may lead to unnecessary boilerplate.

    When all you really want to be able to say is “any time you find a .proto file, run it through this tool please, it would be better if the tool itself could express this, and then a project wanting to use it could just say “use this tool please”.

  2. I don’t see any obvious way in which custom tools will be able to operate on the output of other custom tools if you have multi-pass transformations.

    Let’s say I have one custom tool which outputs .proto files, and then I want a second tool to transform them into .swift files. I can see that the first tool reports its output with addDerivedSource (and I can see that’s essential for the underlying build system to be able to cache etc).

    How do you specify the inputs to the second phase as being the outputs from the first one? The manifest is real swift of course, so perhaps there is a way to programatically obtain the output of one .build item and set it as the sources: parameter of another - but even if this is possible, I can imagine that it might become messy.

Am I missing something here?

1 Like

Great proposal, I'm excited to delete some checked-in generated code!

I took a stab at designing a hypothetical SwiftProtobuf PackageExtension using the APIs in this proposal with the real SwiftProtobuf package and came up with a few questions.

For background, the SwiftProtobuf project contains a plugin executable protoc-gen-swift, implemented in pure Swift and exported as an executable product by Swift Package Manager. Adding a new packageExtension product to that package implementing should be relatively straightforward. However, that package is actually not enough to generate Swift source code from a .proto file, it needs the Protobuf Compiler tool (protoc) as well.

To illustrate, a typical invocation of protoc to generate Swift source files looks like this (with variables filled in by the build system):

${ProtocToolPath} \
    --plugin=protoc-gen-swift=${ProtocGenSwiftToolPath} \
    --swift_out=${TargetGeneratedSourcesDir} \
    --swift_opt=ProtoPathModuleMappings=${ModuleMappingsFilePath} \
    --swift_opt=Visibility=Public \
    -I ${TargetSourcesDir}/Proto \
    -I ${TargetDependencyIncludePath} \
    -I ${PackageDependencyIncludePath} \
    -I ${SystemIncludePath} \
    ${TargetSourcesDir}/Proto/example.proto
# Produces ${TargetGeneratedSourcesDir}/example.pb.swift

Looking over the proposed API I'm not sure that all of these variables can be filled in.

ProtocToolPath is the path to the protoc tool executable. The protoc tool is typically installed on the system somewhere in PATH or downloaded into a project build dir and run from there [1]. If the tool is on the system PATH we need a way to express this (and escape the sandbox for it). If the tool is to be downloaded into the build dir we need a way to express that so that the package extension can find the tool.

[1] The Protobuf Gradle Plugin allows this to be configured.

ProtocGenSwiftToolPath is the path to the protoc-gen-swift tool, which is an executable product from the SwiftProtobuf package. This tool be obtained with TargetBuildContext.lookup(tool: "protoc-gen-swift"), but a new API will be needed to the Tool protocol to get its path.

TargetGeneratedSourcesDir is a directory for generated sources that will be compiled into the current target. This can be obtained with TargetBuildContext.buildDirectory.appending("ProtobufGeneratedSources").string.

ModuleMappingsFilePath is the path to a generated file containing metadata for the Swift Protobuf plugin. It contains mappings of .proto file names to their corresponding Swift module names so that generated code contains the correct import statements. It will need to be generated prior to the above protoc invocation and take metadata for the current target and its transitive dependencies as arguments. It seems possible to create another Tool to generate this file, and extend the TargetBuildContext protocol to have a new property var dependencies: [TargetBuildContext] { get }. This tool would have an empty set of inputs in the current target.

TargetSourceDir is the root of the sources directory for the current target, for example Sources/ExampleAPI. It's required to allow protos to write their imports without regard to which target they're in (similar to allowing chevron-includes in C projects). It's unclear if this is easily attainable from TargetBuildContext.inputs.

TargetDependencyIncludePath is a path to a directory in a target dependency containing .proto files that can be imported similar to include in a C target. It will require an API to walk a target's dependencies and query metadata about attached Build Rules and Package Extensions.

PackageDependencyIncludePath is the same as a TargetDependencyIncludePath , except that it points to a directory in a package dependency's checkout.

SystemIncludePath is a path to a directory containing "well-known" protos, similar to /usr/include. If protoc is installed on the system this will probably be a nearby system directory and it will thus need to be allowed from the sandbox. If protoc was downloaded into the build directory this will be an adjacent resource directory and the tool lookup API will need to be able to find it.

Summarizing my main questions:

  • How do you imagine supporting "system" tools and their associated resources?
  • How do you imagine supporting "downloaded" tools and resources (that SwiftPM can't build)?
  • How do you feel about extending the API to allow querying transitive dependencies?
  • How do we express llbuild-level dependencies on inputs in transitive dependencies?
2 Likes

I feel like this is the main point of objection I have with this proposal. It doesn't have a mechanism to configure tools beyond passing them CLI arguments. It doesn't have a mechanism that allows authors to provide a library or API to end-users who might use these tools. This pushes the complexity outward to developers using these libraries, just like rpath and the like.

While we've been reassured that importing custom APIs into SwiftPM would be "easy", I'm not so sure... and therefore I think pushing this type of workflow out of the package and into the workspace proposal is a far better approach.

Is this still in the works? I'm looking to use swiftlint with a project that relies on SPM and someone pointed me to this thread.

Specifically the issues I'm running into right now are:

  • There doesn't seem to be a way to generate Xcode projects that include the Run Script build phase that is required by swiftlint. Will this proposal include something like that, or would that have to a separate proposal?
  • There's no way to swift build and run other tools at the same time AFAIK. I think this proposal covers that feature if I'm reading it correctly though.
  • Having access to the environment variables that Xcode provides would be exceptionally convenient. The one I'm utilizing right now is DWARF_DSYM_FOLDER_PATH, because that's the path where the built swiftlint executable is placed. Would we retain access to those environment variables under this proposal?

Thanks!

1 Like

One major issue with this proposal (and indeed with most build systems in general, not just SwiftPM) is that for many build tasks it is not possible to statically know the paths of the output files produced by a task from the set of input file paths alone; there is also a dependency on the contents of those inputs.

For example, implementing a C compiler task is easy: there's always one input file, and always one output file. We can compute a suitable output file path based on the input file path, i.e. input.c produces output.o, and we pass both of these paths to the compiler. The contents of input.c are completely irrelevant when constructing the build graph.

However, other tools can be problematic, such as the protobuf compiler. Given a file such as input.proto (depending on the output language), any number of output files may be generated. You can only control the output directory, but you can't know which files the tool will generate there (from the file paths alone). To know this, you must also understand the content of input.proto.

With a solution requiring outputs to be listed at task construction time, you either have to provide a provision for developers to hardcode which output files a given protoc invocation + input file will generate (this is not scalable and pushes the problem to the wrong audience), or you have to forgo declaring some of the outputs to the build process (this harms parallelism and correctness, if it's even possible at all in a given scenario).

Essentially, we need some sort of two-part solution: a mechanism for rule authors to declare what WILL happen, to the build system, and a mechanism for the build system to report back to rule authors what DID happen, providing the opportunity to cycle back additional information into the build graph (i.e. newly discovered output nodes that now need to be attached to the task we just ran). This also makes ordering more difficult (how do you guarantee the discovered outputs don't affect tasks which already ran, or how do you know to defer tasks which might have been or will be affected?) but will need to be solved for proper integration of arbitrary build tools.

4 Likes

Rules should have dependency checks that allow skipping things, this could be either implicit done by the build system, MSBuild do that quite well by tracking all the inputs used to create a set of outputs, and they do that by setting a File tracker that inspect files read and write by a given process (the rule tool), this could be tricky in some cases.

There is also the possibility for the tools to provide this info using a standardized format or API.

I working in a custom compiler that output Swift code in my case having the compiler (a C++ tool) run from Swift Package Manager would be great

let package = Package(
name: "MyPkg",
dependencies: [
    .package(url: "https://github.com/ zeroc-ice/Slice2Swift", from: "1.0.0"),
],
rules: [.build(withPackageExtension: "Slice2Swift")], // Install this rules with all targets
targets: [
  // Compile all Slice files (.ice) with default options
  .target(name: "MyPkg", dependencies: []),
  // Override the rule for "Other" target
  .target(name: "Other", dependencies: [],
          rules: [.build(withPackageExtension: "Slice2Swift", args: ["-I.", "-DFOO"])]),
  // Compile Foo.ice with -DXXX and compile remaining Slice files (.ice) with -DNO_XX
  .target(name: "More", dependencies: [],
          rules: [.build(withPackageExtension: "Slice2Swift", options: ["-DXXX"], inputs: "Foo.ice"),
                     .build(withPackageExtension: "Slice2Swift", options: ["-DNO_XX"], exclude: "Foo.ice")])
])

I think for simple cases installing a rule that applies to all targets will be nice, and allow to override per target. Having implicit inputs it is also nice, for the outputs the build system must query the tool.

How will I make the outputs of my tool the inputs of another, maybe allow query the installed rules, and then the tool can add its buildDirectory to the inputs of a second rule, but I think it will be best if that can be automatically discovered would be much better, even if it is not for all cases.

A way to make these build rules even more useful - especially in the server-side world - is to have the ability to move the binaries post-build. Consider the following structure for a server:

Public/
Source/
   server
   view

So the server might import the view, for server-side rendering, though the view might also be built into wasm for client side rendering. Everything in Public/ gets served to the public first. So if when building the seperate views into their view.wasm (or whatever the case may be) could then be moved to the Public/ directory.

I'm soon experimenting with a makefile, which feels dirty in a Swift context, and thoroughly hope this proposal gets further!

1 Like

I put down some thoughts around SPM plugins here and got directed to this thread.

I'd be very interested in code-generation and this seems like a workable approach. A couple of things to consider-

In the example MyPackage -> Package.swift file provided, the SwiftyCURL dependency is probably a requirement of the generated code rather than the static code in the package itself. In such a situation, this dependency is not really modelled correctly-

  1. MyPackage is making an assumption that the generated code requires SwiftyCURL as a dependency. This requires some kind of knowledge of the code that is going to be generated - probably through documentation of the generator which is not a strong contract.
  2. MyPackage is making an assumption that version 1.0.0 of SwiftyCURL is acceptable for the generated code. Conversely, the generator has no way of safely using non-breaking API additions introduced in minor versions of its runtime dependencies.
  3. MyPackage is making an assumption that the generated code doesn't use any other dependencies than SwiftyCURL. Conversely, the code generator has no way of adding dependencies even if those additions are considered non-breaking.

This proposal provides a mechanism for generating the actual code but it leaves dependency management to be handled by the consuming package rather by the generator package which seems more appropriate.

Is there any update on this? Really need to be able to call cmake from SPM for the project I am working on currently at AWS.

2 Likes

No updates right now, but we are aware that extensibility is an important missing part of the Swift package story.

3 Likes

Hello, are there any new developments in this matter? :slight_smile:

2 Likes

Hi @stuchlej, no updates right now, but we will post here when have something to share.

1 Like

Is there anything that the community can help with in any way? Documenting a specific area of code in SwiftPM to prepare it for this feature, drafting a pitch/proposal, identifying use cases? This feature has been sorely missing from the start, and there's enough tooling that could benefit from this (GYB, Sourcery, image/localization resource generators etc), what's the best way to make it finally happen?

This is particularly frustrating for non-Apple platforms. You can use Xcode build phases, but as soon as you need to target other platforms this area feels completely neglected. Custom build scripts may work fine for a root package, but what if it depends on some other package that needs to generate resources at build time? Committing generated resources to repositories is not a practical solution after all.

3 Likes

@Max_Desiatov coming from the swift-server side of the ecosystem the sentiment resonates with me.

The community could absolutely help. Identifying use cases would be great start, and once the SwiftPM team puts together an initial pitch working together with them to refine and mature the proposal and implementation. cc @abertelrud

Does this imply that the pitch can only come from the SwiftPM team and anything else pitched by the community in the meantime will be disregarded?

No, but it implies that the SwiftPM team intends to make a pitch since it thinks this is an important feature

5 Likes

I have put up a new draft proposal for build tool extensibility here: Pitch: SwiftPM Extensible Build Tools.

Since that's a new proposal and not a continuation of this one, I figured it was best to start a new thread and keep the comments separate.

5 Likes