Pitch: SwiftPM Extensible Build Tools

FranzBusch · February 17, 2021, 7:09am

I have to agree with the general sentiment of @lukasa reply and I also think for most cases it is even desirable to have the same version for the Codegen Tool and the runtime lib. However what about the following scenario.

You are depending on a package that uses build tools like SwiftLint and SwiftFormat but you are also using the same two tools in your package. However the package you are depending is not getting updated regularly. Are you know locked to the version of SwiftLint and SwiftFormat from your dependency or can there be two different versions of a build Extension be used in a dependency graph?

I think this could be something you can run into quite regularly.

georgebarnett · February 17, 2021, 10:15am

Thanks for the context there. Exposing the options via the extension would be great since it makes the options much more discoverable. This sounds like a great future direction; the question then of course is when will that future be realized?

I have pretty big concerns about config files being enough for the time being though, especially when targets may want to use different configuration. In gRPC, for example, a package may want to generate client code with public visibility in one module but server code with internal visibility in another. This would require two separate sets of configuration.

Fortunately protoc does have some support for configuration files although it's quite limited: it can read command line arguments from a named file. In this case, to support per target configuration, gRPC Swift would have to define conventions for reading configuration. It could, for example, specify that configuration files must reside in the source directory for the target and be named "TargetName.grpc-swift.config" or similar. That's works, but it's not discoverable or obvious how to drive the configuration. SwiftProtobuf would have do something similar if it wanted to offer per-target configuration.

designatednerd · February 17, 2021, 10:22pm

This proposal made my day. I work on the Apollo GraphQL iOS SDK, and the level of shenanigans we have to send people through in order to set up our codegen is very, very high, and I wholeheartedly support any effort to make that easier.

Couple thoughts:

It does seem a little weird to not have options be at all a part of this. I definitely get the idea for separating how to set that up to a separate proposal to keep scope reasonable, but particularly with the requirement that extensions can't have dependencies (so one couldn't bring in a YAML parsing dependency, for instance), it does seem like a pretty big gap from a practical standpoint.
This would be absolutely amazing for Swift Package Manager, but fixing this at the SPM level does leave some questions about how to implement codegen for people who use solutions other than SPM to integrate our code, and for various reasons can't switch. This could be a good incentive to get people to switch, but from talking to users that's often a big ask.
One thing to consider is often the files used to generate code don't need to be included in the final app/lib - that's the case for .graphql files with Apollo, since we take those files and codegen our way into actual swift files. Is that distinction something that would be considered when passing the sourceFiles? It seems like yes with the .proto example, but I can't remember if you have to include the proto files in the final product.
One thing that would be particularly useful information to add to TargetBuildContext for code generation tools to have is what files have changed recently (maybe since the last build?). You often don't need to regenerate unless a specific type of file or folder of files has changed - it'd be super helpful to have the information which would allow tools like ours to skip over work when it's not necessary.

Overall: Yes, please!

adam-fowler · February 18, 2021, 9:40am

+1 for this.

I maintain a repository which involves over 200 libraries, 50MB of generated code. Requiring users of the project to download all this code has always been an ask. If I could get users to generate the code themselves via SPM that would be awesome. My code generator is written in Swift so this proposal seems to cover everything I would need to do that.

One question though. I'm not sure this has been covered. I augment some of the generated code libraries with additional code to provide extra features. Would it be possible with these additions to SPM to have a library that is built from a combination of generated code and hand written code?

feldur · February 19, 2021, 2:37am

I suspect it's no more at home there as here, in that the discussion there has a different objective than mine. The one there is explicitly focused on identity, while my point is that identify is insufficient to validate the supply chain, and that we need to solve the supply chain problem. I (regrettably) lack the time to do the research necessary to create a well-researched pitch for what to do about securing the supply chain; do you think it would be productive to open a topic in "Using Swift" posing this conversation over there to gather interest / support / ideas?

John_McCall · February 19, 2021, 4:24am

I think Evolution | Discussion would be the right place.

Jens_Nerup · February 19, 2021, 12:12pm

I’m very excited to read the draft and I like that the proposed design is extensible enough to extend various parts of a build graph later on. I also find it very appealing that the Package definition and the proposed extension is using a API based (and typesafe) approach. I especially like idea outlined by @ktoso on how to make the extension options typesafe.

Just like @lukasa I think we need to have another look at how the extensions would work on different platforms as it seems like we can’t limit an extension to only one or a range of platforms if I'm not mistaking. One could think that we’ll end up in a situation where an extension is only available on certain platforms and not others (I’ve come to think of an extension for CoreData).

One way of attacking that problem could be to apply some of the same logic to ExtensionUsage as seen in the linkerSettings by using when but that is properly not the right place to solve the problem.

…
    targets: [
        .executableTarget(
            name: "MyExe",
            using: [
		.extension("GenSwifty", package: "gen-swifty", .when(platforms: [.macOS]))
	    ]
        ),
    ]
…

Alternatively this could be handled by the extension using a platforms attribute just like the regular Package definition. We could then match the platforms from the extension with the platforms from the user of the extension to verify if the extension can be used or not. But as stated in a previous comment - how would SPM fail when trying to use an extension on an unsupported platform?

It could also be interesting to know how the commands are executed. Is it in process or out of process? Would SPM terminate unresponsive extensions to prevent hanging builds? Both of these questions are maybe more of an implementation detail.

Last but not least. I share the concerns about the extension dependencies but to get the ball rolling I'm willing to address those concerns later as it seems like a larger task.

AliSoftware · February 19, 2021, 2:51pm

Like @designatednerd , this pitch made my day… especially as the author of SwiftGen and seeing it be taken as a good candidate in the first example

Really excited for this. I really like where this is going.

I get (and share) the frustrations of not being able to pass options to the extensions in the first iteration. I think I'd still prefer to have the pitch go forward and the feature implemented first even without support for options in its first version, as we could always document for adopters of our extensions to use JSON/YAML configuration files that could follow a specific standard (eg swiftgen-$targetName.yml or similar). It's not perfect, but I also understand the complexity of supporting options right from the first iteration, so we could live with it imho (especially if we know that options will hopefully be implemented in a future iteration)

PS: for the record we're already working on making SwiftGen itself be built via SPM (instead of our internal Rakefile scripts and xcodebuild), and SwiftGen is also already distributed as a preconpiled, binary executable. So this seems like good timing

tomerd · February 19, 2021, 7:01pm

For configuration, we need to weight the tradeoffs between three main options:

Typesafe configuration directly in the package manifest itself
Non typesafe configuration (e.g. some kind of unstructured dictionary) directly in the package manifest itself
External configuration file(s) defined by each extension if/when needed

The sentiment in the thread, which I personally share, is that option #1 would give SwiftPM users the best experience and most safety. The technical challenge with achieving such experience is that the package manifest is Swift code, and as such the SwiftPM/Xcode build systems would need to pre-compile the extensions before the package manifest can make use of types the extensions define. Some build tools e.g. gradle, sbt, there is a "build for a build" for exactly this reason. Since neither SwiftPM's nor Xcode's build system are set up for this today, this could be a pretty significant undertaking, so the main downside with choosing this option is that it will delay the extensibility feature in a significant way.

Option #2 - non typesafe configuration in the manifest - is nice in that it keeps the configuration close to the plugin binding in the manifest, and does not suffer from the chicken and egg issue of option #1. The downside is that it is not type safe and as such could lead to subpar and frustrating user experience when making configuration mistakes. It would also likely need to be limited to a somewhat flat list of key values which is not very flexible. The maven build tool takes this approach, and suffers for these shortcomings, but is still fairly successful in it's ecosystem. We should also consider the evolution of the Swift packages ecosystem: Since we want to get to option #1 eventually, the transition from non-typesafe configuration in the manifest to typesafe one could be painful given that SwiftPM must support older versions of the manifest and as such would need to support a "mixed mode" for a long time.

Option #3 - external configuration files - is nice in that it gives the extension author full flexibility in how to design their configuration, and does not suffer from the chicken and egg issue of option #1. The downsides have already been brought up in this thread earlier but the tl;dr is that configuration files are arm-length from the manifest so less intuitive, requires the extension author to advertise information about how to use them outside of the manifest validation, and it also puts more responsibility on the extension author for something that SwiftPM could theoretically provide as a utility for those authors. When it comes to evolution of the proposal / feature, external files may have an advantage over non-typesafe configuration because the extensions can evolve to accept manifest based configuration (option #1) when such is available in addition to configuration files, which makes for a natural transition. With regards to the location and structure of the external configuration - for some plugins it may be enough to have single file at the root of the project, and some may require one per target which can be achieved by naming convention or by putting one at the root of each target.

SDGGiesbrecht · February 19, 2021, 10:15pm

I might also be important to note that there is absolutely nothing SwiftPM can do to prevent extensions from electing for Option #3 in spite of whatever SwiftPM recommends.

If SwiftPM implements Option #1, users will have to learn a mix of #1 and #3.
If SwiftPM implements Option #2, users will have to learn a mix of #1, #2 and #3.
If SwiftPM recommends Option #3, users will only have to learn #3.

Because of this, I think I would consider #3 a better interim solution than #2. It has two desirable properties: (a) It’s less for users to learn now. And (b) later on, finalizing to #1 will not involve any breaking changes or deprecations.

AliSoftware · February 19, 2021, 10:28pm

Also the benefit of option 3 is that many tools (SwiftGen, Sourcery, swiftlint, …) already use this way to configure and read options today, outside of the context of SwiftPM.

So for anyone who was already using those tools in their repos/products, they already have that configuration file (swiftgen.yml, etc) in their repo, and migrating from their current setup (having SwiftGen installed in their project via CocoaPods for example) to an integration via SwiftPM would not require them to change anything, they would just continue to use the swiftgen.yml file like before.

And for tools that are not already ready to support a config file, making the main.swift of their Package Extension from a JSON file (simple key:value flat dictionary) would be easy enough as a bridge and would not require any new dependency for the tool (just parse the JSON using Foundation), only documentation. And since every good tool out there has a README with instructions to install it already, and that end users of the package extension will have to go to the repo or doc to discover the right name to use in their manifest for the Package Extension anyway…

Again, this is all while still aiming for Option #1 in the future, but as it was said before, Option #1 being far from ready and involving more complexity (and 2-pass-builds), this seems like a very good and acceptable compromise to me in the meantime, to avoid postponing the core idea of package extensions until forever.

abertelrud · February 23, 2021, 7:09am

Thanks for pointing this out. I have corrected this in the latest version of the draft proposal.

abertelrud · February 23, 2021, 7:14am

This should be possible by declaring a dependency from the extension target on the tool, using .product() notation (since the executable target that provides the tool has to be vended as an executable product). The idea is that the extension would be able to see any tool that it declared a dependency on.

Lastly, is it possible to use generated files from extensions of dependencies? From my feeling this proposal allows that but I just want to double check if it really does allow it. An example could be a tool like mockolo . This tool is generating mocks for protocols. An important feature for that tool is to be able to generate mocks with protocols that inherit from other protocols across module boundaries. A setup could look like this:
A --> B --> C
Where a protocol X is declared inside C and A defines a protocol Y that inherits from X. Mockolo would now generate a Mocks.generated.swift for module C. Then it would need to generate a Mocks.generated for module B using the generated file from module C and in the end it generates a file for module A using both generated files from B & C.
Is it possible with the current proposal to access the generated files from the dependencies easily?

If I understand the example correctly, this should work because the code that is generated for C would be compiled into that module, and the symbols and types it provides would be no different than if they had been regular source files in C.

abertelrud · February 23, 2021, 8:33am

Thanks a lot for all the feedback!

Thanks for pointing these out. This is fixed in the latest version of the draft, but the spelling of the capability intentionally had the parentheses. The reason is that I think we'll want to extend this with parameters in the future, such as for specifying file patterns to optimize when to invoke the extension (and to help diagnostics — something that this proposal doesn't yet address is how to avoid the "unknown file type" warnings that SwiftPM currently emits). That seems more naturally extensible if the parentheses are already there.

The intent would be that any IDE that uses SwiftPM would support code completion in the same way as for PackageDescription. As you point out, that should be improved in some cases, but the expectation in this proposal is that PackageExtension would be a peer of PackageDescription and would be treated the same.

The question about Swift LSP is a good one. Since it uses libSwiftPM to parse manifests it should be able to run extensions in the same way, but of couse, the commands would need to be run in order for the source files to be generated. That requires further discussion.

This is a good point and I think it should indeed be the subject of another concurrent proposal. There are a lot of details there, but I think it's somewhat separable from extensions, as long as there is a well-defined way for an extension to access the executables and auxiliary files in a binary target.

It would of course need to be defined and implemented at the same time as this proposal in order for this to be useful, but it should be separate, I think.

That's a good point. If this proposal isn't modified to have a specific affordance for this, then the package as a whole would need to be overridden, with a modified tools definition. This, too, deserves more discussion.

It would be a fairly large undertaking to support separate package dependency graphs for the various targets (or alternatively to allow multiple different versions of a package in a single graph). To support that we would also need to have a way to specify per-target package dependencies in the manifest, of course, and it gets complicated: if a package vends a build tool but also a runtime library used by the generated code, then the dependency versions of the runtime library would still need to be compatible with the dependency versions of the client package, while the tool could technically use a different set of dependencies since it runs in its own address space.

As for a build tool that itself uses another extension, that should work fairly well, as long as the extension can generate buildTool commands. There can't be circular references, of course, so you might need two extensions in the package; but this proposal should allow the build of protoc-gen-grpc-swift to use an extension that generates protoc commands, while also vending a separate extension that generates protoc commands that use the protoc-gen-grpc-swift generator. I might be missing something about the details involved, and would need to take a closer look at swift-grpc in particular, but in general an executable target that builds the tool has no further restrictions than any other executable target.

This is assuming that the extension can use the buildTool capability, so the commands can be incorporated into the build graph of any build system. I think it can do that in the case you mention, because it can know the names of the outputs before it runs protoc (at least with the source generators that are relevant here, including, I think, swift-gen-grpc-swift).

The reason for prebuild commands as distinct from buildTool commands is that SwiftPM's build system (and those of some IDEs that use libSwiftPM) can't currently adapt their builds plans based on output files whose names aren't known until the command runs, i.e. to "discover more work" as a result of running commands. So the names of outputs of a buildTool have to be known up-front. That's very restrictive, so the idea was to also allow build commands that run before the build plan is made so they can emit arbitrary outputs that build systems that need to know all work up-front would then see.

The restriction that prebuild commands can only use binaries and not executables built by SwiftPM in this proposal is that SwiftPM can't currently do separate preparatory builds to create the set of tools to be used during the actual build (tiered builds as ktoso said). SwiftPM could be changed to do that. But any IDE that uses its own build system to build packages would need to be able to do that too. It's possible that there could be some hybrid where SwiftPM's build system is used to build the host-side tools and then the IDE's build system is used for the package products.

So to lift this restriction would require some non-trivial build system work, and wouldn't work for all IDEs. But the proposal doesn't preclude lifting that restriction in the future.

Ideally any build system that builds Swift packages would be able to run commands, see what they produced, and then generate more build commands based on the outputs after running the command by applying the same build rules as for the source files. But not all IDEs can do that.

So a challenge with this proposal is to try to define these build commands and the capabilities in a general way that can be adopted by different IDEs, as well as by SwiftPM itself. I expect that we'll be able to lift those restrictions over time, however, assuming that the build systems become a bit more flexible.

Longwinded answer there but hopefully that at least partially answers it.

FranzBusch · February 23, 2021, 8:49pm

It is not so much about where the code is getting compiled but rather that a build phase for target A needs an output of a build phase from target C. To rephrase is it possible to use the output of a build phase of a dependency as an input of another build phase?

abertelrud · February 25, 2021, 9:40pm

No, not in this proposal. In the initially proposal API, the output from one build tool to the next can only be within one target.

abertelrud · February 25, 2021, 9:44pm

Thanks a lot for all the feedback! A revised version of this proposal is up for review at SE-0303: Package Manager Extensible Build Tools.

The proposal under review is similar to the one pitched here but tries to incorporate the feedback from this thread. One visible change that doesn't make much semantic difference is that "extensions" are now called "plugins" to better align with other build tools and to reduce confusion with existing use of the word "extension" in Swift.

Thanks!

feldur · May 31, 2022, 8:24pm

John - did you ever see this discussion? It seems like I missed something in how to generate interest. Might you have any further suggestion?

John_McCall · June 1, 2022, 2:12am

I'll respond there.