[Pitch] Package traits

tshortli · May 31, 2024, 9:00pm

I think it's worth spelling out what you mean by "compile" here. I've been making the compiler more aware of code that is unreachable at runtime due to un-availability constraints. For example, a function that is @available(macOS, unavailable) is unreachable at runtime^[1] on macOS and can therefore either be omitted in the resulting binary (or at least stubbed out if it still needs to exist for linking purposes). Taking advantage of this knowledge, we can get some of the best of both worlds: we still type check code that is unreachable at runtime, since it's not conditionally compiled, but we don't do any code generation for it so the size impact is minimized. This helps avoid a problem with conditional compilation where you can't be sure that code you are changing will still compile in other build configurations without actually building in those configurations. Embedded Swift takes advantage of this feature to keep various declarations available for type checking, while not allowing those unavailable-in-embedded declarations to actually wind up in the final binary.

Modulo various compiler bugs that defeat availability checking, or the owner of a resilient library retroactively marking a previously available function as unavailable. ↩︎

Alejandro · May 31, 2024, 9:10pm

It's perfectly reasonable that any documentation target can have all of the features enabled to be able to compile and see all definitions to emit documentation, but clients depending on a package without feature X won't even compile/typecheck/potentially see a file for feature X.

taylorswift · May 31, 2024, 9:11pm

does this imply that all traits must be mutually compatible with one another?

Alejandro · May 31, 2024, 9:27pm

There's lots of different ways to tackle the documentation issue, if the traits are all mutually compatible then it can just turn them all on in one shot. If not, then it could turn each one on individually and attempt to merge artifacts, etc. The problem's been solved before, we just have to find how it fits into our language and ecosystem

FranzBusch · May 31, 2024, 10:19pm

Yes and that’s called out in the proposal. A package must support compiling with all possible trait combinations including all traits enabled. A trait must never remove API or change API it can only be additive.

lukasa · June 1, 2024, 6:11am

The problem with the “type check but don’t compile” pattern both you and @Joe_Groff bring up is that to do it requires that you still have all the necessary modules and headers to do that type checking. This is particularly challenging in SwiftPM with system dependencies.

A common example that we used to see was OpenSSL: if you depend on something that uses OpenSSL and needs to type check, you have to have the OpenSSL headers. Those aren’t shipped by default on Apple platforms and SwiftPM can struggle to find them elsewhere, so this makes everyone’s build story harder, even if they never use OpenSSL.

Relatedly, in SwiftPM land today we have no way to emit the type checking information for a module without compiling it first. So the impact here is that you do actually have to compile your whole dependency tree. Using an availability guard here would require us to be able to produce Swift module files without compiling the whole tree first, and also requires us to get internal imports actually working.

FranzBusch · June 1, 2024, 10:51am

Sure. When I state that code does not get compiled I mean that the compiler doesn't even see the code since it wouldn't be able to type check it in many cases due to the symbols not being present. Especially when optional dependencies are guarded by a trait then we don't even have to checkout the code for the dependency nor compile it.

As @lukasa said this is common with system dependencies like OpenSSL but it would be the same for any other package dependency in the graph that is optional. If such a package dependency is not enabled then we shouldn't have to fetch it.

tshortli · June 2, 2024, 5:40pm

Thanks! The main reason I asked is that I think conditional compilation has some significant tradeoffs, and to me it is not an ideal solution to either management of code size or access to experimental features. If we implement package traits, they are undoubtedly going to be used for those purposes but I would love to have different tools to solve those problems that don't involve conditional compilation. However, the example of dependency control that @lukasa provided does feel like a much clearer motivation for the unique capabilities of a package traits feature.

tachyonics · June 3, 2024, 11:38pm

Overall a good proposal that addresses a number of existing issues, particularly around optional dependencies.

However I do find the syntax of traits: [.defaults, ...] awkward.

I do see that Cargo/Rust also includes a default mechanism and has some justification for it as providing convenience but I am still not 100% sure about a concrete use case.

As for the syntax, my preference would be to follow Rust/Cargo's lead and have something like a defaultTraits: Boolean parameter (defaulting to true) rather than mixing a weird proxy into the list of traits.

dschaefer2 · June 5, 2024, 4:42pm

I'll just add some flavour to something @allevato mentioned. With scalable builds, you generally never have the entire action graph. It's one way that it scales. Dependencies are only calculated for the requested targets. Invoking the build twice for different build targets will result in a different action graph. You really have no way of knowing all the traits that dependents may want.

The same issue would arise if we want to cache built artifacts from locally run swiftpm builds of multiple projects. Think SwiftSyntax as an example everyone is running into.

At the end of the day, we probably need to treat the combination of traits dependents request as different build configurations and cache multiple variations of the artifacts. Now that's perfectly OK. I just don't think unifying the traits is possible at scale.

FranzBusch · June 7, 2024, 10:12am

I did start off with this in the beginning. I think an argument can be mode for both approaches. The approach of .defaults means that there is only a single configuration option that changes how traits behave and the if users configure custom traits they have to pass .defaults to get the default traits. The approach of the boolean flag means that if users configure custom traits they might or might not have the default traits enabled based on that flag.
I gravitated towards the .defaults spelling because it made it clearer in my opinion what traits are enabled or not.

I totally agree with this. It is similar to regular C macros in a way. If you have a build system that builds multiple binaries which share the same libraries you have two approaches. Either you can unify all the passed macros for the libraries so you can share them for every single binary or you have to build the libraries with different macros for each binary. The same applies for package traits; however, the one added benefit is that package traits guarantee that all combinations of traits must work together. So in theory you could just unify the traits for each module across all the binary graphs. The trade off is that you might have a larger binary size in the end.

Now as you said you can just build each module with the different required trait configurations and resolve this again. This just requires a bit of work on the build system side to setup the configurations correctly.

filip-sakel · June 7, 2024, 12:03pm

Also as a future direction, I wonder if we could have enum-like traits. For example, if a trait selects a renderer, it would be more expressive and less error-prone to say “—trait renderer=A.” With the current boolean traits, one could enable 3 renderers and then we’d need to manually issue errors if more than 1 renderer was active.

allevato · June 7, 2024, 12:15pm

What guardrails are in place to verify that a package developer has maintained this requirement, rather than an author being able to push out a package that is only discovered later to have conflicting code with a particular combination of traits?

This is where the "typecheck but don't compile" would shine better than "compiler doesn't even see the code". If the former approach were used, the compiler would still see the union of all possible declarations in the package regardless of the trait configuration used, and it could report that it's seen a declaration that isn't present in some trait configuration, or that a certain configuration produces multiple conflicting declarations. Then, as long as the package author is building their code in any configuration, errors would be caught.

I'd be more worried if we're depending on package authors using a specific workflow to catch this, like relying on documentation generation or expecting them to test their package manually with --enable-all-traits.

FranzBusch · June 7, 2024, 12:34pm

In theory we could have spelled traits using an enum; however, I decided against that to not diverge the current String based approach that the manifest choose. I agree that a stronger typed manifest would be great even for targets and products but I valued the consistency more here.

Your example for using a trait to choose between options is valid but if you decide to throw an error this can only happen during runtime. Every package must support building with all traits enabled at once. So in your example you must expect that all renderer traits are active at once and you have to decide in your code what should happen then. Pick one of them or just fail at runtime. Importantly you shouldn't fail the build!

I think this is a fair concern but I am personally not too worried about this since other ecosystems, such as Rust, have shown that this can work and scale.

In general, there is nothing holding back package authors to push invalid releases. Importantly, this is the case currently as well. Package regularly ship API breaks without bumping major versions due to a lack of CI that catches such issues or an incomplete understanding of what qualifies as a SemVer breaking change.
Package authors should invest into CI tooling which gives them confidence in releasing new versions. This is an area that we can improve as a community by providing a recommended GH actions setup for packages.
Now coming back to traits, when a package adopts traits it should setup CI to run with all traits enabled and potentially all combinations that make sense to test individually.

Once we have more usage of package traits in the ecosystem we can see what additional tooling we can provide to developers to make their lives easier to verify if their traits are all working as expected such as a swift build/test --all-trait-combinations.

gwendal.roue · June 7, 2024, 1:07pm

In this thread I describe a wrong usage of traits that could not automatically be caught automatically, even with swift build/test --all-trait-combinations.

FranzBusch · June 7, 2024, 1:15pm

I am not sure if I am understanding this thread correctly. How is the OtherPackage conflicting with the GRDB package? Is it because the traits of two separate packages are conflicting with each other and this only surfaces at the end binary?
Could you spell out a rough setup that you have in mind here which would resolve in a broken build using package traits?

taylorswift · June 7, 2024, 1:16pm

i agree with Tony that we should make it a goal to provide some sort of enforcement assistance so that trait compatibility is actually meaningful. we already have a widespread problem of packages that compile successfully as dependencies but fail to generate documentation. this is a difficult situation because the developers responsible for generating the documentation (Swiftinit, SPI, etc) are not always the same people who wrote the package and cannot correct the source code without the cooperation of the author. the prevailing approach so far has simply been “ignore all documentation errors” which doesn’t feel sustainable or something we should be doubling down on.

FranzBusch · June 7, 2024, 1:23pm

I understand the desire to have an enforcement but I don't think there is anything here that we can do for traits nor any other thing that can break builds or documentation generation. We don't have a centralised place that builds all packages and approves releases. The SwiftPackageIndex shows if a package builds on various Swift versions and if the SPI maintainers are interested they can incorporate trait builds into this.
However, in the end it is up to the package authors to make sure they are releasing a package that is capable of building on all supported Swift versions with all trait combinations and the package generates documentation properly.
Now I agree that we can provide tools to help authors to understand if their package is in a healthy state to release such as the --all-trait-combinations flag but I don't think we have the power to enforce anything here. Maybe in the future if we get a package registry we can run pre-checks locally but even then everybody can release broken package versions.

gwendal.roue · June 7, 2024, 1:26pm

Sure, I can expand.

The GRDB package, currently, links with the SQLite library that ships on the operating system. Some users request support for SQLCipher, the SQLite "fork" that adds support for encryption.

Putting aside the fact that SPM can not (yet?) build SQLCipher, let's imagine that it is possible, and that GRDB adds a trait "SQLCipher" that replaces the system SQLite with SQLCipher.

At first glance, this looks a correct usage of traits, because SQLCipher adds new apis and does not remove any feature.

In reality, this is in incorrect usage of traits, because a new operating system will ship eventually, with a newer version of SQLite. The vanilla GRDB (no trait) will ipso facto become more powerful than GRDB+SQLCipher. And this is an incorrect usage of traits, as defined here.

For example:

MyApp depends on GRDB (+ SQLCipher)
MyApp depends on OtherPackage
OtherPackage depends on GRDB (no trait)

When iOS 19 ships, the app may break, because OtherPackage is entitled to execute SQL that is only valid on iOS 19:

// Works with GRDB without trait, fails with GRDB+SQLCipher
#if available(iOS 19, *) {
    try db.execute(sql: "Some SQL that requires iOS 19")
}

See? Even if GRDB (+SQLCipher) is 100% compatible with GRDB (no trait) today, it will become incompatible on the next OS release.

That was the scenario that can't be detected automatically, because breakage will happen in the future.

(To make me very clear: GRDB will not define a SQLCipher trait, because this would create the problem described above)

EDIT: I'm not sure I'm adding much to the thread with those posts. It's just a reminder that automatic detection of invalid traits can't be 100% perfect.

It's an opportunity to remind SPM authors that there exists:

Requests that SPM can build SQLCipher and SQLite (some details if one is interested)
Requests for a sort of package variant that is more versatile than traits (so that the MyApp described above would NOT build, because MyApp and OtherPackage have conflicting dependencies)

filip-sakel · June 7, 2024, 1:34pm

I must have missed something because I don't understand why a project must be able to build with all traits enabled. For instance, couldn't we design a trait that controls whether to use a system library? If we then build on a system without said library, that trait needs to be disabled. Consequently, enabling all traits in that case would automatically result in a build failure.