[Pitch] Adoption tooling for upcoming features

anthonylatsis · February 18, 2025, 11:50am

Hello there!

I just finished drafting a proposal that I’ve been looking forward to share with you all. The basic idea is an integrated mechanism for producing source-compatible adjustments to code that can be applied to preserve its behavior once a given eligible upcoming feature is enabled. Simply speaking — a new, feature-oriented code migrator.

Although the focus of this proposal is quite deliberately mechanical migration, this feature has a lot of potential beyond the scope of preserving semantics.
Ultimately, it could enable us to implement comprehensive adoption and educational experiences for arbitrary features.

Please feel free to leave editorial feedback on the corresponding pull request, and use this thread to discuss the solution in question and suggest improvements or alternatives.

And in case you’re wondering, more naming ideas are most welcome too.

Thanks!

FranzBusch · February 18, 2025, 4:03pm

This is great! We recently pitched an overhauled version of Extensible enums for non resilient modules that is using a language feature to change the behavior of enums inside a module. This would be a perfect candidate to offer an adoption mode where each existing public enum in the module would generate a warning with a fix-it to add @frozen to it in preparation to turning the language feature to avoid any API breaks.

cal · February 18, 2025, 6:20pm

This is cool and will be a helpful tool!

For a large codebase, having a way to automatically apply these fix-its in some way would be huge. It makes sense for that to be separate future work from this proposal though.

In theory you could have a third-party tool take the compiler output and automatically read, digest, and apply the given fix-its. Would it be possible for a third-party tool to consume these fix-its? From what I remember, fix-its don't get printed via standard out.

Jon_Shier · February 18, 2025, 6:50pm

I have other thoughts I'm still writing down, but if this doesn't produce new warnings, how will the fixits be surfaced? AFAIK, there's no standalone fixit diagnostic.

hborla · February 18, 2025, 7:32pm

Yes, there's no way to produce a standalone fix-it, so the proposal does use warnings as the mechanism. From the first sentence of the detailed design:

Upcoming features that have mechanical migrations will support an adoption mode, which is a new mode of building a project that will produce compiler warnings with attached fix-its that can be applied to preserve the behavior of the code once the upcoming feature is enacted.

I couldn't find a place in the proposal that says otherwise but if I missed one, please let me know.

Jon_Shier · February 18, 2025, 7:34pm

Yeah, not sure what I was seeing. I thought it said it couldn't produce any new errors or warnings, but it just says errors. Makes my actual response much easier.

anthonylatsis · February 19, 2025, 6:03am

They do get printed with the LLVM formatter: -diagnostic-style llvm. But I think such tools should instead rely on -serialize-diagnostics-path (frontend flag) or -emit-module-serialize-diagnostics-path. Plus -print-diagnostic-groups (currently a hidden frontend flag) to filter out relevant diagnostics until we serialize the group.

owenv · February 19, 2025, 7:43pm

The serialized diagnostics format has a "Category" field which clang uses to store diagnostics flags, and is currently unused by Swift outside of some edge cases. It may be worth considering storing the diagnostic group flags there for use by tools.

Overall, I think this is a good approach to mechanical migration. Integrating it into a user's standard build process via the addition of a single additional flag makes it very easy to reliably adopt across any build system or non-standard project configuration. Treating adoption as something that's done at feature instead of language mode granularity also gives users a lot of flexibility in how the mode it's used, which I think is valuable.

John_McCall · February 26, 2025, 3:53am

I've recently been advising on an internal clang-based tool for finding uses of specific APIs, and I wonder if there's a more powerful feature we should be aiming to build here.

Compilers are in a good position to automatically collect the information for certain kinds of auditing tasks as a side-product of a build. Since the compiler processes all the code anyway, it's easy for it to check whether that code matches some predicate as it goes along. Once you have a list of matching code sites, there are a lot of tools you can straightforwardly write on top of that.

Now, it would be easy to generalize this to the point that it becomes a problem. In its full generality, this would require running some kind of script (or macro) over the AST as the compiler builds it. That sounds really expensive, and it would likely become a significant burden for the maintainers of both the compiler and the script. But I think there's a lesser generalization that would be quite reasonable to maintain and still very efficient.

We start by building in support for certain simple predicates, like:

Does this code use a declaration with a particular name?
Does this code use a particular language feature?
Would this code change behavior if a particular upcoming feature were enabled?

The compiler would then just take in a configuration file with a list of these predicates, apply them as it goes, and collect all the matching code sites in an output file. Since it knows exactly which predicates it's looking for ahead of time, it can avoid doing unnecessary work just to set up for predicates that it hasn't been asked about. As a result, this should have negligible overhead on compiler invocations that aren't doing any auditing.

This would enable a large category of auditing tools to be written without adding any custom code to the compiler as long as the existing builtin predicates are adequate for the job. For example, if there's an API internal to your project that you only want to be used by a specific place in the code, you can configure your build with an audit configuration that collects uses of it into a file, then write a simple script to verify that the number of uses in the file doesn't go up. We'd probably want to allow multiple independent configurations to be specified at once, but that doesn't seem problematic.

Now, the tool you're proposing would emit diagnostics and fix-its instead of an output file, but that just seems like an alternative output rather than something fundamentally different.

ktoso · February 26, 2025, 4:28am

Oh I really love these; especially limiting use of unsafe or otherwise "don't use this anymore" APIs just by names or by language feature would be great.

Similar collecting declarations of some specific kind; I was asked at some point about auditing feature for distributed -- it would slot into this idea perfectly, just collecting all distributed declarations into a report file and there we go: that's your 'exposed' remote api surface

John_McCall · February 26, 2025, 4:36am

Yeah. Of course, the “does this use this language feature” tests would need custom logic for each feature, so there would be a catching-up period, but I feel like it would be straightforward to add a check for most new features.