[Pitch] Extensible enums for non-resilient modules

The major issue is that simply adding a case won't just add a warning, it may break builds for everyone dependent on the package until the case is added. This is unacceptable as building can break without warning for consumers of the package, often randomly in CI or for new team-members rather than anyone who has an older version of the package cached.

1 Like

It's not to be avoided, but it usually should take some time before you do a semver-major. Major releases can contain breaking changes like adding new enum cases to frozen enums.

For example Swift 5 -> Swift 6 took 5 years.

At any rate doing a new major release for each new enum case or each few new enum cases is too much. You'd usually want to have a few more meaningful changes before doing a major release.

That's just how things have grown to be. I haven't done a proper study as to why ... that might be interesting to do tbh.

I pitched this very recently and even implemented it:

There is lots of useful feedback in that thread which I don't see addressed in the pitch, including on the naming. For instance:

The same criticism also applies to @extensible.

Personally, I continue to prefer @nonfrozen, but I don't really care as long as the functionality exists under some name. If we do actually manage to change the default behaviour for enums in a future language mode (as the LSG says they are open to), the attribute would just be temporary anyway.

We generally ask that people pitching a feature read past discussions on the topic and incorporate them in to their new pitch. That just saves everybody a lot of time, not having to re-hash arguments or miss some important details.

5 Likes

I will add that, for library developers, tagging semver majors is generally undesirable unless there is a strong reason to do so.

Particularly in low-level/foundational libraries (like SwiftNIO), a major would most likely force anyone depending on it to also have to tag a new major for their package if they choose to upgrade. This creates a chain of new major releases that ends up fragmenting the ecosystem since projects that haven't yet updated to the new majors will block others from upgrading.

3 Likes

One question -- if I have an exhaustive switch case plus an @unknown default case, and then the library adds another enum case later, what happens? Does my switch case continue to compile even though it is no longer exhaustive? If not, then I don't think this fixes the API evolution problem for libraries.

For example, when I test this locally with a non frozen enum, I see that even with @unknown default I have to switch exhaustively (or use default). If we have:

switch pizzaFlavor {
 case .hawaiian:
     throw BadFlavorError()
 case .pepperoni:
     try validateNoVegetariansEating()
     return .delicious
 case .cheese:
     return .delicious
 @unknown default:
     try validateNoVegetariansEating()
     return .delicious
 }

and then later the library adds the .veggieSupreme flavor, wouldn't we now get the error:

error: switch must be exhaustive
   | `- note: add missing case: '.veggieSupreme'

So, as is, I'm not sure how this helps API authors confidently add new enum cases without the possibility of breaking existing consumers.

+1, in a large codebase with many modules that are compiled together as a unit (e.g. a large and complex app), then this extra overhead from @unknown default is not valuable. Treating code within the same package the same as code within the same module is a good way to help handle this.

At Airbnb, our entire app is compiled with all of the modules being part of the same package. We like this approach because it relaxes warnings that we don't find relevant for our use case (like for retroactive conformances to types defined in other modules within the app). For example, we think it's perfectly for one team to add a retroactive extension to a type defined in another team's module.

However, this definitely isn't the default, and I imagine it might not be that common.

2 Likes

Thanks for all the great feedback so far. I just pushed an update of the proposal trying to incorporate most of it. I want to call out a few important differences compared to the initial one:

  • Added a short note about the difference of this proposal to previous ones
  • Reworked the sections about how the language feature is working with examples. Importantly, this is the first language feature that also has impact on the consuming modules.
  • Clarified that the intention is to enable this language feature by default in the next language mode
  • Added a migration path for modules adopting the language feature and potential future language mode. Including a second attribute @nonExtensible
  • Expanded on how this effects code inside the same module/package.

I just updated the proposal to include a note about the differences to previous proposals:

Differences to previous proposals
This proposal expands on the previous proposals and incorperates the language
steering groups feedback of exploring language features to solve the
motivating problem. It also provides a migration path for existing modules.

I also rework parts of the proposal to focus more on how the proposed language feature works and what migration paths we would enable for maintainers of packages.

After having looked through the previous pitch thread and talking with @lukasa about it, there was nothing blocking it back then. Cory provided the implementation, the proposal just never got scheduled for review.

The pitch from @Karl got LSG feedback and we incorporated this into the proposal by using a language feature and providing migration paths.

Thanks for bringing this up. I added @unkown catch as a potential future direction since it is possible to handle typed enum errors like this

do {
    try throwingMethod()
} catch {
    switch error {
    case .bar: break
    case .foo: break
    @unknown default: break
    }
    
}

I agree that extensible enums are not the right tool for a single code base that includes both the enum definition and all places where the enum is consumed. That's why this pitch still requires exhaustive matching for enums from the same module or the same package. Xcode is outside of Swift's control but in Xcode you can set the package identifier so as long as all modules have the same one you can still exhaustively match.

Just to be clear, I am very empathetic to this and I agree handing @unknown default cases is often impossible. However, that's the reason for why the compiler emits a warning when a known case isn't handled even when a @unkown default case is defined. The unknown case really should never be hit or just for a short period of time until the adopter gets around to properly address the case.

I intentionally used the previously proposed @extensible attribute here since @frozen also has ABI implications which @extensible does not. I added an alternatives considered section for this.

As others have said it is not something to be avoided completely but the bar for a new SemVer major release should be high. Especially for packages that are low in the dependency graph e.g. swift-collections or swift-nio. If those packages would release a new major it would most likely separate the ecosystem since an end user application (iOS app, package with an executable target, etc.) won't be able to use the latest major until all its transitive dependencies have updated to use that new major as well.
SwiftPM could hypothetically allow having multiple different majors of a single package in the dependency and build graph but even then the bar for a new major for foundational packages is still incredibly high.
The result of that high bar is what makes enums practically unusable for any public API in such packages taking away an important Swift language construct.

2 Likes

Thanks for the feedback. We have certainly read the previous threads both yours and Cory's original one. That's how we ended up incorporating the LSGs feedback of exploring language features to stage in this change. I added a reference at the top of the proposal to your previous pitch.

Regarding the associated value and naming feedback in your previous pitch thread. I added a section in future directions and alternatives considered respectively addressing those.

Thanks—can you provide specifics on this point? It would be necessary to know how it's expanded on prior proposals and how it's incorporated or explored past feedback in order to evaluate the pitch.


This addition is ill-considered: it breaks your own stated goal to "align the behavior" between language dialects and instead now adds further differences.

I also do not understand why there are now two attributes that do the same thing. What would be the semantic difference between @frozen and @nonExhaustive for a non-resilient library that necessitates a distinct attribute and new errors?

3 Likes

Sure, the proposal at a high level takes a different approach than the previous ones by using a language feature to stage in the alignment of behavior between the dialects. It is similar to the previous ones by using attributes to explicitly control the extensibility of enumerations. However, the attribute to mark an enumeration as extensible is only provided as a migration tool with the expectation that once the new language mode arrives the attribute is no longer needed in modules adopting the language mode.

That's a fair critic and I have been going back and forth on it. In previous pitch threads, it was raised that @frozen implies more than just extensibility on enums and using the attribute in non-resilient modules could lead to confusion. I personally agree with this and I think we are better off having a new set of attributes decoupled from the concept of frozen.

However, you are right that the goal is to align both modes. So as far as I can see we have two options:

  1. Re-use @frozen to mark enums as non-extensible in non-resilient modes and introduce a @nonFrozen to provide a migration path
  2. Re-use @frozen to mark enums as non-extensible in non-resilient modes and expect that the migration path is to enable the language feature and mark all existing public enums as @frozen
  3. Introduce a new set of attributes. This leaves the open question if it is possible to have a @nonExtensible non-frozen public enum in a resilient module otherwise we have divergent behavior again.

Now after having spelled all of this out. I think the most sensible choice is actually 2. It only requires a new language feature. Re-uses the existing attribute that works in both dialects. While still providing a migration path.

1 Like

Wouldn't @frozen imply non-extensibility?

1 Like

Just to clarify, isn't @frozen non-extensible?

What would the expectation be for a package that is both available to build from source but can also be distributed as an ABI stable library (e.g. Foundation)? It seems like in this paradigm owners of such a library would have to maintain confusingly contradictory attributes that are conditionally compiled depending on the context they are being built for. When the library is built for ABI stable distribution, its enums cannot be @frozen unless they are in fact actually frozen. When built as a package, on the other hand, they would be expected to conditionally add @frozen in order to maintain source compatibility with clients that have always used them as a package.

I personally am not sure it should be a goal to maintain source compatibility for uses of existing enums that should have been treated as non-frozen by the language all along but weren't. In my mind, the source break is part of the point here. However, it does of course need to be opt-in. What if the source break required an agreement on both sides of the module boundary to take effect? That would mitigate the difficulty for libraries wanting to migrate to this new behavior without requiring them to add @frozen to existing enums when that isn't semantically correct. In this approach, if a library opts in to ExtensibleEnums the change does not have any immediate effect on clients until they also opt-in (maybe by enabling the same upcoming feature or a different one if we need to make them orthogonal).

2 Likes

I anticipate an objection to my idea would be that library owners want to be able to opt-in to ExtensibleEnums and immediately start reaping some benefit, even when they have existing clients that are not going to opt-in to a new language mode any time soon. Ideally, if the library owner introduces new enums, they should start getting the benefit of extensibility for those enums regardless of the language mode of their clients. But I also think that there should be a migration path where those clients to eventually start treating the existing enums as extensible too, so I don't think re-use the existing @frozen attribute to maintain source compatibility is great, as it locks in non-extensibility.

What if there were a @preExtensibleEnums attribute that can be applied to existing enums as part of updating a package to use the ExtensibleEnums feature? This attribute would essentially indicate that the enum should continue to be treated as non-extensible, but only for clients that haven't yet opted-in themselves. That way package owners don't have to commit to non-extensibility for existing enums indefinitely.

I still hold that, as someone working in a project with multiple internal packages seeing frequent changes (which are compiled together statically with each new build), this is a bad move and leads to annoying and, for us, nonsensical additions and warnings. I am probably in the minority but I want to make sure it's heard nonetheless as I don't think I'm the only one.

9 Likes

AFAICT, all approaches discussed treat extending the enum type as a whole; this is the most general solution.

Would it help to consider a partial solution, where a library creates a new case that can be treated as extending an existing case, instead of using a catch-all for completely unknown cases?

To avoid magic of automatic conversions, the client would have to adopt it expressly, but then could omit the catch-all case:

switch EnumType.as(item) { ...exhaustive cases... }

The catch-all case is mostly pointless and even harmful because it should interrupt processing for the unknown situation, injecting errors into the entire call chain. If instead the client could call some API to get a (prior) known value, then the library could add cases and provide the client with reasonable behavior, so long as it is consistent with an existing case.

That creates a second class of non-frozen enums, where the library promises to support backwards compatibility. That means libraries can defer breaking clients, and clients don't have unknown case behavior percolating through their code with extraneous errors.

When clients upgrade to the new library version, they can just remove the API call, add cases as needed, and then restore it to handle newer versions of the library in the future. That has the additional benefit of allowing clients to decide whether to accept library extensions or balk using the catch-all.

I use a ridiculous number of enums - hundreds per package - to model behavior and errors. I'd say the majority of possibly new cases are only slight variations of existing cases. I imagine for something like swift-syntax, where new AST features are mostly subtypes of old, it would work well to avoid churn.

Libraries might try this now with a pattern using protocols and distinct types, but it could be seamless with compiler support.

For non-resilient libraries, an agreement was never expressible in code as to the @frozen-ness of enums.

A library turning on this feature would be the moral equivalent of adding a new case to every enum declared in that library—which (unless it explicitly documented otherwise) the library could always do. For cases where there was a documented promise of being frozen, now there is a way to express it in code.

Turning on this feature does not require any existing library to lie about whether its enums are frozen: rather it exercises an option that the library has always explicitly reserved. If that cannot be tolerated in practice by a library’s clients, then that library’s enums were in fact frozen anyway.

1 Like

I drafted a very long response to this issue for the previous thread. I'll see if I can still find it and post it as a new thread.

But basically, you can start with this from semver.org:

In systems with many dependencies, releasing new package versions can quickly become a nightmare. If the dependency specifications are too tight, you are in danger of version lock (the inability to upgrade a package without having to release new versions of every dependent package). If dependencies are specified too loosely, you will inevitably be bitten by version promiscuity (assuming compatibility with more future versions than is reasonable). Dependency hell is where you are when version lock and/or version promiscuity prevent you from easily and safely moving your project forward.

There are times when you want a kind of version lock - when you have the power to, as it says, release new version of every dependent package. But generally you don't have that power.

Now, recently we have introduced the package access level, and IMO that changes the game significantly. Now I think we actually have a clearer idea about what a "package" is - it is a unit of distribution, and a unit of versioned code. If your "internal packages" are versioned together (you want version lock), they should perhaps be considered modules in the same logical package.

We might need better tools for arranging different folders of code in to a single logical package as part of constructing your Xcode workspace or whatever (maybe your whole workspace is one logical package). Perhaps that is a path forward.

3 Likes

That all sounds very accurate to me. If the way forward is moving forward with what is pitched in this thread, in addition to better tools for mixing an Xcode project alongside SPM packages, I'd be quite satisfied with that solution.

2 Likes

Right: I was going to start drafting a response but you’ve put it well here.

The language’s notion of a small-p ā€œpackageā€ is roughly a unit of distribution of versioned code.

We envisioned that in many cases—or, at least, as the right default starting point—this would be a SwiftPM package, but it doesn’t need to be (and you don’t even need to use SwiftPM). I believe someone has already pointed out above that you can specify the same package identifier for multiple packages; an orthogonal discussion might be whether more knobs need to be exposed in the various build tools to surface this sort of customization.

For the purposes of language design, though, we have a semantic notion of a small-p package, and it is that semantic boundary to which the package access level refers.

1 Like

In that case, I'd update my take on this pitch to be +1 if and only if it is accompanied by changes to build tools that allow marking code in a "separate SPM package within an Xcode workspace alongside a traditional Xcode project" as being within the same "package." Without that, my previous -1 comments apply.

(I realize that is essentially a completely separate team that probably cannot comment on whether or not that will happen for now, but worth stating anyways IMO.)

Swift-Build is the Xcode build system, and it's open-source now.

So I think (I'm not sure if I've got this completely right) it could be in-scope to ask for those kinds of build-system improvements alongside language changes, and we could potentially even implement it ourselves and see how it feels. I believe you can also run Xcode with a self-built build system, so we wouldn't even need to wait for the next Xcode release to try it.