[Pitch] Non-Frozen Enumerations

sspringer · November 14, 2023, 1:57pm

From my understanding of your example, it seems that there a new case should to be handled explicitly, i.e. adding a new case should result in a new major version of the library, so a "classical" enum should be the right fit.

Conversely, enums with a "forced" default behavior (or "extensible" or "non frozen") are the right fit if some kind of a default behavior is sensible.

wadetregaskis · November 14, 2023, 6:23pm

sspringer:

What about demanding a default case by writing the following, no introduction of @nonfrozen (or @extensible) or usage of @unknown default necessary:
public enum FormatStyle {
    case short
    case verbose
    default
}

I find that a bit unintuitive, or at least ambiguous. My first impression as a reader is that this somehow means there's a default case, so I could do e.g. let x: FormatStyle and it'd default to some default, I guess?

That's not as naive as it might sound - plenty of UI API enums have a case that means "use the system default". So it wouldn't surprise me to see and use a case that is literally called default (and in fact sometimes it is, today, using backticks to dodge around the keyword).

If one thinks that through it seems to be invalid, of course, but I don't like that one has to think it through; it should be immediately apparent even to a newbie. Prior suggestions like @nonExhaustive are better in this respect.

I like the base idea of having a symbolic 'case' or keyword otherwise to mark that additional cases may be added in future, but I think a clearer name is warranted.

Though to be clear, I still think extensibility should be the implicit default, and we need only special annotations - i.e. @frozen - for the other case.

tera · November 14, 2023, 9:06pm

This is what I do not fully understand. The premise of SE-0192 was that adding an enumeration case is (or should be) a non-breaking change, hence could be a "minor" version change.

To me having default in the enumeration would indicate the need of having the corresponding default handling in the switch, and vice versa, having no default in the enumeration would indicate there is no need for default handling in the switch. There would be a very nice symmetry between declaration and use sites. Separate to that there indeed could be cases named default, those would be in the tick marks:

enum E {
    case a
    case b
    case `default` // a case named "default"
    default // non-frozen enum marker
}

enum E {
    case a, b, `default`
    default // non-frozen enum marker
}

Jon_Shier · November 14, 2023, 9:42pm

One other thing I'm interested in here is, how do we tell users when new cases are available without a compiler warning? I'd like to expand my error enums without breaking user builds but I'd also like them to be able to know when they've changed, without using some external tool.

taylorswift · November 14, 2023, 9:50pm

i’m also interested in this, but i feel like these are sort of contradictory goals…

on one hand, we want the addition of new enum cases to be a non-disruptive as possible: no errors, no warnings, everyone can just keep chugging along as-is without ever having to worry about new enum cases.

on the other hand, we want to grab users’ attention when new enum cases appear, which kind of implies that we want this to be a disruptive event.

i don’t think adding a new enum case can be disruptive and non-disruptive at the same time…

wadetregaskis · November 14, 2023, 10:17pm

You could potentially get something of both worlds if you had e.g.: an @ignorable attribute:

enum Foo {
    case A
    case B
    // Then, in the next version of this module, we add:
    @ignorable case C
}

The behaviour being that exhaustivity checking ignores case C, so you don't get any new warnings about unhandled cases (perhaps only if you have a default handler though).

Just thinking out loud; I'm not sure how useful that would actually be.

tera · November 15, 2023, 12:18am

Looks like a job for a warning to me (which would be an error for "treat-warning-errors" users). The attention is brought, users will glance over the warning/error, if they think it's safe to ignore the new case (for now or forever) will put:

case theNewCase: print("TODO later") // TODO

and call it a day, no warning/error afterwards. Possibly a wrong behaviour still.

We may want to invest into some "don't-treat-this-particular-warning-error" override.

And another override: "treat-this-particular-error_warning"

+1

itaiferber · November 15, 2023, 12:21am

FWIW, as a consumer of enums from frameworks (both 3rd-party, and local packages), I (personally) want new enum cases to be loud breaking changes for me, because new cases represent an underlying change in logic that I need to be aware of, and respond to.

I make sure switch statements in the codebase I work on are always fully-exhaustive so the compiler can start warning if new cases sneak in, even at the occasional cost of aesthetics/ergonomics. I'd much prefer this over the alternative, and I suspect I'm not the only one.

Jon_Shier · November 15, 2023, 12:23am

Me too, except for error enums. Perhaps it would also be useful here if the user could insist on warnings for new cases despite what the enum says.

taylorswift · November 15, 2023, 5:09am

going further, i think this is an area where structs as enums really falls short. when you add cases to such a struct, you have no way of forcing code that switches on the struct to adapt to the new cases.

hassila · November 15, 2023, 7:17am

I think what may be needed in reality is a new level of diagnostic in addition to errors/warning that is opt in - or you view it as the job for the linter.

MPLewis · November 15, 2023, 8:35am

We already have remarks that do this, could we just make this a remark and let people opt into making remarks errors (or warnings, and then transitively errors if they enable -warnings-as-errors)?

wadetregaskis · November 15, 2023, 4:18pm

I don't think a linter should be required for this. Imposing selection & use of a linter is onerous, and this is fundamental to the language so it belongs in the compiler.

One of the best things about enums in Swift is that the compiler tells you if you're overlooking something. This is just one key aspect of how Swift works well "out of the box", unlike most languages which rely on linters or other 3rd party tools for core functionality.

sspringer · November 16, 2023, 8:44am

I think that this syntax would easily be adopted on a “mental” level. What I do not like are those “@“-annotations, they say to the beginner “Here we have an expert mode, stay away from it!” when actually we talk about an important and very understandable feature. At least, then introduce a new keyword. But I think this syntax is very nice if enums are “non-extensible” by default.

The non-extensible enum is the easier to understand and a good fit in many cases, and always very usable if you are not creating a library. Therefore I would think the current behaviour should be the default (which would also make the new feature a less breaking change). In addition, you then have the nice syntax above (if you do not dislike it for some reason).

With the new feature you should be able to decide that adding a new case to your enum is not a breaking change. If you have a use case where an upcoming new case should be handled explicitly, avoiding the need to make this a breaking change would not be sensible.

As already mentioned, the decisive question is, can a sensible default behaviour be formulated for upcoming new cases of the enum?

lukasa · November 16, 2023, 10:20am

I think everyone agrees with this. I just want those to be loud breaking changes for me, but not for my users. That fits naturally with the idea of emitting warnings, which I can promote to errors in my CI but which don't otherwise punish my users.

tera · November 16, 2023, 10:30am

Sorry to bring this example again but API users will be punished (2a or 2b in the example) one way (a compilation warning treated as error that would block compilation until fixed) or another (an ignored compilation warning and the app not working correctly).

In general a new enumeration case is a breaking change.

lukasa · November 16, 2023, 10:44am

I just simply don't agree with this. There are a wide range of enumerations where not handling new values provides no degradation in functionality than handling them.

Error enumerations are the most obvious example, but let me provide a more complex one: HTTP2Frame.FramePayload. This type holds the data inside a HTTP/2 frame:

    public enum FramePayload {
        /// A `DATA` frame, containing raw bytes.
        ///
        /// See [RFC 7540 § 6.1](https://httpwg.org/specs/rfc7540.html#rfc.section.6.1).
        case data(FramePayload.Data)
        
        /// A `HEADERS` frame, containing all headers or trailers associated with a request
        /// or response.
        ///
        /// Note that swift-nio-http2 automatically coalesces `HEADERS` and `CONTINUATION`
        /// frames into a single ``HTTP2Frame/FramePayload/headers(_:)`` instance.
        ///
        /// See [RFC 7540 § 6.2](https://httpwg.org/specs/rfc7540.html#rfc.section.6.2).
        indirect case headers(Headers)
        
        /// A `PRIORITY` frame, used to change priority and dependency ordering among
        /// streams.
        ///
        /// See [RFC 7540 § 6.3](https://httpwg.org/specs/rfc7540.html#rfc.section.6.3).
        case priority(StreamPriorityData)
        
        /// A `RST_STREAM` (reset stream) frame, sent when a stream has encountered an error
        /// condition and needs to be terminated as a result.
        ///
        /// See [RFC 7540 § 6.4](https://httpwg.org/specs/rfc7540.html#rfc.section.6.4).
        case rstStream(HTTP2ErrorCode)
        
        /// A `SETTINGS` frame, containing various connection--level settings and their
        /// desired values.
        ///
        /// See [RFC 7540 § 6.5](https://httpwg.org/specs/rfc7540.html#rfc.section.6.5).
        case settings(Settings)
        
        /// A `PUSH_PROMISE` frame, used to notify a peer in advance of streams that a sender
        /// intends to initiate. It performs much like a request's `HEADERS` frame, informing
        /// a peer that the response for a theoretical request like the one in the promise
        /// will arrive on a new stream.
        ///
        /// As with the `HEADERS` frame, `swift-nio-http2` will coalesce an initial `PUSH_PROMISE`
        /// frame with any `CONTINUATION` frames that follow, emitting a single
        /// ``HTTP2Frame/FramePayload/pushPromise(_:)`` instance for the complete set.
        ///
        /// See [RFC 7540 § 6.6](https://httpwg.org/specs/rfc7540.html#rfc.section.6.6).
        ///
        /// For more information on server push in HTTP/2, see
        /// [RFC 7540 § 8.2](https://httpwg.org/specs/rfc7540.html#rfc.section.8.2).
        indirect case pushPromise(PushPromise)
        
        /// A `PING` frame, used to measure round-trip time between endpoints.
        ///
        /// See [RFC 7540 § 6.7](https://httpwg.org/specs/rfc7540.html#rfc.section.6.7).
        case ping(HTTP2PingData, ack: Bool)
        
        /// A `GOAWAY` frame, used to request that a peer immediately cease communication with
        /// the sender. It contains a stream ID indicating the last stream that will be processed
        /// by the sender, an error code (if the shutdown was caused by an error), and optionally
        /// some additional diagnostic data.
        ///
        /// See [RFC 7540 § 6.8](https://httpwg.org/specs/rfc7540.html#rfc.section.6.8).
        indirect case goAway(lastStreamID: HTTP2StreamID, errorCode: HTTP2ErrorCode, opaqueData: ByteBuffer?)
        
        /// A `WINDOW_UPDATE` frame. This is used to implement flow control of DATA frames,
        /// allowing peers to advertise and update the amount of data they are prepared to
        /// process at any given moment.
        ///
        /// See [RFC 7540 § 6.9](https://httpwg.org/specs/rfc7540.html#rfc.section.6.9).
        case windowUpdate(windowSizeIncrement: Int)
        
        /// An `ALTSVC` frame. This is sent by an HTTP server to indicate alternative origin
        /// locations for accessing the same resource, for instance via another protocol,
        /// or over TLS. It consists of an origin and a list of alternate protocols and
        /// the locations at which they may be addressed.
        ///
        /// See [RFC 7838 § 4](https://tools.ietf.org/html/rfc7838#section-4).
        ///
        /// - Important: ALTSVC frames are not currently supported. Any received ALTSVC frames will
        ///   be ignored. Attempting to send an ALTSVC frame will result in a fatal error.
        indirect case alternativeService(origin: String?, field: ByteBuffer?)
        
        /// An `ORIGIN` frame. This allows servers which allow access to multiple origins
        /// via the same socket connection to identify which origins may be accessed in
        /// this manner.
        ///
        /// See [RFC 8336 § 2](https://tools.ietf.org/html/rfc8336#section-2).
        ///
        /// > Important: `ORIGIN` frames are not currently supported. Any received `ORIGIN` frames will
        /// > be ignored. Attempting to send an `ORIGIN` frame will result in a fatal error.
        case origin([String])
    }

The problem with this design is that HTTP/2 frames are extensible: new frame types can be added by arbitrary users. The semantic of this is that the new frame types are to be ignored:

Implementations MUST ignore unknown or unsupported values in all extensible protocol elements. Implementations MUST discard frames that have unknown or unsupported types.

This is a perfect use of having an unknown default. The behaviour of seeing an unknown frame is well-defined: discard it. So long as you are handling all existing frames, it is not a problem to discard new ones.

More generally, it is incumbent on API designers to communicate to their users what it might mean for a new case to be added to an enumeration, and what a reasonable behaviour will be. It's just not true that all of these should be breaking changes.

tera · November 16, 2023, 10:54am

I think we actually agree, just putting accents differently (note how you had to reach out for best API practices / guidelines / documentation). When I say that "In general a new enumeration case is a breaking change" I do not preclude many cases where adding a new case is benign. I'm just stressing the point that even having a single example like the above means that in general adding a new case can be a catastrophic breaking change.

Maybe we need some new language mechanism to distinguish between important changes:

enum LengthType {
    case byte
    case short
    case word
    case long // new case in v1.1
}

and unimportant changes:

enum TextStyle {
    case headline
    case subhead
    ...
    case caption3 // new case in v 1.1
}

Edit. Could that new language mechanism be the above suggested "default" case?

enum LengthType {
    case byte, short, word
    case long // new case in v1.1
    // there is no default case here
    // 🆕 having "default" handling in the corresponding switch is a hard error
}
enum TextStyle {
    case headline, subhead, ...
    case caption3 // new case in v 1.1
    default // 🆕
    // 🆕 the corresponding switch must have "default" handling.
}

Note that the LengthType above is not frozen and expected to change in the future.

wadetregaskis · November 16, 2023, 5:49pm

It seems like there's developing consensus on this. This discussion has made it clearer - to me at least - that the best thing that will genuinely improve things is providing enum authors more control over whether new cases are 'breaking' or not.

Approaches like using a default "case" are on the right track but limited because they commit the whole enum to one approach or the other (and in advance). Approaches like my earlier suggestion of a @ignorable attribute provide more flexibility, as specific new cases can be ignorable while others are not.

@ignorable also preserves what I think is an important default, extensibility. Just like with final and other such keywords & annotations, it's better to err on the side of flexibility and require explicit consent to give it up.

So, to formalise my proposal (with new behaviours in bold):

Enum @frozen?	`default` handler?	`@unknown default` handler?	New case uses `@ignorable`?	Result
Yes	Irrelevant	N/A	N/A	Cannot add new cases, not even ignorable ones (same as today). [technically: binary- and source-incompatible]
No	No	No	Irrelevant	Compiler error about the missing `default` / `@unknown default`.
No	No	Yes	No	Compiler warning¹ about the new case (same as today).
No	No	Yes	Yes	Nothing breaks nor complains.
No	Yes	N/A	Irrelevant	Nothing breaks nor complains (same as today).

¹ This could arguably be upgraded to an error. But I'm inclined to err on the side of letting users decide (via whether they use Treat Warnings as Errors).

i.e.:

'Library evolution mode' is irrelevant (because binary compatibility is no different to source compatibility, in the end).
Having a default or @unknown default handler provides forward-compatibility (i.e. won't crash) and is now required for all non-frozen enums.
- default provides full (binary and source) forward-compatibility (in the sense of not causing any compiler or runtime complaints - it of course may have undesirable results for the application's semantics, like mishandling of the new case).
- @unknown default provides source forward-compatibility where appropriate (as controlled by the enum author via @ignorable).

This gives both enum users and authors greater control. There's still room for conflict between the two, but less so than is currently the case.

Enum users must be more explicit about how unknown cases are handled - no more "implicitly frozen [but not really]" enums. Enum authors can help lessen this burden by being more rigorous in marking their enums as @frozen when that applies.

Enum users have flexibility in whether they want to know about new cases or not - @unknown default vs default, respectively - and enum authors can also control which new cases users are pro-actively notified about [by the compiler], via the @ignorable attribute.

Limitations

default is a blunt instrument. A new enum case might genuinely be important, but be muffled by the presence of an existing default handler. This could be fixed with e.g. an @important enum case attribute, that prevents the given case being covered by default handlers. I'm uncertain as to whether that's worth the additional complexity. And it's not unequivocally conclusive (i.e. then users could complain they don't like enum author choices and want some way to suppress @important cases too, and so on ad infinitum).

There would still be a clash, or at least open question on the behaviour, of default when the switch otherwise appears exhaustive. Today you get a compiler warning that the default is redundant. This would be annoying if the use of default becomes (as proposed) necessary to say "handle - and don't bother me about - all new enum cases in future". I'm inclined to remove this warning (for non-frozen enums). Especially since it's already undesirable in some cases (e.g. where some enum cases are compile-time conditional, such as based on architecture or debug vs release, or when moving to an older version of a module).

tera · November 16, 2023, 6:28pm

Interesting approach. Not overly fine? I wonder of the amount of "@ignorable" noise though before (potentially) every case, but it could be minimised:

enum E {
    case a, b, c            // important cases
    @ignorable case d, e, f // unimportant cases
}

If the total enum overhaul was on the table maybe just this:

enum E {
    case a       // important case
    case b       // important case
    case c, d    // important cases
    e            // no `case` prefix → ignorable case
    f            // no `case` prefix → ignorable case
    g, h         // no `case` prefix → ignorable cases
}

Alternatively:

enum E {
    case a       // important case
    case b       // important case
    case c, d    // important cases
    default e    // ignorable case
    default f    // ignorable case
    default g, h // ignorable cases
}

Or this version of spelling:

enum E {
    case a       // important case
    case b       // important case
    case c, d    // important cases
    default case e    // ignorable case
    default case f    // ignorable case
    default case g, h // ignorable cases
}

to match the use site:

switch e {
case a: ...
case b: ...
case c: ...
case d: ...
// would be a compilation error to not handle important cases explicitly
// other cases could be handled via default
default: ...
}

There would still be a need for an empty trailing "default" in the declaration of "unimportant enums" (or some other mechanism to the same effect):

enum E {
    case a       // important case
    case b       // important case
    case c, d    // important cases
    default
}

as otherwise there would be no "default case handling" in the user source and once you add a non important case in the next library version there would be a compilation error.