SE-0192 — Non-Exhaustive Enums — review #2

allevato · March 31, 2018, 3:59pm

I'm not sure I follow your reasoning here. From a user's point of view, there's no difference between the ICU enums I'm proposing adding to the standard library and the numerous non-frozen enums provided by Apple system frameworks. They both face the same potential problems from what I can tell, so why are you recommending a different solution for the former?

The only difference is that Apple's can be easily and directly annotated by the language authors while ICU's cannot, but that seems like an odd place to draw the line, based on the wording of the proposal.

But, as I said in a previous post, maybe I should investigate writing the shadowing the enums on the C side with the right attributes to get the correct behavior. I haven't had an opportunity to try that yet to see if it solves this problem.

Karl · March 31, 2018, 4:25pm

Here's what it looks like if you use an enum to try and do a 1:1 mapping:

Using the C API directly

My C library returns a new enum value, which I don't understand. It's just an integer. Even though I don't understand this value, the library does - I can pass it back in to query properties of this value, bit-mask it to extract options, etc. It's still pretty usable.

Use the Swift overlay

My Swift overlay maps the C integer in to a Swift enum. When I map from C -> Swift, I lose the underlying C value as the Swift compiler picks a discriminator. However, I have a RawRepresentable conformance mapping my enum's cases to the C values, so I can recover the library values of known cases and push these values back to the library for follow-up queries.

Unknown cases will have to map to something - either a single, special case in your enum, or nil. In that situation, you lose all information about the new value, and all ability to inspect the value. Even RawRep cannot recover the library value.

I think the 2nd situation is worse for something which tries to follow the C library closely. It's only acceptable to lose the underlying C value if the enum is frozen or if you're only interested in presenting a curated subset of functionality in the first place.

allevato · March 31, 2018, 4:28pm

I think we might be talking past each other. Why doesn't everything you said above apply to Apple's own enums imported from their Objective-C frameworks into Swift?

Karl · March 31, 2018, 4:35pm

Enums imported by Clang are different - they stay as integers AFAIK. They are not the same enums that we write in Swift.

The only time when this might come up in the Apple libraries is Foundation, because it does this kind of mapping from CF enums; but since they are bundled together, I think they manage to avoid it.

allevato · March 31, 2018, 4:38pm

Right, which is why I suggested above that I may need to revisit my enums to see if I can write them in the C portion of stdlib and import them, instead of wrapping them in Swift—so that they're treated the same way. So I think we're on the same page. (Either way, the enums either need to be shadowed in C [to add annotations] or shadowed in Swift, but it looks like the former is the only way that will handle "unknown" values the way this proposal intends.)

I don't want to derail this thread any further—I was just trying to get a feeling from someone closer to the proposal about whether this situation was worth consideration, since it came up during my implementation of something intended for the stdlib.

Karl · March 31, 2018, 4:47pm

We're talking about a feature which we think hardly any library authors will really have to worry about. You are literally in the middle of it right now. Others may disagree, but I don't see it as derailing. Is not the whole motivation for this proposal to handle C enums?

I think it's just as important to get this kind of feedback and work out these details as it is to bike-shed the naming and placement of attributes. Handling unknown cases has been a high concern of users since the first drafts of this proposal.

jrose · April 2, 2018, 11:49pm

@allevato, your situation is why the original version of the proposal provided @frozen to everybody (well, spelled @exhaustive at the time), and not just "the stdlib for now, other libraries potentially in the future if they have binary-compatibility concerns". It was determined by a strong majority on the core team that applying that change everywhere was not worth the cost and the additional frustration when nearly all the time someone is going to ship the library with their app and therefore they know all of the possible enum cases.

I think case unknown(/*rawValue*/ Int32) is a reasonable answer here precisely because your API and ICU's API are evolving independently—unless you're talking about building your own copy of ICU separate from a system one. Even in that case, though, it's just about what your clients have to do when they update to a newer version of your library. Without any additional help, they'll get errors; if they want, they can use unknown-or-whatever and get warnings.

(And for those following along, the difference with C enums in Apple frameworks is that if an end user updates their OS, the possible enum cases change without the app recompiling. And if the stdlib ever starts shipping with Apple's OSs instead of being bundled into an app, the same thing will happen with (non-frozen) enums declared there.)

jrose · April 3, 2018, 12:01am

I continue to think that annotating either default or switch makes for worse-reading code than a new case kind—particularly since we don't have any existing statement or case attributes—but I don't really have any new arguments for that. (I'm not concerned about searchability because "swift unknown switch" would work fine.)

If the core team were to overrule me on that, I don't love @warnIfUsed (because it might be used, at run time), but it works okay. @dabrahams' @unnamed doesn't make sense to me because it introduces the idea of "unnamed cases", which…kind of describes how C enums work but isn't how I'd want to explain future cases to people (or private cases, if we ever get those).

To @DevAndArtist's suggestion specifically, I don't think the attribute is terrible on switch, but it makes the initial diagnostic where you forgot it a little more complicated. Instead of "you got all the known enum cases but you need to add one more case to your switch, which might have an attribute", it's now "you need to add one more case and add this attribute, or else you won't find out about the next one". It seems too easy to forget the attribute part.

(I guess that's a new argument against the attribute on default as well: leaving it out still produces correct code, whereas switching unknown for default is a less likely mistake. But at least it'd be the same place where you make the change.)

I'd request that the core team just pass a ruling on the name and then I'll go implement it.

dabrahams · April 3, 2018, 11:09pm

Yah, I don't know why I was thinking “unnamed” was any better than “unknown” here, and I have no objections to the readability of

@unknown case _:

so maybe would be happy if we could just agree to encourage that spelling when you want the warning.

The problem with warnIfUsed is that it says the wrong thing, and getting it to say the right thing is hard. “Used” typically means “uttered,” and obviously you have uttered the default case. It's more like “matched,” but then it seems to imply—as does “used”—that it's a runtime check that fires only when the case is actually reached in execution. To be accurate, we'd have to say @warnWhenKnownCaseMatches or @warnWhenKnownCaseNotAlreadyHandled. Personally, I have no serious objection those spellings, but I don't know about others.

jrose · April 3, 2018, 11:12pm

*raises hand* I do have an objection to any annotation that is likely to be significantly longer than the other cases being matched. I agree with Chris that this will be uncommon but I don't want that to mean it should look drastically different from other code where it appears.

Nevin · April 3, 2018, 11:16pm

I think we have a winner.

jrose · April 3, 2018, 11:17pm

Also I hate to prolong the discussion but @unknown case _ also allows for @unknown case let value, i.e. you can say this is meant to match unknown cases but also bind values. I think we'd restrict it to catch-all cases still, though, or it gets too confusing.

dabrahams · April 3, 2018, 11:47pm

@unknown case let value: is a catch-all case, just like case let value: is. I don't see the problem here.

jrose · April 3, 2018, 11:49pm

Yes, sorry, I meant we would want to disallow things like @unknown case (.foo, _), which would mean about the same thing as a true unknown pattern: case (.foo, #unknown). That would be a significant increase in the proposal's complexity over what was discussed in this review period.

dabrahams · April 3, 2018, 11:54pm

Given that case _: is a synonym for default: I can't see any justification for disallowing adornment by @unknown. I also can't see any reason that would have to imply support for something like case (.foo, @unknown _) or @unknown case (.foo, _). What's the actual problem here?

jrose · April 3, 2018, 11:56pm

No problem. I pointed out something else cool we could potentially do with your syntax, but wanted to make it clear that we wouldn't go further. Nothing you've said is wrong, though.

(That said, I can see us wanting to artificially limit the syntax here anyway. I did propose unknown: to start and still think that's the best alternative myself.)

allevato · April 4, 2018, 1:56am

"My" library in this case is the Swift standard library—I'm proposing additions to Unicode.Scalar. Since Swift links to the system ICU, it's possible we end up in a situation where stdlib v.N is written aware of the enum values in ICU v.M, and then someone runs code with v.N's stdlib on a system with ICU v.(M+1), which has introduced new values that will then try to pass back through the Swift layer.

So my concern is that the proposal says that unknown cases are supported in the standard library, but as written, this particular situation is not supported. The way I'm reading it, this proposal handles the following:

Enums implemented in Swift stdlibs that do not have raw values can be non-frozen
Enums imported from C frameworks can be non-frozen

But the following situation cannot be handled:

Enums implemented in Swift that are constructed using raw values (whether RawRepresentable or ad hoc) cannot be non-frozen

This seems to be because (1) if you use RawRepresentable, then init?(rawValue:) returns nil if the raw value does not correspond to a case at the time of writing, and (2) if you write your own initializer by hand, there's no way to map unknown raw values to an "unknown" case.

Given this, is unknown(Int32) still what you recommend for this problem, even if the enum lives in stdlib? Is that safe to evolve with new cases in the future? Is there something I'm missing?

jrose · April 4, 2018, 2:33am

That's two different kinds of unknown, then. You're trying to map from one set of values, which may change at run-time, to another set of values, which you may want to change in the future. They happen to match up, which is what makes this tempting, but I don't think it's an innate part of the problem. That said, we could come up with some special way to support that (and @Joe_Groff has talked about it before in precisely this way, non-exhaustive enums with raw values), but that's out of scope for any version of this proposal.

That said, doing this at all is questionable, rather than just extending the underlying enum. I get that the stdlib probably doesn't have that option but that makes me wonder if it's even an enum to begin with. Maybe it's just a struct.

allevato · April 4, 2018, 3:07am

When you say "extending the underlying enum", do you mean annotating it with the necessary Clang attributes on the C side to rename it and its cases to something "Swiftier" and keep it non-frozen? I brought that up as a possible path forward further up in the thread, but wasn't sure if it was the best approach. As far as I can tell, my situation (exposing an ICU enum in stdlib that has added cases over time) doesn't feel any more unique than something like UIKeyboardType (another enum that evolved over time), with the main difference being that we can't add Clang attributes to the ICU ones.

So, one idea I considered was creating a new C enum that has the same numeric values as the original ICU enum and expressing the public API in terms of that enum, but I'm not sure of any other cases in the Swift module where a C type is directly exposed to Swift clients. Does this seem remotely reasonable?

Karl · April 4, 2018, 3:27am

I think you need to explain why you're so intent on this thing being an enum.

90% of the point of enums is exhaustive switching. Besides that property, they are just a group of named values. If you're wrapping a non-frozen enum coming from C, you will not have exhaustive switching, and you are almost certainly better-off with a struct.

Mapping the C enum to a Swift enum is something a lot of people will probably do initially, but I think it's poor advice; since RawRep does not affect the layout of the enum, every library call will have to dynamically un-map back to a library value via the rawValue getter, and it doesn't handle evolution of the C library as seamlessly.