Handling unknown cases in enums [RE: SE-0192]

DeclModifiers are a syntax/parser thing, not a representational concept in the AST. Storing it as a bit would be fine, I’d also prefer that.

-Chris

···

On Jan 17, 2018, at 4:19 PM, Jordan Rose <jordan_rose@apple.com> wrote:

The parser has fairly general support for declmodifiers, and my proposal fits directly into that. The only extension is that ‘default’ isn’t a decl, and I don’t think we have a statement modifier yet. That said, we’ve always planned to have them when the need arose.

‘case’ isn’t a decl in switches either. Nor is it a statement, the way things are modeled today.

Bleah, sorry, I forgot that it is a statement at the moment. But I don't think it needs to be modeled as a statement attribute/modifier. A flag on CaseStmt is fine.

The latest news: SE-0192 — Non-Exhaustive Enums - #337 by jrose

I’m working on cutting down the proposal text (and the implementation) to match this feedback from the core team, but there’s also one more issue that needs to be worked out about unknown case—and no, I’m not just talking about the name.

During these discussions, @Chris_Lattner3 and I realized we were talking past each other because of one crucial point:

It is an error to use unknown case with a non-enum-typed value.

Chris pointed out that this makes it much more annoying to use switches with patterns that contain non-frozen enums.

switch (excuse, notifiedTeacherBeforeDeadline) {
case (.eatenByPet, false):
  // 1
case (_, false):
  // 2
case (let excuse, true):
  switch excuse {
  case .eatenByPet: // 3
  case .thoughtItWasDueNextWeek: // 4
  unknown case: // 5
  }
}

Okay, this one's particularly contrived, but you see the point: if you want to keep your exhaustiveness diagnostics, you need to pull the enum value out into its own switch. That was something I had always considered a necessary evil, since it would come up relatively rarely, but Chris proposed an alternative rule: unknown default can match any pattern that contains a non-frozen enum.

switch (excuse, notifiedTeacherBeforeDeadline) {
case (.eatenByPet, false):
  // 1
case (_, false):
  // 2
case (.eatenByPet, true):
  // 3
case (.thoughtItWasDueNextWeek, true):
  // 4
unknown default:
  // 5
}

I personally see this as convenient but confusing; it's no longer obvious what relation the "unknown" has to the value being matched, and it can catch multiple unknown values if you have a pattern containing multiple enums. Or an enum with another enum as a payload.

(The other features that work like this are try, which applies to an entire subexpression rather than just the immediate call so that you can do things like try riskyThingOne().riskyThingTwo(), and pattern-based let, so that you can say case let (x, y) instead of case (let x, let y).)

Whatever we pick, however, this is really the thing that was keeping me and Chris from agreeing on a name. If this new kind of catch-all can only match enum values, then unknown or unknown case is a good name. If it can match any value that contains a non-frozen enum, then unknown, unknown default, or default unknown is a good name. (@dabrahams in particular had a strong distaste for unknown default since it isn't the "default" that's "unknown".)

(Yes, unknown is back on the table, because we think we can give good enough diagnostics if someone is using it as a label.)

So, what do you all think? Should the new catch-all be allowed to match "patterns containing non-frozen enums" or "non-frozen enums only", and why?

(Please save comments on the name until we see if there's a clear preference for one or the other of these choices.)

Yes, this is a great idea.

Even though you are switching over something other than an enum, the patterns being matched inside the switch are still called “cases”, and they still use the keyword case. Therefore the most consistent, easiest to understand, and (in my opinion) best-looking choice is to use the spelling “unknown case”.

It is the pattern-matching case which is unknown, irrespective of whether the value being matched is an enum, a tuple containing an enum, or what-have-you. Or rather, “unknown case” is a pattern-matching case which matches unknown values, and consequently raises a warning if a known value would match it.

I agree with Chris's formulation (and, indeed, was the first to suggest default unknown--but won't comment on the name here). I disagree that it's confusing: if a case contains an enum value that's unknown, then it itself is a value that's unknown and it's sensible to match a pattern that's named "unknown" or some flavor of it. And, as you say, the way this feature would work is precedented by how try works.

By contrast, I think (and I believe Chris made this point in the original post also) it'd be very unfortunate to give switching over enums something that doesn't apply to switching over other types or to if case and other pattern matching. The awkwardness you demonstrate of having to pull enums into its own sub-switch is a consequence of such a design, and being able to avoid it with the alternative demonstrates, in my view, that alternative's superiority.

1 Like

This topic has been so controversial it almost feels destined to end up like the fileprivate debacle, another language wart. As no proposal feels like an unequivocally good idea I'd favor leaving the status quo.

3 Likes

As mentioned previously, this will never apply to if case and other pattern matching. This new case is a catch-all case like default except that you can get warnings out of it.

That was a bit of a muddled sentence on my part. Let me rephrase:

As a flavor of case (as in your design, however it'd be spelt), it'd be unfortunate that it's allowed only in switch...case--and only for enums--and not in other scenarios in which case is used. Since the intent is indeed that it'll never apply to if case and other pattern matching, it makes more sense for the feature to function like (and be spelled like--but again, we'll leave this aside for now) default.

1 Like

I'm not sure, but I think there's a meaningful typo in SE-0192. It says this:

Source compatibility

It is now a source-compatible change to add a case to a non-exhaustive enum.

then this:

Effect on Library Evolution

It is now a binary-compatible change to add a case to a non-exhaustive enum.

then this:

Future direction: compatibility checking

Of course, the compiler can't stop a library author from adding a new case to a non-exhaustive enum, even though that will break source and binary compatibility.

That should be "to an exhaustive enum" in the last quote, right?

1 Like

This is a probably stupid idea, but here goes… .

What I’m suggesting has 2 parts:

Part 1 (old syntax)

Instead of having default unknown or unknown case cases, have that kind of switch throw a unique type of error, say (for the sake of explicitness, it doesn't matter what the name is) UnknownCaseError. That’s the entire behavior, except for one small dollop of special sauce: the compiler knows that the switch statement (separate from what errors its contained statements can throw) can throw no other kind of error. That means this:

	do {
		try switch excuse {
		case .eatenByPet:
			…
		case .thoughtItWasDueNextWeek:
			…
		}
	catch is UnknownCaseError {
		…
	}

has an exhaustive catch, so works fine in an enclosing non-catching scope.

Part 2 (new syntax)

Then, make the switch itself a catching scope, if it needs to be (that is, if it would have needed default unknown or unknown case under the existing proposal), but with a simpler syntax. The above example would be written more simply like this:

	try switch excuse {
	case .eatenByPet:
		…
	case .thoughtItWasDueNextWeek:
		…
	catch is UnknownCaseError:
		…
	}

This means that an UnknownCaseError could be thrown at any level of enum non-exhaustion, and be caught at any higher level of switch. Rewriting earlier examples:

	switch (excuse, notifiedTeacherBeforeDeadline) {
	case (.eatenByPet, false):
	  // 1
	case (_, false):
	  // 2
	case (let excuse, true):
	  try switch excuse {
	  case .eatenByPet: // 3
	  case .thoughtItWasDueNextWeek: // 4
	  catch is UnknownCaseError: // 5
	  }
	}

or:

	try switch (excuse, notifiedTeacherBeforeDeadline) {
	case (.eatenByPet, false):
	  // 1
	case (_, false):
	  // 2
	case (let excuse, true):
	 switch excuse {
	  case .eatenByPet: // 3
	  case .thoughtItWasDueNextWeek: // 4
	  }
	catch is UnknownCaseError: // 5
	}

or:

	try switch (excuse, notifiedTeacherBeforeDeadline) {
	case (.eatenByPet, false):
	  // 1
	case (_, false):
	  // 2
	case (.eatenByPet, true):
	  // 3
	case (.thoughtItWasDueNextWeek, true):
	  // 4
	catch is UnknownCaseError:
	  // 5
	}

This borrows, approximately, the known semantics of catching nested errors to clarify what it means to have such special catches at various levels.

The idea here is that any other “normal” errors thrown inside the try switch would slip past this catch is UnknownCaseError . And vice versa: any additional catch cases would catch matching errors in the obvious way:

	try switch (excuse, notifiedTeacherBeforeDeadline) {
	case (.eatenByPet, false):
	  // 1
	case (_, false):
	  // 2
	case (.eatenByPet, true):
	  // 3
	case (.thoughtItWasDueNextWeek, true):
	  // 4
	catch is UnknownCaseError:
	  // 5
	catch SchoolError.retiredTeacherError:
	  print (“Saved by the bell!”)
	catch:
	  print (“Explanation was rejected: \(error)”)
	}

No doubt there’s lots wrong with this idea, but the discussion in this thread does seem to be trying to “reinvent” catching errors as a case, so why not just allow it?

3 Likes

I think we should be able to match "patterns containing non-frozen enums".

Every once in a while I write a switch over a pair of enum values of different types like this: switch (myEnum, otherEnum). Consider the case where one of these is non-frozen and the other was either frozen or from my own module. I really I would be able to include more than one case by using the #unknown expression that has been discussed: case (.something, #unknown) and case (.somethingElse, #unknown). If we're not going to have this, I would at least want to be able to use unknown to get a warning if the rest of the cases are not exhaustive.

Why would we not allow somebody to benefit from exhaustiveness checking in this case? This would force them to either use nested switches or a default clause both of which are suboptimal.

I don't want to sidetrack the discussion on naming, but I do want to point out that unknown is the one name that works for both of the semantics. This has always been the name I thought best. The others have always felt like compromises to me. I'm happy to hear unknown is on the table.

3 Likes

It's an interesting idea, but I don't think UnknownCaseError is the right mental model here. This is a situation that's only encountered if your switch does not have some other kind of catch-all case, and therefore the error is only thrown based on analysis of the other cases. That seems inside-out to me; it's like rethrows but even more complicated. I also don't think we want try switch to silently work without handling the UnknownCaseError immediately, but that's at odds with a normal try, which can propagate up out of a function whenever it wants.

1 Like

Support #unknown if achievable, since it generalises better.
Otherwise the standalone unknown keyword, matching any subexpression, because I think this is a similar distinction to open vs public.

My suggestion of default(unknown) might work for him. (Analogous to unowned(unsafe))

case #unknown is implementable Handling unknown cases in enums [RE: SE-0192] - #8 by Chris_Lattner

I would really want
case #known + case #unknown == case _ for non-exhaustive enums.

1 Like

I agree with the former (containing non-frozen enums) for convenience and keeping things simple.

Let me summarize interaction that I envision between switch statements and non-frozen enums:

If you have an exhaustive switch on a non-frozen enum, compiler makes you put this limited type of default (I prefer to call it default(unknown)). Compiler will warn you if you use default(unknown) (default for unknowns) and do not have cases for all known cases. This is the key purpose of it being different from ordinary default.

It would be a warning to use default(unknown) when the enum is frozen.

it would be a warning to use default when the switch is already exhaustive (for currently known cases) for a non-frozen enum, since such default should only be intended to catch the cases that are unknown at this time. An auto fix can be suggested to change it to default(unknown).

I think that is all there is to it for the time being to keep things simple. Basically, it is just some compiler diagnostic to help when you intend to write exhaustive switches over non-frozen enums.

1 Like

This is intriguing. As far as I recall, it had not been previously suggested that “default” would warn when used on an otherwise-exhaustive switch over a non-frozen enum. I think it is a great idea.

After all, it would be quite easy to write “default” out of simple habit, and thus lose the benefit of “unknown case”. Having the compiler nudge people toward the latter when appropriate is beneficial and I support it.

I don't think we should have a warning for default when the switch is already exhaustive. That's useful signal too: "I happen to have handled all the present cases, but I don't need to be notified when more are added." I suspect in practice no one will try to write a default on a fully-covered switch anyway; the diagnostic you get for leaving it out when you need one will definitely suggest unknown default (or whatever).

Thanks for the suggestion. I personally find default(unknown) to be harder to read. I did mention it to the other core team members and they didn't seem to care for it either.

Apart from whether or not we ever do a pattern-based #unknown—and I'm not at all sure we would want to, because it makes the notion of "catch-all" much more murky, and probably isn't the right name (see below)—the core team decided that it's okay to have a top-level spelling that's equivalent, just like default is equivalent to case _.

This is explicitly not the design for unknown case, because I don't want the case that gets picked to change just from recompiling with a newer SDK. It's a catch-all regardless; the "unknown"-ness is just about diagnostics.

1 Like

Since this new type of default is only going to provide a warning, why not just repurpose default to always provide the warnings? if somebody wants to silence the warnings then can switch (no pun intended) to case _ ? Has this been discussed before?

1 Like

That's interesting. I'd be wary of breaking expectations from C, but maybe it'd be worth it. @Chris_Lattner3, what do you think?

EDIT: Ah, wait, never mind. You can't tell if someone meant for a switch to be exhaustive or not.

I also considered and liked this idea, but expected it to meet resistance from users of default over case _, since it's forcing them to use a style they may have deliberately chosen not to use. For new users, it would reduce the complexity of switch statements considerably.