SE-0243: Codepoint and Character Literals

taylorswift · March 10, 2019, 7:16pm

This seems like the worst of both worlds,, you lose out on the compile-time validation and get saddled with unneeded domain widening, but don’t gain any clarity, as @xwu points out.

String literals are constructed at runtime due to ICU dependencies, so validating this is going to be pretty complicated. This seems like a perfect use case for the @expressible(none) compile-time literal attribute I proposed a few posts back, which takes a [Unicode.Scalar] array instead of a String. I’m sure you’re also aware of the pitfalls inherent in converting unicode-aware Strings into ASCII bytestrings.

Let’s not swat a fly with a nuclear warhead. I would hate to see people compiling regexes just to test if an ASCII byte is a digit or a letter.

I need to remind everyone that the first drafts of the proposal specified exactly this behavior, but there was a lot of pushback from people in favor of 'a' for Character literals. (read basically the 30 posts before the one i linked.) Backtracking on this is likely to bring a lot of the pro-Character literal people out of the woodwork to defend their syntax.

I have no objection to ':'.ascii, but there are a lot of practical challenges that would make it hard to make this API actually usable.

We can’t vend this on Character, because it would get too confusing to have an optional asciiValue and a trapping ascii value on the same type, and the latter seems to go against the spirit of what Character is trying to model.
We’re left with vending this on Unicode.Scalar, but we have to sacrifice Character literals to make this not require contortions like (':' as Unicode.Scalar).ascii. I would also say that many of the arguments against ascii on Character, also apply to ascii on Unicode.Scalar. Unicode.Scalar can model 1,111,998 codepoints, it would be weird and against the spirit of the type to consider 1,111,870 of them “edge cases”, which is the assumption we make when we make something trapping instead of optional.
':'.ascii just doesn’t tell a great compile-time validation story. Of course, we could just special-case it and make this particular expression known to the compiler but that doesn’t sound particularly generalizable to me. I disagree with xwu’s assertion that compile-time validation should be heuristic and implicit. It’s far more useful to know when and where to trust the compiler to handle things so that I know to add in manual runtime validation (or static #asserts at the call-site) in the situations where it’s not.