@taylorswift, @johnno1962, what are your thoughts on things like '\u{200D}'
, interpolations, multi-line and raw, for Character literals? It's conceivable that Character's conformance could construct from raw scalar values or interpolations that way. Off the cuff, I'd say this is a "rejected" direction, because users can always use the double quotes to access all of that.
For the content of the pitch itself, I feel like it can be distilled down into its essence:
This can just mention that we have awesome String literals (which are continuing to get increasingly awesome), but it's also common in programming to want to use things that appears as characters to users in numeric contexts, using their Uncode scalar value (example: C chars).
The bytestrings concept is totally something we'll be exploring more in the future, but seems very out of place in this pitch because we're not pitching bytestrings. We can just drop it.
We can just drop this entire section. The discussion of encodings is unrelated, terminology used can just be standard terminology, canonical equivalence is unrelated, bytestrings is an unrelated concept except as a literary device for the motivation section, same for machine strings, etc.
Again, can drop bytestrings concept and encoding validity discussion, which is unrelated to this pitch except as a motivator. The motivation is simply that it's common to want to use the visual representation of a numeric value in code when that corresponds to a character.
One of the future directions for String (a more recent link escapes me, but an old one is here) is to provide performance-sensitive or low-level users with direct access to code units. In that world, it would be much nicer to have numeric-character literals for use in conjunction with this hypothetical future API:
extension String {
func withCodeUnits(_ f: (UnsafeBufferPointer<UInt8>) throws -> T) rethrows -> T { ... }
}
The value of character literals which can convert to UInt8 for the body of f
is hugely motivating compared to raw numbers in code.
If we want to go with the tables in Prepitch: Character integer literals - #180 by Michael_Ilseman, then the proposed solution is fairly straight forward. The tables are pretty self-explanatory, we use single-quote for character literals, and we can list the protocol declarations under "Detailed Design". The deck-chair rearrangement necessary for source compatibility can go under the "Source Compatibility" section to keep it out of the spotlight.
Actually, this has a nice ABI impact of purging all the unnecessary intermediary protocols. We can keep the entry points if this doesn't make the deadline if necessary.
It seems like there have been several ones debated on this thread. You can also mention that we're not going to extend anything fancy like interpolations or scalar values into the character literal syntax.
If you'd like, I can also help drive this proposal because I think it is a compelling future direction for String, if you're willing to wait a few weeks ;-)