That (CR + LF → LF) is not Unicode normalization, and the two are not canonically equivalent by Unicode’s standards. Hence they do not satisfy
The “normalization” we are talking about here (by using the basic dictionary definition, not the Unicode technical term) is done simply because CR + LF is one
Character (extended grapheme cluster in Unicode parlance), but two ASCII values. For the
Character instance to produce a single
UInt8, it has to somehow handle two as one. To do that, it was decided to convert the pair to the equivalent UNIX line ending when needing to express it as a single ASCII byte. The alternative design choice would have been to return
nil, as is done for
≠ or any other Unicode‐only character, but that design seems even less intuitive.