Why isn't Character Codable?

Jens · September 25, 2019, 7:31am

(I guess this question has been asked before but I couldn't find the answer.)

SDGGiesbrecht · September 25, 2019, 7:08pm

Possibly because its Unicode definition (extended grapheme cluster) is not forward or backward stable. Loading a Character from an older file on disk that predates an OS update, or passing a Character between devices could lead to decoding failures (or even crashes if it were implemented poorly). For almost all use cases, it is probably better to encode them as String. Any parceling into characters should be redone or rechecked after loading anyway.

String indices and offsets are vulnerable to the same thing, which is why there is no good way to do coding with them either. Offsets into the scalar, UTF‐8 and UTF‐16 views are even worse, because if, say, the JSON undergoes normalization, they could all become invalid—and that can happen even without an update to Unicode.