(I guess this question has been asked before but I couldn't find the answer.)
Possibly because its Unicode definition (extended grapheme cluster) is not forward or backward stable. Loading a Character
from an older file on disk that predates an OS update, or passing a Character
between devices could lead to decoding failures (or even crashes if it were implemented poorly). For almost all use cases, it is probably better to encode them as String
. Any parceling into characters should be redone or rechecked after loading anyway.
String
indices and offsets are vulnerable to the same thing, which is why there is no good way to do coding with them either. Offsets into the scalar, UTFâ8 and UTFâ16 views are even worse, because if, say, the JSON undergoes normalization, they could all become invalidâand that can happen even without an update to Unicode.