Why is Character not Codable?

itaiferber · March 23, 2022, 6:19pm

Theoretically, Character could be Codable, though there are some big pitfalls to consider when doing so:

The stability of grapheme clusters across Unicode versions isn't guaranteed, so you'd be liable to get different results with the same data across versions of Swift. For example:

Version Sᵢ of Swift has knowledge of Unicode version Uᵢ, which defines a sequence of Unicode code points P to be multiple grapheme clusters
A later version Sⱼ of Swift has knowledge of Unicode version Uⱼ, which redefines P to be a single grapheme cluster
You build a program with version Sⱼ of Swift and encode your single Character(P). You then try to decode that same data with the same program built with version Sᵢ of Swift and . Well, maybe not , but it would depend on what happens if Character.init(from:) encounters encoded data representing more than a single Character. Likely it would throw an error, but the data remains undecodable

(This might sound contrived, but Unicode 11 updated the definition of grapheme clusters to be defined by a regular expression, and extended the definitions of what might be considered an "allowable" grapheme cluster, especially around pictographic sequences [emoji]. The yellow in that link is showing a diff from the previous version of the report [TR29-31] to the next [TR29-33]. So what might be considered a single grapheme cluster in Unicode 11 may be considered multiple in earlier versions of Unicode)

As @SDGGiesbrecht notes in the other thread, it's likely you'll want to encode your Character as a String explicitly so that at decode time, you can figure out how you might want to deal with the failure modes directly.