Theoretically, Character
could be Codable
, though there are some big pitfalls to consider when doing so:
The stability of grapheme clusters across Unicode versions isn't guaranteed, so you'd be liable to get different results with the same data across versions of Swift. For example:
- Version
Sᵢ
of Swift has knowledge of Unicode versionUᵢ
, which defines a sequence of Unicode code pointsP
to be multiple grapheme clusters - A later version
Sⱼ
of Swift has knowledge of Unicode versionUⱼ
, which redefinesP
to be a single grapheme cluster - You build a program with version
Sⱼ
of Swift and encode your singleCharacter(P)
. You then try to decode that same data with the same program built with versionSᵢ
of Swift and . Well, maybe not , but it would depend on what happens ifCharacter.init(from:)
encounters encoded data representing more than a singleCharacter
. Likely it would throw an error, but the data remains undecodable
(This might sound contrived, but Unicode 11 updated the definition of grapheme clusters to be defined by a regular expression, and extended the definitions of what might be considered an "allowable" grapheme cluster, especially around pictographic sequences [emoji]. The yellow in that link is showing a diff from the previous version of the report [TR29-31] to the next [TR29-33]. So what might be considered a single grapheme cluster in Unicode 11 may be considered multiple in earlier versions of Unicode)
As @SDGGiesbrecht notes in the other thread, it's likely you'll want to encode your Character
as a String
explicitly so that at decode time, you can figure out how you might want to deal with the failure modes directly.