SE-0221 – Character Properties

Michael_Ilseman · July 23, 2018, 6:24pm

I feel like there’s two separable aspects you brought up here:

One is a request is to continue adding APIs for Unicode-savvy users, ala Unicode Scalar Properties and the recently pitched String case folding and normalization APIs. Specifically, the ability to request version-specific information.

It sounds like you want something akin to versioned Unicode Scalar Properties, but I feel it would be out of place for the standard library proper as currently designed. It would require shipping all versions of Unicode data files, reconciling availability (so all properties might end up being optional), etc. I think it would make a very interesting SPM package today, and in the future could have a place in a form of “extended” libraries or package catalogue for Unicode experts/enthusiasts.

The other is a concern about the stability of answers to this query. Many String APIs suffer from this, including String.count and String.lowercased(), which varies version-to-version of Unicode. Do you see something particularly troubling for Character Properties that doesn’t already apply to String?