When applied with grapheme-cluster semantics (the default if applied to String
), it would match grapheme-cluster by grapheme-cluster and comparison obeys canonical equivalence. There are some features that might not be supported, e.g. generalizing some scalar properties to grapheme clusters. Resulting indices would be grapheme-cluster aligned.
When applied with scalar semantics (the default if applied to String.UnicodeScalarView
), then it would have scalar-by-scalar matching with binary semantics. Resulting indices would be scalar-aligned.
TBD is application to one of the encoded views, but it will probably closely adhere to scalar semantics.
These would also likely have different character classes, one which maps to a Character
property (and we should add new ones as part of this effort) and one which follows the normal and
s and or
s based on scalar property. Character classes would likely be customizable (e.g. POSIX mode, or even supply a custom one), mechanism TBD.