There is Unicode.Scalar.Properties.isEmoji
, but emoji character can be composed out of multiple scalars, seems there should be Character.isEmoji
? According to the comment on Unicode.Scalar.Properties.isEmoji
, determining whether a Character
is emoji is not simple:
testing
isEmoji
alone on a single scalar is insufficient to determine if a unit of text is rendered as an emoji; a correct test requires inspecting multiple scalars in aCharacter
. In addition to checking whether the base scalar hasisEmoji == true
, you must also check its default presentation (seeisEmojiPresentation
) and determine whether it is followed by a variation selector that would modify the presentation.
The logic discuss above should be encapsulated in the missing Character.isEmoji
? This would save a lot of having to know about "emoji Unicode
".
On to how to get all the emoji scalars? From NSHispter article on CharacterSet, he is doing it this way:
import Foundation
var emoji = CharacterSet()
for codePoint in 0x0000...0x1F0000 {
guard let scalarValue = Unicode.Scalar(codePoint) else {
continue
}
// Implemented in Swift 5 (SE-0221)
// https://github.com/apple/swift-evolution/blob/master/proposals/0221-character-properties.md
if scalarValue.properties.isEmoji {
emoji.insert(scalarValue)
}
}
So it's brute force testing overly large code point, not efficient?
Found this from Stackoverflow:
extension Character {
/// A simple emoji is one scalar and presented to the user as an Emoji
var isSimpleEmoji: Bool {
guard let firstScalar = unicodeScalars.first else { return false }
return firstScalar.properties.isEmoji && firstScalar.value > 0x238C
}
/// Checks if the scalars will be merged into an emoji
var isCombinedIntoEmoji: Bool { unicodeScalars.count > 1 && unicodeScalars.first?.properties.isEmoji ?? false }
var isEmoji: Bool { isSimpleEmoji || isCombinedIntoEmoji }
}
it not doing all the logic mentioned in Unicode.Scalar.Properties.isEmoji. So it maybe not completely correct. And I don't know why:
&& firstScalar.value > 0x238C
So how get a list of all Emoji scalars?
// NOTE: These ranges are still just a subset of all the emoji characters;
// they seem to be all over the place...
let emojiRanges = [
0x1F601...0x1F64F,
0x2702...0x27B0,
0x1F680...0x1F6C0,
0x1F170...0x1F251
]
for range in emojiRanges {
for i in range {
guard let scalar = UnicodeScalar(i) else { continue }
let c = String(scalar)
print(c)
}
}
In this SO post, the emoji scalars range is:
unicode-range:
U+0080-02AF, U+0300-03FF, U+0600-06FF, U+0C00-0C7F, U+1DC0-1DFF, U+1E00-1EFF, U+2000-209F, U+20D0-214F, U+2190-23FF, U+2460-25FF, U+2600-27EF, U+2900-29FF, U+2B00-2BFF, U+2C60-2C7F, U+2E00-2E7F, U+3000-303F, U+A490-A4CF, U+E000-F8FF, U+FE00-FE0F, U+FE30-FE4F, U+1F000-1F02F, U+1F0A0-1F0FF, U+1F100-1F64F, U+1F680-1F6FF, U+1F910-1F96B, U+1F980-1F9E0;
if this is correct, then this is the complete list?
Where is the meat of Unicode.Scalar.Properties.isEmoji
? I can’t find it in the GitHub source.