MrMage
(Daniel Alm)
1
I recently came across Daniel Lemire's work on fast UTF8 validation. Given that Swift needs to UTF8-validate nearly every string that is e.g. read from a SQLite database, I was wondering whether it would be possible and worthwhile to use Lemire's work for speeding up Swift's current UTF8 validation code.
I am not an expert in UTF8 parsing and validation, but thought that it would be worth discussing at least.
7 Likes
stevapple
(YR Chen)
2
I think that’s a great work! As I’m not familiar with UTF8, I believe we should first evaluate the algorithm to prove it matches Swift’s use case and doesn’t spoil the Apache-2 license. I also think you can try to contact Daniel Lemire and invite him to bring the PR himself.
Vectorizing UTF8 handling has been on our list for a while. I expect at some point @scanon will get a moment in between numerics projects and do something amazing there 
6 Likes