Unicode is only a problem if the data tables are part of a system dependency (ICU or the standard library on Apple platforms). If you statically link those things, your view of Unicode characters is also fixed at compile-time and could in theory also be evaluated by the compiler. I believe Unicode also makes forward compatibility guarantees, although I haven't examined them in much detail and they may/may not be sufficient for everything we'd want.
One issue that I've found with Swift is that it won't statically allocate objects (i.e. serialise them in the binary); StaticString
is basically the only exception, and even simple arrays of RawOptionSet
s get initialised at start-time in the main function.
For example, take the percent-encoding table used by WebURL (Godbolt). If you check out the main
function, you'll see that the compiler evaluated the array enough to reduce it to a series of magic numbers, but still initialises it at runtime. For us, part of the goal of the project is to be 100% Swift and not use any C shims, and it's only 256 bytes so the overhead is small, but it is a significant reason why the standard library's own Unicode support has to be written in C rather than Swift.
So what I wonder is: will we be able to support compile-time constant values that allocate memory? It doesn't seem like we can today. I'm not just talking about Arrays; also user-defined types built with ManagedBuffer
etc.