I've moved this discussion to a more appropriate subforum from the original thread ([Pitch] Compile-Time Constant Values). I'm hoping that someone who works heavily on SILOptimizer like @Erik_Eckstein might have some more insight here.
I've been looking into taking advantage of
swift_allocStaticObject -based values for other projects but if they're sensitive to small perturbations in the input like this, then that gets a lot trickier. I guess to frame the findings below, I have a couple questions:
- Should we be able to rely on the optimizer to be able to outline certain forms of global/static data without any undocumented or hard-to-understand limitations around the size of that data? Should hitting such a limitation be considered a bug?
- Could this outlining be made to work in debug builds as well? Even though performance isn't as critical in those cases, it would also be great if we could avoid penalizing debug builds in situations where we want to use static data.
If we could solve these two questions (or at least the first one), it would be a great way to handle simple static data in Swift without having to introduce new keywords or attributes into the language.
There definitely are situations where the compiler can outline complex values directly into the data segment of the binary, but I don't know what the limitations are. For example, this example (godbolt) with a simple
String struct gets converted to a data blob:
output.staticArray : [output.Foo]: .zero 8 mainTv_: .zero 8 .zero 16 .quad 6 .quad 12 .quad 10 .quad 7234932 .quad -2089670227099910144 .quad 20 .quad 133540975310708 .quad -1873497444986126336 .quad 30 .quad 133541042677876 .quad -1873497444986126336 .quad 100 .quad 7236850741311532655 .quad -1513209474789907086 .quad 1000 .quad 8462097072821464687 .quad -1441151879073603213 .quad 100000 .quad -3458764513820540908 .quad .L__unnamed_1+9223372036854775776 .L__unnamed_1: .asciz "one hundred thousand"
...where the code to initialize it consists only of a type metadata lookup and then a call to
swift_initStaticObject, which appears to be fast—it just populates the runtime metadata pointer at the beginning of the inlined data.
One cool thing about this is that I was able to use
Strings instead of
StaticStrings and it still worked—and in fact, the strings that could fit into the small string representation used that instead of using a pointer to separate character data.
I'm not sure what it is about
OptionSets that prevents this from working, though. I think this happens as part of the ObjectOutliner SILOptimizer pass in the compiler, so that would be the place to improve.
The other drawback about this is that since it's an optimization pass, unoptimized builds will still get the slow code path that initializes everything at runtime. I wonder if this transformation could happen as a mandatory pass, so that debug builds could still rely on static data.
(A few days later)
I poked at this a little bit more and the outcome was interesting (and beyond my understanding of how the optimizer works).
It turns out
OptionSet s by themselves aren't a problem; some of them are able to outline into static objects fine. Your WebURL case does if we stop the array after element
0x67 . But once we add element
0x68 , or anything after that, it stops outlining.
I looked more closely at the generated post-opt SIL and it looks like in the
0x00-0x67 case, the ObjectOutliner pass outlines the whole array into a single SIL value
mainTv_ , which is what we want.
0x00-0x68 case, it looks like the individual arrays inside the larger array (the option set unions) are outlined separately (
mainTv_...mainTv7_ ), and the overall array is never outlined.
Maybe the inner array outlining is an intermediate step to outlining the whole thing, and the
0x68 th element pushes the number of basic blocks or instructions past some limit that's coded into an optimizer pass that makes it claim it's too complex to analyze, causing it to break down after that?