Consistency of the ObjectOutliner SILOptimizer pass

I've moved this discussion to a more appropriate subforum from the original thread ([Pitch] Compile-Time Constant Values). I'm hoping that someone who works heavily on SILOptimizer like @Erik_Eckstein might have some more insight here.

I've been looking into taking advantage of swift_allocStaticObject -based values for other projects but if they're sensitive to small perturbations in the input like this, then that gets a lot trickier. I guess to frame the findings below, I have a couple questions:

  1. Should we be able to rely on the optimizer to be able to outline certain forms of global/static data without any undocumented or hard-to-understand limitations around the size of that data? Should hitting such a limitation be considered a bug?
  2. Could this outlining be made to work in debug builds as well? Even though performance isn't as critical in those cases, it would also be great if we could avoid penalizing debug builds in situations where we want to use static data.

If we could solve these two questions (or at least the first one), it would be a great way to handle simple static data in Swift without having to introduce new keywords or attributes into the language.


(My initial reply to @Karl's comment about the cost of initializing effectively static data at runtime)

There definitely are situations where the compiler can outline complex values directly into the data segment of the binary, but I don't know what the limitations are. For example, this example (godbolt) with a simple Int and String struct gets converted to a data blob:

output.staticArray : [output.Foo]:
        .zero   8

mainTv_:
        .zero   8
        .zero   16
        .quad   6
        .quad   12
        .quad   10
        .quad   7234932
        .quad   -2089670227099910144
        .quad   20
        .quad   133540975310708
        .quad   -1873497444986126336
        .quad   30
        .quad   133541042677876
        .quad   -1873497444986126336
        .quad   100
        .quad   7236850741311532655
        .quad   -1513209474789907086
        .quad   1000
        .quad   8462097072821464687
        .quad   -1441151879073603213
        .quad   100000
        .quad   -3458764513820540908
        .quad   .L__unnamed_1+9223372036854775776

.L__unnamed_1:
        .asciz  "one hundred thousand"

...where the code to initialize it consists only of a type metadata lookup and then a call to swift_initStaticObject, which appears to be fast—it just populates the runtime metadata pointer at the beginning of the inlined data.

One cool thing about this is that I was able to use Strings instead of StaticStrings and it still worked—and in fact, the strings that could fit into the small string representation used that instead of using a pointer to separate character data.

I'm not sure what it is about OptionSets that prevents this from working, though. I think this happens as part of the ObjectOutliner SILOptimizer pass in the compiler, so that would be the place to improve.

The other drawback about this is that since it's an optimization pass, unoptimized builds will still get the slow code path that initializes everything at runtime. I wonder if this transformation could happen as a mandatory pass, so that debug builds could still rely on static data.


(A few days later)

I poked at this a little bit more and the outcome was interesting (and beyond my understanding of how the optimizer works).

It turns out OptionSet s by themselves aren't a problem; some of them are able to outline into static objects fine. Your WebURL case does if we stop the array after element 0x67 . But once we add element 0x68 , or anything after that, it stops outlining.

I looked more closely at the generated post-opt SIL and it looks like in the 0x00-0x67 case, the ObjectOutliner pass outlines the whole array into a single SIL value mainTv_ , which is what we want.

In the 0x00-0x68 case, it looks like the individual arrays inside the larger array (the option set unions) are outlined separately ( mainTv_...mainTv7_ ), and the overall array is never outlined.

Maybe the inner array outlining is an intermediate step to outlining the whole thing, and the 0x68 th element pushes the number of basic blocks or instructions past some limit that's coded into an optimizer pass that makes it claim it's too complex to analyze, causing it to break down after that?

5 Likes

Ping @Erik_Eckstein (who was pinged in the original before it was moved). I'd love to know if anybody has any ideas about this.

Thanks so much for investigating this. The findings are really fascinating. It's also interesting to see how much better the compiler has done with each release; there's a clear improvement from 5.3 -> 5.4 -> 5.5 -> nightly.

1 Like
Terms of Service

Privacy Policy

Cookie Policy