PSA: Compiler optimisation remarks

There is a super-hidden feature in the compiler which gives you insight in to all kinds of optimisation decisions: such as the cost-benefit calculations for inlining, where exclusivity is enforced, where ARC operations happen, which generic specialisations get generated and which calls get devirtualised.

To use it, add the @_assemblyVision attribute to a function, and build in release mode (or @_semantics("optremark") on a pre-5.6 compiler. I'm using 5.5.2, so that's what I'm showing). Be careful not to mark too many functions - on just one function, I received 805 remarks.

Make sure you edit your scheme settings to build in release mode:

I haven't seen it mentioned at all, and couldn't find any documentation about it, but it's really helpful. So I figured it was worth letting people know about it.

31 Likes

Add this to the list of things that are undocumented for no clear reason.

I'm gonna take a stab and guess that this was probably developed as a tool for folks working on the optimizer, not for developers to use for guidance on rearranging their code to be more optimizer friendly.

As such, I would expect the decisions and factors that are laid bare by this attribute to be able to change with any dot release, and certainly not base any critical decisions on the info gleamed from this.

E.g. some code is too slow to ship without optimization, but if we just move this line here a little, the cost and benefit shift slightly and it gets optimized, then some dot release comes along at a random point in the future, and suddenly the critical section is slow again without any code changes.

6 Likes

On the one hand, you’re almost certainly correct. On the other hand, this is really cool.

You are actually looking at two different features.

The first is called opt remarks. "opt remarks" shows the decisions the optimizer is making (e.x.: did I specialize this, did I inline this/etc).

The second is something that I invented called "Assembly Vision". This is the thing that is telling you where the ARC/exclusivity checks/runtime casts. Assembly Vision also enables the other form of normal opt-remarks since that information can be useful when determining where/why ARC is there. The idea is to make it so that instead of having to read the assembly, one can have vision on approximately where these calls are so you don't have to read the assembly yourself.

The proper way to invoke this is to put the @_assemblyVision attribute on a nominal type or a function. I believe that it is on trunk/5.6 (it might be in 5.5, I don't remember).

The reason why I haven't made a bigger deal about it is I want to extend it further and make it more powerful before I really shouted about it.

I hope it is useful for you! The concept came from me trying to automate how I optimize Swift code as a compiler engineer so that other engineers who aren't compiler people can be just as effective. The best way to see it in practice is look at the test cases that I have committed into tree (I posted some links to it below). Another thing to keep in mind is that -O gives worse remarks due to function signature optimization messing with some stuff. -Osize though works really well.

16 Likes

This is not true. I just wanted to improve it further before I really shouted about it. If it is useful to you... use it! That being said, you are correct that it doesn't provide guarantees per say, but it /can/ help you to understand your code (which is the point).

5 Likes

@Karl I would appreciate if you could fix your example to use @_assemblyVision so that people do not use @_semantics("optremark")

3 Likes

FYI to anyone interested in these features: the new Swift build system integration launched in Xcode 13.2 seems to hide remarks, so if you aren't seeing the expected output, turn it off.

1 Like

That’s an incredibly impressive feature, thank you! I think I saw it in the underscored attribute list before.

Though it’s a little weird that you’d put it into production unfinished, without even a compiler flag to toggle it. Is that common?

"usable but not fully ready" is not uncommon, yeah. For example _cdecl is in a similar state. Just need to make sure that we don't accidentally commit to a future plan prematurely, and that the unpolished bits don't impact production-ready stuff (e.g. _assemblyVision shouldn't have any impact on code that doesn't use it).

3 Likes

The key thing here is the underscore. The underscore signals that it is unfinished and not intended to be treated as a final finished thing. It is sort of like saying this is experimental or unstable.

6 Likes

Sure, but wouldn’t it make more sense to lock it behind a feature flag too?

There isn't any real advantage to doing that and I wanted to be able to get feedback.

Oh! I'm Sorry! I assumed it was something being used internally or for optimiser development. But it is super-helpful, so thank you very much for creating it!

Yeah in one day it has already helped me discover that:

  • withContiguousStorageIfAvailable fallbacks for String.UTF8View were resulting in more specializations than I expected. It's really subtle and easy to miss, but I managed to reduce my binary size by 20% (!) by avoiding that.

  • Some of my algorithms should be split in to small functions for better inlining

  • Some theoretical fast paths in my algorithms were incurring ARC. I've seen big benefits by slimming them down.

  • wCSIA is not eliminated until late in the optimiser pipeline, so the compiler specialises my program 3 or 4 times more than it needs to, then discards most of what it did. I filed SR-15624 about it. No impact on size or performance, but there are potentially compile-time savings there.

That spelling doesn't seem to work on 5.5.2/Xcode 13.2, but I added it and to the post as the preferred spelling.

The feedback I can give is that it's great :smiley:, but if you have a function which is specialised multiple times, the results can look a bit cluttered and it's hard to tell them apart. For example, this function is inlined in some specializations but not in others, and it's not clear which:

Also, it's not clear why some specializations are generated. I was looking everywhere for what was causing the String.UTF8View specializations to be generated (bear in mind that it may not even be reachable code, as in SR-15624).

6 Likes

Hm phase ordering issues like this are super subtle, I wouldn't be surprised if there actually are lurking performance issues hidden behind that. Neat find.

3 Likes

No worries!

Great! That makes me really happy!

Ok.

The specialization issue I am not sure what that is about, but what I can say is that:

  1. I want to add to @_assemblyVision the ability to specialize which of the perf remarks you are getting with the idea that you could select individual ones or we could make specific views (like ARC or Exclusivity) and maybe you could add the marker multiple times with different categories that we or together. I think this would help cut down on the verbosity of the output.

  2. I think it may be clearer if you look at it on the command line or use something like emacs with that where you can jump to definition from the compile output. I find that makes it easier to read in Xcode.

5 Likes

Xcode integration with the remarks system in general would be great. Not just for this and other custom remarks, but the educational output and other systems as well.

5 Likes

I tossed a PR up to see what happens if we make those always-inline

3 Likes

Looks inconclusive perf-wise, and a notable code size regression, so I think we can conclude that the optimizer is doing its job and cleaning up after itself despite the phase ordering issue as @Karl noted. Alas, no trivially easy wins to be had here.

2 Likes

Oh well, thanks for trying