There is a super-hidden feature in the compiler which gives you insight in to all kinds of optimisation decisions: such as the cost-benefit calculations for inlining, where exclusivity is enforced, where ARC operations happen, which generic specialisations get generated and which calls get devirtualised.
To use it, add the @_assemblyVision attribute to a function, and build in release mode (or @_semantics("optremark") on a pre-5.6 compiler. I'm using 5.5.2, so that's what I'm showing). Be careful not to mark too many functions - on just one function, I received 805 remarks.
I'm gonna take a stab and guess that this was probably developed as a tool for folks working on the optimizer, not for developers to use for guidance on rearranging their code to be more optimizer friendly.
As such, I would expect the decisions and factors that are laid bare by this attribute to be able to change with any dot release, and certainly not base any critical decisions on the info gleamed from this.
E.g. some code is too slow to ship without optimization, but if we just move this line here a little, the cost and benefit shift slightly and it gets optimized, then some dot release comes along at a random point in the future, and suddenly the critical section is slow again without any code changes.
You are actually looking at two different features.
The first is called opt remarks. "opt remarks" shows the decisions the optimizer is making (e.x.: did I specialize this, did I inline this/etc).
The second is something that I invented called "Assembly Vision". This is the thing that is telling you where the ARC/exclusivity checks/runtime casts. Assembly Vision also enables the other form of normal opt-remarks since that information can be useful when determining where/why ARC is there. The idea is to make it so that instead of having to read the assembly, one can have vision on approximately where these calls are so you don't have to read the assembly yourself.
The proper way to invoke this is to put the @_assemblyVision attribute on a nominal type or a function. I believe that it is on trunk/5.6 (it might be in 5.5, I don't remember).
The reason why I haven't made a bigger deal about it is I want to extend it further and make it more powerful before I really shouted about it.
I hope it is useful for you! The concept came from me trying to automate how I optimize Swift code as a compiler engineer so that other engineers who aren't compiler people can be just as effective. The best way to see it in practice is look at the test cases that I have committed into tree (I posted some links to it below). Another thing to keep in mind is that -O gives worse remarks due to function signature optimization messing with some stuff. -Osize though works really well.
This is not true. I just wanted to improve it further before I really shouted about it. If it is useful to you... use it! That being said, you are correct that it doesn't provide guarantees per say, but it /can/ help you to understand your code (which is the point).
"usable but not fully ready" is not uncommon, yeah. For example _cdecl is in a similar state. Just need to make sure that we don't accidentally commit to a future plan prematurely, and that the unpolished bits don't impact production-ready stuff (e.g. _assemblyVision shouldn't have any impact on code that doesn't use it).
Oh! I'm Sorry! I assumed it was something being used internally or for optimiser development. But it is super-helpful, so thank you very much for creating it!
Yeah in one day it has already helped me discover that:
withContiguousStorageIfAvailable fallbacks for String.UTF8View were resulting in more specializations than I expected. It's really subtle and easy to miss, but I managed to reduce my binary size by 20% (!) by avoiding that.
Some of my algorithms should be split in to small functions for better inlining
Some theoretical fast paths in my algorithms were incurring ARC. I've seen big benefits by slimming them down.
wCSIA is not eliminated until late in the optimiser pipeline, so the compiler specialises my program 3 or 4 times more than it needs to, then discards most of what it did. I filed SR-15624 about it. No impact on size or performance, but there are potentially compile-time savings there.
That spelling doesn't seem to work on 5.5.2/Xcode 13.2, but I added it and to the post as the preferred spelling.
The feedback I can give is that it's great , but if you have a function which is specialised multiple times, the results can look a bit cluttered and it's hard to tell them apart. For example, this function is inlined in some specializations but not in others, and it's not clear which:
Also, it's not clear why some specializations are generated. I was looking everywhere for what was causing the String.UTF8View specializations to be generated (bear in mind that it may not even be reachable code, as in SR-15624).
The specialization issue I am not sure what that is about, but what I can say is that:
I want to add to @_assemblyVision the ability to specialize which of the perf remarks you are getting with the idea that you could select individual ones or we could make specific views (like ARC or Exclusivity) and maybe you could add the marker multiple times with different categories that we or together. I think this would help cut down on the verbosity of the output.
I think it may be clearer if you look at it on the command line or use something like emacs with that where you can jump to definition from the compile output. I find that makes it easier to read in Xcode.
Looks inconclusive perf-wise, and a notable code size regression, so I think we can conclude that the optimizer is doing its job and cleaning up after itself despite the phase ordering issue as @Karl noted. Alas, no trivially easy wins to be had here.