Is there any way to check where ARC calls are inserted by the compiler?

Hi there, new forum user here.

I'm using swift to build a GameBoy emulator to learn both the language (in hard mode, some might say :slight_smile:) and the performance tools available.

When optimising a part of the emulator I tried and profiled several approaches, and one of the things that came as a performance killer was release / retain cycles (I'm using the phrase 'performance killer' in a really relative way as I was using a synthetic benchmark).

I've made a lot of efforts to minimise those calls, from the basics like use structs to more complex things like using UnsafePointers for fixed size arrays, and Unmanaged but the retain release calls were still there. The main point is: I don't know exactly why those calls are being inserted, so I wonder if there is any way to allow clang to generate the code or pseudo code that the compiler will inject to add those calls. Something slightly more high level that ASM code I mean.

Thanks a lot

5 Likes

Unfortunately, it's not as easy as you might expect, but I can tell you what I use. First of all, never look at ARC if you're not using an optimised build.

For (manual) static analysis:

  • assembly (look for swift_retain, swift_bridgeObjectRetain, swift_retain_n, and maybe a few others, a regex like swift_[a-zA-Z_]+etain will probably work)
  • SIL (I usually put the code I want to optimise in a separate file, then you can swiftc -O -emil-sil the-file.swift). In the SIL, you'll find things like strong_retain and retain_value

Often however, it's quite a bit easier to actually instrument a running program using:

  • lldb/Xcode's lldb support
  • dtrace

For both lldb/dtrace I would recommend putting the code you want to optimise in a large loop, ie. run it 10,000 or 100,000 times. Then set a breakpoint on swift_retain and tell lldb to ignore it about 500 times or so (break set -n swift_retain -i 500). The reason to ignore it 500 times is to jump over all the retains that happen before we even reach your program.

In Xcode, that's also quite easy:

My preferred tool however is dtrace. dtrace has a slightly steeper learning curve but there's plenty of documentation out there, please feel also free to ask me if something doesn't make sense. As you can see in one of the bugs I filed I usually use something akin to

swiftc -O test.swift && sudo dtrace -n 'pid$target::swift_retain*:entry { @c = count(); } :::END { printa(@c); } ' -c ./test

which will print (once the program exits) the number of swift_retain* calls the program has done. If you set up a big loop (10_000/100_000), then you can usually figure out how much ARC your particular algorithm does.

The other thing I use is

swiftc -O test.swift && sudo dtrace -n 'pid$target::swift_retain*:entry { @c[ustack()] = count(); } :::END { printa(@c); } ' -c ./test

which does the same thing as above, except that it aggregates the swift_retain counts by stack trace. So it'll count how many times which stack trace called retain. That leads to nice output like

[...]

              libswiftCore.dylib`swift_retain
              libswiftCore.dylib`swift_bridgeObjectRetain+0x46
              libswiftCore.dylib`_StringGuts.append(_:)+0x21c
              libswiftCore.dylib`specialized String.init(repeating:count:)+0x16e
              libswiftCore.dylib`String.init(repeating:count:)+0x12
              test`main+0x26a
              libdyld.dylib`start+0x1
              test`0x2
          1000000

              libswiftCore.dylib`swift_retain
              libswiftCore.dylib`swift_bridgeObjectRetain+0x46
              test`specialized find<A>(_:needle1:needle2:needle3:)+0x133
              test`main+0x2d8
              libdyld.dylib`start+0x1
              test`0x2
          1000001

Unfortunately, dtrace for certain operations uses privileged system access that is disabled with SIP (System Integrity Protection). So to get some dtrace to work you have to mess with the SIP (System Integrity Protection) settings and tell it to be enabled but without any dtrace stuff. Below, you can find instructions to have SIP still enabled but make dtrace work. Please be aware that this might lower the security of an attacker who has root access on your machine.

  1. reboot
  2. as soon as you hear the "Mac chime" and see the Apple logo (pretty much immediately after startup), press Cmd+R to enter Recovery mode
  3. Open Utilities->Terminal
  4. Run the command csrutil enable --without dtrace
  5. Reboot, you'll land in the normal OS with SIP still enabled but without dtrace protections

To undo this setting, do the procedure again but run csrutil enable.

14 Likes

Unfortunately, dtrace for certain operations uses privileged system
access that is disabled with SIP (System Integrity Protection).

I solve this problem by running stuff like this in a VM. I disable SIP in the VM and then take a snapshot of that. I can then access a SIP-disabled make by simply restoring that snapshot.

VMs are not great for performance testing in general, but here you just want to count retains and releases and that won’t be affected by the VM.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

4 Likes
Terms of Service

Privacy Policy

Cookie Policy