Okay, let's start with an apology: This post is getting a bit into the nitty gritty and might sound like a lot of work to actually use. And yes, it's a lot of information and yes, this is definitely getting a bit weird but once you wrap your head around it I think it shouldn't be too bad.
Our agenda is the following:
- Create a "special" module which holds a "special" type that whose copies will always call a function (one off effort, you can build this module once and use it forever)
- Integrate that special module into any code base (pretty easy, just two extra
swift build flags)
- Use
lldb (or any other debugger really) to find the copies
With that out of the way, let's get going.
Intro
To get what you want, wouldn't it be awesome if we could have a function that we could set a breakpoint on every time a struct gets copied? Yes, it would. "Unfortunately", Swift is often pretty good at optimising so it doesn't actually need real functions to copy but inlines the effects that it needs to copy something.
But: What if we could undo that? What if we could make it such that the Swift compiler couldn't ever inline the copies and would need to call a named function? Hmm, this sounds pretty much like what Library Evolution does, no?
Precisely, let's assume in your real codebase you have a struct CopiedTooMuch defined like
struct CopiedTooMuch {
var some: ActualVariable
var another: ActualVariable
}
And let's assume you want to find out why CopiedTooMuch is being copied too much.
Step 1: Creation of the special module
What I am proposing is to add (for debugging only) add another member to this struct called
import FindMeModule
struct CopiedTooMuch {
var some: ActualVariable
var another: ActualVariable
// We don't need it but we need its side effects
let findMe: FindMe = FindMe() // <<< THIS IS NEW
}
So far, nothing really changes but one thing is certain: To copy a CopiedTooMuch we will also need to copy FindMe. And how can we make sure that the Swift compiler never inlines the FineMe copies but always calls a "please copy a FindMe" function? Library evolution
.
If you create a findme.swift file with this content
class Clazz {
init() {}
}
public struct FindMe {
private let x0: Clazz = .init()
private let x1: Clazz = .init()
public init() {}
}
and then compile it like so (assuming both findme.swift and your current working directory is /tmp)
swiftc -O -o /tmp/libFindMeModule.dylib \
-emit-module -module-name FindMeModule \
-emit-library -emit-module-interface \
-enable-library-evolution findme.swift
That will yield you a real Swift module FindMeModule which is completely opaque to the compiler (after all it has library evolution on).
Step 2: Integration of FindMe into your codebase
Finally, adjust the file which has CopiedTooMuch in your real codebase to add the import FindMeModule as well as the let findMe = FindMe(). To compile your real codebase you now need to use
swift build -Xswiftc -I/tmp -Xlinker /tmp/libFindMeModule.dylib
so that the Swift compiler can find our special little module. Again, I'm assuming here that it's in /tmp.
Step 3: Using lldb to find the copies
Good, with the compilation out of the way we can run lldb .build/debug/MyActualProject. And there we can finally set our breakpoints!
break set -n $s12FindMeModule0aB0VwCP
break set -n $s12FindMeModule0aB0Vwca
break set -n $s12FindMeModule0aB0Vwcp
(FWIW, Vw stands for 'value witness' and the weird CP, ca, cp suffixes are explained in the name mangling doc)
If you set those breakpoints which lldb will say
(lldb) break set -n $s12FindMeModule0aB0VwCP
Breakpoint 2: where = libFindMeModule.dylib`initializeBufferWithCopyOfBuffer value witness for FindMeModule.FindMe, address = 0x0000000000003ca0
(lldb) break set -n $s12FindMeModule0aB0Vwca
Breakpoint 3: where = libFindMeModule.dylib`assignWithCopy value witness for FindMeModule.FindMe, address = 0x0000000000003d30
(lldb) break set -n $s12FindMeModule0aB0Vwcp
Breakpoint 4: where = libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe, address = 0x0000000000003cfc
Perfect! Now what's left is to run the binary
(lldb) run
Process 7429 launched: '/tmp/package/.build/debug/package' (arm64)
Once a copy happens, lldb will then
Process 7429 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1
frame #0: 0x0000000100473cfc libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe
libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe:
-> 0x100473cfc <+0>: stp x20, x19, [sp, #-0x20]!
0x100473d00 <+4>: stp x29, x30, [sp, #0x10]
0x100473d04 <+8>: add x29, sp, #0x10
0x100473d08 <+12>: mov x19, x0
Target 0: (package) stopped.
Cool, let's see the backtrace (bt):
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1
* frame #0: 0x0000000100473cfc libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe
frame #1: 0x0000000100006e98 package`outlined init with copy of CopiedTooMuch at <compiler-generated>:0
frame #2: 0x0000000100006b5c package`FOO(a=package.CopiedTooMuch @ 0x000000016fdff2a0) at main.swift:11:13
frame #3: 0x0000000100006798 package`package_main at main.swift:24:7
frame #4: 0x000000018eb91058 dyld`start + 2224
Okay, frame 2 is my "actual" code, let's go there
(lldb) frame select 2
frame #2: 0x0000000100006b5c package`FOO(a=package.CopiedTooMuch @ 0x000000016fdff2a0) at main.swift:11:13
8 @inline(never)
9 func FOO(_ a: CopiedTooMuch) -> Int {
10 // we'll cause some copies here
-> 11 var a = a
12 var b = a
13 BAR(&a)
14 BAR(&b)
Nice! See how the compiler can now show us that var a = a is what causes this copy?
To see the next copy, just continue with cont in lldb and so on.
A few extra bits of information:
- This would likely also work without library evolution but it's not guaranteed to work because the compiler and optimisation modes might inline even across modules these days
- If you prefer to use the
FindMeModule as a SwiftPM dependency that should also work but the cross module inlining might be a problem