Recently I found that there are surprisingly a huge number of struct
s being copied here and there in our codebase. Are there some techniques (maybe a compiler warning) that I can use to effectively catch unintentional copies?
You can use the new ownership modifiers which will force you to explicitly copy values.
Adding to @dnadoba's suggestion:
Are you talking about Release builds specifically (i.e. with use of an -O
or -Osize
or -Ounchecked
flag)? The Swift compiler omits a lot of basic optimisations in Debug builds (for the sake of making compilation faster).
Can you utilise non-copyable structs for some of your needs? Note that you can still copy a non-copyable struct, you just have to do it manually with e.g. a clone
method you write yourself.
If you can precisely describe (or provide simple example code for) problem cases, the Swift compiler folks would probably appreciate a bug report.
Yes, the ownership modifiers can alleviate the waste of copy indeed, but I want an efficient way to find where copying, especially some huge struct
s, occurs in the code. There are ~1K people working on our codebase and it is hard to teach all the developers to write copy-free code, so we want an automatic solution, like compiler warnings, to keep the code clean reasonably.
I think the copying problem exists no matter if it is a release build. The compiler behavior should be consistent (a copyable struct
should always be copied when being passed around under any optimization levels).
From my testing done on Swift 5.9, the functionality of ~Copyable
is extremely limited. It does not play with protocol and generics at all.
I get where you're coming from, but it'll be hard to get help from the compiler on this. Not every copy is unnecessary nor a performance problem, and the compiler doesn't really have great insight into which cases will be acceptable and which won't (though maybe when PGO is in use, it could).
It'll probably be more viable - and to the point of what you care about, performance - to focus on profiling and benchmark unit tests. That way you can ignore any copies that don't actually matter, and you'll also catch performance problems due to more than just copies. You can make performance testing part of your development process - e.g. part of your CI pipeline - and even block integration of patches that introduce regressions.
I don't think that's accurate (nor a desirable constraint). Swift is a pretty high-level language that nonetheless aspires to be very performant, and that means it has to do some pretty advanced optimisations, such as eliminating entire instantiations of structs let-alone mere copies (e.g. to make lazy Collection
s and Sequence
s fast, you can't actually allocate all the intermediary structures nor actually have them chain calls through each other for every element you fetch - it all gets completely eliminated by the optimiser until you're left with a trivial loop and the essential operations, potentially just a few machine instructions).
It's true, non-copyable structs currently don't play well with those features. There's work underway to rectify that, although it's probably still some way off from landing in a Swift release.
Okay, let's start with an apology: This post is getting a bit into the nitty gritty and might sound like a lot of work to actually use. And yes, it's a lot of information and yes, this is definitely getting a bit weird but once you wrap your head around it I think it shouldn't be too bad.
Our agenda is the following:
- Create a "special" module which holds a "special" type that whose copies will always call a function (one off effort, you can build this module once and use it forever)
- Integrate that special module into any code base (pretty easy, just two extra
swift build
flags) - Use
lldb
(or any other debugger really) to find the copies
With that out of the way, let's get going.
Intro
To get what you want, wouldn't it be awesome if we could have a function that we could set a breakpoint on every time a struct gets copied? Yes, it would. "Unfortunately", Swift is often pretty good at optimising so it doesn't actually need real functions to copy but inlines the effects that it needs to copy something.
But: What if we could undo that? What if we could make it such that the Swift compiler couldn't ever inline the copies and would need to call a named function? Hmm, this sounds pretty much like what Library Evolution does, no?
Precisely, let's assume in your real codebase you have a struct CopiedTooMuch
defined like
struct CopiedTooMuch {
var some: ActualVariable
var another: ActualVariable
}
And let's assume you want to find out why CopiedTooMuch
is being copied too much.
Step 1: Creation of the special module
What I am proposing is to add (for debugging only) add another member to this struct called
import FindMeModule
struct CopiedTooMuch {
var some: ActualVariable
var another: ActualVariable
// We don't need it but we need its side effects
let findMe: FindMe = FindMe() // <<< THIS IS NEW
}
So far, nothing really changes but one thing is certain: To copy a CopiedTooMuch
we will also need to copy FindMe
. And how can we make sure that the Swift compiler never inlines the FineMe
copies but always calls a "please copy a FindMe" function? Library evolution .
If you create a findme.swift
file with this content
class Clazz {
init() {}
}
public struct FindMe {
private let x0: Clazz = .init()
private let x1: Clazz = .init()
public init() {}
}
and then compile it like so (assuming both findme.swift
and your current working directory is /tmp
)
swiftc -O -o /tmp/libFindMeModule.dylib \
-emit-module -module-name FindMeModule \
-emit-library -emit-module-interface \
-enable-library-evolution findme.swift
That will yield you a real Swift module FindMeModule
which is completely opaque to the compiler (after all it has library evolution on).
Step 2: Integration of FindMe
into your codebase
Finally, adjust the file which has CopiedTooMuch
in your real codebase to add the import FindMeModule
as well as the let findMe = FindMe()
. To compile your real codebase you now need to use
swift build -Xswiftc -I/tmp -Xlinker /tmp/libFindMeModule.dylib
so that the Swift compiler can find our special little module. Again, I'm assuming here that it's in /tmp
.
Step 3: Using lldb
to find the copies
Good, with the compilation out of the way we can run lldb .build/debug/MyActualProject
. And there we can finally set our breakpoints!
break set -n $s12FindMeModule0aB0VwCP
break set -n $s12FindMeModule0aB0Vwca
break set -n $s12FindMeModule0aB0Vwcp
(FWIW, Vw
stands for 'value witness' and the weird CP
, ca
, cp
suffixes are explained in the name mangling doc)
If you set those breakpoints which lldb
will say
(lldb) break set -n $s12FindMeModule0aB0VwCP
Breakpoint 2: where = libFindMeModule.dylib`initializeBufferWithCopyOfBuffer value witness for FindMeModule.FindMe, address = 0x0000000000003ca0
(lldb) break set -n $s12FindMeModule0aB0Vwca
Breakpoint 3: where = libFindMeModule.dylib`assignWithCopy value witness for FindMeModule.FindMe, address = 0x0000000000003d30
(lldb) break set -n $s12FindMeModule0aB0Vwcp
Breakpoint 4: where = libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe, address = 0x0000000000003cfc
Perfect! Now what's left is to run
the binary
(lldb) run
Process 7429 launched: '/tmp/package/.build/debug/package' (arm64)
Once a copy happens, lldb will then
Process 7429 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1
frame #0: 0x0000000100473cfc libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe
libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe:
-> 0x100473cfc <+0>: stp x20, x19, [sp, #-0x20]!
0x100473d00 <+4>: stp x29, x30, [sp, #0x10]
0x100473d04 <+8>: add x29, sp, #0x10
0x100473d08 <+12>: mov x19, x0
Target 0: (package) stopped.
Cool, let's see the backtrace (bt
):
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1
* frame #0: 0x0000000100473cfc libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe
frame #1: 0x0000000100006e98 package`outlined init with copy of CopiedTooMuch at <compiler-generated>:0
frame #2: 0x0000000100006b5c package`FOO(a=package.CopiedTooMuch @ 0x000000016fdff2a0) at main.swift:11:13
frame #3: 0x0000000100006798 package`package_main at main.swift:24:7
frame #4: 0x000000018eb91058 dyld`start + 2224
Okay, frame 2 is my "actual" code, let's go there
(lldb) frame select 2
frame #2: 0x0000000100006b5c package`FOO(a=package.CopiedTooMuch @ 0x000000016fdff2a0) at main.swift:11:13
8 @inline(never)
9 func FOO(_ a: CopiedTooMuch) -> Int {
10 // we'll cause some copies here
-> 11 var a = a
12 var b = a
13 BAR(&a)
14 BAR(&b)
Nice! See how the compiler can now show us that var a = a
is what causes this copy?
To see the next copy, just continue with cont
in lldb and so on.
A few extra bits of information:
- This would likely also work without library evolution but it's not guaranteed to work because the compiler and optimisation modes might inline even across modules these days
- If you prefer to use the
FindMeModule
as a SwiftPM dependency that should also work but the cross module inlining might be a problem
Little addition, if you integrate like this (with #if FIND_ME
)
#if FIND_ME
import FindMeModule
#endif
struct CopiedTooMuch {
var x = 0
#if FIND_ME
var findMe = FindMe()
#endif
}
@inline(never)
func FOO(_ a: CopiedTooMuch) -> Int {
// we'll cause some copies here
var a = a
var b = a
BAR(&a)
BAR(&b)
return a.x + b.x
}
@inline(never)
func BAR(_ a: inout CopiedTooMuch) {
a.x += 1
precondition(a.x == 1)
}
print(FOO(CopiedTooMuch()))
then you can compile your code normally (just swift build
) without any FindMe
stuff as well as with
swift build \
-Xswiftc -I/tmp -Xlinker /tmp/libFindMeModule.dylib \
-Xswiftc -DFIND_ME
when you want to use FindMe
to find copies.
I appreciate your post but your approach is not really suitable for my need: I would like to detect ALL the copying of struct
s, especially the huge ones, in a project where ~1K devs are working on. I understand your idea of preventing the inlining of copying code then setting up breakpoints based on the mangling rules then trying to catch them at runtime, which obviously does not scale.
Also this is why I thought a compiler warning could be the best option: to warn on any ASTs (except the new copy
operator or other things expressing explicit copy) that would lead to a huge struct
being copied. The size threshold could be customized with an additional command line option. After a fresh compile of the whole project, we would have all the sites where copying happens by analyzing compiler's output. Moreover, the warning would give direct hints to the devs, so they could realize what they were doing early.
That's a fair criticism but there are other technologies which can help with that. For example:
- one of my favourite technologies of all time: DTrace
- scripting
lldb
to automatically continue and print - interposing the copy function with
DYLD_INTERPOSE
I'll demonstrate the first one here: The DTrace program pid$target::*Copy*FindMe*:entry { @copy_stacks[ustack()] = count(); }
will aggregate (by stack) and count all the copies of our FindMy
. And DTrace even ships with macOS, so you can run:
sudo dtrace \
-n 'pid$target::*Copy*FindMe*:entry { @copy_stacks[ustack()] = count(); }' \
-c .build/debug/MyActualProgram
which will yield
$ sudo dtrace -n 'pid$target::*Copy*FindMe*:entry { @copy_stacks[ustack()] = count(); }' -c .build/debug/MyActualProgram
dtrace: system integrity protection is on, some features will not be available
dtrace: description 'pid$target::*Copy*FindMe*:entry ' matched 3 probes
2
dtrace: pid 8473 has exited
libFindMeModule.dylib`initializeWithCopy for FindMe
package`outlined init with copy of CopiedTooMuch+0x6c
package`FOO(_:)+0x84
package`main+0x70
dyld`start+0x8b0
1
libFindMeModule.dylib`initializeWithCopy for FindMe
package`outlined init with copy of CopiedTooMuch+0x6c
package`FOO(_:)+0x90
package`main+0x70
dyld`start+0x8b0
1
In my demo program I only cause two copies, one at
this stack
libFindMeModule.dylib`initializeWithCopy for FindMe
package`outlined init with copy of CopiedTooMuch+0x6c
package`FOO(_:)+0x84
package`main+0x70
dyld`start+0x8b0
1 <<<< how many calls
and one at that
libFindMeModule.dylib`initializeWithCopy for FindMe
package`outlined init with copy of CopiedTooMuch+0x6c
package`FOO(_:)+0x90
package`main+0x70
dyld`start+0x8b0
1 <<<< how many calls
I hope that helps.
But yes, it'd be awesome if for example Instruments.app, Swift itself or some other technology could help us with finding these copies without us having to build it ourselves. I'd suggest you file a Github issues (for Swift) and/or a Apple Feedback (for Instruments) with your request.
As Swift leans towards using structs more and more +1 on that!
You can eliminate copies with the ownership modifiers but they also show you where copies are made through compiler errors if you don't use the copy
operator. If you cleaned up the error messages you see where copies are necessary directly in your source code. This is a large undertaking but doesn't actually require that much knowledge about the ownership modifiers.
I also want to mention that if you have large structs you can make copies cheap (i.e. roughly just a call to swift_retain
) by implementing CoW (Copy on Write) instead of eliminating copies. This will only provide benefits if the copy isn't mutated and the copy wasn't actually needed in the first place. CoW can be implemented manually but nowadays you can probably write a macro for it as well.
Just throwing an idea here,
if you use -emit-silgen
(I assume it's SIL as handed over by sema without any optimization pass), you might search for alloc_box
with store
(and optionally copy_value
) (and/or alloc_stack
, other cases ...). It even seems all the instructions contain the information you might need (type name, file location, ...).
I would like to ask you, can this idea even work, or is here some obvious problem I don't see @dnadoba
I'm not an SIL expert but I think copy_addr
is closer to my understanding of "copy". And the unoptimized SIL may not reflect the actual codegen. So I tried to emit a diagnostic in IRGenSILFunction::visitCopyAddrInst()
and it seemed to work really well.
Also with the help of the custom diagnostic I found an actual compiler issue.
Diving into the code, it seems like the frontend translates continuous logical operations into nested closure calls, and the struct is copied as a capture of the implicit closure. This behavior can be confirmed by setting breakpoint on the outlined helper. Since this post is getting some attention, I decide to cross-ref the issue here to make more people aware of it.
Can you elaborate, preferable with a somewhat small spoon
The copy is semantically necessary but it might be eliminated in an optimisation pass if the compiler can proof that it is not actually necessary. Do you emit the diagnostics after or before all optimisation passes have run? Also do you build in release configuration (i.e. -O
)?
I assume that by IRGenSIL::visitCopyAddrInst()
, you mean IRGenSILFunction::visitCopyAddrInst
, am I correct?
I would be really interested too, if you would share the diagnostic with us :)
My bad, I only checked the file name and assumed that the class name was the same.
As for the diagnostic, I am actually afraid that I am not doing it 100% right, so I choose not to post the code to avoid confusion for now.

The copy is semantically necessary but it might be eliminated in an optimisation pass if the compiler can proof that it is not actually necessary. Do you emit the diagnostics after or before all optimisation passes have run? Also do you build in release configuration (i.e.
-O
)?
FWIW, the above issue of multiple copy calls is visible in godbolt with -O.
At the stage of IRGenSILFunction
, the SIL should be finalized right?
All the testing and diagnostics emission I've done were built with -O
.