Find unintentional copy of struct

nocchijiang · November 18, 2023, 9:07am

Recently I found that there are surprisingly a huge number of structs being copied here and there in our codebase. Are there some techniques (maybe a compiler warning) that I can use to effectively catch unintentional copies?

dnadoba · November 18, 2023, 4:37pm

You can use the new ownership modifiers which will force you to explicitly copy values.

wadetregaskis · November 18, 2023, 4:42pm

Adding to @dnadoba's suggestion:

Are you talking about Release builds specifically (i.e. with use of an -O or -Osize or -Ounchecked flag)? The Swift compiler omits a lot of basic optimisations in Debug builds (for the sake of making compilation faster).

Can you utilise non-copyable structs for some of your needs? Note that you can still copy a non-copyable struct, you just have to do it manually with e.g. a clone method you write yourself.

If you can precisely describe (or provide simple example code for) problem cases, the Swift compiler folks would probably appreciate a bug report.

nocchijiang · November 19, 2023, 5:25am

Yes, the ownership modifiers can alleviate the waste of copy indeed, but I want an efficient way to find where copying, especially some huge structs, occurs in the code. There are ~1K people working on our codebase and it is hard to teach all the developers to write copy-free code, so we want an automatic solution, like compiler warnings, to keep the code clean reasonably.

nocchijiang · November 19, 2023, 5:35am

I think the copying problem exists no matter if it is a release build. The compiler behavior should be consistent (a copyable struct should always be copied when being passed around under any optimization levels).

From my testing done on Swift 5.9, the functionality of ~Copyable is extremely limited. It does not play with protocol and generics at all.

wadetregaskis · November 19, 2023, 5:55am

I get where you're coming from, but it'll be hard to get help from the compiler on this. Not every copy is unnecessary nor a performance problem, and the compiler doesn't really have great insight into which cases will be acceptable and which won't (though maybe when PGO is in use, it could).

It'll probably be more viable - and to the point of what you care about, performance - to focus on profiling and benchmark unit tests. That way you can ignore any copies that don't actually matter, and you'll also catch performance problems due to more than just copies. You can make performance testing part of your development process - e.g. part of your CI pipeline - and even block integration of patches that introduce regressions.

I don't think that's accurate (nor a desirable constraint). Swift is a pretty high-level language that nonetheless aspires to be very performant, and that means it has to do some pretty advanced optimisations, such as eliminating entire instantiations of structs let-alone mere copies (e.g. to make lazy Collections and Sequences fast, you can't actually allocate all the intermediary structures nor actually have them chain calls through each other for every element you fetch - it all gets completely eliminated by the optimiser until you're left with a trivial loop and the essential operations, potentially just a few machine instructions).

It's true, non-copyable structs currently don't play well with those features. There's work underway to rectify that, although it's probably still some way off from landing in a Swift release.

johannesweiss · November 19, 2023, 11:20am

Okay, let's start with an apology: This post is getting a bit into the nitty gritty and might sound like a lot of work to actually use. And yes, it's a lot of information and yes, this is definitely getting a bit weird but once you wrap your head around it I think it shouldn't be too bad.

Our agenda is the following:

Create a "special" module which holds a "special" type that whose copies will always call a function (one off effort, you can build this module once and use it forever)
Integrate that special module into any code base (pretty easy, just two extra swift build flags)
Use lldb (or any other debugger really) to find the copies

With that out of the way, let's get going.

Intro

To get what you want, wouldn't it be awesome if we could have a function that we could set a breakpoint on every time a struct gets copied? Yes, it would. "Unfortunately", Swift is often pretty good at optimising so it doesn't actually need real functions to copy but inlines the effects that it needs to copy something.

But: What if we could undo that? What if we could make it such that the Swift compiler couldn't ever inline the copies and would need to call a named function? Hmm, this sounds pretty much like what Library Evolution does, no?

Precisely, let's assume in your real codebase you have a struct CopiedTooMuch defined like

struct CopiedTooMuch {
    var some: ActualVariable
    var another: ActualVariable
}

And let's assume you want to find out why CopiedTooMuch is being copied too much.

Step 1: Creation of the special module

What I am proposing is to add (for debugging only) add another member to this struct called

import FindMeModule

struct CopiedTooMuch {
    var some: ActualVariable
    var another: ActualVariable

    // We don't need it but we need its side effects
    let findMe: FindMe = FindMe()  // <<< THIS IS NEW
}

So far, nothing really changes but one thing is certain: To copy a CopiedTooMuch we will also need to copy FindMe. And how can we make sure that the Swift compiler never inlines the FineMe copies but always calls a "please copy a FindMe" function? Library evolution .

If you create a findme.swift file with this content

class Clazz {
    init() {}
}

public struct FindMe {
    private let x0: Clazz = .init()
    private let x1: Clazz = .init()

    public init() {}
}

and then compile it like so (assuming both findme.swift and your current working directory is /tmp)

swiftc -O -o /tmp/libFindMeModule.dylib \
    -emit-module -module-name FindMeModule \
    -emit-library -emit-module-interface \
    -enable-library-evolution findme.swift

That will yield you a real Swift module FindMeModule which is completely opaque to the compiler (after all it has library evolution on).

Step 2: Integration of `FindMe` into your codebase

Finally, adjust the file which has CopiedTooMuch in your real codebase to add the import FindMeModule as well as the let findMe = FindMe(). To compile your real codebase you now need to use

swift build -Xswiftc -I/tmp -Xlinker /tmp/libFindMeModule.dylib

so that the Swift compiler can find our special little module. Again, I'm assuming here that it's in /tmp.

Step 3: Using `lldb` to find the copies

Good, with the compilation out of the way we can run lldb .build/debug/MyActualProject. And there we can finally set our breakpoints!

break set -n $s12FindMeModule0aB0VwCP
break set -n $s12FindMeModule0aB0Vwca
break set -n $s12FindMeModule0aB0Vwcp

(FWIW, Vw stands for 'value witness' and the weird CP, ca, cp suffixes are explained in the name mangling doc)

If you set those breakpoints which lldb will say

(lldb) break set -n $s12FindMeModule0aB0VwCP
Breakpoint 2: where = libFindMeModule.dylib`initializeBufferWithCopyOfBuffer value witness for FindMeModule.FindMe, address = 0x0000000000003ca0
(lldb) break set -n $s12FindMeModule0aB0Vwca
Breakpoint 3: where = libFindMeModule.dylib`assignWithCopy value witness for FindMeModule.FindMe, address = 0x0000000000003d30
(lldb) break set -n $s12FindMeModule0aB0Vwcp
Breakpoint 4: where = libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe, address = 0x0000000000003cfc

Perfect! Now what's left is to run the binary

(lldb) run
Process 7429 launched: '/tmp/package/.build/debug/package' (arm64)

Once a copy happens, lldb will then

Process 7429 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1
    frame #0: 0x0000000100473cfc libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe
libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe:
->  0x100473cfc <+0>:  stp    x20, x19, [sp, #-0x20]!
    0x100473d00 <+4>:  stp    x29, x30, [sp, #0x10]
    0x100473d04 <+8>:  add    x29, sp, #0x10
    0x100473d08 <+12>: mov    x19, x0
Target 0: (package) stopped.

Cool, let's see the backtrace (bt):

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1
  * frame #0: 0x0000000100473cfc libFindMeModule.dylib`initializeWithCopy value witness for FindMeModule.FindMe
    frame #1: 0x0000000100006e98 package`outlined init with copy of CopiedTooMuch at <compiler-generated>:0
    frame #2: 0x0000000100006b5c package`FOO(a=package.CopiedTooMuch @ 0x000000016fdff2a0) at main.swift:11:13
    frame #3: 0x0000000100006798 package`package_main at main.swift:24:7
    frame #4: 0x000000018eb91058 dyld`start + 2224

Okay, frame 2 is my "actual" code, let's go there

(lldb) frame select 2 
frame #2: 0x0000000100006b5c package`FOO(a=package.CopiedTooMuch @ 0x000000016fdff2a0) at main.swift:11:13
   8   	@inline(never)
   9   	func FOO(_ a: CopiedTooMuch) -> Int {
   10  	    // we'll cause some copies here
-> 11  	    var a = a
   12  	    var b = a
   13  	    BAR(&a)
   14  	    BAR(&b)

Nice! See how the compiler can now show us that var a = a is what causes this copy?

To see the next copy, just continue with cont in lldb and so on.

A few extra bits of information:

This would likely also work without library evolution but it's not guaranteed to work because the compiler and optimisation modes might inline even across modules these days
If you prefer to use the FindMeModule as a SwiftPM dependency that should also work but the cross module inlining might be a problem

johannesweiss · November 19, 2023, 11:34am

Little addition, if you integrate like this (with #if FIND_ME)

#if FIND_ME
import FindMeModule
#endif

struct CopiedTooMuch {
    var x = 0
#if FIND_ME
    var findMe = FindMe()
#endif
}

@inline(never)
func FOO(_ a: CopiedTooMuch) -> Int {
    // we'll cause some copies here
    var a = a
    var b = a
    BAR(&a)
    BAR(&b)
    return a.x + b.x
}

@inline(never)
func BAR(_ a: inout CopiedTooMuch) {
    a.x += 1
    precondition(a.x == 1)
}

print(FOO(CopiedTooMuch()))

then you can compile your code normally (just swift build) without any FindMe stuff as well as with

swift build \
    -Xswiftc -I/tmp -Xlinker /tmp/libFindMeModule.dylib \
    -Xswiftc -DFIND_ME

when you want to use FindMe to find copies.

nocchijiang · November 20, 2023, 1:20am

I appreciate your post but your approach is not really suitable for my need: I would like to detect ALL the copying of structs, especially the huge ones, in a project where ~1K devs are working on. I understand your idea of preventing the inlining of copying code then setting up breakpoints based on the mangling rules then trying to catch them at runtime, which obviously does not scale.

Also this is why I thought a compiler warning could be the best option: to warn on any ASTs (except the new copy operator or other things expressing explicit copy) that would lead to a huge struct being copied. The size threshold could be customized with an additional command line option. After a fresh compile of the whole project, we would have all the sites where copying happens by analyzing compiler's output. Moreover, the warning would give direct hints to the devs, so they could realize what they were doing early.

johannesweiss · November 20, 2023, 7:27am

That's a fair criticism but there are other technologies which can help with that. For example:

one of my favourite technologies of all time: DTrace
scripting lldb to automatically continue and print
interposing the copy function with DYLD_INTERPOSE

I'll demonstrate the first one here: The DTrace program pid$target::*Copy*FindMe*:entry { @copy_stacks[ustack()] = count(); } will aggregate (by stack) and count all the copies of our FindMy. And DTrace even ships with macOS, so you can run:

sudo dtrace \
    -n 'pid$target::*Copy*FindMe*:entry { @copy_stacks[ustack()] = count(); }' \
    -c .build/debug/MyActualProgram

which will yield

$ sudo dtrace -n 'pid$target::*Copy*FindMe*:entry { @copy_stacks[ustack()] = count(); }' -c .build/debug/MyActualProgram
dtrace: system integrity protection is on, some features will not be available

dtrace: description 'pid$target::*Copy*FindMe*:entry ' matched 3 probes
2
dtrace: pid 8473 has exited


              libFindMeModule.dylib`initializeWithCopy for FindMe
              package`outlined init with copy of CopiedTooMuch+0x6c
              package`FOO(_:)+0x84
              package`main+0x70
              dyld`start+0x8b0
                1

              libFindMeModule.dylib`initializeWithCopy for FindMe
              package`outlined init with copy of CopiedTooMuch+0x6c
              package`FOO(_:)+0x90
              package`main+0x70
              dyld`start+0x8b0
                1

In my demo program I only cause two copies, one at

this stack

              libFindMeModule.dylib`initializeWithCopy for FindMe
              package`outlined init with copy of CopiedTooMuch+0x6c
              package`FOO(_:)+0x84
              package`main+0x70
              dyld`start+0x8b0
                1    <<<< how many calls

and one at that

              libFindMeModule.dylib`initializeWithCopy for FindMe
              package`outlined init with copy of CopiedTooMuch+0x6c
              package`FOO(_:)+0x90
              package`main+0x70
              dyld`start+0x8b0
                1    <<<< how many calls

I hope that helps.

But yes, it'd be awesome if for example Instruments.app, Swift itself or some other technology could help us with finding these copies without us having to build it ourselves. I'd suggest you file a Github issues (for Swift) and/or a Apple Feedback (for Instruments) with your request.

jaleel · November 20, 2023, 10:12am

As Swift leans towards using structs more and more +1 on that!

dnadoba · November 20, 2023, 10:37am

You can eliminate copies with the ownership modifiers but they also show you where copies are made through compiler errors if you don't use the copy operator. If you cleaned up the error messages you see where copies are necessary directly in your source code. This is a large undertaking but doesn't actually require that much knowledge about the ownership modifiers.

I also want to mention that if you have large structs you can make copies cheap (i.e. roughly just a call to swift_retain) by implementing CoW (Copy on Write) instead of eliminating copies. This will only provide benefits if the copy isn't mutated and the copy wasn't actually needed in the first place. CoW can be implemented manually but nowadays you can probably write a macro for it as well.

stuchlej · November 20, 2023, 2:53pm

Just throwing an idea here,

if you use -emit-silgen (I assume it's SIL as handed over by sema without any optimization pass), you might search for alloc_box with store (and optionally copy_value) (and/or alloc_stack, other cases ...). It even seems all the instructions contain the information you might need (type name, file location, ...).

I would like to ask you, can this idea even work, or is here some obvious problem I don't see @dnadoba

nocchijiang · November 21, 2023, 8:16am

I'm not an SIL expert but I think copy_addr is closer to my understanding of "copy". And the unoptimized SIL may not reflect the actual codegen. So I tried to emit a diagnostic in IRGenSILFunction::visitCopyAddrInst() and it seemed to work really well.

Also with the help of the custom diagnostic I found an actual compiler issue.

Diving into the code, it seems like the frontend translates continuous logical operations into nested closure calls, and the struct is copied as a capture of the implicit closure. This behavior can be confirmed by setting breakpoint on the outlined helper. Since this post is getting some attention, I decide to cross-ref the issue here to make more people aware of it.

sveinhal · November 21, 2023, 10:37am

Can you elaborate, preferable with a somewhat small spoon

dnadoba · November 21, 2023, 10:42am

The copy is semantically necessary but it might be eliminated in an optimisation pass if the compiler can proof that it is not actually necessary. Do you emit the diagnostics after or before all optimisation passes have run? Also do you build in release configuration (i.e. -O)?

stuchlej · November 21, 2023, 2:25pm

I assume that by IRGenSIL::visitCopyAddrInst(), you mean IRGenSILFunction::visitCopyAddrInst , am I correct?

I would be really interested too, if you would share the diagnostic with us :)

nocchijiang · November 21, 2023, 10:36pm

My bad, I only checked the file name and assumed that the class name was the same.

As for the diagnostic, I am actually afraid that I am not doing it 100% right, so I choose not to post the code to avoid confusion for now.

tera · November 21, 2023, 10:38pm

FWIW, the above issue of multiple copy calls is visible in godbolt with -O.

nocchijiang · November 21, 2023, 10:45pm

At the stage of IRGenSILFunction, the SIL should be finalized right?

All the testing and diagnostics emission I've done were built with -O.

Find unintentional copy of struct

Intro

Step 1: Creation of the special module

Step 2: Integration of FindMe into your codebase

Step 3: Using lldb to find the copies

Step 2: Integration of `FindMe` into your codebase

Step 3: Using `lldb` to find the copies