Consume and memory performance

EngOmarElsayed · June 27, 2024, 10:06am

let ex = PlayerStruct()
var mut = consume ex

what I understand is that this pieace of code transfer ownership of PlayerStruct to mut. So at this point mut is the only variable that have the value for PlayerStruct . So at this point there is only one copy of the PlayerStruct owned by the mut in memory.

VS.

let ex = PlayerStruct()
var mut = ex

Here we have two copies of the PlayerStruct one with the ex and one with mut, so it is not memory efficient at this point.

So the first solution is better right ? My question is : By using the consume keyword we can make our memory more effient and removing unused copies of instance that where easey to be made in the past, is what I understand is correct ?

johannesweiss · June 27, 2024, 10:56am

You may have two copies, yeah. But the optimiser is often able to elide this.

Here's how I check:

struct PlayerStruct {
    // some storage
    var int0: Int = #line
    var int1: Int = #line
    var int2: Int = #line
    var int3: Int = #line

    @_optimize(none) // just to make sure it's actually created
    init() {}
}

@inline(never)
func blackhole<T>(_ ps: inout T) {}

func HARNESS_consume() {
    let ex = PlayerStruct()
    var mut = consume ex
    blackhole(&mut)
}

func HARNESS_copy() {
    let ex = PlayerStruct()
    var mut = ex
    blackhole(&mut)
}

then run

swiftc -O -emit-assembly -parse-as-library -module-name T test.swift | swift demangle | grep -A15 ^T.HARNESS

which will compile this optimised and emit the assembly, then greps for the right functions

output:

T.HARNESS_copy() -> ():
	.cfi_startproc
	sub	sp, sp, #48
	stp	x29, x30, [sp, #32]
	add	x29, sp, #32
	.cfi_def_cfa w29, 16
	.cfi_offset w30, -8
	.cfi_offset w29, -16
	bl	T.PlayerStruct.init() -> T.PlayerStruct
	mov	x0, sp
	bl	generic specialization <T.PlayerStruct> of T.blackhole<A>(inout A) -> ()
	ldp	x29, x30, [sp, #32]
	add	sp, sp, #48
	ret
	.cfi_endproc

--
T.HARNESS_consume() -> ():
	.cfi_startproc
	b	T.HARNESS_copy() -> ()
	.cfi_endproc

In this case, the optimiser totally got it. See how HARNESS_consume literally just (tail) calls HARNESS_copy because they have the exact same code?

EngOmarElsayed · June 27, 2024, 11:01am

Oh okay, so this means that using consume will not make a big difference right ?

johannesweiss · June 27, 2024, 11:10am

Not quite. You can use consume to guarantee things that otherwise may or may not happen. So using consume can absolutely make a difference, but it's not always required to get the best performance. Especially trivial cases like the one in your example are often taken care of by the optimiser.

But what the optimiser can optimise is a little volatile, the Swift compiler of course evolves and so does your code. So if you have a piece of code where it's critical not to get a copy, I'd use consume explicitly. But in most places I'd just elide it as it won't make a difference either way and you can always add it later.

Where you may want to think about it a little more carefully is function signatures. Adding a consuming to a function is a breaking change.

EngOmarElsayed · June 27, 2024, 11:23am

I got you and thank you for the amazing explanation. I have one more question to make sure I understand consuming keyword. When this word is added to the function the function removes self from the callers, right ? which means that after calling this method the self will not be available. I am right ?

johannesweiss · June 27, 2024, 11:51am

I meant consuming on arguments like func foo(_ x: consuming Player) but you're asking about consuming func foo().

So yes, kinda, you're definitely on the right track but it doesn't really remove self (except if a copy is impossible, see later) from the caller, it merely says 'I will consume ownership of self'. This may sound a little abstract, let's do examples. Let's start with a class C, note for classes copying means increasing the ref count.

Let's start with a program that doesn't use consuming:

class C { func bye() {} }
func HARNESS(_ c: C) -> C { c.bye(); return c; }

if we feed this through the compiler we get

$ echo 'class C { func bye() {} }; func HARNESS(_ c: C) -> C { c.bye(); return c; }' | swiftc -O -emit-assembly -module-name T - | swift demangle | grep -A20 ^T.HARNESS | grep -v .cfi_
T.HARNESS(T.C) -> T.C:
	stp	x20, x19, [sp, #-32]!
	stp	x29, x30, [sp, #16]
	add	x29, sp, #16
	mov	x20, x0
	ldr	x8, [x0]
	ldr	x8, [x8, #80]
	blr	x8    // <-- CALL to virtual function `bye`
	mov	x0, x20
	ldp	x29, x30, [sp, #16]
	ldp	x20, x19, [sp], #32
	b	_swift_retain  // <-- a retain at the end before return

So when we're calling bye we don't need to do anything to the ref count because we borrowed (actually @guaranteed) that from the caller (standard for function parameters). And bye by default will also just borrowing (actually @guaranteed) self.

Before the return however we need to increase the reference count because the default for return values is @owned.

Now, if we change our little program and make it a consuming func bye() that means bye is no longer happy with a @guaranteed self, it now wants to consume it.

class C { consuming func bye() {} }
func HARNESS(_ c: C) -> C { c.bye(); return c; }

gives us

$ echo 'class C { consuming func bye() {} }; func HARNESS(_ c: C) -> C { c.bye(); return c; }' | swiftc -O -emit-assembly -module-name T - | swift demangle | grep -A20 ^T.HARNESS | grep -v .cfi_
T.HARNESS(T.C) -> T.C:
	stp	x20, x19, [sp, #-32]!
	stp	x29, x30, [sp, #16]
	add	x29, sp, #16
	mov	x20, x0
	ldr	x8, [x0]
	ldr	x19, [x8, #80]
	bl	_swift_retain // <--- retain before call to bye
	blr	x19  // <--- virtual bye call
	mov	x0, x20
	ldp	x29, x30, [sp, #16]
	ldp	x20, x19, [sp], #32
	b	_swift_retain // <--- another retain before release

So as you see, the caller doesn't strictly speaking lose access to c (aka self in bye) but it will need to increase the reference count an extra time.

Now, where this really becomes apparent is if we disallow the compiler from making copies/increasing the ref count. Let's consider this slightly modified program (note that C is now struct C: ~Copyable):

struct C: ~Copyable { @inline(never) func bye() {} }
func HARNESS(_ c: consuming C) -> C { c.bye(); return c; }

this gives us

$ echo 'struct C: ~Copyable { @inline(never) func bye() {} }; func HARNESS(_ c: consuming C) -> C { c.bye(); return c; }' | swiftc -O -emit-assembly -module-name T - | swift demangle | grep -A3 ^T.HARNESS
T.HARNESS(__owned T.C) -> T.C:
	.cfi_startproc
	b	function signature specialization <Arg[0] = Dead> of T.C.bye() -> ()
	.cfi_endproc

shiny! It compiles and the function does nothing but (tail) calling bye.

But if we now make bye a consuming func we'll get:

$ echo 'struct C: ~Copyable { @inline(never) consuming func bye() {} }; func HARNESS(_ c: consuming C) -> C { c.bye(); return c; }' | swiftc -O -emit-assembly -module-name T - | swift demangle | grep -A3 ^T.HARNESS
<stdin>:1:80: error: 'c' consumed more than once
1 | struct C: ~Copyable { @inline(never) consuming func bye() {} }; func HARNESS(_ c: consuming C) -> C { c.bye(); return c; }
  |                                                                                |                      |               `- note: consumed again here
  |                                                                                |                      `- note: consumed here
  |                                                                                `- error: 'c' consumed more than once
2 |

because the compiler is now unable to add a defensive copy.

I know this is all quite something but I hope it helps.

FWIW, to figure out the calling conventions (@owned vs @guaranteed etc) that the compiler picked, I'd recommend looking at the SIL instead of the assembly. E.g.

regular func bye takes self (implicit first parameter) as @guaranteed C

$ echo 'class C { func bye() {} }; func HARNESS(_ c: C) -> C { c.bye(); return c; }' | swiftc -O -emit-sil -module-name T - | swift demangle | grep '^sil.*bye'
sil hidden @T.C.bye() -> () : $@convention(method) (@guaranteed C) -> () {

consuming func bye() takes self as @owned C

$ echo 'class C { consuming func bye() {} }; func HARNESS(_ c: C) -> C { c.bye(); return c; }' | swiftc -O -emit-sil -module-name T - | swift demangle | grep '^sil.*bye'
sil hidden @T.C.bye() -> () : $@convention(method) (@owned C) -> () {

EngOmarElsayed · June 27, 2024, 12:06pm

So the refrence count was increased the first time for the function bye() (because it’s consume function so it will take a copy of c to consume it) and the secound time for return c.

EngOmarElsayed · June 27, 2024, 12:10pm

So at this point this made me think, consuming a function becaomes really handy with ~Copyable types but with Copyable it’s not that useful because the type is allowed to be copied so doesn’t make a big difference , right ?

EngOmarElsayed · June 27, 2024, 12:13pm

johannesweiss:

this gives us

$ echo 'struct C: ~Copyable { @inline(never) func bye() {} }; func HARNESS(_ c: consuming C) -> C { c.bye(); return c; }' | swiftc -O -emit-assembly -module-name T - | swift demangle | grep -A3 ^T.HARNESS
T.HARNESS(__owned T.C) -> T.C:
	.cfi_startproc
	b	function signature specialization <Arg[0] = Dead> of T.C.bye() -> ()
	.cfi_endproc

shiny! It compiles and the function does nothing but (tail) calling bye.

You mean it deosn’t return c ? if yes, Why ?

johannesweiss · June 27, 2024, 12:24pm

It does return c but it's hard to see. The compiler was smart enough to see that C is of zero size (i.e. to return it, there's nothing to do).

If you add an Int to C it'll look like this

$ echo 'struct C: ~Copyable { var x = 1; @inline(never) func bye() { } }; func HARNESS(_ c: consuming C) -> C { c.bye(); return c; }' | swiftc -O -emit-assembly -module-name T - | swift demangle | grep -v .cfi_ | grep -A10 ^T.HARNESS
T.HARNESS(__owned T.C) -> T.C:
	stp	x20, x19, [sp, #-32]!
	stp	x29, x30, [sp, #16]
	add	x29, sp, #16
	mov	x19, x0
	bl	function signature specialization <Arg[0] = Dead> of T.C.bye() -> ()
	mov	x0, x19
	ldp	x29, x30, [sp, #16]
	ldp	x20, x19, [sp], #32
	ret