Analysis of an example of Code Generation of a property

Michael_Gottesman · January 4, 2022, 7:25pm

As per my promise to @Karl, I am going to in this post analyze the ARC traffic from Karl's post. I am also going to figure out how one can always remove ARC from this example

func test(_ array: Array<Int>) {
  someFunction(array.someSlice)
}

func someFunction(_ c: ArraySlice<Int>) {
  blackHole(c)
}

extension Collection {
  @inline(never) // Simulate a function the compiler decides not to inline.
  var someSlice: SubSequence {
    self[startIndex..<index(startIndex, offsetBy: 5)]
  }
}

@_silgen_name("blackHole")
func blackHole<T>(_: T)

So I just took a look at this example with -Xfrontend -enable-ossa-modules. Before we lower ownership, we get the following SIL:

/ test(_:)
sil hidden [ossa] @$s6output4testyySaySiGF : $@convention(thin) (@guaranteed Array<Int>) -> () {
// %0 "array"                                     // users: %3, %1
bb0(%0 : @guaranteed $Array<Int>):
  debug_value %0 : $Array<Int>, let, name "array", argno 1 // id: %1
  %2 = alloc_stack $Array<Int>                    // users: %22, %4, %3
  %3 = store_borrow %0 to %2 : $*Array<Int>
  %4 = load_borrow %2 : $*Array<Int>              // users: %21, %6
  %5 = alloc_stack $Array<Int>                    // users: %8, %11, %6
  %6 = store_borrow %4 to %5 : $*Array<Int>
  // function_ref specialized Collection.someSlice.getter
  %7 = function_ref @$sSl6outputE9someSlice11SubSequenceQzvgSaySiG_Tg5 : $@convention(method) (@guaranteed Array<Int>) -> @owned ArraySlice<Int> // user: %9
  %8 = load_borrow %5 : $*Array<Int>              // users: %9, %10
  %9 = apply %7(%8) : $@convention(method) (@guaranteed Array<Int>) -> @owned ArraySlice<Int> // users: %20, %12
  end_borrow %8 : $Array<Int>                     // id: %10
  dealloc_stack %5 : $*Array<Int>                 // id: %11
  %12 = begin_borrow %9 : $ArraySlice<Int>        // users: %15, %13, %19
  debug_value %12 : $ArraySlice<Int>, let, name "c", argno 1 // id: %13
  %14 = alloc_stack $ArraySlice<Int>              // users: %18, %17, %15
  %15 = store_borrow %12 to %14 : $*ArraySlice<Int>
  // function_ref blackHole
  %16 = function_ref @blackHole : $@convention(thin) <τ_0_0> (@in_guaranteed τ_0_0) -> () // user: %17
  %17 = apply %16<ArraySlice<Int>>(%14) : $@convention(thin) <τ_0_0> (@in_guaranteed τ_0_0) -> ()
  dealloc_stack %14 : $*ArraySlice<Int>           // id: %18
  end_borrow %12 : $ArraySlice<Int>               // id: %19
  destroy_value %9 : $ArraySlice<Int>             // id: %20
  end_borrow %4 : $Array<Int>                     // id: %21
  dealloc_stack %2 : $*Array<Int>                 // id: %22
  %23 = tuple ()                                  // user: %24
  return %23 : $()                                // id: %24
} // end sil function '$s6output4testyySaySiGF'

What is interesting is that we of course have a bunch of stack traffic that the early ossa optimization isn't handling (looking at mem2reg). The other interesting part here is of course that we have the getter not being inlined due to the no inline making it so that we return a value at +1. If one waits until the end of SIL, one can see that the optimizer actually is able to move the retain out of the getter through function signature specialization. This is a late optimization that is unsafe in its current form to run on Ownership SSA. I can tell that it has been function signature optimized due to the mangling:

sil hidden @$s6output4testyySaySiGF : $@convention(thin) (@guaranteed Array<Int>) -> () {
bb0(%0 : $Array<Int>):
  debug_value %0 : $Array<Int>, let, name "array", argno 1, loc "/app/example.swift":1:13, scope 2 // id: %1
  %2 = function_ref @$sSl6outputE9someSlice11SubSequenceQzvgSaySiG_Tg5Tf4n_g : $@convention(method) (@guaranteed Array<Int>) -> ArraySlice<Int>, loc "<compiler-generated>":0:0, scope 5 // user: %3
  %3 = apply %2(%0) : $@convention(method) (@guaranteed Array<Int>) -> ArraySlice<Int>, loc "<compiler-generated>":0:0, scope 5 // users: %10, %8, %4, %6
  debug_value %3 : $ArraySlice<Int>, let, name "c", argno 1, loc "/app/example.swift":5:21, scope 7 // id: %4
  %5 = alloc_stack $ArraySlice<Int>, loc "/app/example.swift":6:13, scope 8 // users: %11, %9, %6
  store %3 to %5 : $*ArraySlice<Int>, loc "/app/example.swift":6:13, scope 8 // id: %6
  %7 = function_ref @blackHole : $@convention(thin) <τ_0_0> (@in_guaranteed τ_0_0) -> (), loc "/app/example.swift":6:3, scope 8 // user: %9
  retain_value %3 : $ArraySlice<Int>, loc "<compiler-generated>":0:0, scope 2 // id: %8
  %9 = apply %7<ArraySlice<Int>>(%5) : $@convention(thin) <τ_0_0> (@in_guaranteed τ_0_0) -> (), loc "/app/example.swift":6:3, scope 8
  release_value %3 : $ArraySlice<Int>, loc "<compiler-generated>":0:0, scope 2 // id: %10
  dealloc_stack %5 : $*ArraySlice<Int>, loc "/app/example.swift":6:14, scope 8 // id: %11
  %12 = tuple (), loc "/app/example.swift":3:1, scope 3 // user: %13
  return %12 : $(), loc "/app/example.swift":3:1, scope 3 // id: %13
} // end sil function '$s6output4testyySaySiGF'

in this example since we are using the unsafe objc return convention, we need to retain to take ownership before we can pass it to something that is guaranteed. That is why this specific optimization is unsafe to do in Ownership SSA. Given the constraints of the problem I do not think we can do more with the current return value based semantics that we have for swift since in low level SIL, we cannot eliminate a release like in the above that /could/ be a last release.

That all being said, lets look at what this example looks like using a _read accessor instead of a getter:

func test(_ array: Array<Int>) {
    someFunction(array.someSlice)
}

func someFunction(_ c: ArraySlice<Int>) {
    blackHole(c)
}

extension Collection {
    @inline(never) // Simulate a function the compiler decides not to inline.
    var someSlice: SubSequence {
        _read {
            yield self[startIndex..<index(startIndex, offsetBy: 5)]
        }
    }
}

@_silgen_name("blackHole")
func blackHole<T>(_: T)

In this case, we get different SIL right before eliminating Ownership SSA (again using emit-ossa-modules):

  *** SIL function before  #2427, stage Serialize, pass 1: OwnershipModelEliminator (ownership-model-eliminator)
// test(_:)
sil hidden [ossa] @$s4testAAyySaySiGF : $@convention(thin) (@guaranteed Array<Int>) -> () {
// %0 "array"                                     // users: %3, %1
bb0(%0 : @guaranteed $Array<Int>):
  debug_value %0 : $Array<Int>, let, name "array", argno 1 // id: %1
  %2 = alloc_stack $Array<Int>                    // users: %25, %4, %3
  %3 = store_borrow %0 to %2 : $*Array<Int>
  %4 = load_borrow %2 : $*Array<Int>              // users: %24, %6
  %5 = alloc_stack $Array<Int>                    // users: %8, %23, %6
  %6 = store_borrow %4 to %5 : $*Array<Int>
  // function_ref specialized Collection.someSlice.read
  %7 = function_ref @$sSl4testE9someSlice11SubSequenceQzvrSaySiG_Tg5 : $@yield_once @convention(method) (@guaranteed Array<Int>) -> @yields @in_guaranteed ArraySlice<Int> // user: %9
  %8 = load_borrow %5 : $*Array<Int>              // users: %9, %13
  (%9, %10) = begin_apply %7(%8) : $@yield_once @convention(method) (@guaranteed Array<Int>) -> @yields @in_guaranteed ArraySlice<Int> // users: %11, %12
  %11 = load [copy] %9 : $*ArraySlice<Int>        // users: %14, %22
  end_apply %10                                   // id: %12
  end_borrow %8 : $Array<Int>                     // id: %13
  %14 = begin_borrow %11 : $ArraySlice<Int>       // users: %17, %15, %21
  debug_value %14 : $ArraySlice<Int>, let, name "c", argno 1 // id: %15
  %16 = alloc_stack $ArraySlice<Int>              // users: %20, %19, %17
  %17 = store_borrow %14 to %16 : $*ArraySlice<Int>
  // function_ref blackHole
  %18 = function_ref @blackHole : $@convention(thin) <τ_0_0> (@in_guaranteed τ_0_0) -> () // user: %19
  %19 = apply %18<ArraySlice<Int>>(%16) : $@convention(thin) <τ_0_0> (@in_guaranteed τ_0_0) -> ()
  dealloc_stack %16 : $*ArraySlice<Int>           // id: %20
  end_borrow %14 : $ArraySlice<Int>               // id: %21
  destroy_value %11 : $ArraySlice<Int>            // id: %22
  dealloc_stack %5 : $*Array<Int>                 // id: %23
  end_borrow %4 : $Array<Int>                     // id: %24
  dealloc_stack %2 : $*Array<Int>                 // id: %25
  %26 = tuple ()                                  // user: %27
  return %26 : $()                                // id: %27
} // end sil function '$s4testAAyySaySiGF'

So what is interesting here is beyond the extra stack allocations which show how we need to handle store_borrow in mem2reg (which is something that @meg-gupta has been looking at), the real issue here is what I call the "tight coroutine scope" problem. The problem is that the accessor to the coroutine works through lvalue logic so the access is as tight as possible and then we copy the value to push it into the outer scope. To be able to handle such a case we need to be able to have SILGen be able to emit a wider coroutine scope so that we can eliminate the ARC traffic. I tried playing around with this and I was unable to come up with an example that lifetime extends the read coroutine. But I did not try too hard.

That being said, one /can/ do this today with inout since we can pass an argument as inout to create a "wide" exclusivity scope. As an example, consider the following Swift code:

func test(_ array: inout ArraySlice<Int>) {
    someFunction(&array.someSlice)
}

func someFunction(_ c: inout ArraySlice<Int>) {
    blackHole(&c)
}

extension ArraySlice {
    @inline(never) // Simulate a function the compiler decides not to inline.
    var someSlice: SubSequence {
        _read {
          fatalError()
        }
        _modify {
            yield &self[startIndex..<index(startIndex, offsetBy: 5)]
        }
    }
}

@_silgen_name("blackHole")
func blackHole<T>(_: inout T)

Importantly notice how in the SIL below blackhole is actually within the tight coroutine scope:

sil hidden [ossa] @$s4testAAyys10ArraySliceVySiGzF : $@convention(thin) (@inout ArraySlice<Int>) -> () {
// %0 "array"                                     // users: %3, %1
bb0(%0 : $*ArraySlice<Int>):
  debug_value %0 : $*ArraySlice<Int>, var, name "array", argno 1, expr op_deref // id: %1
  // function_ref specialized ArraySlice.someSlice.modify
  %2 = function_ref @$ss10ArraySliceV4testE04someB0AByxGvMSi_Tg5 : $@yield_once @convention(method) (@inout ArraySlice<Int>) -> @yields @inout ArraySlice<Int> // user: %3
  (%3, %4) = begin_apply %2(%0) : $@yield_once @convention(method) (@inout ArraySlice<Int>) -> @yields @inout ArraySlice<Int> // users: %7, %5, %8
  debug_value %3 : $*ArraySlice<Int>, var, name "c", argno 1, expr op_deref // id: %5
  // function_ref blackHole
  %6 = function_ref @blackHole : $@convention(thin) <τ_0_0> (@inout τ_0_0) -> () // user: %7
  %7 = apply %6<ArraySlice<Int>>(%3) : $@convention(thin) <τ_0_0> (@inout τ_0_0) -> ()
  end_apply %4                                    // id: %8
  %9 = tuple ()                                   // user: %10
  return %9 : $()                                 // id: %10
} // end sil function '$s4testAAyys10ArraySliceVySiGzF'

So we can express patterns like this in SIL, we just don't have the facilities in SILGen now AFAIKT to express this. I think adding something like a "shared" or "ref" variable would in a pinch allow us to express this behavior in a wider scope.

Karl · January 4, 2022, 8:46pm

Oh this is fantastic analysis! Very enlightening! Thanks so much for looking in to it.

So this seems like the ideal solution, as it doesn't need to introduce new language concepts (beyond read accessors, which are already part of the performance manifesto). And as you say, it is possible with a _modify coroutine due to exclusivity, but _read coroutines are just given a narrower scope.

That's actually a bit surprising to me, since exclusivity is about banning overlapping read/write accesses, but overlapping read/read accesses are allowed. That would indicate that _read coroutines should be able to have at least as wide a scope as _modify when there are no mutations in play.

Yeah this is why I brought it up in the manifesto thread; I was wondering if we might need something like this. If we could tackle this at the SIL level, that would be great, but otherwise it might be worth thinking about extending ref variables to computed properties.

Thanks again for looking in to it!