Advice dealing with SILGen for switch statements w/ multiple binding patterns


(Greg Titus) #1

Hey All,

I’m not sure what to try next here, and could use some advice. For some time — off and on as I have time to get back to it — I’ve been working on switch statements with multiple binding patterns, e.g.:

enum A<T> {
  case left(a: T, b: T)
  case right(a: T, b: T)
  
  var foo: T {
    switch self {
    case let .left(a, _), let .right(_, a):
      return a
    }
  }
}

At some point I’d love to also get to allowing fall through between case blocks with bound variables as well, but first things first. My initial approach was to construct the body of these case blocks as a single basic block with incoming arguments for each bound variable, e.g.:

bb0:
  switch_enum : $A, case .left: bb1, case .right: bb2
bb1:
  %a1 = tuple_extract $(a: T, b: T), 0
  br bb3(%a1)
bb2:
  %a2 = tuple_extract $(a: T, b: T), 1
  br bb3(%a2)
bb3(%a: $T):
  // use %a

This is great unless T is an address-only type (like an unknown generic, as it is in this example code). A stack allocation happens, the actual value is copied in, and then (in bb1 and bb2) that stack value was deallocated, which means passing the value to bb3 breaks the memory management with use-after-free, ick. This was bug SR-2555.

So my next attempt was to pass the address of the value for address-only types:

bb0:
  switch_enum_address : $*A, case .left: bb1, case .right: bb2
bb1:
  %a1 = tuple_element_addr $(a: T, b: T), 0
  %s1 = alloc_stack $T
  copy_addr [take] %a1 to [initialization] %s1
  br bb3(%s1)
bb2:
  %a2 = tuple_element_addr $(a: T, b: T), 1
  %s2 = alloc_stack $T
  copy_addr [take] %a2 to [initialization] %s2
  br bb3(%s2)
bb3(%s: $*T):
  // use %s as address
  destroy_addr %s
  dealloc_stack %s

This requires careful ordering of the stack allocations in each pattern block so that the stack FILO invariant is maintained (in this example there is only one variable “a” but if we needed “b” instead of “_” then we’d have to be sure that each of bb1, bb2 alloc’d stack space for a and b in the same order even though they are in a different order in their respective tuples). But is doable.

Unfortunately, although the result looks like it might be valid SIL, this won’t work. I extended the SILVerifier to deal with stack allocations that were aliased due to being passed as parameters between basic blocks (so that it would know that %s1 and %s were the same and match the allocation of the former to the deallocation of the latter), and that makes SILGen happy.

But this causes IRGen to complain that in the bb3 phi “address has no container” while it is trying to generate IR for the dealloc_stack instruction. If anyone wants to see this the changes are here <https://github.com/gregomni/swift/tree/switch>.

So I backed off, for the moment, to some code which emits the case contents statement(s) multiple times, once for each pattern, as if the programmer had written:

  var foo: T {
    switch self {
    case let .left(a, _):
      return a
    case let .right(_, a):
      return a
    }
  }

This is okay for this example, but obviously isn’t ideal with larger block contents. That got merged to trunk in pull #5094 to fix the bug for now, and that’s where I am today.

Should I:

- Continue down the aliased alloc_stack path and figure out how to make IRGen happy with stack addresses passed as args to BBs? I am - thus far - unfamiliar with IRGen. Any guesses how difficult this would be?

- Hoist stack allocations out of the individual pattern BBs (promising looking in this example, but super messy with lots of cases with differing tuple contents)? This makes the setup easier - alloc_stack the superset of needed things at the top, but the control flow for cleanup potentially hard. Although it seems like SILGen’s Cleanups machinery might actually work for me instead of my having to fight it, as has unfortunately seemed the case so far.

- Try changing over to allocing a box for each address-only arg that I want to pass between BBs? Seems unfortunate to make heap allocations for this (could they be optimized out?), and I’m worried about whether I’d run into further problems continuing to try to ‘join’ multiple allocs to single deallocs this way.

- Use separate strategies for switches containing address-only pattern variables and those which don’t (so that the body block can be shared in the simpler cases but leave the address-only case alone)?

- I keep having the nagging feeling that I’m overlooking some easier solution here.

Any thoughts greatly appreciated!

Thanks,
  - Greg


(Joe Groff) #2

Hey All,

I’m not sure what to try next here, and could use some advice. For some time — off and on as I have time to get back to it — I’ve been working on switch statements with multiple binding patterns, e.g.:

enum A<T> {
  case left(a: T, b: T)
  case right(a: T, b: T)
  
  var foo: T {
    switch self {
    case let .left(a, _), let .right(_, a):
      return a
    }
  }
}

At some point I’d love to also get to allowing fall through between case blocks with bound variables as well, but first things first. My initial approach was to construct the body of these case blocks as a single basic block with incoming arguments for each bound variable, e.g.:

bb0:
  switch_enum : $A, case .left: bb1, case .right: bb2
bb1:
  %a1 = tuple_extract $(a: T, b: T), 0
  br bb3(%a1)
bb2:
  %a2 = tuple_extract $(a: T, b: T), 1
  br bb3(%a2)
bb3(%a: $T):
  // use %a

This is great unless T is an address-only type (like an unknown generic, as it is in this example code). A stack allocation happens, the actual value is copied in, and then (in bb1 and bb2) that stack value was deallocated, which means passing the value to bb3 breaks the memory management with use-after-free, ick. This was bug SR-2555.

So my next attempt was to pass the address of the value for address-only types:

bb0:
  switch_enum_address : $*A, case .left: bb1, case .right: bb2
bb1:
  %a1 = tuple_element_addr $(a: T, b: T), 0
  %s1 = alloc_stack $T
  copy_addr [take] %a1 to [initialization] %s1
  br bb3(%s1)
bb2:
  %a2 = tuple_element_addr $(a: T, b: T), 1
  %s2 = alloc_stack $T
  copy_addr [take] %a2 to [initialization] %s2
  br bb3(%s2)
bb3(%s: $*T):
  // use %s as address
  destroy_addr %s
  dealloc_stack %s

This requires careful ordering of the stack allocations in each pattern block so that the stack FILO invariant is maintained (in this example there is only one variable “a” but if we needed “b” instead of “_” then we’d have to be sure that each of bb1, bb2 alloc’d stack space for a and b in the same order even though they are in a different order in their respective tuples). But is doable.

Unfortunately, although the result looks like it might be valid SIL, this won’t work. I extended the SILVerifier to deal with stack allocations that were aliased due to being passed as parameters between basic blocks (so that it would know that %s1 and %s were the same and match the allocation of the former to the deallocation of the latter), and that makes SILGen happy.

But this causes IRGen to complain that in the bb3 phi “address has no container” while it is trying to generate IR for the dealloc_stack instruction. If anyone wants to see this the changes are here <https://github.com/gregomni/swift/tree/switch>.

So I backed off, for the moment, to some code which emits the case contents statement(s) multiple times, once for each pattern, as if the programmer had written:

  var foo: T {
    switch self {
    case let .left(a, _):
      return a
    case let .right(_, a):
      return a
    }
  }

This is okay for this example, but obviously isn’t ideal with larger block contents. That got merged to trunk in pull #5094 to fix the bug for now, and that’s where I am today.

Should I:

- Continue down the aliased alloc_stack path and figure out how to make IRGen happy with stack addresses passed as args to BBs? I am - thus far - unfamiliar with IRGen. Any guesses how difficult this would be?

This seems like the right thing to do. It shouldn't be too hard to set up IRGen to support this. In the same way you needed to extend SIL's verifier to handle joined stack allocations, you should be able to change BB arg emission in IRGen to plumb the "container" part of stack allocations through as another phi node when we see that all of the predecessor values of a bb arg are alloc_stacks.

-Joe

···

On Oct 22, 2016, at 10:32 AM, Greg Titus via swift-dev <swift-dev@swift.org> wrote:

- Hoist stack allocations out of the individual pattern BBs (promising looking in this example, but super messy with lots of cases with differing tuple contents)? This makes the setup easier - alloc_stack the superset of needed things at the top, but the control flow for cleanup potentially hard. Although it seems like SILGen’s Cleanups machinery might actually work for me instead of my having to fight it, as has unfortunately seemed the case so far.

- Try changing over to allocing a box for each address-only arg that I want to pass between BBs? Seems unfortunate to make heap allocations for this (could they be optimized out?), and I’m worried about whether I’d run into further problems continuing to try to ‘join’ multiple allocs to single deallocs this way.

- Use separate strategies for switches containing address-only pattern variables and those which don’t (so that the body block can be shared in the simpler cases but leave the address-only case alone)?

- I keep having the nagging feeling that I’m overlooking some easier solution here.

Any thoughts greatly appreciated!

Thanks,
  - Greg

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev


(Greg Titus) #3

Excellent, that was what I needed - some indication that I wasn’t completely off-base and doing something that ought not be done. :slight_smile:

Thanks!
  - Greg

···

On Oct 24, 2016, at 11:17 AM, Joe Groff <jgroff@apple.com> wrote:

On Oct 22, 2016, at 10:32 AM, Greg Titus via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Hey All,

I’m not sure what to try next here, and could use some advice. For some time — off and on as I have time to get back to it — I’ve been working on switch statements with multiple binding patterns, e.g.:

enum A<T> {
  case left(a: T, b: T)
  case right(a: T, b: T)
  
  var foo: T {
    switch self {
    case let .left(a, _), let .right(_, a):
      return a
    }
  }
}

At some point I’d love to also get to allowing fall through between case blocks with bound variables as well, but first things first. My initial approach was to construct the body of these case blocks as a single basic block with incoming arguments for each bound variable, e.g.:

bb0:
  switch_enum : $A, case .left: bb1, case .right: bb2
bb1:
  %a1 = tuple_extract $(a: T, b: T), 0
  br bb3(%a1)
bb2:
  %a2 = tuple_extract $(a: T, b: T), 1
  br bb3(%a2)
bb3(%a: $T):
  // use %a

This is great unless T is an address-only type (like an unknown generic, as it is in this example code). A stack allocation happens, the actual value is copied in, and then (in bb1 and bb2) that stack value was deallocated, which means passing the value to bb3 breaks the memory management with use-after-free, ick. This was bug SR-2555.

So my next attempt was to pass the address of the value for address-only types:

bb0:
  switch_enum_address : $*A, case .left: bb1, case .right: bb2
bb1:
  %a1 = tuple_element_addr $(a: T, b: T), 0
  %s1 = alloc_stack $T
  copy_addr [take] %a1 to [initialization] %s1
  br bb3(%s1)
bb2:
  %a2 = tuple_element_addr $(a: T, b: T), 1
  %s2 = alloc_stack $T
  copy_addr [take] %a2 to [initialization] %s2
  br bb3(%s2)
bb3(%s: $*T):
  // use %s as address
  destroy_addr %s
  dealloc_stack %s

This requires careful ordering of the stack allocations in each pattern block so that the stack FILO invariant is maintained (in this example there is only one variable “a” but if we needed “b” instead of “_” then we’d have to be sure that each of bb1, bb2 alloc’d stack space for a and b in the same order even though they are in a different order in their respective tuples). But is doable.

Unfortunately, although the result looks like it might be valid SIL, this won’t work. I extended the SILVerifier to deal with stack allocations that were aliased due to being passed as parameters between basic blocks (so that it would know that %s1 and %s were the same and match the allocation of the former to the deallocation of the latter), and that makes SILGen happy.

But this causes IRGen to complain that in the bb3 phi “address has no container” while it is trying to generate IR for the dealloc_stack instruction. If anyone wants to see this the changes are here <https://github.com/gregomni/swift/tree/switch>.

So I backed off, for the moment, to some code which emits the case contents statement(s) multiple times, once for each pattern, as if the programmer had written:

  var foo: T {
    switch self {
    case let .left(a, _):
      return a
    case let .right(_, a):
      return a
    }
  }

This is okay for this example, but obviously isn’t ideal with larger block contents. That got merged to trunk in pull #5094 to fix the bug for now, and that’s where I am today.

Should I:

- Continue down the aliased alloc_stack path and figure out how to make IRGen happy with stack addresses passed as args to BBs? I am - thus far - unfamiliar with IRGen. Any guesses how difficult this would be?

This seems like the right thing to do. It shouldn't be too hard to set up IRGen to support this. In the same way you needed to extend SIL's verifier to handle joined stack allocations, you should be able to change BB arg emission in IRGen to plumb the "container" part of stack allocations through as another phi node when we see that all of the predecessor values of a bb arg are alloc_stacks.

-Joe

- Hoist stack allocations out of the individual pattern BBs (promising looking in this example, but super messy with lots of cases with differing tuple contents)? This makes the setup easier - alloc_stack the superset of needed things at the top, but the control flow for cleanup potentially hard. Although it seems like SILGen’s Cleanups machinery might actually work for me instead of my having to fight it, as has unfortunately seemed the case so far.

- Try changing over to allocing a box for each address-only arg that I want to pass between BBs? Seems unfortunate to make heap allocations for this (could they be optimized out?), and I’m worried about whether I’d run into further problems continuing to try to ‘join’ multiple allocs to single deallocs this way.

- Use separate strategies for switches containing address-only pattern variables and those which don’t (so that the body block can be shared in the simpler cases but leave the address-only case alone)?

- I keep having the nagging feeling that I’m overlooking some easier solution here.

Any thoughts greatly appreciated!

Thanks,
  - Greg

_______________________________________________
swift-dev mailing list
swift-dev@swift.org <mailto:swift-dev@swift.org>
https://lists.swift.org/mailman/listinfo/swift-dev