[SIL] Deconstructing partial_apply

Last week, I put some effort into trying to reduce the power of SIL's infamous partial_apply instruction. The instruction was intended to make representation and optimization of closures easy by avoiding concretization of the closure type and layout, but in practice, the complexity of dealing with ARC and the differing ownership needs for closure objects vs normal function arguments makes the abstraction unwieldy to understand and work with. As we look into adding new features that rely on static analysis and optimization of SIL, such as the borrow model, partial_apply is going to be a continuing drag on development. I have a work-in-progress branch to make SILGen generate at least some closures by explicit box allocation and initialization:

https://github.com/apple/swift/pull/4565

My idea is to get to a point where all partial_apply does is bind the thin invocation function to its context object to build the "thick" function value. I'm using the existing infrastructure for @box types in SIL to represent a closure allocation, using tuples to aggregate multiple captures. This is enough to get a start on the project, but there a few limitations and inefficiencies to this approach, and I wanted some feedback on how to tackle these problems:

(A) Aggregating contexts as tuples imposes some unnecessary abstraction limitations on capture layout, and would potentially force otherwise unnecessary metadata instantiation for tuple types. Tuples need to tolerate potentially being accessed as fully generic (T, U, V) types (and possibly variadic (T...) types, should we add variadics in the future), which limits the layout optimizations we can potentially do with them. Closure contexts don't need to support generic abstraction, since the details of layout only need to be known by the closure's invocation function and its enclosing scope, so we should have full freedom for layout optimization like we have with structs and classes.
(B) Boxed tuples can't represent captured type information for forming closures in generic contexts. Closure contexts need to be able to provide type metadata to the invocation function for execution of generic code and layout of the context object itself.

One idea I'm entertaining is, instead of relying on structural box types, to introduce "nominal" box layouts, which can give an identity to a particular closure layout, thereby addressing problem (A), and can potentially carry a generic signature to represent captured type metadata, addressing (B). For example, in code like this:

  var escaper: () -> ()
  func foo(x: Bool, y: Bool) {
    escaper = { _ = x; _ = y }
  }

We'd emit something like this SIL:

  // Closure layout for closure inside foo
  sil_box_layout foo.1 {
    let x: Bool
    let y: Bool
  }

  sil @foo : $(Bool, Bool) -> () {
  entry(%x : $Bool, %y : $Bool):
    // Allocate a box with the layout for this closure
    %context = alloc_box foo.1
    // Initialize the members
    %context.x = project_box %context, #x
    store %x to %context.x
    %context.y = project_box %context, #y
    store %y to %closure.y
    // Bind to the invocation function
    %closure = partial_apply @foo.1(%closure)
    assign %closure to @escaper
    return
  }

  // Invocation function for the closure
  sil @foo.1 : $(@box foo.1) -> () {
  entry(%context : $@box foo.1):
    // Load captured locals out of context
    %context.x = project_box %context, #x
    %x = load %context.x
    %context.y = project_box %context, #y
    %y = load %context.y
    // do nothing
    release %context
    return
  }

The unique box layout gives us license to pack those bools into a single word, or a tagged pointer, or do other fancy layout optimizations. For generic closures, a box layout could be generic, so that something like this:

  func bar<T>(x: T, y: T) {
    escaper = { _ = x; _ = y }
  }

lowers to:

  // Closure layout for closure inside bar, which captures type info
  sil_closure_layout bar.1<A> {
    let x: T
    let y: T
  }

  sil @bar : $<B> (B, B) -> () {
  entry(%x : $*B, %y : $*B):
    // Allocate a box with the layout for this closure, bound to this generic context
    %context = alloc_box bar.1<B>
    // Initialize the members
    %context.x = project_box %context, #x
    copy_addr %x to [init] %context.x
    %context.y = project_box %context, #y
    copy_addr %y to [init] %closure.y
    // Bind to the invocation function
    %closure = partial_apply @bar.1<B>(%closure)
    assign %closure to @escaper
    return
  }
  
  sil @bar.1 : $<C> (@box bar.1<C>) { ... }

As a hack, we can guarantee in the polymorphic convention that a generic box context parameter acts as a type metadata source for its generic parameters, similar to what we do with 'self' in class and protocol method signatures. Having an opening instruction might be a better long-term solution, though, and could make generic boxes also useful for representing copy-on-write existentials and GADT enum payloads. This is just a rough sketch of an idea; I'm open to other approaches. We've talked about tracking the SIL-level layout of structs, enums, and classes in SIL separate from the Swift level, so representing box layouts concretely feels consistent with that potential direction. OTOH, at this point, we'd be constructing a lot of SIL data structures for every closure.

-Joe

Last week, I put some effort into trying to reduce the power of SIL's infamous partial_apply instruction. The instruction was intended to make representation and optimization of closures easy by avoiding concretization of the closure type and layout, but in practice, the complexity of dealing with ARC and the differing ownership needs for closure objects vs normal function arguments makes the abstraction unwieldy to understand and work with. As we look into adding new features that rely on static analysis and optimization of SIL, such as the borrow model, partial_apply is going to be a continuing drag on development. I have a work-in-progress branch to make SILGen generate at least some closures by explicit box allocation and initialization:

(WIP) Curtail the power of partial_apply by jckarter · Pull Request #4565 · apple/swift · GitHub

My idea is to get to a point where all partial_apply does is bind the thin invocation function to its context object to build the "thick" function value.

Hooray!

I'm using the existing infrastructure for @box types in SIL to represent a closure allocation, using tuples to aggregate multiple captures. This is enough to get a start on the project, but there a few limitations and inefficiencies to this approach, and I wanted some feedback on how to tackle these problems:

(A) Aggregating contexts as tuples imposes some unnecessary abstraction limitations on capture layout, and would potentially force otherwise unnecessary metadata instantiation for tuple types. Tuples need to tolerate potentially being accessed as fully generic (T, U, V) types (and possibly variadic (T...) types, should we add variadics in the future), which limits the layout optimizations we can potentially do with them. Closure contexts don't need to support generic abstraction, since the details of layout only need to be known by the closure's invocation function and its enclosing scope, so we should have full freedom for layout optimization like we have with structs and classes.
(B) Boxed tuples can't represent captured type information for forming closures in generic contexts. Closure contexts need to be able to provide type metadata to the invocation function for execution of generic code and layout of the context object itself.

One idea I'm entertaining is, instead of relying on structural box types, to introduce "nominal" box layouts, which can give an identity to a particular closure layout, thereby addressing problem (A), and can potentially carry a generic signature to represent captured type metadata, addressing (B). For example, in code like this:

This makes a lot of sense to me.

One goal here should be to allow the box layout to be easily cloned and modified, e.g. to promote a capture to a value capture, or to eliminate a capture that's provably unused.

  var escaper: () -> ()
  func foo(x: Bool, y: Bool) {
    escaper = { _ = x; _ = y }
  }

We'd emit something like this SIL:

  // Closure layout for closure inside foo
  sil_box_layout foo.1 {
    let x: Bool
    let y: Bool
  }

  sil @foo : $(Bool, Bool) -> () {
  entry(%x : $Bool, %y : $Bool):
    // Allocate a box with the layout for this closure
    %context = alloc_box foo.1
    // Initialize the members
    %context.x = project_box %context, #x
    store %x to %context.x
    %context.y = project_box %context, #y
    store %y to %closure.y
    // Bind to the invocation function
    %closure = partial_apply @foo.1(%closure)
    assign %closure to @escaper
    return
  }

  // Invocation function for the closure
  sil @foo.1 : $(@box foo.1) -> () {
  entry(%context : $@box foo.1):
    // Load captured locals out of context
    %context.x = project_box %context, #x
    %x = load %context.x
    %context.y = project_box %context, #y
    %y = load %context.y
    // do nothing
    release %context
    return
  }

The unique box layout gives us license to pack those bools into a single word, or a tagged pointer, or do other fancy layout optimizations. For generic closures, a box layout could be generic, so that something like this:

  func bar<T>(x: T, y: T) {
    escaper = { _ = x; _ = y }
  }

lowers to:

  // Closure layout for closure inside bar, which captures type info
  sil_closure_layout bar.1<A> {
    let x: T
    let y: T
  }

  sil @bar : $<B> (B, B) -> () {
  entry(%x : $*B, %y : $*B):
    // Allocate a box with the layout for this closure, bound to this generic context
    %context = alloc_box bar.1<B>
    // Initialize the members
    %context.x = project_box %context, #x
    copy_addr %x to [init] %context.x
    %context.y = project_box %context, #y
    copy_addr %y to [init] %closure.y
    // Bind to the invocation function
    %closure = partial_apply @bar.1<B>(%closure)
    assign %closure to @escaper
    return
  }
  
  sil @bar.1 : $<C> (@box bar.1<C>) { ... }

As a hack, we can guarantee in the polymorphic convention that a generic box context parameter acts as a type metadata source for its generic parameters, similar to what we do with 'self' in class and protocol method signatures. Having an opening instruction might be a better long-term solution, though, and could make generic boxes also useful for representing copy-on-write existentials and GADT enum payloads.

I suspect we'd want both. We can require the generic signature of the context to match the applied function, just to make it easy.

John.

···

On Sep 6, 2016, at 3:36 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

This is just a rough sketch of an idea; I'm open to other approaches. We've talked about tracking the SIL-level layout of structs, enums, and classes in SIL separate from the Swift level, so representing box layouts concretely feels consistent with that potential direction. OTOH, at this point, we'd be constructing a lot of SIL data structures for every closure.

-Joe
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev