Accessor coroutines: poor children?

asl · August 31, 2023, 7:38pm

We are currently working on implementing autodiff for _modify / read accessors and we seem to face with some issues that we feel might be generic enough to bring here / ask for some guidance or ideas.

Currently lack of autodiff support of subscript _modify / read accessor is a major UX and performance blocker for any autodiff user code that use e.g. Array or Dictionary from Swift standard library. Various workarounds (see e.g. [SR-14113] Support `_read` and `_modify` accessor differentiation · Issue #54401 · apple/swift · GitHub) are possible, but they incur lots of overheads: instead of direct access of an Array we need to do the opposite and essentially not use subscripts at all (think about hard requirement of changing of e.g. a[i]=x to something like a.updated(at: i, with: x) everywhere in user code).

These accessors are implemented as @yield_once SIL coroutines, but it seems that it's more like an implementation detail rather than something more higher level. Let me describe the issues we're currently having:

There is no proper corresponding AST type for SIL coroutines. Modify accessors look like ordinary methods returning Void: () -> (inout S) -> (). Value yields are added during SIL generation directly from AccessorDecl. Why do we need this? We need to be able to store intermediate pullbacks in the tuple that records the execution flow (branch trace enums and pullbacks for callees in the corresponding BB). And we need AST type for this. Currently we're just unsafely casting things from / to void functions, but I'm not 100% sure this will work if some reabstraction on the way will be necessary.
We need to be able to partial_apply a closure. Why? Because we need to create a pullback closure capturing pullback tuple (that records execution flow and intermediate values in the form of intermediate pullbacks). It seems that it is possible to do at SIL level, but there is no LLVM IR codegen support for such weird case
Both VJP function and pullback seems to be required to be @yield_once closures themselves. However, there is one subtle but important detail: VJP function should return both value and pullback function. And here the fun things starts to appear as VJP should actually yield a value, but really return a pullback (as it should represent the backward execution flow from return via yield and back to original arguments, we can ignore unwind part as these things can never be aborted). Currently co-routines are not allowed to return anything. For now we just hacked VJPs and pullback is returned indirectly as @inout argument. I believe someone downstream might be confused to see so.

Currently we're having a prototype that could generate seemingly-correct SIL for the simple cases like this:

struct S: Differentiable {
    private var _x : Float

    var x: Float {
        get{_x}
        set(newValue) { _x = newValue }
        _modify { yield &_x }
    }

    init(_ x : Float) {
        self._x = x
    }
}

Fixing issue 2. above would allow us to get working LLVM IR and this seems to be enough to support subscript modify accessors in Dictionary, but this is not not 100% sure for Array.

Here is the main problem: many things might be special enough to require custom derivatives. Consider the following example:

struct Struct<T> {
  var x : T
  var computedProperty: T {
    get { x }
    set { x = newValue }
    _modify { yield &x }
  }
}

Normally we are able to register custom derivatives like this:

extension Struct where T: Differentiable & AdditiveArithmetic {
  @derivative(of: computedProperty.set)
  mutating func vjpPropertySetter(_ newValue: T) -> (
    value: (), pullback: (inout TangentVector) -> T.TangentVector
  ) {
    // do something here
  }
}

Note that VJP function returns a pair consisting of function value and a pullback. And here are open questions:

VJP itself should be a co-routine. How we'd spell this? It seems to require allowing yield in ordinary functions and some additional syntax to show that something could be yielded as @inout
VJP should return a pullback. How we'd specify this? What will be its type?
Usually pullbacks returned from user vjp functions are closures, for example, here's standard VJP for multiply operation:

  @derivative(of: *)
  static func _vjpMultiply(
    lhs: Float, rhs: Float
  ) -> (value: Float, pullback: (Float) -> (Float, Float)) {
    return (lhs * rhs, { v in (rhs * v, lhs * v) })
  }

How can we allow yields here?

Sorry for the long writeup :) I hope that the problems are more or less clear. What will be the best way to deal with them from the language perspective? Maybe there are some other ways / workarounds or alternative solutions?

Tagging @Ben_Cohen @Brad_Larson