While working on an interesting application of differentiable programming (GitHub - borglab/SwiftFusion), I have run into some cases where it would be very useful to define custom derivatives of stored properties.
Therefore, I'd like to add the ability to do this.
Before doing so, I'd like to share the motivations and ask if anyone has any questions or comments.
Problem
Today, the differentiable programming implementation allows you to specify custom derivatives for computed properties:
struct Foo: Differentiable {
var x: Float
var xComputed: Float { x }
@derivative(of: xComputed)
func dXComputed() -> (value: Float, pullback: (Float) -> TangentVector) {
return (x, { TangentVector(x: 2 * $0) })
}
}
print(gradient(at: Foo(x: 0)) { $0.x })
// => TangentVector(x: 1.0)
print(gradient(at: Foo(x: 0)) { $0.xComputed })
// => TangentVector(x: 2.0)
but not stored properties:
struct Foo: Differentiable {
var x: Float
@derivative(of: x)
func dX() -> (value: Float, pullback: (Float) -> TangentVector) {
return (x, { TangentVector(x: 2 * $0) })
}
}
// error: <Cell 5>:4:19: error: cannot register derivative for stored property 'x'
// @derivative(of: x)
// ^
Motivation: Consistency
Most language features treat stored properties indistinguishably from computed properties. The current implementation of the custom derivative feature is inconsistent with this.
Motivation: Real world use case
Consider this geometry calculation on the 2D plane:
/// A 2D rotation.
struct Rot2 {
/// The cosine and sine of the rotation.
var c, s: Float
/// Returns `rhs`, rotated by `lhs`.
static func * (_ lhs: Rot2, _ rhs: Vector2) -> Vector2 { ... }
}
/// Returns the distance between `target` and the end of a robot arm with two joints.
///
/// `r1`, `r2` are the joint angles.
func f(_ r1: Rot2, _ r2: Rot2, _ target: Vector2) -> Float {
let armSegment = Vector2(1, 0)
let handPosition = r1 * armSegment + r1 * r2 * armSegment
return (handPosition - target).magnitude()
}
We might want the gradient of f
with respect to r1
and r2
, to do inverse kinematics. We can ask for that, and we'll get:
// Make `Rot2` differentiable with the default synthesized `TangentVector`.
extension Rot2: Differentiable {}
gradient(at: Rot2(c: 1, s: 0), Rot2(c: 1, s: 0)) { f($0, $1, Vector2(1, 1)) }
// => Rot2.TangentVector(c: ..., s: ...), Rot2.TangentVector(c: ..., s: ...)
Notice that Rot2.TangentVector
has two fields corresponding to the c
and s
fields in Rot2
. This because the default synthesized TangentVector
has one property per differentiable property in the original type.
However, there is another sensible TangentVector
for 2D rotations. We can define it in Swift AD as follows:
extension Rot2: Differentiable {
struct TangentVector {
/// Change in angle.
var theta: Differentiable
}
}
This TangentVector
is better for some applications because:
- it is more efficient to do computations with it (fewer numbers), and
- it has a nicer interpretation (it's the change in the angle of the rotation).
However, Swift AD does not know how to compute derivatives using that TangentVector
. To teach Swift AD to compute derivatives, we must specify the derivatives of all the "primitive Rot2
operations". ("primitive Rot2
operations" are operations that all other functions on Rot2
are made of). The stored property accessors are the primitive operations of Rot2
, so we would like to be able to specify:
extension Rot2 {
@derivative(of: s)
func dS() -> (Float, (Float) -> TangentVector) {
return (s, { c * $0 })
}
@derivative(of: c)
func dC() -> (Float, (Float) -> TangentVector) {
return (c, { -s * $0 })
}
}
To do this, we need to allow custom derivatives of stored properties s
and c
.
In summary, the problem is that: When users define a custom TangentVector
, they need to define custom derivatives for all "primitive operations" on the original type. Stored property accessors are common primitive operations. Therefore, the users need a way to define custom derivatives of stored property accessors.
Here are more examples of types in SwiftFusion that require custom TangentVector
s:
- Pose2 (2D translation and rotation, aka 2D rigid transformation)
-
Rot3
(3D rotation) (not quite yet implemented) -
Pose3
(3D translation and rotation, aka 3D rigid transformation) (not quite yet implemented)
Proposed solution
Allow custom derivatives for stored properties.
Custom derivatives already work for computed properties, and stored properties are indistinguishable from computed properties to clients. Therefore, custom derivatives for stored properties can work exactly like custom derivatives for computed properties.
Future work
It's a bit inconvenient to define separate custom derivatives for each of your stored properties. It's also hard to explain to users what this means and why it is necessary.
It would be nice to make something that is easier to use and easier to explain. I'm trying out some ideas, and I might have a detailed forum post soon. I'm interested if anyone else has any ideas.