Differentiable programming for gradient-based machine learning

dabrahams · December 4, 2020, 2:24pm

rxwei:

As for using move(by:) as an alternative, I've actually become less comfortable with move . move(by:) does seems okay when we think in terms of manifolds, but it is super unclear when defined as a member of primitive math types like Float and Double — developers rarely think of values of those types as points on a manifold, so they can be confused seeing a move method under Float or an expression like 3.0.move(by: 1.0) even when having knowingly imported Differentiation .

offset is an accurate but not overly domain-specific description of this operation. While it is true that all precedents using offset as the base name in Apple’s SDKs are using it as a past-participle (therefore indicating non-mutating), the mutating-ness of this operation is already unambiguously conveyed in the type signature, so I'm feeling much better about offset(by:) than other alternatives.
mutating func offset(by tangentVector: TangentVector)

But every argument you've given against move(by:) applies equally to offset(by:). I never loved “move,” but there's nothing manifold-specific about it: it's an ordinary everyday word in English. Combined with the fact that it's shorter than offset and has an unambiguous part of speech, it seems unambiguously better than offset to me (this is exactly the thought process I went through when making my first post BTW).

The point here was not to reconsider the protocol name, which is fine on its own, but to create a protocol that allows common math types to be used with different manifolds. As @scanon has pointed out, for many of the types we'd like to differentiate, there is no single manifold implied, and this “move” operation we're trying to name would have to be implemented differently for each manifold. As soon as someone makes a conformance of Matrix2x2<Float> to Differentiable and uses += to implement move, you can't use that matrix type to represent rotations in R², because the meaning of move has been locked to the type. Instead you'd need to create a new wrapper type around it.

But that is all to the good when considering problems like the confusion induced by seeing move in code completion for Float or used in (3.0).move(by: 2).

IMO you're giving up a bit too quickly on the idea of separating the manifold from the differentiable type. It may turn out to be the wrong choice, but needs some serious thinking through in the context of real use cases.

I suppose it's also worth asking whether manifolds should be dynamically parameterized, so you'd create a manifold instance containing the parameters, and use regular methods on it:

This approach would of course further increase the burden of use, but we should ask if there important use cases that need this capability, because the space cost of carrying the dynamic parameters around inside each instance of the differentiable or tangent vector types might be prohibitive. IMO differentiable programming will always be something of an expert-level feature, so it can bear a slightly higher ergonomic cost if that enables important applications.