[AutoDiff] Need assistance with debugging the compiler

philipturner · January 16, 2022, 1:11am

Please reserve this thread for conversation between me and those with expertise on low-level aspects of the Swift compiler. I'm making a new Swift Forums post because I reached the 3 consecutive reply count limit and can't post comments on Differentiable programming for gradient-based machine learning - #146. Also, it's about a series of bugs that appeared in December 2021, and would diverge from the main conversation there.

@dan-zheng @rxwei @Brad_Larson is autodiff deactivated for anything besides floating point types? I'm at a loss for why gradient(at:of:) requires that the closure's output conform to FloatingPoint in the Xcode autocomplete. The Swift Standard Library source code (link) makes no mention of that requirement. This restriction makes it difficult to reproduce the conditions from S4TF that cause compiler crashes in the newest toolchains.

Attempt at simultaneous control flow and mutation

import _Differentiation

struct DiffData: Differentiable {
    var x: Float
    var y: Float
    var z: Float
    
    @differentiable(reverse)
    func fooed(_ input: Float) -> DiffData {
        var copy = self
        copy.x *= input // Mutation is here
        copy.y *= input
        copy.z *= input
        return copy
    }
}

@differentiable(reverse)
func foowrapper(_ input: DiffData, argument: Float) -> DiffData {
    if argument > 0.5 { // Control flow is here
        return input
    } else {
        return input.fooed(argument)
    }
}

let myData: DiffData = .init(x: 6, y: 7, z: 8)
/// compile error: Global function 'gradient(at:of:)' requires that 'DiffData' conform to 'FloatingPoint'
let grad = gradient(at: myData) { myData in 
    foowrapper(myData, argument: 3)
}

How does S4TF even compile? Tensor doesn't conform to FloatingPoint, yet it is used as the output of differentiable functions. I faced this problem when making the iOS differentiation demo, where it wouldn't let me differentiate SIMD3<Float> because that doesn't "conform to FloatingPoint". I had to pass in just one component of it, which is why the sample code is getLocationY and not getLocation. It's almost impossible to use autodiff when the only things I can differentiate are Float and Double and not custom data types. The 5.5 branch (what differentiation-ios-demo used) diverged from main months ago, so this isn't new and I probably can't get around it by switching toolchains. Yet, S4TF compiled on a toolchain created after 5.5.

Also, I'm at a loss for why this won't compile in Xcode (January 9, 2022 toolchain). I'm not even modifying the input, which is the only active value:

import _Differentiation

@differentiable(reverse)
func myFunc<T: Differentiable & AdditiveArithmetic>(_ input: T) -> T {
    var inputCopy = input
    inputCopy += input /// error: Expression is not differentiable
    return input
}

It's effectively the same as this, which does compile:

@differentiable(reverse)
func myFunc<T: Differentiable & AdditiveArithmetic>(_ input: T) -> T {
    return input
}

[AutoDiff] Fix crasher when type-checking mismatched derivative. by rxwei · Pull Request #40347 · apple/swift · GitHub caused S4TF to break in the 2021-12-02 toolchain. The PR broke so many functions in S4TF that I'm surprised it passed the tests you used to validate it. Workarounds are now no longer a viable solution like they were for BatchNorm (the reason I'm deciding to fix the compiler myself now). Shouldn't building S4TF successfully be a prerequisite before [AutoDiff] commits are merged?

philipturner · January 16, 2022, 12:20pm

I found the requirement that a parameter must be floating-point, but it was really small and hard to come by.

@inlinable
public func gradient<T, R>(
  at x: T, of f: @differentiable(reverse) (T) -> R
) -> T.TangentVector
  where R : FloatingPoint, R.TangentVector == R { /// in this line
  return pullback(at: x, of: f)(R(1))
}

I also found the differential operators that allow Tensor to be differentiated:

@inlinable
public func gradient<T, R>(
  at x: T,
  in f: @differentiable (T) -> Tensor<R>
) -> T.TangentVector where T: Differentiable, R: TensorFlowFloatingPoint {
  return valueWithGradient(at: x, in: f).1
}

porterchild · January 19, 2022, 3:02am

You can call pullback(at: of:) and pass it a unit vector of the function's output TangentVector type to get what would happen automatically by calling gradient(at:of:) for a function with an output that conforms to FloatingPoint

let pullbackFunction = pullback(at: myData) { myData in
    foowrapper(myData, argument: 3)
}
let unitVector = DiffData.TangentVector.init(x: 1, y: 1, z: 1)
let grad = pullbackFunction(unitVector)

here's some more context
see also autodiff tutorials, specifically part 3
or this section of the Differentiable Programming Manifesto

philipturner · January 19, 2022, 4:13am

I realized that after I made this post , but thanks for the autodiff tutorials. If I make them into Colab notebooks, will you and @Troy_Harvey make a GitHub repository about them? I haven’t seen anything from your group as being on GitHub yet, which is an area of concern.