Differentiable Perceptron example not working

mtsrodrigues · November 28, 2022, 11:33am

The Differentiable Programming Manifesto is outdated and I can't find a way to make the Perceptron example compiler. Can anyone help me? Thanks!

struct Perceptron: Differentiable {
    var weight: SIMD2<Float> = .random(in: -1..<1)
    var bias: Float = 0

    @differentiable(reverse)
    func callAsFunction(_ input: SIMD2<Float>) -> Float {
        (weight * input).sum() + bias // ❌ Expression is not differentiable
    }
}

var model = Perceptron()
let andGateData: [(x: SIMD2<Float>, y: Float)] = [
    (x: [0, 0], y: 0),
    (x: [0, 1], y: 0),
    (x: [1, 0], y: 0),
    (x: [1, 1], y: 1),
]
for _ in 0..<100 {
    let (loss, modelGradient) = valueWithGradient(at: model) { model -> Float in
        var loss: Float = 0
        for (x, y) in andGateData {
            let prediction = model(x)
            let error = y - prediction
            loss = loss + error * error / 2
        }
        return loss
    }
    print(loss)
    model.weight -= modelGradient.weight * 0.02
    model.bias -= modelGradient.bias * 0.02
}

mtsrodrigues · November 29, 2022, 12:33pm

@Brad_Larson Could you help me here?

Brad_Larson · November 29, 2022, 9:45pm

I think the problem here is that SIMD's sum() lacks a registered derivative because we currently cannot register derivatives to @_alwaysEmitIntoClient functions. To work around that, you can define a wrapper function for .sum() and register a derivative for that, like temporarySum() in the following:

import _Differentiation

public extension SIMD2
    where
    Self: Differentiable,
    Scalar: BinaryFloatingPoint & Differentiable,
    Scalar.TangentVector: BinaryFloatingPoint,
    TangentVector == Self
{
    @inlinable
    func temporarySum() -> Scalar {
        return self.sum()
    }

    @inlinable
    @derivative(of: temporarySum)
    func _vjpTemporarySum() -> (
        value: Scalar, pullback: (Scalar.TangentVector) -> TangentVector
    ) {
        return (temporarySum(), { v in Self(repeating: Scalar(v)) })
    }
}

struct Perceptron: Differentiable {
    var weight: SIMD2<Float> = .random(in: -1..<1)
    var bias: Float = 0

    @differentiable(reverse)
    func callAsFunction(_ input: SIMD2<Float>) -> Float {
        (weight * input).temporarySum() + bias
    }
}

var model = Perceptron()
let andGateData: [(x: SIMD2<Float>, y: Float)] = [
    (x: [0, 0], y: 0),
    (x: [0, 1], y: 0),
    (x: [1, 0], y: 0),
    (x: [1, 1], y: 1),
]
for _ in 0..<100 {
    let (loss, modelGradient) = valueWithGradient(at: model) { model -> Float in
        var loss: Float = 0
        for (x, y) in andGateData {
            let prediction = model(x)
            let error = y - prediction
            loss = loss + error * error / 2
        }
        return loss
    }
    print(loss)
    model.weight -= modelGradient.weight * 0.02
    model.bias -= modelGradient.bias * 0.02
}

The above builds and runs for me in current nightly toolchain snapshots.

mtsrodrigues · November 30, 2022, 12:08am

Thank you so much!

saraalrawi · August 9, 2023, 3:05pm

have you tried matmul with _Differentiation to implement the training loop of a deep learning model?