Ongoing work on differentiable Swift

Brad_Larson · December 8, 2022, 8:54pm

Sorry for the slow reply, was on the road for a bit. Glad to hear about the interest.

There are maybe different concerns when talking about the performance and robustness of differentiable Swift as a language feature itself, versus higher-level frameworks built upon it. A higher-level framework oriented around a Tensor-like type that dispatches to an accelerator or has optimized vector operations might have different performance bottlenecks and may only use a subset of the language features differentiable Swift provides. The overhead that we're trying to eliminate at the language level in the backwards pass of compiled differentiable Swift functions might be trivial compared to that involved in the preparation and dispatch of big parallel operations.

For traditional CNNs, your performance will probably be determined by how well data can be batched for parallel operations, how well available hardware can be used, can operations be fused, can memory be re-used, etc. Established frameworks have put a lot of time and effort into those parts of the process, and they can be hard to beat for their sweet spot of traditional neural networks.

Differentiable Swift shines in the situations that need a fast host language, tremendous flexibility, and / or the ability to blur the lines between application logic and machine learning code. If a custom operation would let you perform a calculation much faster, you can simply write one in Swift instead of waiting for a framework's authors to build and deploy one. We've seen instances on mobile where that was a huge win over existing solutions. At PassiveLogic, differentiable Swift is absolutely vital to getting great performance at the edge for our physics-based simulations, where it is ~117X faster in one internal comparison vs. PyTorch for a representative simulation, and 189X faster than TensorFlow for the same. Our code seamlessly mixes application logic in Swift with optimization processes.

Also, while differentiable Swift currently can be used on iOS, Android, and SwiftWASM (I've deployed to all three), support is what I would call "highly experimental" on those platforms and may not pass your comfort threshold for everyday production use. It's certainly fun to play with there, though.

That's a bit of a roundabout way of saying that I'm not sure whether replacing your existing TensorFlow or TFLite CNN model with something built in a framework layered on differentiable Swift would justify the rewrite work today. There are certainly cases where we've seen large performance advantages by building something in Swift that is outside of the comfort zone of existing ML frameworks, but that may not be your situation from what you describe.