Status update on the Differentiable Swift language feature

Status update on the Differentiable Swift language feature

It's been a while since our last update regarding Differentiable Swift! A lot has happened since then, and I'd like to give a quick overview of what we've been up to. Since our last update by Brad Larson, almost 70 pull requests related to Differentiation have been merged into swiftlang/swift, with another 9 PR's currently awaiting review!
The contents of these include performance improvements, bug fixes, standard library additions, as well as new function signatures we can now register/generate derivatives for!

For an approximation of the full list of PRs, take a look here

Notable improvements since last update

To summarize that list of bug fixes and performance improvements, here are some of notable changes we've made over the last year:

Release toolchain stability

Previously, we relied on nightly snapshots to use the latest Differentiation features in the compiler. @clackary has done some great work in tracking quality and performance of Swift Differentiation across both main and release branches. Over the last few release cycles, we have been able to identify and resolve issues on release branches, making using open source releases viable, and the most stable option for using Differentiation. We have adopted the latest open source releases for development and production.

Differentiable functions with multiple results

Thanks to this PR, we added a large collection of function signatures we can now define derivatives for. This enables using:

  • functions with an inout parameter which also returned a result
  • functions with multiple inout parameters
  • mutating functions which returned a result
  • functions that return a tuple of results

Work towards differentiable coroutines

A series of PRs that work towards adding Differentiation support to coroutines with the ultimate goal of supporting _modify accessors. This will improve the call site when reading and writing into types like Array.
Currently we have reached the point where the compiler can differentiate _modify accessors, but there is no way to register custom derivatives for them (meaning that code like Array.subscript._modify can not be made differentiable, yet! Since supporting this relies on the presence of custom derivatives in the Differentiation module).
A final proof-of-concept PR that completes the last pieces of the puzzle is currently under review and discussion here.

Improved performance when differentiating through Array types

The compiler is now able to optimize differentiation with Array types a lot more thanks to this PR. There are still more improvements to be discovered on this front however!

Improved usability of Optional<T>

A lot of work has gone towards improving differentiation of optionals. Optional support is much more complete now.

Ongoing performance work

A lot of improvements have been made in terms of performance over the last year. This is an ongoing effort, as we have another set of optimizations tailored to differentiable code, that we are currently working on.

Upcoming features (currently open PRs)

Along with bugfixes and other improvements, here are two PRs that we are working on that we are very excited about:

Support for _modify accessors.

There's a proof-of-concept PR by Anton Korobeynikov to add support for first-class coroutines in user code. The PR enables registration of custom derivatives for _modify accessors (as the derivative itself must be a coroutine). This will improve the user experience when using Differentiation with Array types, where we currently have to use a workaround.

@differentiable(reverse)
func setValue(array: inout [Double], value: Double) {
    array[2] = 6.0 // not differentiable yet, will be with the above mentioned patch!
}

// instead we have to use
@differentiable(reverse)
func setValue(array: inout [Double], value: Double) {
    array.updated(at: 2, with: 6.0) // is differentiable (defined in our swift-differentiation package)
}

This is both less performant and not as readable/familiar.

Custom derivatives for functions marked with @_alwaysEmitIntoClient

In this PR by Daniil Kovalev support for adding custom derivatives to functions marked with @_alwaysEmitIntoClient is introduced. This for example enables differentiation of more methods on SIMD types from the standard library.

Open source packages

We've started to collect and open source some useful extensions to the Differentiable Swift API currently available in Swift 6.1 under the differentiable-swift organization. The goal here is for these packages to be a slightly more flexible way to battle-test certain implementations before upstreaming them to the standard library or their respective repository.

We currently have three packages that each extend a preexisting library in their focused direction:

swift-differentiation (extends the standard library)

A collection of useful extensions to Swift Differentiation. The contents of this repository extend the current implementation of Differentiation to give the user access to more differentiable methods than currently provided in the Swift standard library. It also contains some workarounds for some methods that are not currently differentiable due to missing support in the language itself.

swift-numerics-differentiable (extends swift-numerics)

This package exports the swift-numerics products while also adding derivatives to many of the methods provided by the original package. This mainly extends Float, Double, and some of the SIMD types with derivatives for the functions provided by the ElementaryFunctions protocol and RealFunctions protocol.

swift-collections-differentiable (extends swift-collections)

This package exports the swift-collections products while also making room to add Differentiable support to some of the types the original provides. Currently, this mainly provides Differentiable support for OrderedDictionary.

Focus going forward

Differentiable Swift remains an incredibly important feature for our team at PassiveLogic.
Going forward, we will continue to maintain the feature, resolving issues as they arise. We will also direct more effort towards solidifying the architectural foundations of the language feature.

It is an ultimate goal of ours to advance this language feature through the Swift Evolution
process, and to that end we've identified some core architectural improvements to be made for us to move in that direction:

Protocol default derivatives

Currently, we cannot provide default implementations for derivatives of protocol methods that live in a different module. This limits the amount of generic code we can make differentiable as well as adding derivatives to preexisting libraries and the standard library itself that leverage protocols. Enabling this would allow us to define derivatives for many more APIs.

Improved Closure memory allocations

The performance of computing the pullback/derivative of a function is memory bound. Every intermediate value eventually needed to compute the pullback has to be allocated on the heap during the forward pass. We're currently exploring smarter allocation strategies to try and minimize the amount of malloc calls and get more data locality.

Concluding

In the long term, moving the feature past experimental status will require wider adoption and support from more than just us at PassiveLogic. We'd love to be a resource for anyone wanting to explore the potential applications of Differentiable Swift.
Most software frameworks around machine learning are built for the current needs of the most popular deep learning techniques, but there is a world of applications for gradient descent optimization and differentiability outside of that spotlight. It's our hope that the flexibility of a language feature like this could enable underserved areas of machine learning and general optimization techniques.
If you have feedback or questions or just want to show off some interesting uses of Differentiable Swift, please reach out! Either here on the swift forums to me (@JaapWijnen), @tmcdonell, @clackary or @_ck512, or on our recently opened discord channel.

Thanks

A lot of thanks goes out to @asl for the continuing large effort put into the above improvements.

I'd like to thank the following people for their current and previous work on the Differentiation language feature:

36 Likes

I think it would be really great if Differentiable Swift Tutorials had more tutorials. Maybe something like the Iris Flower Classification using Swift Differentiable to beginners and other more advanced real world tutorial.

3 Likes

It's really interesting to see the continued work on this. I wonder if the feature could attract more attention if the performance benefits (assuming they exist) were made clear somewhere.

For now I must say I have treated Differentiation as an interesting curiosity, but one that has felt almost unusably buggy even in simple use cases (disclaimer: this is based on trying out the feature many years ago) and with unclear performance over alternatives (like e.g. JAX).

In short: is Swift the ideal place to do this kind of work for those not directly invested in the Differentiable project? I know it's not anyone's job or responsibility to answer that, but just wanted to share my curiosity but also somewhat shaky experiences so far. And I would be very happy to hear from those who have experienced otherwise :)

3 Likes

I wonder more about how the design might shift to accommodate things like InlineArray, integer generic params, noncopyable and nonescapable types, macros and concurrency. these new features are directly targeted at supporting the sort of numeric operations that autodiff really needs and I wonder if much of autodiff support would be better implemented as macros over these features.

1 Like

maybe it would be useful to introduce something like Tensor type, not just Array or InlineArray or Matrix?

1 Like

Yeah more tutorials would be great! We'd love to have more examples that show off the strengths of autodiff and examples that demonstrate what the current limitations are. The Iris classification could be an interesting example.

We've been working hard to improve the situation since the last time you've tried it so it should definitely feel a lot less buggy! We've also added support for a lot more language features since then. Our priorities have really been on improving the user experience and language feature support.
Now that a lot of the bugs have been squashed, we're starting to have another look at performance. We would like to do fair comparisons with other solutions/approaches to autodiff to really be able to pinpoint the relative differences between them. Any suggestions here are more than welcome!

1 Like

Regarding macros, we've definitely thought about this. We currently don't see a way to fully support the language feature using the current macros system unfortunately. We're definitely excited about the upcoming/now available language features you mention and are keeping an eye on them or adding support for them to Differentation.

1 Like

Yes a specialized ML library would be interesting.
However! I think the real power of swift differentiation is that we don't need to define special "blessed" types. The user can define their own types and the compiler will be able to differentiate those types for you. There's no need for you to restrict yourself to special built in types that swift-differentiation provides for you. This to me is pretty powerful!

2 Likes

Could you elaborate on this? bc I’d be fascinated to know more. I’ve dabbled at doing zero-copy combinators for ML layers and my comment was along the lines I’ve been pursuing.

One if the main issues is that we currently can’t infer if a type conforms to a protocol inside a macro. So we wouldn’t be able to know if a type being used in a differentiable function is differentiable or not.

If you have ideas here I would love to chat though. If we can simplify and/or improve certain aspects of the language feature by leveraging macros that would be very interesting to look into.

1 Like

Another update from our side!
Quite a few developments since last May.
We've been able to get a list of improvements merged among which:

There's also been work on some exciting new improvements:

There are still some great improvements waiting for review

Aside from the direct swift compiler work there's also been a new swift-numerics release
swift-numerics drops Complex's conditional conformance to Differentiable

  • Complex no longer conditionally conforms to Differentiable. Given that the Differentiation module has not formally stabilized, support for it will be moved onto a feature branch for now.

The conditional conformance has been moved to our swift-numerics-differentiable package
and is available as of version 1.2.0.

We're really happy with the progress we've made the last months and especially excited about the support for _modify accessors and throwing functions. This will unlock brand new ways to write differentiable Swift code!
This unlocks lots of improvements for our simulation and control applications. We'd love to talk about how this could help you!

14 Likes

Hey, have you guys considered putting together some kind of Swift ML SDK, to make it easy for Swift devs to use Differentiation, perhaps in concert with toolchain vendors like @marcprux at Skip and @gmn at SCADE, if not Apple too? This is obviously a very hot field right now, but I feel that awareness and use of this effort is low in the Swift community.

A public SDK that makes it easy to try these features might change that. Just a thought, as someone who has never used a single ML tool.

Btw, congrats on your recent Series C. :smiling_face_with_sunglasses:

2 Likes

This post really motivated me to do something in that direction. Feedback is more than welcome!

5 Likes

I’m not entirely sure how this would combine with toolchain vendors. Could you elaborate on that?
I think a more classic Swift ML SDK would be interesting for sure. But there’s already great tools out there so I think the best approach would indeed be to wrap existing solutions and see how those would interface with Swift in the best way in terms of API.
Like @pedronahum has done for PyTorch. It seems very interesting. I’m hoping to make some time to take a good look at it, and how it uses swift-differentiation, soon!

1 Like

I only meant that the three toolchain vendors I listed all have experience shipping their own Swift SDKs for various markets, so they may be interested in collaborating to build this ML SDK, particularly since they could then provide this SDK to their users too. Since you guys at PassiveLogic are not in the tools business, I figured they may be able to help you turn this compiler feature into a larger SDK.

I have no idea what complementary pieces would be needed alongside this Differentiation feature to make it competitive with existing ML SDKs in other programming languages, but am simply suggesting that such a Swift ML SDK would help spur adoption of your work, similar to the current static Musl linux or Wasm SDKs for Swift.

1 Like

I’d like to point you to our repo for test examples. These were carefully run on MacOS and Linux on Jetson Orin.

This benchmark is a simple thermodynamics simulation. If that interests you great, otherwise you can imagine it is: a loop of functions computing a grab-bag of math operators — pretty representative most code problems.

In this test, Swift is:
~759-5000x faster than Pytorch and Tensorflow.
~10–100x less memory.
~1000X less energy utilization.

While the goal was to has the code match, we ended up optimizing the settings for Pytorch and Tensorflow to give them the best advantage. Jax was tested but not published because it faired so poorly, we sent the results to the Jax team.

We plan on providing other examples (we benchmark our own code internally, we need to invest in open source examples), but happy to add any community benchmarks to the list.

4 Likes

Swift is still faster, prelim results with Jax and TF compiled: JAX Benchmark to Building Simulation Comparison by pedronahum · Pull Request #33 · PassiveLogic/differentiable-swift-examples · GitHub

2 Likes

I assume this reply is aimed at me, not your colleague Jaap. :wink:

Those sound like great results, but unfortunately I doubt anyone is aware of them. I reached out to a contact at a very large company doing a lot with AI a couple years ago and pointed out your great benchmark results, and was told that while the tech team found these results intriguing and impressive, they would be sticking with the tech stack they had.

My impression is that these compiler features are not enough: you need a whole Swift ML SDK to get people to try out this work, eg the great success the three current official SDKs at swift.org have had.

Also, have you considered writing about your work and showcasing this example on the Swift blog? I think you would get a lot more eyes on these results that way.

2 Likes

All fair points.

We are solving for our business needs first of course, while also contributing back to the community. We trying encourage more engagement in open source — but understand that is dual expectations (so thank for the efforts guys!).

On complier features. It depends on what you want to do. If you want to make a distributed network traffic optimizer, for example… just write your normal swift code + a 20 line atom implementation. There are likely 100 such everyday code problems to be solved for every deep learning model.

As for what we do in AI, we plan on open sourcing more of our toolkits. But these aren’t focused traditional homogeneous deep learning model training, but in broader compute use-cases focused on heterogenous compute graphs, composability, shaped edge learning, physics-defined world models, ontological graphs, graph rag, and formal inferencing.

3 Likes