Pre-pitch: Swift Differentiable Programming Design Overview

Differentiable programming is a language feature that we've been incubating for (the official) Swift as part of the Swift for TensorFlow project. After over a year's evolution and experimentation with real-world differentiable programming problems such as machine learning, the feature is getting closer to being ready for a Swift Evolution pitch. As such, we put together a design overview for this feature with an open roadmap. Your comments and suggestions are welcome!

Note: Opening this Google Doc does not require a Google account.

This document will be kept up to date, and the incomplete sections such as higher-order differentiation, infinite differentiability and control flow will be expanded soon. By the time we are ready to run a pitch (in late 2019), we will prepare detailed tutorials so that the community will get a better grasp of the problems this language feature is going to solve and help identify interesting use cases for it. Technical documentation about the implementation will also be written up in depth.

To see this language feature in action, you can watch our recent demo at Google I/O 2019 and play with the custom differentiation tutorial on Google Colab.


If you have any questions or concerns regarding having differentiable programming (or any new features introduced in the document) as a part of the official Swift language, I'd love to hear what you think! As we complete the implementation and the first evolution pitch/proposal, we want to make sure to address any obvious issues and make the rationale clear.

Are there any applications of these features to other areas not related to AI/ML? For example, could you add an example for some physics calculations maybe? Any way to break it up for easier review? Thanks for the great work.

Absolutely, I will add examples of applications in mathematical optimization, physics, robotics, and potentially computer graphics in the proposal.

Currently the idea is to introduce first-order differentiable programming in the first proposal, which includes changes to the following:

  • syntax
  • type system
  • conformance synthesis
  • standard library
  • compiler transform
  • ABI

I'm not sure if the top-level feature (first-order differentiable programming) can be broken down to multiple proposals, since components rely on each other and are not independently justifiable. Technically, conformance synthesis can be pitched independently, but it is very core to customizable differentiable programming and probably should not be separated.


I haven’t fully read the proposal and basked myself in all its glory yet, but I feel that this structure could benefit other operations than just derivatives, such as the transforms (Fourier, Laplace, Z, etc.).

Is it possible to separate the features out enough so that others can create their own function transforms.

1 Like

I like Swift for TensorFlow project.

But I don't know whether it makes sense to introduce automatic differentiation into original Swift.

I know its useful for machine learning.
But even if it is introduced,
people who treats machine learning use Swift for TensorFlow.
Because many other big features are there in S4TF.

So, other usecase examples are important to discuss about introducing this feature.

I am looking forward to it.

And then, if there are grand plan to merge Swift for TensorFlow into original Swift completely, I want to know overall perspective.

1 Like

Yes, I absolutely agree with you on this point.

To be clear, the Swift for TensorFlow library will not be proposed for being included in the official Swift language, because TensorFlow is a library built on Swift in a world of diverse libraries. Instead, we propose technologies, especially ones that solve general real-world problems and ones that make a big impact to programming.

Some technologies that were incubated as part of the Swift for TensorFlow engineering effort are going to be proposed though Swift Evolution, not the Swift for TensorFlow library itself. The ones that have been accepted to Swift include dynamic callables, "static" callables, and AdditiveArithmetic.

Differentiable programming is a new era of programming and is the foundation of modern-day AI algorithms. It is absolutely not only applicable to machine learning libraries.

This is not exactly what we envisioned for our open source contributions back to the Swift community (not locking Swift ML users to one library). With differentiable programming, we want to make Swift a go-to language for numeric programming and scientific computing, and be world's first general-purpose language that is capable of supporting this new programming paradigm. We want to encourage more developers to develop powerful libraries with it, not just Swift for TensorFlow.


There are definitely a ton of real-world problems that can be solved with function transforms, and custom function transforms will definitely be easier when the Swift compiler gets bootstrapped and when the compiler becomes more hackable.

While going for generality is a good thing, I think that the pursuit for generality should not block features that solve a slightly more domain-specific but very impactful feature. When a general system comes along, the less general feature can definitely be replaced with the more general feature. Differentiable programming also helps set a bar for future AI-capable programming languages and any generic function transform features being proposed for the Swift language.

Very very promising and timely proposal not only for Swift. It for sure will affect all general-purpose software development industry because it looks like some sort of «silver bullet» for a lot of existing domains. I believe that we should do it correctly in the first version of public implementation. Otherwise «someone» will beat us with the best solution. Now we see that ML/Data science «race» has begun.

In the proposal I see very logical approach for ML/Data science domains. But I agree with colleagues that more example from «neighbor» domains would be very helpful. We should choose such form of differentiable programming which can become de facto standard for all industry, become part of our «every-day» programming and reduce the threshold of entry for beginners.

Thank you for active promotion this. Great work. I am very inspired.


I was thinking that it could be in the form of annotated Swift library, but it seems you’re using different approach and just edit the compiler yourself here. (Which make sense now that I remember S4TF also aim to support a lot of compile-time checking)

Yes, it is a combination of standard library (Differentiable protocol) + compiler (@differentiable function types) + runtime (@differentiable function representation).

Differentiation of computable functions is not computable [1] [2], so differentiable programming can never be implemented as a library—it can either be in a restricted DSL, or be fully language-integrated as a first-class feature.

[1] Marian Boykan Pour-El and Ian Richards. Differentiability properties of computable functions — a summary. 1978.
[2] Marian Boykan Pour-El and Ian Richards. Computability and noncomputability in classical analysis. 1983.


I’m not sure how non-computability fits in, since we seem to be dealing with function with computable differentiation (almost?) exclusively.

IIUC (and so correct me if I’m wrong), this feature annotates basic functions with @transposing and @differentiating, and uses composition rule (in this case, chain rule) to break down function at each level until it hits the beginning of the variable. The elemental function annotation and composition rule is something I think can be exposed.

This is really fantastic work Richard. Thanks to all in the S4TF team for moving this along. Very impactful

1 Like