Formalizing a Numerical/ML Working Group

I've been thinking about the post-#S4TF chatter. And how we might form stronger community and inter-corporate support for ML & numerical Swift features and libraries.

A couple points of note:
S4TF. The concepts around differentiable and ML Swift in the broader industry were too closely associated with the work at Google. Obviously it originated there. But after they pulled the plug (ahem... no surprise), the broader community over-associated it with Swift itself. Even though true success meant that there would be no tensorflow in S4TF at the end of the day. And that mostly held up with with the milestones of differentiability and introspection getting mainlined.
I'm super thankful for the S4TF for getting these features rolling. That said, Today our internal Swift team building libraries in numerical, introspection, and differentiable Swift is over 10 people and growing. I suspect we are not the only company building features should be rolled into canonical libraries.

Numerical Computing. I'm a big believer that Pythons success was built around the "1-true-library approach". If you are doing numerical computing, you won't spend a week trying out packages on github... you just use numpy.

We've seen the power of "blessed libraries" over in the Server group. SwiftNIO became a standard the moment it was announced. But there a speed of development problem with the over-reliance on the core Apple team driving all these libraries too. For example the Swift Numerics library has had a bunch of interesting pull request sitting on it for a year. I assume there are other internal priorities. But externally it's not clear what the roadmap is, should be, or how to contribute.

Proposal

  • Establish a ML/Numerical Working Group
  • Have that group outline a roadmap of modules, features, evolutions
  • Coordinate efforts on a set of core Swift Libraries
  • Have a review process that can move pull requests that are on roadmap, without bottlenecking behind a single contributor.

I'm curious to hear from the rest of community and the core team
-Troy

54 Likes

@dabrahams is The Loft · GitHub still a thing?

2 Likes

Small correction on language that was pointed out to me. Most of the features waiting on the Numerics library aren't pull requests per se, but are waiting in the issues list.

Features in this list include DSP functions, BigInt, Numerical Analysis, ShapedArrays, Angles, etc. All great ideas, but largely stalled behind what seems like an unclear community process to get them accepted, or segmented into a suite of libraries,

This isn't to pick on Numerics per se, but it's a good study in unrealized potential. There are many libraries around this we could certainly roadmap.

6 Likes

I would be very interested in participating in such an effort

1 Like

Sounds very interesting! I've been trying to make some progress on a ShapedArray implementation for a while now. Would be nice to exchange ideas around the best way to approach it.

3 Likes

This is a great idea! I’ve been working for a while now on my soon-to-be-published SwiftML library for training neural networks in Swift with a Keras-like API using result builders on top of ML Compute and Metal Performance Shaders, but it would be great to have a language-wide standard that isn’t platform-dependent. I understand that this is only one domain within the larger field of machine learning, but the best way right now to train a neural network in Swift with hardware acceleration is to use Metal Performance Shaders, which is obviously not very portable. That’s why I think that this proposed working group should make sure not to ignore hardware-ecosystem concerns. Swift for TensorFlow was great, but it didn’t address any of the hardware-compatibility issues because it just sat on top of the existing TensorFlow core, which only supports CUDA.

The counterpoint to this would be that hardware compatibility shouldn’t be the concern of people who are just working on higher-level frameworks and language features, and while I understand that point of view, this seems like a great opportunity to take a stab at fixing the ecosystem issues, and it would be a shame to pass it up.

5 Likes

It is if someone else wants to take it over.

1 Like

Indeed. There are many combined efforts that could combine individual efforts around core libraries in numerical computing, DSP, neural nets, introspection, auto-diff, distributed computing, and heterogeneous compute. We alone have efforts in each of these.

In our case sitting on top of conventional deep learning libraries is too constrained. So continuing in the S4TF thesis that Swift as a high performance generalized differentiable languages is critical to unlocking the next generation of solutions. I'd like to see more of a lively community around the unique features of Swift in machine learning... given today there are few serious other language options that can compete in the circle of fast, embeddable, generalizable, and 1st class auto-diff.

3 Likes

cc'ing a few relevant folks to make sure there is an accurate gauge of interest
@scanon @rxwei @tkremenek @Chris_Lattner3

1 Like

I'd love to see this happen!

11 Likes

I am still very interested in doing factor graphs with Swift, so please count me in :)

2 Likes

I've been working on Swift for NNC Reference, it is a fun exercise with Swift the language.

Taking lessons learned from Python, it would be awesome to build on top of mainlined autodiff feature and be platform agnostic. However, to be practical, it is still important to be able to run state-of-the-art models with whatever Swift provides with most popular hardware (GPUs and TPUs) competitively. What does that look like without S4TF? Still XLA-based? libtorch? Something new?

On the side note, there are several languages serious about taking numerical computing throne. Besides the obvious Julia lang, Nx from Elixir would be an interesting one to look at: Nx (Numerical Elixir) is now publicly available - Dashbit Blog

2 Likes

@machineko I think we could move it to another forum like discord if can't get buy-in from the maintainers here. However if we want to engage with the core community I think it belongs here.

@liuliu X10 was nice, but limited to tensors, Tensorflow and (mostly) Cuda targets. I'd like to see heterogeneous compute support for embedded targets and for a wide range of problem shapes - not just tensors. There are many ways to address this. I believe there is some work in MLIR-Swift (early). I'm not the guy to give updates in this realm though. Other paths via LLVM also exist.

Our efforts to date have focused on getting solution completeness on CPU targets. Our team will dive deeper on GPU/TPU/NPU targets later this year. If anyone wants to provide updates in the area of heterogeneous compute, I'd love to hear progress.

2 Likes

Definitely! Swift for NNC has very different goals (libnnc aims to be cross-language runtime with complete pipelines including graph optimization and autodiff). It is a very personal project.

For Swift on Numerical / ML, we probably would like majority of pipelines are done in Swift either at library level (graph transformations?) or at language level (autodiff). External dependencies such as XLA or libtorch would just be a simple computing runtime that handles tensor computing on heterogeneous platforms.

1 Like

I'd love a workgroup in this direction! Swift as a C-type language is ideal for high performance stuff, and its reasonable type system, value types and many other language features would make Swift an ideal candidate to replace python one day as the goto language for numerics and ML.

Regarding portability: I've done some experimentations with protocol-based numerical APIs. The result wasn't satisfactory because when trying to create generic kalman filters or dense layers or whatever you may be interested in, you first have to compose useful protocols and use them as generic abstractions. But I think with a community based conversation, this could actually work pretty well.

The idea was: 1. have all the important numerical operations of matrices/vectors/(tensors?) as protocol requirements in a single MatrixContracts framework. Also, declare important types like Matrix<T> or Vector<T>. 2. Leave providing an implementation for those protocols to other platform-dependent frameworks. 3. Write cool frameworks based only on MatrixContracts.

A nice idea I came up with in order to hide nasty implementation details:

public protocol BLASScalar where BLAS.Scalar == Self {
associatedtype BLAS : BasicBLAS
}

public protocol BasicBLAS where Scalar.BLAS == Self {
associatedtype Scalar : BLASScalar
//?gemm, i?amax,... as static functions go here
}

What I would love to see on the ML front:

  1. Type safety. I should know ahead of time if I can compose two layers or not. Possibly this is an incentive to finally get dependent types into Swift. Type safety may also enable the compiler to be smart about your code - especially in -Ounchecked. Further potential "composition problems" that cannot be encapsulated in a type should be caught at runtime before training starts by making layer composition throwing.
  2. A rigourous, functional and stateless design, maybe based on this: [1804.00746] The simple essence of automatic differentiation and this [1711.10455] Backprop as Functor: A compositional perspective on supervised learning. The goal should be to create an Elm-moment for machine learning (Elm was the language that inspired the Redux-architecture and thereby indirectly React.js and even SwiftUI). Ultimately, we should aim for "if it compiles, it works".
6 Likes

Are there any updates on this?

Also curious. I've also got a relevant numerics project about ready to go and have no idea how to go about working with the community to best shepherd it. Did a Discord group start up?

2 Likes

Not yet AFAIK, but I am thinking of starting one...

4 Likes

We could start a Discord and discuss what are the features, mostly libraries, needed for Swift to be a great fit on numerical problems and to get closer to parity to Python, Julia and R. Probably we are going to need a lot of libraries to cover this in an useful way. I want to help on this!

3 Likes

I very much like this idea and think Swift Numerics could use some perspective from the scientific programming community as well. Happy to contribute if someone wants to spearhead the effort.

2 Likes