Ongoing work on differentiable Swift

Troy_Harvey · June 5, 2022, 7:47pm

Hi Chris,

Just to augment Brad's response. Our implementation is pure Swift. We are taking a new approach, which is pretty different from deep learning where models are monolithic. We use runtime composable model fragments (like you would want to in any code base), that use typed interfaces to connect up the pieces in to an application model. Because these model fragments are pre-trained, the user after composing their application ends up with a largely pre-trained completed application model. So our final training costs are very low. Then secondarily, we continuously learn as we inference in the application.

These composable pre-trained kernels define the existential behavior of "things". Behaviors define the underlying requirements of actors (not the Swift kind), and actors can be composed into multiple higher level types: components, equipment, assemblies, sub-systems, systems, etc.

We have an in-house stack that handles the different layers in this process, and several different user applications for building, exploring, and composing the models. At runtime we compile these kernel functions, and link them into a runtime, and inference. We've been focused on CPUs first, but now that we have the compiler and frameworks integrating, we will be starting on accelerators. A few notable things on that front:

Dispatch Our hierarchical typed networks enable the dispatcher multiple levels of granularity to pick from, starting with flat graphs, to behavior graphs, to component graphs, to system graphs, etc. One of the challenges of a TensorFlow like approach is that you have flat tensor graphs and must try to heuristically aggregate to make dispatch more efficient, and of course the heuristics for coalescing operators is an endless pursuit. We can pick the level of dispatch graphs from our recursively hierarchical format.
Accelerators. While tiling these graphs on GPUs will see some large gains, we see an eventual need for developing silicon that is built for graph processing and MIMD operations to get the most out of the work we are doing.

As far as current focus goes, the team is working on a bunch of ambitious goals:

AI Frameworks. We are working on Swift differentiability together with a group of of AI frameworks that build out "4 legs of the stool": Navigation (deductive/inductive graph inferencing), Introspection (reflection, mutation, lensing, meta-graphing), MetaInferencing(abduction, runtime latent inferencing, constraints, etc), and Solving (chaining, competitions, generative learning, graph dispatch, distributed cluster management, multi-graph tie-ups). We have several assets we are targeting for open source here.
Digital Twins. We've been developing a computable digital twin language called Quantum. This is being open sourced, with many industry partners. It is a physics-based digital twin graph encoding that is the underpinnings of the AI frameworks above. It is broad in its ambitions to describe and compute real world things and how they interact in a generalized way.
Autonomous Platform Our platform team is building on top of these frameworks for real time autonomy, automation, sensor fusion, and I/O.
Edge Hardware Our hardware team is building the edge compute platforms that run the whole stack.
User Software Our user software team is building tools that make AI accessible to real people (not just developers). These tool enable engineers to make digital twins, and enabling end customers to build their own custom autonomous systems, and AI as a service queries.