Swift as syntactic sugar for MLIR

burmako · August 6, 2019, 9:33pm

Hello! My name's Eugene Burmako. I work at Google on the Swift for TensorFlow team. I'd like to share something that we've been working on recently.

Swift works great as an infinitely hackable syntactic interface to semantics that are defined by the compiler underneath it. The two options today are LLVM (there's a running joke that Swift is just syntactic sugar for LLVM) and TensorFlow graphs (which is the contribution of early versions of Swift for TensorFlow).

Multi-Level Intermediate Representation (MLIR) is a generalization of both the LLVM IR and TensorFlow graphs to represent arbitrary computations at multiple levels of abstraction to enable domain-specific optimizations and code generation (e.g. for CPUs, GPUs, TPUs, and other hardware targets).

In Swift as syntactic sugar for MLIR - Google Docs, we've written down some thoughts on several ways to metaprogram MLIR in Swift - starting from treating MLIR programs as strings and then gradually increasing the level of language integration with Swift.

Seeking to obtain experimental evaluation of the designs explored in the document, we've prototyped quasiquotes, a language feature that allows "quoting" snippets of code which are then transformed into data structures available to Swift programs. These data structures can then be used for all sorts of purposes, including translation to MLIR: Experimental prototype of Swift quasiquotes by burmako · Pull Request #26518 · apple/swift · GitHub.

This code doesn't fully implement the theorized design yet, but we believe that it is already useful for experimentation. We are evaluating available approaches and garnering community feedback. Please let us know what you think!

Joe_Groff · August 7, 2019, 2:14am

This is an interesting approach to metaprogramming! For your stated use case of generating MLIR from Swift, though, this seems like a bit of an oblique way of achieving that goal. Type-checking Swift code and lowering it into SSA is what half of the compiler already does, and it seems like if you're taking this approach to add what amounts to a new backend to the compiler, you're imposing the need to reimplement that first half of the compiler as a quasiquote interpreter in Swift, with all the well-known problems of compile-time performance, diagnostic quality, and behavior differences between baseline Swift and the metalanguage that this sort of approach has. Could we instead lower MLIR from SIL, using availability or another existing mechanism in the language to constrain things like your matmul example to operations supported by your target, and provide an API for further manipulating that MLIR in compiler passes written in Swift?

burmako · August 7, 2019, 2:54am

Going from SIL from MLIR is an interesting alternative that we're actually exploring right now.

Quasiquotes were my first pick to get things going, since I'm more familiar with quasiquotes from my previous work on Scala, but there is a plan to explore several alternatives to make an informed decision.

My initial impressions are as follows:

Trees are easier to get started with (even simple code snippets like matmul generate impressive amounts of SIL).
Trees will need to be lowered (lowering all sorts of control flow in trees into SSA is going to be hard, although it's not clear how much lowering will be needed, since it's not clear what language features we will want to allow in MLIR kernels).
Trees capture syntactic structure of the code in a straightforward way (reconstructing affine loop nests from SIL will probably be an intense experience).
SIL surfaces some implementation details, e.g. ARC, which aren't directly relevant to MLIR kernels and probably will have to be stripped off.

One thing that would be good to mention here is that our quasiquotes are created after typechecking, so quasiquote-based approached don't have to reimplement the typechecker. Lowerings to SIL have to be redone (perhaps for a subset of the language), but I'm not sure whether this counts as half of the compiler.

In a nutshell, I think that both approaches have their merits, but this doesn't have to be an either-or situation. I want to try all of them.

Joe_Groff · August 7, 2019, 2:56am

Definitely, having both approaches together seems like it could also reduce the complexity and improve the factoring of the DSL code you have to build. Your quasiquoter/function builder/macro whatever could focus on the interesting "frontend" business logic of the DSL, mapping the interesting part of the DSL semantics to the baseline Swift semantics, allowing the standard compiler backend to manage the lowering to SSA. Any interesting "backend" business logic you want to subsequently do with the resulting CFG can focus on the MLIR-level transformations.

Torust · August 7, 2019, 6:36am

This is very exciting to me, although not directly because of MLIR (even though MLIR is interesting in its own right). I've been toying with the idea of a SPIR-V backend for Swift, enabling graphics/compute shaders to be written directly as Swift code. Quite apart from the existing shading languages being largely quite old and/or clunky, language-level integration could simplify things like resource binding. For example, I currently generate Swift code via SPIR-V reflection to match the GPU resource slots; if the GPU code were written in Swift in the first place the compiler could generate a lot of the glue code.

I'm far out of my depth here, but I had imagined the conversion would take place after SIL, with the reason being that generic specialisation could occur. As the paper/dissertation on the Swift/Rust-inspired shading language Slang points out, generics and protocols are very useful for graphics shaders, and could be compiled down to specialisation constants since the full set of possible types can be known at compile time. It might also be possible to do this with a quasiquote-based approach, however – I'm not familiar enough with it to know.

Additionally, being able to run and debug the DSL Swift code on the CPU would be a very useful feature; I would imagine this would fall fairly naturally out of an SIL-based approach, but might be quite difficult using string manipulation and quasiquotes.

I mention all of this because I would think that an MLIR and an SPIR-V backend would be architecturally similar and could share a lot of machinery. I haven't looked into MLIR enough to know whether it's possible that a SPIR-V backend could use MLIR as an intermediary (and therefore sit outside of Swift) or would be best suited as a companion project. Either way, I think it's worth considering graphics programming and non-ML shader compilation as a use case when designing this.

I would also imagine there would be similar concerns at the SIL level – for example, it would need to be possible for both many MLIR use-cases and for shaders that all allocations can be statically eliminated. Knowing that no allocations lie on a critical path would also be useful as a general-purpose Swift feature.

saeta · August 7, 2019, 4:25pm

Excellent thoughts @Torust! I think this getting Swift to play nice with SPIR-V is going to be very important.

Note: I believe the work that what @burmako is doing and a SPIR-V integration are actually one-and-the-same. Concretely, there already is an MLIR dialect for SPIR-V (https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/SPIR-V.md) that is under active development. :-)

Full disclosure: I'm @burmako's manager, and the MLIR team is a sister team to my team under @Chris_Lattner3 at Google, and has engineers working on the SPIR-V dialect of MLIR.

Joe_Groff · August 7, 2019, 5:02pm

Yeah, it seems like maybe there are a couple of different ideas to tease out here—more powerful features for designing and implementing DSLs, and infrastructure for generating different IRs from Swift code. If you want to be able to write Swift code (or at least a subset of Swift) and compile it down to SPIR-V, MLIR, or some other domain-specific IR, I suspect it'd lead to an overall better user experience to treat that as a backend problem and lower from SIL, since as you noted, that seems like it makes it easier to share functions and libraries across different heterogenous targets, and it makes it easier to ensure that regular Swift code targeting one of these backends behaves the same as Swift code today. To reduce the barrier to entry of building those sorts of backends, maybe we could factor the optimizer pipeline to let passes plug into the compiler using high-level Swift interfaces; I think @Michael_Gottesman was working on something to allow SIL passes to be written in Swift recently, for instance.

(However, particularly for very specialized targets or domains, I can see the desire to also use quasiquoting to build a specialized DSL for what those targets specialize in.)

burmako · August 7, 2019, 8:50pm

I think that the barrier to entry aspect is super important here.

With quotes, we hit the ground running immediately, since libQuote is just another Swift library. We only had to define Swift data structures to model trees, and then the work on lowering and quoting was happening in parallel. Swift is a great language, and compile times are pretty good, so we've been able to keep up the momentum and productively iterate on the Swift-to-MLIR lowering.

If we were to develop a similar technology inside the Swift compiler, we'd have to deal with writing C++, interfacing with a significant API surface, and likely waiting longer for things to compile (overall, ninja-based workflow is not bad, but sometimes you do have to wait for a while). That would be suboptimal for iteration speed.

As mentioned above, there are downsides to using trees (although we'll need an experimental evaluation to see how significant they are), but iteration speed is critical as we explore the right way to model heterogeneous hardware in Swift.

Now, if there was a technology that:

Provides Swift data structures that model SIL (Data model and parser for SIL by burmako · Pull Request #227 · tensorflow/swift · GitHub would be one example of something along these lines).
Allows to develop compiler passes as SwiftPM projects.
Exposes an easy way to hook these compiler passes into Swiftc (e.g. via something like -experimental-compiler-pass libMyCompilerPass.dylib that works across platforms).

That would significantly improve the iteration speed of a SIL-based approach. It's great to know that there is work being done in that direction.

Joe_Groff · August 7, 2019, 9:12pm

Definitely. I can see how the approach you're taking allows you to prototype and iterate more quickly within the existing compiler infrastructure. I'd be concerned though that when it comes time to implement things "for real" in a polished product, it'll get much harder to faithfully replicate the behavior of plain Swift, figure out how declarations and modules can be shared, and other such details.

Karl · August 19, 2019, 9:17am

Hi Eugene!

I've only recently discovered MLIR, and the whole 'multi-level' and pluggable aspects (via dialects) certainly look interesting. The thing I wonder about though is why we would "lower" MLIR from SIL, since it seems that MLIR is able to represent even higher levels of abstraction than SIL can (e.g. library-level optimisations, such as for tensorflow); wouldn't it make more sense to implement SIL as an MLIR dialect?

Obviously that would be a massive amount of work; I'm just curious about the theory. What would be the benefits/drawbacks to performing SILGen and SIL-level optimisations (like today) and generating MLIR, against implementing SIL as an MLIR dialect? (Besides of course that MLIR is quite new and still in active development)

Joe_Groff · August 19, 2019, 4:59pm

If MLIR had existed at the time Swift was originally implemented, then SIL could probably have been embedded as an MLIR dialect (MLIR seems like it might've learned a thing or two from the SIL experience.) Even if it were, though, SIL is by design an unstable implementation detail of the compiler, which we change freely when necessary to improve the compiler implementation or support new features, and that would be the case even if it were a dialect in a more general IR framework. If we were going to expose a code generation mechanism for user extension, we'd want to either stabilize SIL to some degree, or come up with a more stable abstraction of SIL that'd be suitable for custom code generation. Maybe we could do that by having a pass that normalizes SIL to a standard format that can be visited by a stable API.

Michael_Gottesman · August 20, 2019, 5:21pm

I would like to add something to the public discussion from our own conversations. Namely that part of the problem with doing Swift -> MLIR is that many of the semantics of Swift the language are actually defined by compiler passes in SIL. This means that one can not truly claim to have a "Swift" -> MLIR translator since one would have to guarantee that everything lowers the same semantically (quirks included). That sounds like a moving target that would be difficult to simulate exactly. In contrast, if you had something that could lower from canonical SIL, this problem goes away.

Chris_Lattner3 · August 26, 2019, 11:45pm

This is all very much a theoretical concern, but if Swift uses MLIR as the implementation mechanism for SIL, then the definition of the SIL dialect and the SIL-specific passes would be implemented in the Swift codebase. Swift needs to control its own ops, own semantics and provide its own specific passes. MLIR totally supports that, and in fact has direct support to model things like "raw vs canonical" SIL.

That said, exposing 'sil' to users definitely has the stability issues that Joe mentions regardless of the representation. Joe is also right that MLIR learned a lot from the experiences building SIL and LLVM of course :-)

-Chris

Peteris · September 27, 2019, 9:32pm

This proposal is exciting. I'm embarking on a project that will use MLIR and Swift and hoping to use both together.

Is there an experimental interface that I could already lean on or are the above isolated experiments and it is best to rely on C++ at this stage?

I'm not too worried that a better interface could come along later, my key question is whether the current paths provide a reasonably "comprehensive" interface where one could fill in the missing pieces without too much trouble.

If the answer is "you have to rely on C++", is there some pathway of creating a comprehensive set of bindings that you would suggest if I'd like to call MLIR from Swift?
Happy to become one of the guinea pigs whenever you need one!

burmako · October 4, 2019, 10:16pm

We're currently using C++ to build MLIR programs, although at some point in the future we may look into defining idiomatic Swift APIs (but we don't have a concrete roadmap for that).

Thank you for your kind offer! I'll post an update here once we have something to share publicly. Let's stay in touch

dibyendu.das0708 · April 30, 2020, 7:02am

Hi-

Is there a working Swift -> MLIR converter that can be used ?

-Thx
Dibyendu

saeta · April 30, 2020, 8:26pm

Thanks for asking! Unfortunately, we've put this development on pause for the time being. Could you describe your potential applications & what you might need?

IOOI · May 1, 2020, 1:10am

Is it because @Chris_Lattner3 went into the woods, err to SiFive, where he vanished?

dibyendu.das0708 · May 1, 2020, 4:19pm

Hi-

We are looking for a reasonably powerful high level language that generates MLIR.

-Thx

dd

saeta
Brennan Saeta

April 30
Thanks for asking! Unfortunately, we've put this development on pause for the time being. Could you describe your potential applications & what you might need?

Chris_Lattner3 · May 1, 2020, 4:51pm