How to become an expert in compilers the long way?

I am just an eager student that wants to learn about compilers (everything about them). Currently, I do a lot of computer vision and NLP but I am really interested in compilers and currently, I have also started with swift.

I thought the experts that are actually working on compilers can help guide me by providing resources from beginner to further levels.

My background. I know assembly language. I am comfortable with C. I have also learned about state machines(FSA, NFA) and have fair knowledge about graph algorithms.


@jrose has a great blog post about exactly this!

I'd personally like to vouch extra for LLVM's Kaleidoscope Tutorial, which is a great high-level overview of most of the internals of an LLVM-backed compiler.

And also, I'd highly recommend just trying to contribute to Swift itself. We have a lot of bugs on marked as "Starter Bug" that you can pick from. If something seems interesting, assign it to yourself, send me a message, and I'll help you get started building the project.


It's worth pointing out that "compilers" is a very broad subject area, and its easy to feel overwhelmed at first. There are actually multiple subtopics that have little to do with each other, and together they comprise "compiler development". Examples:

  • Parsing (you mentioned you're already familiar with formalisms such as NFAs)
  • Semantic analysis (something we refer to as "declaration checking" in Swift; I think there's a lot of subtlety here)
  • Type systems (this includes constraint solving, proof theory, logic...)
  • Optimization (graph algorithms come in handy here!)
  • Tooling (more about practical engineering than theory; this is the design of debuggers, linkers, code completion, etc. Itself a very broad topic with many sub topics one could study in depth on their own)

Once you learn enough to gain a broad overview of the field, you should pick an area to specialize in. Even those of us that work on compilers for a living can't really hope to learn everything, so that's why on a typical team of compiler engineers you'll find experts on type theory, module systems, optimizations, debuggers, and so on.

One thing I've found is that textbooks about compilers heavily focus on parsing and optimization, but discussion of type checking is often rather minimal and the other areas are barely mentioned at all. It would be nice if someone wrote a "modern compiler frontend design" book that skipped parsing and code generation and did a deep dive on everything in between instead!


This would be very nice indeed!

This. Going from a compilers textbook to a project like LLVM/Clang/Swift is like building a paper airplane and then working on A320s. Nothing in my undergraduate or graduate compilers courses really prepared me for anything more than toy work. (Those courses predated Clang, so I hope modern courses try to do it a little better, but 15 weeks isn't nearly enough time.)

Compared to similar projects of size and scope, the LLVM and Swift projects are some of the cleanest, better-documented ones that I've experienced. Grab a debugger, sprinkle some breakpoints around, explore, and just start calling the dump() method on literally everything. There's a ton of useful information baked in there to figure out what's going on, and you'll probably be productive faster than you initially expected.


When you say contribute to Swift itself, do I need to have some background in compilers itself. As I have just started with Swift language and would go through the language docs this week.

There were some recommendations for books related to compilers including

  1. Compilers: Principles, Techniques, and Tools
  2. Compiler Design in C
  3. Modern Compiler Implementation in C
  4. And one for lex and yacc

Can someone from their experience comment on these books, based on the technologies being used in 2019 vs what is taught in these books. Mostly, the concepts would remain the same but what technologies are being used to write compilers in 2019.

And @harlanhaskins the post you mentioned, there are references to learn LISP, Scheme or Rachet. So to work in compilers is their a specific language set that one needs.

There isn’t a specific language set one needs, Those are recommended because LISP languages are frequently used to teach language concepts because their syntax is minimal and they are homoiconic, meaning that you can manipulate the program itself in the language itself. As such, LISP-derived languages usually have strong macro systems that allow for meta programming and DSL design. Experimenting with a LISP will teach you that code is just data, and force you to think about program structure as the data structures underlying the source code, rather than purely the syntactic constructs.

I’d say that, no, you wouldn’t need a compiler background to contribute to Swift. I didn’t have any compiler background before I started contributing — the important thing is that you’re willing to learn the concepts as you go. But don’t feel like you have to start contributing to swift, if you would prefer to read up on language theory first.


In terms of type theory specifically, how useful do you think books like Software Foundations, Practical Foundations for Programming Languages or even (gasp) Homotopy Type Theory would be? I see those being thrown around a lot, but are they practical or are they too much "out there"? (For the record, the only one I've started to look at so far is Software Foundation and while I enjoy it immensely, it also doesn't immediately seem to be applicable to anything.)

This. 100% this. I was in the exact same boat where I knew practically nothing about compilers in general. Even now, I still have a lot to learn, but at least it’s coming together little by little.

For me, when I was implementing my random functions in the stdlib, I caught an immediate interest to try and learn other parts of the project. Just being exposed to writing any code on the compiler, or rather any project in general, can capture the motivation needed to dive deeper into how it works and how you can contribute. Little by little, as I was writing new prs, whether they’re NFC, fixing a bug, or adding a new feature, I was getting familiar with the project and it’s vast API. Another huge reason why contributing to the compiler is a lot less intimidating is that the Swift team at Apple have some fantastic and super smart people. They are super helpful and will help give you guidance on a problem, or will suggest alternatives that maybe you hadn’t thought of yet. I would say that contributing to Swift is very unique because not only can you learn about compilers, the stdlib, or any relating project that affects Swift, but you also get very helpful advice and can learn a lot because of the engineers behind it as well.


And if you’re still a student, applications are open for Google Summer of Code!


@Graydon_Hoare put up slides from a recent talk of his, giving a great survey of many different compiler architectures and the history of compiler and language design:


Thank you, everyone, who contributed for the awesome resources. I have made up my plan on how to study the topic, with an equal focus on theory and practical part at the same time. It is such an awesome topic to work on.

And @harlanhaskins, I would work hard to get into the google summer of code.

EDIT:- Would update the main title post with my study plan based on all the recommendations.

1 Like

I’m a student of life, does that count? :grin: