Why Swift AST partially use llvm::PointerUnion instead of class hierarchy?

I'm reading the source code of AST, it uses both llvm::PointerUnion and subtyping to represent AST nodes at different levels. For example:

// (1) union
struct ASTNode : public llvm::PointerUnion<Expr *, Stmt *, Decl *, Pattern *,
                                           TypeRepr *> {
    ...
}

// (2) subtyping
class InOutExpr : public Expr {
    ...
}

This approach make the visitor pattern more complicated to implement, so I guess there may be some performance optimizations that I don’t know about. Can someone explain this in more detail?

1 Like

I started writing an answer, then I found this: llvm - What can make C++ RTTI undesirable to use? - Stack Overflow, where @Chris_Lattner3 answers your exact question. :smiley:

5 Likes

Thanks! Now I understand the difference between C++ RTTI and LLVM's style RTTI, but I'm still a bit unsure: There does not seem to be much performance gap between llvm::PointerUnion and LLVM's style RTTI, is it just to save a few bytes of memory through tagged pointers?

There are multiple aspects to it:

  1. It's saving some space on individual types. Since the compiler creates lots of types, that adds up. Apart from having lower memory footprint being a generally good idea for cache hit rate, this matters for situations like Swift Playgrounds (the iOS app) which has less RAM available than a Macbook or some desktop-class device.

  2. Using dynamic_cast is slower than a few integer comparisons or a range check, which is what dyn_cast boils down to (you can see this manifest in the various classof methods).

      static bool classof(const Decl *D) { // on AccessorDecl
          return D->getKind() == DeclKind::Func ||
                 D->getKind() == DeclKind::Accessor;
      }
    

    Since we want to do pattern matching all the time (on types, on expressions etc.), being able to pattern match quickly is important. C++ dynamic_cast Performance « A howl on the wind…

    • member variable + reinterpret_cast is the fastest reliable way to
      determine type; however, that has a lot higher maintenance overhead
      when coding
  3. Generally, using the C++ style subclassing mechanism is coupled with using virtual functions, but dynamic dispatch is typically more difficult for an optimizer to optimize compared to static dispatch. We do use virtual functions in parts of the compiler where this doesn't really affect us. For example, ModuleFile has many virtual functions.

It's generally a "how many of this thing are we making" and "how often are we using this operation". If we're making tons of something, or using an operation lots of times, it's worthwhile to optimize. Another example is hashtables, where the code is quite complicated (IMO), but it is worthwhile to optimize because a meaningful fraction of the compiler's running time is spent doing hash table lookups.

2 Likes

Why do you think the visitor pattern would be simpler to implement if there was a common type that everything inherited from?

I ask because almost all of the complexity of the visitor pattern is due to the following:

  1. The need to have separate return types per AST node hierarchy.
  2. Instantiating various default and forwarding methods via macro metaprogramming.
1 Like

Thank you for pointing that out. I hadn't seen union type and subtying used together before, so there was some misunderstanding. I get it now.

1 Like