I'm reading the source code of AST, it uses both llvm::PointerUnion and subtyping to represent AST nodes at different levels. For example:
// (1) union
struct ASTNode : public llvm::PointerUnion<Expr *, Stmt *, Decl *, Pattern *,
TypeRepr *> {
...
}
// (2) subtyping
class InOutExpr : public Expr {
...
}
This approach make the visitor pattern more complicated to implement, so I guess there may be some performance optimizations that I don’t know about. Can someone explain this in more detail?
Thanks! Now I understand the difference between C++ RTTI and LLVM's style RTTI, but I'm still a bit unsure: There does not seem to be much performance gap between llvm::PointerUnion and LLVM's style RTTI, is it just to save a few bytes of memory through tagged pointers?
It's saving some space on individual types. Since the compiler creates lots of types, that adds up. Apart from having lower memory footprint being a generally good idea for cache hit rate, this matters for situations like Swift Playgrounds (the iOS app) which has less RAM available than a Macbook or some desktop-class device.
Using dynamic_cast is slower than a few integer comparisons or a range check, which is what dyn_cast boils down to (you can see this manifest in the various classof methods).
member variable + reinterpret_cast is the fastest reliable way to
determine type; however, that has a lot higher maintenance overhead
when coding
Generally, using the C++ style subclassing mechanism is coupled with using virtual functions, but dynamic dispatch is typically more difficult for an optimizer to optimize compared to static dispatch. We do use virtual functions in parts of the compiler where this doesn't really affect us. For example, ModuleFile has many virtual functions.
It's generally a "how many of this thing are we making" and "how often are we using this operation". If we're making tons of something, or using an operation lots of times, it's worthwhile to optimize. Another example is hashtables, where the code is quite complicated (IMO), but it is worthwhile to optimize because a meaningful fraction of the compiler's running time is spent doing hash table lookups.