Redesigning parts of the Clang Importer

As I've been fixing bugs to, mostly, prevent C++ interop from crashing when importing various APIs, I've noticed a common theme: these bugs often occur when we do something differently than Clang does it. This makes sense because we're using Clang and importing C++. Here are a few examples of such bugs:

https://github.com/apple/swift/pull/36553
https://github.com/apple/swift/pull/35963
https://github.com/apple/swift/pull/35962
https://github.com/apple/swift/pull/35819

In fact, all these bugs occur because Clang is lazy and we are not. What I mean by that is, Clang often won't evaluate types or members until it really needs to (i.e., until they are actually used). For example, Clang will lazily evaluate inline static data members. So, if there's an error, it will only be emitted when the member is used. Because we try to evaluate every member, (before that patch) we would sometimes find invalid members that Clang hadn't yet instantiated and therefore didn't mark as invalid.

Another example of this can be demonstrated with the following snippet:

template<class T> struct X { X<X<T>> test(); } ;
void test(X<int>);

Clang has no problem compiling this program, but we fail to import it (luckily, we no longer hang/crash). Why is this? Because Clang won't evaluate the function type until it's actually called, whereas we will evaluate and instantiate the function type as we're importing it.

So, what's the solution? We could redesign some parts of the Clang importer to be lazier (especially when dealing with C++ templates). We have already started to do this with function templates (see #35819), but I think we could integrate laziness into the Clang importer in a much more fundamental way.

This would not only fix some of the issues above but also allow us to import more C++ APIs, more closely match what Clang does, and be vastly more performant, especially when importing templates.

In this post, I'm not proposing we do anything. I'm just bringing up a "problem" (or, more accurately, an observation about related problems) and brainstorming some possible solutions. I'm creating this post in response to the discussion @typesanitizer and I had here: cxx-interop Improve performance of deep template instantiations. by zoecarver · Pull Request #36553 · apple/swift · GitHub.

I'd love to hear what people think about making the Clang importer lazy and if there are any suggestions as to the best way to design this (I know there are some other parts of the compiler that rely on lazy evaluation, it would be helpful to know how those are implemented and if the model works well). As Varun said, taking a step back and thinking about the best implementation for the Clang importer might be extremely beneficial in the long run.

3 Likes

The Swift compiler became lazy with the RequestEvaluator:

Maybe this is a future direction for Clang too. You could ask Clang to type-check something even so it wouldn't have done so itself.

1 Like

It's interesting that @schuett mentions the request evaluator, because ClangImporter is a large chunk of the compiler that has not yet been requestified, and changing that might help. I know that @codafi has talked about this in the past; he might have something to contribute to the discussion.

Some random observations:

  • ClangImporter is highly re-entrant and does a lot of ad-hoc cycle breaking. For instance, there are places where we partially initialize a parent declaration, insert it into a lookup table, import related declarations which will need to reference it, and only finish initializing the parent once those are imported. It would be nice to formalize this kind of thing.

  • However, there aren't actually that many entry points to the importer (or at least to SwiftDeclConverter). About six months ago, I experimented with instrumenting the ClangImporter to get it to explain its importing logic, and managed to narrow it to five entry points. That makes me think you won't necessarily need a million different requests to make this work.

  • Some of ClangImporter's logic is easy to implement when you start from the Clang declaration, but quite difficult when you start from the Swift reference to that declaration. For instance, if you're starting from a qualified lookup of init(contentsOf:) in NSData, how do you map that to the Objective-C selector initWithContentsOfURL:error:? The answer we went with is that we build a lookup table of context + Swift name to ObjC declaration by examining all declarations ahead of time (in a separate compiler job that builds a PCH with the table, even!), but that's inherently an eager operation. Is it possible to make that lazy instead?

  • Some of ClangImporter's logic is much easier to implement when you examine and evaluate many sibling declarations together than it would be if you were evaluating them all separately. For instance, when SwiftDeclConverter::VisitEnumDecl() sees several enumerators with the same raw value, it imports the earliest (non-unavailable) enumerator as a real case and imports the others as static lets (actually, static computed variables). If you evaluate all of the enumerators together, it's pretty easy to use a little temporary state to track which enumerators should be imported as cases and which should be imported as static lets; if you try to make this fully lazy, then suddenly you need to factor this state into a separate computation, and you're probably going to end up at least partially processing all of the enumerators anyway.

4 Likes

By the way, I remember the implementation before the lookup table, and it was miserable. Members were imported as you referenced them, but that led to weird ordering issues because importing isn't quite idempotent. (Inheritance of requirements from protocols is a particularly nasty case in Objective-C.) Additionally, lots of that work was "lazy" at class granularity but eager once you got to members, i.e. importing one member of a particular category pulled them all in. There was no good way to support the swift_name attribute, of course. And finally, the lookup table actually sped up compilation quite a bit, because the importer didn't have to crawl through the Clang AST from scratch every time.

I'm not saying there's no good lazy solution here; we have plenty of parts of the compiler that improved on what was there before, and yet still can be improved themselves (potentially by replacement). But even so…thank you @Douglas_Gregor for designing and implementing the lookup table. :-)

EDIT and @Michael_Ilseman for separating the name lookup / name mapping from the rest of the importer. (I hope I've remembered correctly which of you were responsible for what.)

3 Likes