tl;dr; Providing direct access to the parser speeds up SwiftSyntax 8x, and it becomes 2x faster than the legacy sourcekitd syntactic request
Currently there are 2 ways to create a SwiftSyntax tree from a source file:
- Parse the JSON output from the compiler executable (slow)
- Use sourcekitd + binary serialization format (fast)
When comparing SwiftSyntax with the legacy request, a major benefit is that SwiftSyntax allows for incremental re-parsing but it is still critical for SwiftSyntax to have very fast, real-time performance when parsing a lot of code (e.g. first time a file is opened, processing multiple files, etc.).
From that aspect, even the fastest available approach for SwiftSyntax is multiple times slower than the legacy sourcekitd syntactic request.
To see how faster SwiftSyntax can be, I prototyped providing a C library that exposes a couple of APIs that use the Swift parser to create the raw syntax-tree nodes that SwiftSyntax needs, and pass them to the client. I then modified SwiftSyntax to call these C APIs and get back the raw syntax nodes directly from the parser.
The performance gains of this approach are significant, see the timings below. For the source test case I used the largest file in firefox-ios repo and copied it 2x, so total is a file with 4,120 LOC.
compiler side | SwiftSyntax side | total | |
---|---|---|---|
legacy syntactic request | 25 ms | 25 ms | |
SwiftSyntax/sourcekitd | 44 ms | 60 ms | 104 ms |
SwiftSyntax/direct parse | 11 ms | 2 ms | 13 ms |
As you can see, using the C library SwiftSyntax became 8x faster, and even 2x faster than the legacy sourcekitd syntactic request!
The only potential downside of this approach is that we are losing the crash protection that sourcekitd provided, but:
- The parser is a much less complicated system than the rest of the compiler pipeline, and given the additional quality assurance tools that we have, like the amazing swift stress tester, I believe we can be confident that we will protect SwiftSyntax against crashing bugs.
- Moving SwiftSyntax off sourcekitd has an additional upside beyond performance. A long standing desire of us was to isolate the syntactic functionality from the sourcekitd crashes (a typechecker crash should have no effect on syntactic functionality), and this approach allows us to provide this.
Given how critical the performance of SwiftSyntax is for allowing it to replace all uses of the legacy syntactic request, I believe this approach is the way forward in order to achieve this goal.