Changes to the libSyntax API

Hey all,

I've been doing some work on libSyntax recently <https://github.com/apple/swift/pull/10926&gt; to fill out the API, and I want to let swift-dev know what we're doing and the rationale behind it.

<libSyntax Work · GitHub API

libSyntax is a new representation of the Swift syntax tree that preserves all source-level information and provides easy ways to transform and re-print the tree. It adds a notion of Trivia onto tokens, keeping track of things like spaces, newlines, and comments in the tree instead of throwing them out at lex-time.

The libSyntax tree is structured as a tree of nodes with tokens at the leaves. For example, defer {} is structured as:

DeferStmt
DeferKeyword: Token(kw_defer)
leading trivia: none
trailing trivia: Spaces(1)
Body: CodeBlock
LeftBrace: Token(l_brace)
leading trivia: none
trailing trivia: none
Stmts: StmtList (empty; no children)
RightBrace: Token(r_brace)
leading trivia: none
trailing trivia: none
Semicolon (missing): Token(semi)
leading trivia: none
trailing trivia: none
libSyntax preserves this structure and provides typed accessors for each child. Say you want to transform this defer statement into this one:

defer {
}
To do this, you can apply transformations to the children of the defer:

auto newRightBrace = defer.getRightBrace()
                          .withLeadingTrivia(Trivia::newlines(1));
auto newDefer = defer.withRightBrace(newRightBrace);
<libSyntax Work · GitHub API Generation

As libSyntax's API is intended to provide a set of safe operations on a Syntax node, we've structured it such that all Syntax nodes are arrays of children, and each node provides safe getters and "setters" for their children. libSyntax just needs to make sure the public accessors for these children set up nodes with the correct structure and don't provide entry points that change the structure. This ends up being repetetive and bug-prone if implemented manually.

Instead, we've reworked libSyntax to be generated automatically from a declarative definition of Syntax nodes and their children. This definition is fed to gyb to implement all the libSyntax APIs, including the Syntax nodes themselves, the SyntaxFactory implementation, each of the SyntaxBuilders, and the SyntaxKind enum.

<libSyntax Work · GitHub

We're using gyb for this because we'd like to represent the Syntax tree declaratively. This API necessitates more expressivity than is possible with macros (or, if it is possible, it would require a soup of macro metaprogramming that I personally think would be inscrutible). We originally started implementing this using TableGen, but it got very messy and hard to reason about very quickly. Using gyb would also make it easy to generate a Swift representation of the Syntax tree.

For example, the SyntaxBuilder APIs need to set up a missing structure of each child in each node:

class DeferStmtBuilder {
  RawSyntax::LayoutList layout = {
    RawSyntax::missingToken(tok::kw_defer),
    RawSyntax::missing(SyntaxKind::StmtList),
    RawSyntax::missingToken(tok::semi),
  }
then also generate useFoo (or addFoo, if the child represents a collection like a StmtList) methods for each child:

public:
  DeferStmtBuilder &useDeferKeyword(TokenSyntax kw);
  DeferStmtBuilder &addStmt(StmtSyntax stmt);
  DeferStmtBuilder &useSemicolon(TokenSyntax semi);
  
  DeferStmtSyntax build();
}
I cannot figure out a way to do this with macros, because it requires multiple iterations over all children of a node in the middle of an expansion of the node macro.

Additionally, it would be (in my opinion) a bad idea to use macros to generate the Swift API as well, given Swift doesn't have a native macro system.

With that, I think gyb is the right tool for the job.

<libSyntax Work · GitHub

libSyntax's API, moving forward, will be generated from a declarative node structure using gyb. In the future, we hope to add a Swift API for libSyntax using the same delcarative structure. We want libSyntax to be a great asset for writing Swift tools.

Best,

Harlan Haskins

1 Like