SwiftSyntax.SyntaxRewriter API suggestion

Now that SwiftSyntax provides mutable properties for nearly all important node and token properties (with the recent addition of TokenSyntax's leadingTrivia, trailingTrivia, and tokenKind Give SyntaxToken setters similar to Syntax nodes has for children by hartbit · Pull Request #88 · apple/swift-syntax · GitHub) it seems like SyntaxRewriter's API could be tweaked to fully benefit from Swift's unique mutable value system.

For example, the following rewriter:

struct IfRewriter: SyntaxRewriter {
  override func visit(_ node: IfStmtSyntax) -> IfStmtSyntax {
    // Making mutable
    var node = node

    if node.ifKeyword.trailingTrivia != .spaces(1) {
      node = node.withIfKeyword(
        node.ifKeyword.withTrailingTrivia(.spaces(1))
    }

    if node.body.leftBrace.trailingTrivia != [] {
      node = node.withBody(
        node.body.withLeftBrace(
          node.body.leftBrace.withoutTrailingTrivia()))
    }

    return node
  }
}

Could be greatly simplified if SyntaxRewriter's visit functions took inout nodes:

struct IfRewriter: SyntaxRewriter {
  override func visit(_ node: inout IfStmtSyntax) {
    if node.ifKeyword.trailingTrivia != .spaces(1) {
      node.ifKeyword.trailingTrivia = .spaces(1)
    }

    if node.body.leftBrace.trailingTrivia != [] {
      node.body.leftBrace.trailingTrivia = []
    }
  }
}

Are there any reasons this is a bad idea? Or should I go ahead and open a PR to propose this new API?

I'm CCing the people who might be directly interested in this topic. Please let me know if you prefer me not to do that.

CC @harlanhaskins @Xi_Ge @akyrtzi

I have 2 pet peeves with SyntaxRewriter, both of them related to its efficiency:

  • It rewrites the whole tree, even for the nodes you are not interested in, creating new raw nodes for everything.
  • The with... methods create new parent nodes all the way to the root of the tree that the node belongs to, but SyntaxRewriter doesn't need the parents, it treats the node you give it as a leaf node to add in its own tree. That means all the created new parent nodes are wasted. In the existing model, SyntaxRewriter should probably give you parent-less leaf nodes to rewrite at least. But then as a client you may need to check the parents.

I don't have specific suggestions right now but I'm taking the opportunity of this thread to suggest rethinking how SyntaxRewriter should work so it is more efficient to use. That may involve significantly changing how the API looks right now (e.g should it become a protocol like SyntaxVisitor, etc. ).

This is interesting. A few comments/questions:

This looks like an implement detail, independent of the API, does it not? Or is there something with the current API which makes this creation of new raw nodes mandatory?

Can you point me to where that is happening?

Clients would definitely want to check the parents IMHO. This seems like the wrong solution IMHO.

When you mention SyntaxVisitor, do you mean that it should use something similar to SyntaxVisitorContinueKind? Any other ideas on how the API should evolve?

Are you referring to what happens in replacingSelf? But isn't that necessary for keeping value semantics? COW :cow: to the rescue?

The API doesn't offer any way to indicate that you want to reuse the existing node, the default implementation assumes that every node it gets is a new node to fit in a brand new tree. Perhaps the API is fine and the implementation should be checking the node identity to see if it got the same node back, not sure.

Not sure how COW is related, this is about modifying the raw nodes which are immutable class references. The with... methods return a new node back and that node should have parents, and those parents need to be new raw nodes all the way to the root because their children layout needs to contain to the new raw child.