Swift Syntax Structured Editing Library


(David Farler) #1

Good day swift-dev!

A truly modern compiler has to have excellent IDE and tools integration, as we have done through the SourceKit layer, providing code completion, documentation comments, syntax highlighting, and more. There's no reason to stop there, so I'd like to announce work on the Swift Syntax and Structured Editing library, which will be growing in lib/Syntax in the coming weeks. The overall goal of the library is to provide a representation and a body of APIs for structured editing on Swift syntax.

The immediate goals of the library are to provide infrastructure for the Swift Migrator and for a first-class formatting tool, which we'll be bringing to the open source tree and developing publicly.

Eventually, I also want to use this as the main representation for syntax, which will allow us to incrementally re-parse after edits. This will improve responsiveness in editors when working on large files, which can have linear performance due to re-parsing and re-type-checking. The current AST doesn't make a clear distinction between syntactic and semantic information–by separating these concerns, we have a better hope of incrementally/lazily type-checking only the parts of a syntax tree necessary after edits.

Some key attributes of new Syntax nodes to note. In summary, they:

  • back all of the purely syntactic information for a piece of source
  • are free of semantic information
  • have "full fidelity" access to all of the "trivia", like whitespace and comments
  • are immutable, thread-safe, and can share substructure
  • have straightforward structured editing APIs

Syntax transformation passes can be combined to form a pipeline. This foundation has served us well in the Migrator–composition makes it easy to build up complex batch operations but it also makes it easier to reason about, test, and maintain smaller transformations as well. We consider this superior to Clang migrators, where textual edits are based on buffer ranges. These do not compose well: the way to pipeline transformation passes there is to rebuild a new AST from scratch.

The properties of the Syntax tree, the information that it captures, make it a good match to be used for other functionality like indentation and code formatting.

In order to minimize disruption to the compiler pipeline in the short term, we would like to propose making this Syntax tree outside of the normal compile workflow, on demand, which will gather the source trivia as it does so. This is similar to how the migrator works today. As it develops further, it will be integrated into other areas.

Structured Editing APIs

So, what do we mean by "structured editing"? Here are a few examples:

  • Add an argument label to a function declaration
  • Change a global function call to a method call based on import attributes, such as we did for CoreGraphics "import-as-member"
  • Change a generic parameter name in a generic struct declaration
  • Add a requirement to a generic where clause
  • Indent or reformat any syntax according to declarative formatting rules

These kinds of diverse operations are critical to the Migrator, which needs to be able to accurately make changes at all granularities of the Syntax tree.

There are a few major use patterns that will come up first:

  • Creating new Syntax nodes and tokens. These will be exposed in a single place, the SyntaxBuilder, for better discoverability and code completion.
  • Building up Syntax nodes in a single call. I call these "Make APIs".
  • Building up Syntax nodes gradually as it appears, finalizing the process at any point, marking expected syntax as 'missing'. The parser is a typical manifestation of this use case. I call these "Builder APIs".
  • Modifying Syntax nodes. Effectively setters but, since Syntax nodes are immutable, these are called "With APIs".

Higher level APIs and syntax rewriter abstractions will be built on top of these as more of the grammar is built up in the library.

There is a pull request with a prototype of the current iteration of the APIs. This is still a work in progress and will likely be molten for a few weeks.

https://github.com/apple/swift/pull/7393

Looking forward to your comments and suggestions!

Regards,
David


(David Farler) #2

Hi all,

I've now merged this to master. Just want to say I appreciate the enthusiasm - I'm really excited about all of the things this work will enable.

You can track progress in the following places:

https://github.com/apple/swift/tree/master/lib/Syntax/README.md
For an overview of the APIs, implementation, and testing information.

https://github.com/apple/swift/tree/master/lib/Syntax/Status.md
For implementation status and referenced SRs.

https://bugs.swift.org/browse/SR-3968?filter=10739
I'll be using the 'Syntax', 'Format', and 'Migrator' labels to track tasks and bugs. I'll be working quickly but I'll try to peel off some starter bugs and give a shout out when I can.

Stay Swifty-
DF

···

On Feb 10, 2017, at 2:16 PM, David Farler via swift-dev <swift-dev@swift.org> wrote:

Good day swift-dev!

A truly modern compiler has to have excellent IDE and tools integration, as we have done through the SourceKit layer, providing code completion, documentation comments, syntax highlighting, and more. There's no reason to stop there, so I'd like to announce work on the Swift Syntax and Structured Editing library, which will be growing in lib/Syntax in the coming weeks. The overall goal of the library is to provide a representation and a body of APIs for structured editing on Swift syntax.

The immediate goals of the library are to provide infrastructure for the Swift Migrator and for a first-class formatting tool, which we'll be bringing to the open source tree and developing publicly.

Eventually, I also want to use this as the main representation for syntax, which will allow us to incrementally re-parse after edits. This will improve responsiveness in editors when working on large files, which can have linear performance due to re-parsing and re-type-checking. The current AST doesn't make a clear distinction between syntactic and semantic information–by separating these concerns, we have a better hope of incrementally/lazily type-checking only the parts of a syntax tree necessary after edits.

Some key attributes of new Syntax nodes to note. In summary, they:

  • back all of the purely syntactic information for a piece of source
  • are free of semantic information
  • have "full fidelity" access to all of the "trivia", like whitespace and comments
  • are immutable, thread-safe, and can share substructure
  • have straightforward structured editing APIs

Syntax transformation passes can be combined to form a pipeline. This foundation has served us well in the Migrator–composition makes it easy to build up complex batch operations but it also makes it easier to reason about, test, and maintain smaller transformations as well. We consider this superior to Clang migrators, where textual edits are based on buffer ranges. These do not compose well: the way to pipeline transformation passes there is to rebuild a new AST from scratch.

The properties of the Syntax tree, the information that it captures, make it a good match to be used for other functionality like indentation and code formatting.

In order to minimize disruption to the compiler pipeline in the short term, we would like to propose making this Syntax tree outside of the normal compile workflow, on demand, which will gather the source trivia as it does so. This is similar to how the migrator works today. As it develops further, it will be integrated into other areas.

Structured Editing APIs

So, what do we mean by "structured editing"? Here are a few examples:

  • Add an argument label to a function declaration
  • Change a global function call to a method call based on import attributes, such as we did for CoreGraphics "import-as-member"
  • Change a generic parameter name in a generic struct declaration
  • Add a requirement to a generic where clause
  • Indent or reformat any syntax according to declarative formatting rules

These kinds of diverse operations are critical to the Migrator, which needs to be able to accurately make changes at all granularities of the Syntax tree.

There are a few major use patterns that will come up first:

  • Creating new Syntax nodes and tokens. These will be exposed in a single place, the SyntaxBuilder, for better discoverability and code completion.
  • Building up Syntax nodes in a single call. I call these "Make APIs".
  • Building up Syntax nodes gradually as it appears, finalizing the process at any point, marking expected syntax as 'missing'. The parser is a typical manifestation of this use case. I call these "Builder APIs".
  • Modifying Syntax nodes. Effectively setters but, since Syntax nodes are immutable, these are called "With APIs".

Higher level APIs and syntax rewriter abstractions will be built on top of these as more of the grammar is built up in the library.

There is a pull request with a prototype of the current iteration of the APIs. This is still a work in progress and will likely be molten for a few weeks.

https://github.com/apple/swift/pull/7393

Looking forward to your comments and suggestions!

Regards,
David

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev