[GSoC 2022] Use SwiftSyntax itself to generate SwiftSyntax’s source code instead of GYB

Hi everyone!

I'm very excited to be finally participating in Swift forum, I've been using Swift since its release at 2014 and now it's the time to be a part of the community !

I'm Mostfa Essam. I'm very interested in working on the GSoC project Use SwiftSyntax itself to generate SwiftSyntax using SwiftSyntaxBuilder mentored by @ahoppen.

I've been reading and familiarizing myself with the codebase for the past two weeks.

recently, I've tried to generate some parts of SwiftRewriter.gyb.swift using SwiftSyntaxBuilder, here’s my work:

I've imported some python classes like Node, and Kind to be able to get an output which is very close to the one generated with GYB.

i would like to know what do you think about my work, and discuss some questions with you

Questions and Challanges

  • Comments Generation
    As far as I can see SwiftSyntax/SwiftSyntaxBuilder doesn’t support comments, is this something we should prepare to implement?
    I know that A comment is always a part of the parse tree, but not the AST, how can we support this?)

  • Code Generation Integration
    How can we integrate code generation into the build process? and how can we place SwiftSyntaxBuilder inside current source files ? Currently, I’m using SourceFile resultBuilder to check my results, i'm thinking about something like this:


/// SyntaxBuilder behaves like SourceFile, but when swift is built using build-script, it will be replaced with the code generated from the builder
SyntaxBuilder { 

/* valid Builder statements */
}
/* 
which can be a replacement for GYB Lines like the following ones

% for node in SYNTAX_NODES:
%   if is_visitable(node):
  open func visit(_ node: ${node.name}) -> ${node.base_type} {
    return ${node.base_type}(visitChildren(node))
  }

%   end
% end
*/
  • Finally, I can see some differences in the spacing of generated code, is this something we should handle?
//GYB Generated function signature
) -> TypeSyntax {
//SwiftSyntaxBuilder signatures 
)-> StmtSyntax{

//while the generated code is alike, the spacing before the arrow
//and after curly brace isn't generated

About me

I’m a fresh computer engineer, I’ve been learning about programming since I was 12, started with Objective-c and now Swift, and Python.
I’m in general curios about in 3D computer graphics,metaheuristic optimizations, and machine learning

I’ve made a couple of open-source contributions, here are some of them

I can't wait to start contributing to Swift :technologist:t2:!

Thank you for reading up to this point!

Greetings, Mostfa.

6 Likes

Update 1: i've tried to implement a proof of concept CommentStmt, which could be used as the following

bodyBuilder: {
           //CommentStmt, confirms to StmtBuildable, ExpressibleAsCommentStmt
           //with TokenSyntax = .identifier("// \(comment)"),
           //there must a better and cleaner way to do it.. i will dig more !
            CommentStmt("This's a comment")
            ReturnStmt {
                FunctionCallExpr(calledExpression: IdentifierExpr("base")) {
                    TupleExprElement(expression:
                                        FunctionCallExpr(calledExpression: IdentifierExpr("visitChildren")) {
                        TupleExprElement(expression: IdentifierExpr.init("node"))
                    }

                    )
                }
            }
    }

which adds the the comments to final result:

public func visit(_ node: input)-> output{
    // This's a comment
    return base(visitChildren(node))
}

i just wanted to implement it as a proof of concept as i still not really sure if SwiftSyntaxBuilder should support it or not

Update 2: About code generation and replacement, i can see that generation work starts from
1- handle_gyb_source_single in SwiftHandleGybSources if building Swift.
2- generate_single_gyb_file in build-script.py if building Swift-syntax only.

, so i think i can start investigating from here to understand how can the new code should work.

i think we will need a call to swift compiler so we can generate the code of SourceFile { /* Vailid code */ }
please @ahoppen let me know what do you think about this dieraction and if there's a better approach.

2 Likes

Update 3: Regarding code generation(update 2), Here's what i've tried until now:

1- this's how *.swift.gyb might look like:

a special comment //#! is placed at the before SourceFile and at the end of it, to indicate special handling.

2- when a *swift.gyb is parsed, lines between //#! will be parsed and placed in SourceFileHolder.swift, which is temporary holder for the SourceFile

3- then a spcial swift command line tool will run SourceFileHolder.swift, and parse its output, then place the output(generated code) in the newily generated *.swift file

i've tried to implement this flow using Swift's ArgumentParser, and it seems to be working.

but i do believe there might be a better way to do it, also reading and writing to SourceFileHolder.swift will cause unneeded overhead IMO.

i'm looking at TextOutputStream to understand how can i utilize it to generate and return in a better manner.

looking to hear your feedback,

thank you.

Hi @iMostfa , great to hear that you are interested in the SwiftSyntaxBuilder GSoC project and that you have already made some progress. Below are a few thoughts on your questions. I hope they answer them. If not, please let me know.

SwiftSyntax does support comments by using trivia. We consider everything that is not really relevant as far as compilation is concerned but that is still part of the source file, like whitespace and comments as trivia. The idea is that every token in the source file might contain leading and trailing trivia.

As far as SwiftSyntaxBuilder is concerned, you are right, the support for adding comments is very scarce. I think it’s possible to add a comment before the first token of the node you are creating by passing it as leadingTrivia to the buildSyntax call, but that support definitely needs to be improved.

I would, however, refrain from implementing support for comments as a CommentStmt because comments really aren’t statements (where a statement is something that can be executed at runtime). I would add them to the SyntaxBuildable nodes and make sure to add the comments as leading trivia when we generate a SwiftSyntax node from a SyntaxBuildable node.

The idea is that we have some kind of Python script / executable that generates the source code and writes it back into the package. The way I would go about doing this, is to have a .gyb file which basically translates the Python files in gyb_syntax_support to corresponding Swift files (which only contain the abstract definitions, no generated source code yet) and writes those to a GenerateSwiftSyntaxBuilder module. That module could then be compiled using SwiftPM into an executable generate-swift-syntax-builder that, when run, generates source code and writes it into the SwiftSyntaxBuilder module. That generated source code can then be committed into the repo and used by clients of the library.

Yes, that’s expected. Spacing is a non-trivial issue and I wouldn’t worry about it too much for now. Maybe we can later include swift-format into the generation process to make the generated files pretty.

2 Likes

Thank you @ahoppen for your answers and support

1- About comment support, thank you for pointing out to Trivia, my intention when i thought about CommentStmt was to make it in a DSL-styilish but didn't think much about the naming of Stmt, if i did understood you correctly

We may implement some modifier (like SwiftUI Modifiers) like this:

  FunctionDecl(identifier: .func, signature: signature)
 .comment("This's a comment") // <- behaves like .withLeadingTrivia(.blockComment(""))

and this modifier should be attachable/accessiable from any SyntaxBuildable
is this a better approcach ?

2- About code generation, looks good, i will playaround with your ideas and see where could things go..

I would like to know if you have any suggestions to me and if there're other things i have to take a look at or foucs on

Greetings.

2 Likes

Yes, that’s exactly what I thought as well. Technically, it would probably make sense to have multiple methods for comment, block comment, doc line comment and doc block comment, but the idea is the same.

I don’t have anything specific in mind. If you would think you’ve got something that’s ready to be merged, you are welcome to open a PR but it’s obviously not required. I also wanted to mention that another contributor is already working on an initial implementation to generate Tokens.swift using SwiftSyntaxBuilder here, just to make sure your works don’t collide: Use SwiftSyntaxBuilder to generate SwiftSyntaxBuilder Tokens by kimdv · Pull Request #381 · apple/swift-syntax · GitHub.

Thank you @ahoppen for your clarification,

i've spent the a couple of hours trying to figure out the right way to support .comment("This's a comment") modifier

i'm sharing with you and community what i've figured out until now, and i need to know if this's the right dieraction or there's something better we can do

Taking FunctionDecl as an example, the same applies for the rest of SyntaxBuildable since they are all generated

1- add comment trivia as an optional property to FunctionDecl

public struct FunctionDecl: DeclBuildable, ExpressibleAsFunctionDecl {
  let attributes: AttributeList?
  let body: CodeBlock?
/* Rest of other propoerties */
  let comment: Trivia? //<- Added

2- pass the comment to avilable initalizers

 public init(
    funcKeyword: TokenSyntax = TokenSyntax.`func`,
    identifier: TokenSyntax,
    comment: Trivia? = nil, //<- Passed into init, with default value to be nil
    genericParameterClause: ExpressibleAsGenericParameterClause? = nil,
    signature: ExpressibleAsFunctionSignature,
    genericWhereClause: ExpressibleAsGenericWhereClause? = nil,
    body: ExpressibleAsCodeBlock? = nil,
    @AttributeListBuilder attributesBuilder: () -> ExpressibleAsAttributeList? = { nil },
    @ModifierListBuilder modifiersBuilder: () -> ExpressibleAsModifierList? = { nil }
  )

3- then to provide functional access, will provide

extension FunctionDecl {
    func withBlockComment(_ text: String) -> Self {
        // a new copy of FunctionDecl is created
        return Self.init(funcKeyword: self.funcKeyword,
                         identifier: funcKeyword: self.identifier,
                         comment: .blockComment(text), //<-- text is passed into init
                         genericParameterClause: self.genericParameterClause,
                         signature: self.signature,
                         genericWhereClause: self.genericWhereClause,
                         body: self.body,
                         attributesBuilder: self.attributesBuilder,
                         modifiersBuilder: self.modifiersBuilder
                         }
}

4- then finally, at buildFunctionDecl(format:leadingTrivia:)


func buildFunctionDecl(format: Format, leadingTrivia: Trivia? = nil) -> FunctionDeclSyntax {
let result: FunctionDeclSyntax = /* the result is being initalized here*/

//TODO:- Add the comment Trivia to result here, after adding leadingTrivia if passed

also, i've tried to make a use of leadingTrivia in buildFunctionDecl but i didn't make it.

So this's what i've figuered out until now, do you think that this would be enough for such an implemention?

In general, i do think that the project (Using SwiftSyntax itself to generate SwiftSyntax’s source code instead of GYB) could be divided into 4 Goals/Milestone

  • Implementing needed features in SwiftSyntaxBuilder to be able to fully replace GYB, including Adding tests for it to be production-ready
  • Importing GYB Helper functions/classes into swift (Node, Kind, CommonNodes..etc)
  • Replacing all usages of GYB in SwiftSyntax with SwiftSyntaxBuilder.
  • Integrating the process of generating code using SwiftSyntaxBuilder as we've disucssed here.

do the following goals covers all aspects of this project ? do you have comment on any of them ?
please let me know so i can cover all your needs

also, if any of the readers have experiance with GYB/SwiftSyntax, please let me know what do you want to see in SwiftSyntaxBuilder, so we can make Code generation a better process

1 Like

Your proposed solution to add comments to decls looks good to me. comment: Trivia? should probably be named leadingTrivia: Trivia? and there should be methods to also add other comment kinds but that’s minor things.

That sounds about right to me. I doubt that you’ll be achieve to do these goals in strict order because we’ll likely only figure out what’s missing in SwiftSyntaxBuilder as we start using it – but that’s not an issue at all.

I don’t fully understand the distinction between your last two goals. As I understand it, both aim to replace gyb by SwiftSyntaxBuilder wherever possible. This will most likely be the bulk of the work.

1 Like

Thanks @ahoppen for your comment regarding the proposed solution,

by Integrating the process of generating code i meant editing (build_script) to support newly added featuers (executing/running generate-swift-syntax-builder.. etc), am i understaing it correctly ?

Update: Comments implemention

i’ve started working on comments implementation as discussed in my previous comments, and here’s what i’ve reached until now, and my questions regarding the implementations,

(should this discussion about implemention placed somewhere else ? please let me know )

1- i have started by editing BuildableNodes.swift.gyb to add

var leadingTrivia: Trivia? = nil

inatilly, i found that properties of each struct is based on its childeren, and AFAIK leading trivia can't be considered as a child, so i added it manually inside the struct

public struct ${type.buildable()}: ${base_type.buildable()}, ${type.expressible_as()} {
%   children = node.children()
%   for child in children:
 let ${child.name()}: ${child.type().buildable()}
%   end
 let leadingTrivia: Trivia? = nil //<- Added here

2- because leadingTrivia is added manually, i needed to reconfigure the initalizers

  public init(
    leadingTrivia: Trivia? = nil, //<- Can't add it as last parameter
                                //because trailing clousre Syntax is needed for childs
    ${',\n    '.join(['%s: %s%s' % (
      child.name(),
      child.type().expressible_as(),
      child.type().default_initialization()
    ) for child in children])}
  ) {
    self.leadingTrivia = leadingTrivia
/* childs are intialized here */
%   end
  }

  /// A convenience initializer that allows:
  ///  - Initializing syntax collections using result builders
  ///  - Initializing tokens without default text using strings
  public init(
    leadingTrivia: Trivia? = nil, //< Added here,
    ${',\n    '.join(convenience_init_normal_parameters + convenience_init_result_builder_parameters)}
  ) {
    self.init(
      leadingTrivia: leadingTrivia,
      ${',\n      '.join(delegated_init_args)}
    )
  }
%   end

3- at buildFunctionDecl(format: leadingTrivia:)

  func buildFunctionDecl(format: Format, leadingTrivia: Trivia? = nil) -> FunctionDeclSyntax {
    var result = SyntaxFactory.makeFunctionDecl(
      attributes: attributes?.buildAttributeList(format: format, leadingTrivia: nil),
      modifiers: modifiers?.buildModifierList(format: format, leadingTrivia: nil),
      funcKeyword: funcKeyword,
      identifier: identifier,
      genericParameterClause: genericParameterClause?.buildGenericParameterClause(format: format, leadingTrivia: nil),
      signature: signature.buildFunctionSignature(format: format, leadingTrivia: nil),
      genericWhereClause: genericWhereClause?.buildGenericWhereClause(format: format, leadingTrivia: nil),
      body: body?.buildCodeBlock(format: format, leadingTrivia: nil)
    )
      if let leadingTrivia = self.leadingTrivia {
          result = result.withLeadingTrivia(leadingTrivia)
      }
    if let leadingTrivia = leadingTrivia {
      return result.withLeadingTrivia(leadingTrivia + (result.leadingTrivia ?? []))
    } else {
      return result
    }
  }

And finally,the modifier should be something like this

extension FunctionDecl {
    /// Adds Line Comment to FunctionDecl
    /// - Parameter text: comment to be added as lineComment
    /// - Returns: FunctionDecl
    public func withLineDocument(_ text: String) -> FunctionDecl {
        return FunctionDecl(leadingTrivia: .lineComment("// \(text)\n"),
                            attributes: self.attributes,
                            modifiers: self.modifiers,
                            funcKeyword: self.funcKeyword,
                            identifier: self.identifier,
                            genericParameterClause: self.genericParameterClause,
                            signature: self.signature,
                            genericWhereClause: self.genericWhereClause,
                            body: self.body)
    }
}

while this's sufficient to make it work, i have comments and questions regarding this approach:

1- i dont think that leadingTrivia should be first parameter in the initalizer, so

  • adding leadingTrivia to the childs, would remove the headach of configuring the initializers, but i don't think it's correct to add it to the childs, since it's not a child of a syntax.
  • adding leadingTrivia initazliation parameter after init_normal_parameters and before result_builder_parameters would fix the initalizaion issue.
  • adding leadingTrivia as an optional with default value to be nil, and making it only accessiable/initazlied via a mutating function should be enpugh, but i think that clients would need to access it dierctly from initalizer, also this will intrduce mutating changes which might affect the performance

i think that adding leadingTrivia initazliation parameter after init_normal_parameters would provide enough balance

2- using self.leadingTrivia name inside buildFunctionDecl isn't clear IMO, should i consider betting naming ?

please let me know what do you think about this approach, and how we can i make things better.

Update 1: i've made a very small berief about building generate-swift-syntax-builder using SwiftPM, based on the work of @kimdv

would like to hear opinions about that ! :)
thanks.

1 Like

Hi,
i think that we need to decide if *.gyb.swift file will be consisted from a gaint SourceFile { } or we will enable it to have nomral experssions ? (like the current GYB files)
taking the following examples:

case 1/ File.gyb.swift

import Foundation //<- normal experssion 
SourceFile {
  ImportDecl(path: "UIKit")
  ClassDecl(classOrActorKeyword: .class, identifier: "SomeViewController", membersBuilder: {
    VariableDecl(.let, name: "tableView", type: "UITableView")
  })
}

case 1 behaves like what's inside swift-syntax/SyntaxRewriter.swift.gyb at e3d3a8367431cc90d7ef58c640bfac394b71cf3b · apple/swift-syntax · GitHub

where some functions are generated, and other structs/functions are written explicitly, without generation.

if we end up with selecting this case, we will need to parse the file first, to extract SourceFiles calls, build them, then place the generated code inside the newly generatef file. along with the normal experssions written without generation.

case 2/ File.gyb.swift

SourceFile {
  ImportDecl(path: "Foundation") //<- notice how import is inserted Here
  ImportDecl(path: "UIKit")
  ClassDecl(classOrActorKeyword: .class, identifier: "SomeViewController", membersBuilder: {
    VariableDecl(.let, name: "tableView", type: "UITableView")
  })
}

in case 2 we will make sure that everything related to the file being generated is controlled by SwiftSyntaxBuilder (even if it's just a normal import, or a comment), thus checked by the compiler, that we are generating a valid and correct code.

also, it will make generating the crossponding file easier, what i have in my mind something like this

SourceFile {
  ImportDecl(path: "Foundation")
  ImportDecl(path: "UIKit")
  ClassDecl(classOrActorKeyword: .class, identifier: "SomeViewController", membersBuilder: {
    VariableDecl(.let, name: "tableView", type: "UITableView")
  })
}.writeableSourceFile() //<- inserted at the end of SourceFile

where writeableSourceFile is

extension SourceFile {
  func writeableSourceFile(file: String = #file) {
    //1- get the file name from the passed url, e.g = Token.gyb.swift
   
    //2- get the generated from SourceFile(self)
    let code = String(self.buildSyntax(format: .init(), leadingTrivia: nil))
    
    //3- write the code to a Token.swift file, and insert it using FileManger inside generated files folder
   code.writeToURL(atomically:encoding:)
  }

and when Token.gyb.swift is compiled writeableSourceFile will be executed and generate the crossponding file.

so to sum up, case 1 behaves much like GYB, where we could have multiple SourceFile { } inside a single file, case 2, everything is inserted inside a single SourceFile { }

i think that we will end up using case 1, but would like to know your thoughts

Thank you.

I see. I think that distinction makes sense. I don’t expect that huge changes to build-script will be necessary, so this phase should be fairly quick, I believe.


Your comments implementation looks good to me. Do you mind opening a PR on GitHub to discuss the changes at the code level? I find that easier than on the forums.

Regrading where to place the leadingTrivia argument in the initializer: What do you think about not having it in the initializer at all, but requiring to use withLineComment? That would just define away the placement issue and kind of matches the way that SwiftUI applies modifies to views.


I’m not sure if driving build-script.py from a Swift executable is a huge win. It means that we would need to redefine all arguments that build-script.py currently takes in that Swift executable just to forward them. I also don’t see any immediate benefits from this shift.


Maybe I’m misunderstanding something, but my expectation is that we need to put everything into one SourceFile (while we might able to split certain parts of the SourceFile into helper functions. I also don’t see how the import Foundation statement could be emitted into the generated file in case 1 because we can’t tell whether it existed after File.gyb.swift has been compiled.

Thank you @ahoppen for your replay, and your thoughts,

i will update my proposal based on your comments and thoughts since i don't have much time,
then will conintue the disucssion here

i see your point, and agrees with you.

Regarding file structure, i created some examples to help me explain more:

if we are using case 1, we will use SourceFile only for the dynamic part(the generated code)
where everything else(imports, struct name, sceneMember) in this file, will be (re-imtted/ re inserted ) in the newly created file

if using case 2: everything will be inserted inside SourceFile { } even the static lines don't have not any type of generation/dnymaic content
i prefer this case because it will make us leverage more from SwiftSyntaxBuilder

Thank you again @ahoppen for your valuable disucssion and notes !

Ah, I think I understand. You want to do a textual scan of the source file and re-emit everything that’s not inside a SourceFile declaration in case 1, correct? I don’t think that’s easily feasible and I think we should fully embrace SwiftSyntaxBuilder in this project, so case 2 is the option I would prefer.

1 Like

Amazing, will re-modify my proposal now to disucss this point.