I'm a masters student from the University of São Paulo, and I'm very interested in pursuing this project for GSoC :)
I've played around a lot with the parser in the past, trying to get it to dump the ast in yaml instead of the current s-expression-like representation. In the end I couldn't see this idea through (though I got really close), but I always wanted to come back and try something new.
I'm currently writing a transpiler from Swift to Kotlin for my masters project, so learning about libSyntax and helping its development in a meaningful way would be a big help!
Is there someone specific I should talk to about this? I'd really like to know what the first steps would be towards this project so I could dip my toes in it and really get a sense its scope. I know it's marked as hard, but that just makes it more fun!
I remember watching your talk on libSyntax for Try! Swift NYC 2017, it was really enlightening
I was actually a bit surprised to see this project on GSoC, I was under the impression that libSyntax wasn't yet ready to fully replace the parser. Glad to see I was wrong!
And so is mine
In addition to this, we want to input serialized libSyntax tree (JSON) directly to the compiler.
Since it's super ambitious, I don't think we fully finish this work and merge into the repo in this GSoC timeframe. But I believe we should eventually do this. In your proposal, I expect you to plan what you will finish in the timeframe, and what you will not.
Roughly, we should do:
Implement a parser to parse only into libSyntax tree.
Modify AST nodes to hold libSyntax nodes for source information (getLoc() etc.)
Implement a libSyntax tree to AST translator. This should do:
Ok then, let's see if I understand this correctly. The current parser builds an AST that has both syntactic and semantic information (and also the libSyntax AST, which is separate). So perhaps we could:
Change the parser so that the current AST gets its syntactic information from the libSyntax AST.
Move the code in the parser that generates the semantic information into a separate "AST Translator" that can be run later. I suspect this will take most of the time.
If there's time, allow the compiler to read a libSyntax JSON (now that it can convert the libSyntax into a full AST).
Also, since this is a difficult project that might not be ready in time, do you think it would make sense for me to contact the other student you mentioned (DexinLi) and try to work out a way to separate the project in two, so we can both work on it?
Sounds great, but I'm not 100% sure this strategy works.
AST nodes should refer SyntaxData (wrapped with concrete ***Syntax type) because calculating AbsolutePosition needs complete SourceFileSyntax. That means we cannot construct any AST nodes until we finish libSyntax tree parsing for whole source file.
(Hmm, I currently have no idea what to do for incremental parsing @Xi_Ge@harlanhaskins, WDYT? )
It makes sense to me, but I have to confirm that with other people. I'll discuss about it later.
Hi @Vinicius_Vendramini
Actually I'm thinking about how much time each parts of this project would cost and which part I would finish for this project. So it's OK for me to separate the project into two if it is permitted.
I think incremental parsing is beyond the scope of a summer project. Actually, i think incremental parsing depends on adopting libSyntax to the compiler pipeline for several reasons.
We don't want to design incrementalness for both AST and Syntax nodes.
AST is not designed for mutation; so it's naturally not incrementalable, even if we have a walk-around to make it so.
Most of our existing IDE feature still uses AST, so making syntax tree incremental alone won't provide much benefit.
It makes sense to me, but I have to confirm that with other people. I'll discuss about it later.
[/quote]
Ok then. I'm going through the (long) process of cloning and building the compiler so that I can take a better look at the code. Once that's done I'll be back to brainstorm some more
I remember taking a look at libSyntax a while ago and noticing that it wasn't ready for this kind of integration... I think it wasn't yet able to parse a few language constructs (i.e. it couldn't parse enums or something). So I guess my question is, does it already support the full language?
Thanks @ksvsk, that's the file I was thinking about!
So, wouldn't we have to finish this list before libSyntax can really be integrated into the Parser? I'm not sure how to do that though, do we just have to find the places in the parser where it already generated the AST and make it also generate the libSyntax AST?
Unfortunately, we decided not to select 2 students for this project. Mainly because of the lack of mentoring bandwidth. Sorry for that.
Also, we think this task is overwhelming for GSoC project. This task is super high volume, non-additive, technically hard, etc. Even if we manage to finish the implementation, it's highly possible that it takes long time to review, merge, and migrate, or even impossible to be merged.
We discussed about this today, and came up with an idea which might be possible to implement in this GSoC time frame:
"Implement libSyntax tree to AST translator"
The tasks are:
Accept libSyntax tree JSON as an input to compiler
Deserialize JSON to libSyntax tree
Translate libSyntax tree to AST without modifying current libAST.
(serialized JSON)
↓
[deserializer] @DexinLi already implemented this. Thanks!
↓
(libSyntax tree)
↓
[translator] to be implemented.
↓
(AST)
Snippet of the translator will be something like:
Expr Translator::visit(IdentifierExprSyntax node) {
Identifier Name = ASTCtxt.getIdentifier(node.getIdentifierToken().getText());
SourceLoc Loc = getSourceLoc(node.getAbsolutePosition());
Expr E = new (ASTCtxt) UnresolvedDeclRefExpr(Name, DeclRefKind::Ordinal, Loc);
return E;
}
This is very narrowed version of the original idea, but it's still very valuable because using this, 1) We can implement and test libSyntax parser independently 2) we can incrementally implement libSyntax backed AST nodes.
More importantly, this is purely additive feature. I think we can easily merge this compared to the original idea.
Hi guys! Sorry it took me so long to respond. I tracked down my supervising professor and we decided this was too out-of-scope for the project I’m doing. Thanks for the help though!