Macro Adoption Concerns around SwiftSyntax

jrose · September 2, 2023, 5:48pm

It's not ideal, and I hope there's a better solution in the works, but as a stopgap there's nothing that says different macros can't use different versions of swift-syntax by running in separate processes. I don't know if that's how it works today (it would make sense to build all the macros into one library if possible) but I think from a technical perspective there's nothing that makes it infeasible.

jjrscott · September 2, 2023, 6:48pm

Seems ok given Swift invokes the macro handling as a sandboxed process.

TBH I’d prefer for the communication to be JSON encoded ASTs rather than a raw string as it is now - at this point the compiler has already parsed it so it shouldn’t need doing twice (in both directions).

tgoyne · September 2, 2023, 7:59pm

There is something that says different macros can't use different versions of swift-syntax: SPM doesn't support multiple versions of a library in a single dependency graph, and SPM is the only official endorsed way to build and use macros. Other than build times, SPM's limitations (or swift-syntax's unwillingness to work in a way that SPM supports) is the only problem here.

jrose · September 2, 2023, 8:04pm

Sorry, yes, this would require modifying SPM to treat macro targets as special*, root-like targets, fragmenting the package graph into subcomponents. This has implications for lockfiles, etc; it's not a small change. But it's simpler than "allow multiple versions of a package to coexist in a process", like NPM and Rust's Cargo allow, because that has all the SPM work plus some form of compile-time symbol remapping and possibly decisions about run-time dynamic-lookup-by-name operations.

*That is, more special than they already are.

tgoyne · September 2, 2023, 8:24pm

Each macro target is a separate executable, so multiple versions of swift-syntax in one process isn't something that would come up in the first place barring a shift away from out-of-process plugins (which would be a surprising direction to go).

Zollerboy1 · September 2, 2023, 9:34pm

Is it actually faster to parse JSON than it is to parse Swift code though? Because it would be a lot more JSON given that it has to contain all the info about whitespace and other trivia too.

Jon_Shier · September 2, 2023, 11:36pm

In Swift? No, Codable sets a rather low maximum performance. Using a modern, vectorized JSON parser? Possibly.

Zollerboy1 · September 2, 2023, 11:57pm

I don’t want to derail the thread too much, as I care a lot about the problem at hand, but the performance situation with Codable seems really unfortunate. Couldn’t it be improved by utilizing macros now (I know that probably the whole protocol stack would have to be redone for that but maybe it would be worth it)?

dmt · September 3, 2023, 12:28am

I did a research about performance improvements in Codable. Basically just making everything inlinable significantly improves the situation, although there are at least two implementation flaws that's impossible to repair. (the first one)
Given how much Codable code in the wild I'd say it still worth to do.
On the other hand, there are plenty of things that Codable just doesn't support(like custom converters, or custom encoding primitives). So it also makes sense to design a new system to replace Codable.
But that's really another topic.

jjrscott · September 3, 2023, 2:23am

My preference for JSON was less performance related and more removing the need to align swiftc and swift-syntax. Using strings pretty much bakes in the need for swift-syntax as nothing else has the power to do the job (you can see the crappy hack I had to implement in my SwiftCompilerPlugin project ).

I don’t see Codable being strictly required other than that it’s already used to handle messages between the compiler and macro program.

tgoyne · September 4, 2023, 5:56pm

Yeah, passing the AST in json form (or xml or whatever) to and from the plugin would make implementing a macro without swift-syntax merely difficult to get right rather than wildly impractical. It could also potentially enable a restructuring of swift-syntax so that macros only need to depend on a portion of the library and can skip building the parser, but I don't know how much actual benefit that'd have for the build time.

Helge_Hess1 · September 13, 2023, 8:25pm

Wouldn't even Foundation need SwiftSyntax, i.e. it being embedded in the toolchain. For example for the #Predicate macro.

liamnichols · September 13, 2023, 9:02pm

I had a browse around some prior discussions and I didn't quite find an answer, but why was it decided that swift-syntax should be an ordinary package that must be declared as a package dependency vs using it through the shared libraries in the toolchain?

I feel like this is kind of dependency that you might want to treat similarly to XCTest.

I know that there was a great effort involved with rewriting the parser to break away from the dependency on the toolchain/_InternalSwiftSyntaxParser, but if we will eventually find ourselves having to #if canImport(SwiftSyntax510) to support different versions, would it not be less confusing to just have access to the version in the toolchain and then to use #if compiler(>=5.10) checks to handle api changes instead? This then has the potential to address the compile time/version conflict issues?

I guess that I might be missing some other use-cases where breaking away from the toolchain is necessary? But in that instance, perhaps there are ways to still use swift-syntax as an ordinary dependency?

NeoNacho · September 13, 2023, 9:09pm

This was previously discussed and the conclusion was that the versioning problems are even worse if we used a bundled copy from the toolchain. For example, it couples updating swift-syntax with updating your tools, right now you have the escape hatch of using an older version until all package dependencies have updated.

liamnichols · September 13, 2023, 9:39pm

That's fair point, I guess at this point though I'm quite use to that

anreitersimon · September 14, 2023, 9:49am

Would it theoretically be possible to support both the bundled with toolchain and custom version

I was thinking something along the lines of this:

Currently it adds this in my Package.swift

import CompilerPluginSupport

Would it be possible for this to provide a extension like this.

extension PackageDescription.Package.Dependency {
    static func toolchainSwiftSyntax() -> Self {
       // maybe this could just be a dependency with a local path under the hood
    }

    // maybe even allow stating a minimum swift version 
    static func toolchainSwiftSyntax(minVersion: SwiftVersion) -> Self { ... } 
}

Which would allow me to add a dependency version of SwiftSyntax bundled with the toolchain.

This wouldn't prevent me from manually specifying a different swift version (like right now)

Maybe those dependencies provide modules/products with a different name to avoid conflicts with user specified ones.

something like BundledSwiftSyntax, BundledSwiftSyntaxMacros, ...

That way i could choose to accept the tradeoff coming with tying it to the tools versions.

I for one think the added build-time from having to compile swift-syntax is quite significant.

jjrscott · September 14, 2023, 10:39am

I’d like to reiterate my desire and support for replacing the current String based compiler plugin API with a JSON AST based solution. swiftc literally has all the code to do it if we wish.

(In fact, I’d like the option to supply the AST to swiftc instead of having to reencode it as Swift, but that’s probably for another day)

allevato · September 14, 2023, 2:19pm

Having the compiler communicate ASTs directly to the plugin would effectively require the compiler to stabilize the AST structure forever going forward, and thus make language evolution harder, or require all macros to be compiled with the exact version of SwiftSyntax distributed with the compiler.

By having the compiler send and receive source text, macros aren't version-locked with the compiler in terms of the version of SwiftSyntax they use. A macro using SwiftSyntax from 5.9 will still compile and work with a Swift 5.10 compiler. It may still not parse newer syntax that the older version of the language didn't know about, but if existing syntactic constructs need to change their API representation in SwiftSyntax for some reason, the older macros can still use the older representation that they know about. If the compiler directly sent ASTs, those macros could fail completely until they were updated by their owners.

jjrscott · September 14, 2023, 4:03pm

I don't expect the AST structure to be stabilised any more than the raw source text. Also, using JSON would mean that any bundled SwiftSyntax would be a lot smaller (assuming it was bundled at all).

Even stacked tokenization (ie arrays of tokens) would significantly reduce the need to reimpliment complex parsing in a library. A plugin or library could simply dump it back to a single string if it so chose.

Here's an excerpt from GitHub - jjrscott/simple_ast: Simple AST takes a simple set of rules and produces a simple abstract syntax tree. to show what I mean:

Input

6 * (4 + 3)

AST

[
  "6",
  "*",
  [
    "(",
    "7",
    "+",
    "3",
    ")"
  ]
]

NeoNacho · September 14, 2023, 4:44pm

Yes, this is a good idea. We had a similar one where we were thinking that the toolchain could provide metadata on which swift-syntax it bundles and then SwiftPM could automatically use the pre-built copy if it is compatible.

I think we ran into some concerns though because the prebuilt copy in the toolchain is meant for consumption by the compiler, so it doesn't necessarily 1:1 correspond with the package in both versioning and content.