Look, ma! I broke operator precedence with a freestanding macro!

jgongo · August 31, 2023, 4:52pm

Ok, so I'm exploring the new macro functionality in order to propose an implementation plan for my team, and this is the first thing I did expecting a totally different result:

import SwiftSyntax
import SwiftSyntaxMacros
import SwiftSyntaxBuilder

public struct DoubleWithPlusMacro: ExpressionMacro {
    public static func expansion(
        of node: some FreestandingMacroExpansionSyntax,
        in context: some MacroExpansionContext
    ) -> ExprSyntax {
        guard let argument = node.argumentList.first?.expression else {
            fatalError("compiler bug: the macro does not have any arguments")
        }
        return "\(argument) + \(argument)"
    }
}

THE FOLLOWING UNIT TEST SUCCEEDS:

import XCTest
import SwiftSyntaxMacros
import SwiftSyntaxMacrosTestSupport

import MacrosShowcaseMacros

final class DoubleWithPlusMacroTests: XCTestCase {
    static let testMacros: [String: Macro.Type] = ["double": DoubleWithPlusMacro.self]

    func test_givenInt_whenExpanded_thenShouldProperlyWork() throws {
        let source = """
            let result = #double(5) * 2
        """
        let expectedExpansion = """
            let result = 5 + 5 * 2
        """

        assertMacroExpansion(source, expandedSource: expectedExpansion, macros: DoubleWithPlusMacroTests.testMacros)
    }
}

@freestanding(expression)
public macro double(_ value: Int) -> Int = #externalMacro(module: "MacrosShowcaseMacros", type: "DoubleWithPlusMacro")

let operation = #double(5) * 2
print("The result is \(operation)")

let operation2 = 5 + 5 * 2
print("The result is \(operation2)")

and the output is...

The result is 20
The result is 15

Of course I was expecting to get 15 using the macro and wanted to confirm it, so I would include in our documentation a recommendation to always enclose in parenthesis the expressions returned by freestanding expression macros to avoid operator precedence problems. It turns out that the code generated by the macro is somehow executed before the rest of the code, completely breaking the "insertion in predictable ways" or "impenetrable magic" principles they claimed during WWDC.

Please tell me that I am missing a BIG, BIG thing here, because I don't understand this.

John_McCall · August 31, 2023, 5:16pm

It's just not a naive textual substitution. The replacement expression is dropped in where the macro invocation was, but we don't re-parse (or re-type-check) everything around it. As a result, the replacement expression is always grammatically self-contained, as if it were parenthesized. That's a good thing. It's not the simplest possible rule, but it is a more predictable rule because it means macros don't have dangerous, unexpected behavior if you happen to use them in the wrong grammatical context (exactly what you were gearing up to warn about).

jgongo · August 31, 2023, 5:27pm

I'm not discussing about this being a good or bad thing. The problem here is that something is lying.

I had the impression after watching the videos and reading the documentation that a freestanding macro receives the syntax tree of the macro invocation (FreestandingMacroExpansionSyntax) and then the syntax tree produced by the macro (of type ExprSyntax) is injected in the call site of the macro. This doesn't seem to be what is really happening from what you tell me.

And the bigger problem is that I have a unit test that is telling me that the expanded code is 5 + 5 * 2, NOT (5 + 5) * 2. With this unit test I expect this code to return 15, not 20. So either the insertion/parsing is being done in a wrong way, or the unit test is lying, as they are producing inconsistent results. One of them should be fixed.

Joe_Groff · August 31, 2023, 5:36pm

It might make more sense to think of operator expressions as having already been broken down into syntax trees before the macro expansion occurs. So you have

           *
          / \
#double( )   2
     |
     5

and, after macro expansion, the #double( ) node is replaced, without changing the structure of the tree around it:

That's what's meant when we say that the macro's syntax tree is injected in the call site of the macro. The liar in your case is mostly like the code that prints the syntax tree back at you; I would guess that since there's no explicit ParenExpr node in the generated tree, it prints it as is without any parens. That definitely seems like a bug to me; it should probably print ASTs with parentheses when they would not parse back with the same precedence as the tree is currently constructed. You could write your macro to add the parens yourself, as you said, but we should fix it so that isn't necessary.

jgongo · August 31, 2023, 6:20pm

Ok, I agree that doing it this way may seem more "natural" in order to keep the feeling that a freestanding expression macro is treated like a function which returns a value. The problem with this approach is the following:

According to the Expand on Swift macros WWDC video, "the end result of using a macro is the same as you would get writing the code yourself". I have debugged my test case and I have found the expanded code to produce the following syntax tree:

SourceFileSyntax
├─statements: CodeBlockItemListSyntax
│ ╰─[0]: CodeBlockItemSyntax
│   ╰─item: SequenceExprSyntax
│     ╰─elements: ExprListSyntax
│       ├─[0]: SequenceExprSyntax
│       │ ╰─elements: ExprListSyntax
│       │   ├─[0]: IntegerLiteralExprSyntax
│       │   │ ╰─literal: integerLiteral("5")
│       │   ├─[1]: BinaryOperatorExprSyntax
│       │   │ ╰─operator: binaryOperator("+")
│       │   ╰─[2]: IntegerLiteralExprSyntax
│       │     ╰─literal: integerLiteral("5")
│       ├─[1]: BinaryOperatorExprSyntax
│       │ ╰─operator: binaryOperator("*")
│       ╰─[2]: IntegerLiteralExprSyntax
│         ╰─literal: integerLiteral("2")
╰─endOfFileToken: endOfFile

My question is, can you show me what Swift code can produce that syntax tree?

Because part of the problem is that the assertMacroExpansion function is not comparing syntax trees, but the string output of those syntax trees, and the syntax tree reproduced above is printed as 5 + 5 * 2, which is clearly not what that syntax tree represents (and I get serious doubts about being able to get that syntax tree with plain Swift code without macros)

bnbarham · August 31, 2023, 7:55pm

Thanks for bringing this to our attention @jgongo! The macro is expanded in the compiler as John and Joe mentioned. As you noted, the assertMacroExpansion function does not compare syntax trees and there is no way to represent the syntax tree that the macro produces in source. We'll look at fixing this by inserting parentheses in the expanded source if the resulting expansion needs it, which would be the case if it is eg. a SequenceExpr, TryExpr, etc.

jrose · August 31, 2023, 10:39pm

the end result of using a macro is the same as you would get writing the code yourself

The syntax tree is not an “end result” being referred to; the “end result” is the compiled program. And you can write the code yourself…using parentheses.

I agree with Joe that it would probably make sense for the printed representation to include synthetic parentheses, but elevating this inconsistency to “someone is lying” is going a bit far, even hyperbolically. It’s not like the premise of macros has been betrayed because the default textual representation, intended for debugging and unit tests, doesn’t match up with regular source code.

farzadshbfn · September 1, 2023, 9:26am

If the types are easily separated, that'd be very helpful. but regardless I would like to point-out:

What if this check happens inside assertMacroExpansion? Instead of expanding the source_with_macro and comparing it to the expected_result, it can create syntax tree of both source_with_macro and expected_result and then compare the trees. This way, it will guarantee the "textual" representation of the macro matches its semantic behaviour as well and would avoid this ambiguity.

Although what you're saying is true, but as a developer trying to predict what the code does, the end-results are different. And this confusion comes mostly from the unit-test. Because the unit test checks that the two generated codes are equal, while they are not. And in this case, I think we can say that assertMacroExpansion is not fully testing the macro expansion as they are different to the eyes of the compiler, and I think that's what @jgongo is trying to highlight as well. That the assertMacroExpansion should not pass this test, because their generated syntax tree is different.

jgongo · September 1, 2023, 10:04am

Absolutely not. I can't write some code myself that produces the same exact syntax tree produced by the macro use. And no, writing some code that produces the same result is not the same as writing some code that produces the same syntax tree. Following that reasoning you could ask me to write 10 instead of 5 + 5...

For me this has some serious implications:

We were promised that macro use wouldn't have any magic associated and that we would get the exact same result than writing the code ourselves, but here we are with an extremely simple case where you can't write any Swift source code producing the same syntax tree than the one produced by the macro
This tells me that there are some syntax trees that seem to be accepted by the compiler but aren't backed by any valid piece of Swift code, so does the compiler accept a "superset" of the syntax trees being able to be generated by valid Swift code? Are we aware of this "superSwift" language and what it includes?
What happens when some expansion fails in this superSwift language? Will the warning / error be meaningful? Will this be able to be linked to the code "generated" by the macro, knowing that there may be no valid Swift code that represents what the macro really did?
How can I be confident that if this simple case produces this, more complex cases won't do similar or even worse things?

Correct me if I'm wrong, but the textual representation is also used to show the expansion of macros in the code editor, and to interact with that code using the debugger, setting breakpoints and doing step by step execution (I don't think you used the term debugging to refer to this), so I wouldn't lightly disregard this textual representation as something used for secondary purposes.

And then you seem to disregard unit testing as something that it isn't that important... how am I supposed to test the correctness of a macro I'm writing if I can't even trust SwiftSyntax to be capable of really giving me the code supposed to represent the syntax tree the macro generates???

Anyway, I wasn't the one who promised that "the end result of using a macro would be exactly the same as writing the code yourself" or that "macros shouldn't involve any impenetrable magic". Let's remember that macros in other languages are indeed based on textual substitution, so to a certain extent is easy to reason about them. In Swift they are based on manipulating the syntax tree of the source code (ouch) so if the tools used to do that (SwiftSyntax and SwiftSyntaxBuilder) and the macro application do this kind of things (generating syntax trees that can't be generated by valid Swift source code, generating textual representations of syntax trees that when parsed produce a totally different syntax tree) then for me this breaks those promises, makes Swift macros harder to use than plain text-based traditional macros and they definitely seem to involve some black magic if a user of a macro isn't aware of syntax trees (let's remember that a user of a macro doesn't have to know anything about all of this if using a macro developed by some other person).

I personally find this very confusing for somebody starting to work with macros just after watching the WWDC videos:

Screenshot 2023-09-01 at 11.33.37

Anyway this is my opninion, and you know what they say about opinions