SE-0382: Expression Macros

First of all, thanks for the detailed reply!

If there are good reasons to use an @-attribute, that is fine with me. Thanks for the explanation.

Would this mean that macros using node.parent risk inconsistent behavior with incremental builds? That would be a huge limitation in my opinion. For example, the #printArguments macro would not work without accessing the parent nodes in order to get the function name and signature.

2 Likes

It means that the implementation shouldn't give access to the parent node at all, and it is a big limitation. It's possible that we could find a more reasonable "cut" point for the syntax nodes that are passed to the macro that balances between macro expressivity and still preserved incremental builds. What if, for example, you could access up to the nearest enclosing function / type / variable / etc? That would be enough for #printArguments, and the built-in #function, without exposing the rest of the source file. It aligns fairly well with the granularity at which we could do incremental compilation, too.

I'll give that a try!

Doug

6 Likes

Accessing the nearest definition would help. However, if #printArguments also wanted to print the type name of self in case of a method, this would not be enough. Some other examples would be:

  1. detecting if the macro call is inside of a method or a free function
  2. detecting if self is of a value or reference type
  3. detecting or enumerating other members of self inside a method (This could be emulated using protocols. However, this would mean runtime checks making the code less efficient and necessitate runtime errors instead of build errors.)

There are some solutions I can think of. I am not sure if any of them are feasible:

  • Macros could declare the need to access the parent nodes, disabling incremental builds.

  • Maybe the compiler could provide some sort of declaration hierarchy to the macro, such as name, signature and attributes of the enclosing function, then attributes, type (class, struct, enum, etc.), name and inheritance of the type containing the function, maybe even the type containing this type, etc. This would solve example 3 only in part.

  • Maybe the whole syntax tree could be cached and the relevant parts be updated during incremental compilation.

I suspect that other macro types (propertyWrapper, requirement, conformance, memberDeclaration) would need access to all information about the type in question (members, attributes, name, etc.). Therefore, a solution to this must be found. It would be unfortunate if expression macros would miss out on this information only because they were proposed and accepted before the other macro types.

1 Like

Something that I’m noticing in all of the example macros is a pervasive use of (macro-)runtime verification of arguments. I feel like while doing all this checking explicitly in the macro body allows for maximum flexibility in macro interfaces, there could be some benefit to offering an alternative, more-strongly-typed API. This could be implemented in a similar way to how any Swift type can either define one or more callAsFunction methods with statically typed parameters, or be @dynamicCallable and define a dynamicallyCall method that does validation at runtime. Here’s one way this could work for macros, taking the example WarningMacro:

protocol StaticExpressionMacro: ExpressionMacro {}
extension StaticExpressionMacro {
  // compiler auto-generates this based on your `expandCall` implementations
  static func expansion(of macro: MacroExpansionExprSyntax, in context: inout MacroExpansionContext) throws -> ExprSyntax
}

struct StaticWarningMacro: StaticExpressionMacro {
  static func expandCall(
    // context (+ other args) provided by the compiler go here:
    to macro: MacroExpansionExprSyntax,
    in context: inout MacroExpansionContext,
    // arguments passed by the macro invocation go here
    _ stringLiteral: StringLiteralExprSyntax
  ) throws -> ExprSyntax {
    guard
      // these checks are still necessary once we have the `StringLiteralExprSyntax`
      stringLiteral.segments.count == 1,
      case let .stringSegment(messageString)? = stringLiteral.segments.first
    else {
      throw CustomError.message("#myWarning macro requires a string literal")
    }
    // [same macro implementation]
  }

  // version for value-like macros
  // static func expandReference(to macro: MacroExpansionExprSyntax, in context: inout MacroExpansionContext) throws → ExprSyntax {}
}

// potentially allow more concise declaration since the valid
// overloads can be inferred from the macro struct declaration?
public macro myWarning: StaticWarningMacro

Does this seem like a reasonable approach? I’m not entirely sold on allowing the parameters of the expandCall method to be declared as anything other than ExprSyntax since I imagine most macros will do semantic verification of their arguments rather than plain syntactic verification (but maybe it’s worth it to make very simple macros easier to write?)

Another reason I’m proposing this is that most of the checks in these macros are somewhat untestable — if you only declare a macro as taking one parameter, there’s no way to test that it behaves correctly when passed zero or more than one parameters, so that code would not be covered by unit tests (if the unit tests use the #blah syntax rather than invoking the macro directly).

1 Like

Absolutely! I agree that SwiftSyntax's documentation is lacking at the moment. I'm going to use this PR to collate a basic set of articles on working with SwiftSyntax. I'm hoping to also add an article on working with SyntaxVisitors, and a another on writing a refactoring pass with SwiftRefactor.

6 Likes

I tried to implement a very rough proof of concept for implementing the result builder transform using an expression macro. In theory, this should be possible. A result builder type and a closure to transform is passed to the macro and a transformed closure is returned. As I test it, however, the compiler gives me an error with the following output:

Macro expansion of #apply(resultBuilder:to:) in /Users/kocki/Downloads/swift-macro-examples/MacroExamples/main.swift:67:21-72:3 as (() -> String)
------------------------------
{ () -> String in
let __macro_local_0 = StringAppender.buildPartialBlock(first: StringAppender.buildExpression("This"));
let __macro_local_1 = StringAppender.buildPartialBlock(accumulated: __macro_local_0, next: StringAppender.buildExpression("is"));
let __macro_local_2 = StringAppender.buildPartialBlock(accumulated: __macro_local_1, next: StringAppender.buildExpression("a"));
let __macro_local_3 = StringAppender.buildPartialBlock(accumulated: __macro_local_2, next: StringAppender.buildExpression("sentence."));
return StringAppender.buildFinalResult(__macro_local_3)
}
------------------------------
Macro expansion of #apply(resultBuilder:to:) in /Users/kocki/Downloads/swift-macro-examples/MacroExamples/main.swift:67:21-72:3:3:69: error: cannot find '__macro_local_0' in scope
let __macro_local_1 = StringAppender.buildPartialBlock(accumulated: __macro_local_0, next: StringAppender.buildExpression("is"));
                                                                    ^~~~~~~~~~~~~~~
Macro expansion of #apply(resultBuilder:to:) in /Users/kocki/Downloads/swift-macro-examples/MacroExamples/main.swift:67:21-72:3:4:69: error: cannot find '__macro_local_1' in scope
let __macro_local_2 = StringAppender.buildPartialBlock(accumulated: __macro_local_1, next: StringAppender.buildExpression("a"));
                                                                    ^~~~~~~~~~~~~~~
Macro expansion of #apply(resultBuilder:to:) in /Users/kocki/Downloads/swift-macro-examples/MacroExamples/main.swift:67:21-72:3:5:69: error: cannot find '__macro_local_2' in scope
let __macro_local_3 = StringAppender.buildPartialBlock(accumulated: __macro_local_2, next: StringAppender.buildExpression("sentence."));
                                                                    ^~~~~~~~~~~~~~~
Macro expansion of #apply(resultBuilder:to:) in /Users/kocki/Downloads/swift-macro-examples/MacroExamples/main.swift:67:21-72:3:6:40: error: cannot find '__macro_local_3' in scope
return StringAppender.buildFinalResult(__macro_local_3)
                                       ^~~~~~~~~~~~~~~

The error persists if I use other variable names.

Full code

In MacroExampleLib:

public macro apply<R: ResultBuilder>(resultBuilder: R.Type, to closure: () -> Void) -> (() -> String) = MacroExamplesPlugin.ResultBuilderMacro // If I use `(() -> R.FinalResult)` as the return type, I get another error

public protocol ResultBuilder {
    associatedtype Component
    associatedtype FinalResult
    
    static func buildPartialBlock(first: Component) -> Component
    
    static func buildPartialBlock(accumulated: Component, next: Component) -> Component
    
    static func buildFinalResult(_ component: Component) -> FinalResult
}

In MacroExamplesPlugin:

public struct ResultBuilderMacro: ExpressionMacro {
    public static func expansion(
        of node: MacroExpansionExprSyntax, in context: inout MacroExpansionContext
    ) throws -> ExprSyntax {
        guard
            let resultBuilderSelfExpr = node.argumentList.first?.expression.as(MemberAccessExprSyntax.self),
            let resultBuilderName = resultBuilderSelfExpr.base?.withoutTrivia().description,
            let originalClosure = node.argumentList.dropFirst().first?.expression.as(ClosureExprSyntax.self)
        else {
            throw SomeError()
        }
        
        let originalStatements: [CodeBlockItemSyntax] = Array(originalClosure.statements.map { $0.withoutTrivia() })
        guard let firstStatement = originalStatements.first else {
            throw SomeError()
        }
        
        var localName = context.createUniqueLocalName()
        var newStatements: [String] = []
        newStatements.append("let \(localName) = \(resultBuilderName).buildPartialBlock(first: \(resultBuilderName).buildExpression(\(firstStatement)));")
        
        for statement in originalStatements.dropFirst() {
            let newLocalName = context.createUniqueLocalName()
            newStatements.append("let \(newLocalName) = \(resultBuilderName).buildPartialBlock(accumulated: \(localName), next: \(resultBuilderName).buildExpression(\(statement)));")
            localName = newLocalName
        }
        
        newStatements.append("return \(resultBuilderName).buildFinalResult(\(localName))")
        
        let joinedStatements = newStatements.joined(separator: "\n")
        return "{ () -> String in\n\(raw: joinedStatements)\n}"
    }
}

struct SomeError: Error {}

Finally, in main.swift:

@resultBuilder
struct StringAppender: ResultBuilder {
    static func buildExpression(_ expression: String) -> String {
        expression
    }
    
    static func buildExpression<T>(_ expression: T) -> String {
        String(describing: expression)
    }
    
    static func buildPartialBlock(first: String) -> String {
        first
    }
    
    static func buildPartialBlock(accumulated: String, next: String) -> String {
        accumulated + " " + next
    }
    
    static func buildFinalResult(_ component: String) -> String {
        component
    }
}

@StringAppender
var string: String {
    "This"
    "is"
    "a"
    "sentence."
}

let stringClosure = #apply(resultBuilder: StringAppender.self, to: {
    "This"
    "is"
    "a"
    "sentence."
})

print(string)
print(stringClosure())

Macro arguments are type-checked against the parameter types of the macro prior to instantiating the macro.

macro stringify<T>(_: T) -> (T, String)

But what about scenario where I might want to construct type based on the input provided, i.e. building a type from json string:

macro jsonObj<T>(_: String) -> T

let obj = #jsonObj("""
{
"one": 1,
"two": 2
}
""")

Or registering routes in a server framework:

app.get("hello/:name") { req in
    // req could be inferred from the provided string as
    // SomeGenericStruct<(name: String)>
    let name = req.parameters.name
    return "Hello, \(name)!"
}

Although, such scenario can degrade compile-time performance by having to expand the macro for type inference, but having this as an option would be really beneficial.

Hello everyone!

The language workgroup discussed the initial review feedback for this proposal, and we felt that given its introduction over the previous holiday season, it'd be reasonable to extend the review for another two weeks (until January 15, 2023).

The workgroup (and proposal authors) would be particularly interested in hearing feedback based on hands-on user experience with the draft implementation. Are there sharp edges, unexpected limitations, or other ideas for improvement in the proposed design that arise based on attempts to make your own expression macros? This would all be very useful information that would help shape not just expression macros but, as the first in a series of these, potentially other macros as well.

As always, thank you for your participation in the Swift Evolution process!

Xiaodi Wu
Review Manager

2 Likes

@Douglas_Gregor I would like to experiment a bit more with the current implementation. For that, it would be very helpful if I could get some feedback on the error mentioned above. I would like to know if this is an inherent limitation of the current implementation when expanding closures or if I can somehow work around it.

Thanks in advance!

1 Like

Would expression macros support trailing closure syntax? I have the feeling that passing closures could be a major use-case.

1 Like

Big +1 to the proposal from me.

Here's one use case I'd like to highlight. With the example repository I've been able to implement a compile-time conversion from string literals to arrays of bytes. This seems like a reasonable solution for environments where String type with Unicode handling is too heavyweight. Here's the example code for MacroExamplesPlugin.

import SwiftSyntax
import _SwiftSyntaxMacros

public struct UTF8ArrayMacro: ExpressionMacro {
  public static func expansion(
    of macro: MacroExpansionExprSyntax, in context: inout MacroExpansionContext
  ) throws -> ExprSyntax {
      guard let firstElement = macro.argumentList.first,
        let stringLiteral = firstElement.expression
          .as(StringLiteralExprSyntax.self),
        stringLiteral.segments.count == 1,
        case let .stringSegment(messageString)? = stringLiteral.segments.first
      else {
        throw CustomError.message("#utf8 macro requires a string literal")
      }

      return "\(raw: messageString.syntaxTextBytes)"
  }
}

Then in MacroExamplesLib:

public macro utf8(_ string: String) -> [UInt8] = MacroExamplesPlugin.UTF8ArrayMacro

And at the place of use:

_ = #utf8("Hello, world!").withUnsafeBytes {
    fputs($0.baseAddress, stdout)
}

When dead code elimination becomes advanced enough to strip out unused Unicode tables on platforms where everything is linked statically, this would allow producing binaries as small as those you can get with Rust or C. Of course, there are downsides to this approach, as you're losing interpolation and other niceties of String. But for applications where binary size is a top priority, this is still a big improvement.

10 Likes

Your macro implementations looks good, thank you for working on it! I (very recently) fixed a compiler bug that caused the incorrect error messages, and with my locally-built compiler (and adding a bunch of @expression utterances), your result-builder macro works. My fix hasn't made it into a snapshot yet, but here's a one-off toolchain that contains it.

Yes, they do! I updated your example macro to this:

let stringClosure = #apply(resultBuilder: StringAppender.self) {
    "This"
    "is"
    "a"
    "sentence."
}

and taught the macro implementation to look for a trailing closure as well:

        guard
            let resultBuilderSelfExpr = node.argumentList.first?.expression.as(MemberAccessExprSyntax.self),
            let resultBuilderName = resultBuilderSelfExpr.base?.withoutTrivia().description,
            let originalClosure = node.argumentList.dropFirst().first?.expression.as(ClosureExprSyntax.self) ??
              node.trailingClosure
        else {
            throw SomeError()
        }

and it works fine.

Doug

3 Likes

I finally had an opportunity to try writing some macros. Some thoughts:

JSON literal conversion

I was inspired by the JSON example above, so I wrote one that converts a JSON string literal to nested calls to JSON case initializers. I don't think this is precisely what the author of that post had in mind (it sounds like they want to construct specific types from the literal, more like Codable), but since we don't have the semantic info for that, I tried this simpler idea instead.

Expand for implementation
// In the macro module
public enum JSON {
  case null
  case string(String)
  case number(Double)
  case array([JSON])
  case object([String: JSON])
}

public macro jsonLiteral(_ string: String) -> JSON = MacroExamplesPlugin.JSONLiteralMacro
// Plug-in implementation
import Foundation
import SwiftSyntax
import SwiftSyntaxBuilder
import _SwiftSyntaxMacros

private func jsonExpr(for jsonValue: Any) throws -> ExprSyntax {
  switch jsonValue {
  case is NSNull:
    return "JSON.null"

  case let string as String:
    return "JSON.string(\(literal: string))"

  case let number as Double:
    return "JSON.number(\(literal: number))"

  case let array as [Any]:
    var elements = [ArrayElementSyntax]()
    for element in array {
      let elementExpr = try jsonExpr(for: element)
      elements.append(
        ArrayElementSyntax(
          expression: elementExpr,
          trailingComma: .commaToken()))
    }
    let arrayLiteral = ArrayExprSyntax(
      elements: ArrayElementListSyntax(elements))
    return "JSON.array(\(arrayLiteral))"

  case let dictionary as [String: Any]:
    guard !dictionary.isEmpty else {
      return "JSON.object([:])"
    }

    var elements = [DictionaryElementSyntax]()
    for (key, value) in dictionary {
      let keyExpr = StringLiteralExprSyntax(content: key)
      let valueExpr = try jsonExpr(for: value)
      elements.append(
        DictionaryElementSyntax(
          keyExpression: keyExpr,
          valueExpression: valueExpr,
          trailingComma: .comma))
    }
    let dictionaryLiteral = DictionaryExprSyntax(
      content: .elements(DictionaryElementListSyntax(elements)))
    return "JSON.object(\(dictionaryLiteral))"

  default:
    throw CustomError.message("Invalid type in deserialized JSON: \(type(of: jsonValue))")
  }
}

public struct JSONLiteralMacro: ExpressionMacro {
  public static func expansion(
    of macro: MacroExpansionExprSyntax,
    in context: inout MacroExpansionContext
  ) throws -> ExprSyntax {
    guard let firstElement = macro.argumentList.first,
      let stringLiteral = firstElement.expression
        .as(StringLiteralExprSyntax.self),
      stringLiteral.segments.count == 1,
      case let .stringSegment(jsonString)? = stringLiteral.segments.first
    else {
      throw CustomError.message("#jsonLiteral macro requires a string literal")
    }

    let json = try JSONSerialization.jsonObject(
      with: jsonString.content.text.data(using: .utf8)!,
      options: [.fragmentsAllowed])
    let jsonCaseExpr = try jsonExpr(for: json)
    if let leadingTrivia = macro.leadingTrivia {
      return jsonCaseExpr.withLeadingTrivia(leadingTrivia)
    }
    return jsonCaseExpr
  }
}

This worked really nicely (after stumbling on some SwiftSyntax corruption issues :grimacing:). The following invocation produced the source code I expected:

let json: JSON = #jsonLiteral("""
  {
    "name": "Bojack Horseman",
    "species": "horse",
    "age": 59,
    "friends": [
      "Diane Nguyen",
      "Mr. Peanutbutter",
      "Todd Chavez"
    ],
    "selfControl": null
  }
  """)
// JSON.object(["selfControl":JSON.null,"name":JSON.string("Bojack Horseman"),
// "species":JSON.string("horse"),"friends":JSON.array([
// JSON.string("Diane Nguyen"),JSON.string("Mr. Peanutbutter"),
// JSON.string("Todd Chavez"),]),"age":JSON.number(59.0),])

(Naturally there are some problems like key ordering being different due to JSONSerialization and NSDictionary but that's not relevant here.)

My takeaways here are:

  • Using string interpolation to construct nodes from a combination of literal Swift code and substituted content is a joy to use.
  • When you have to drop down a level to raw initializers, SwiftSyntax/SwiftSyntaxBuilder still provides nice defaults in many places for fixed structural tokens. For example, when creating an ArrayLiteralExpr, you don't have to provide the [ and ] tokens manually.
  • But, we should extend the builder functionality in SwiftSyntax and/or provide additional helpers to make common functionality much more approachable for users. The average macro author shouldn't have to deal with subtle things like including trailing commas in arrays/dictionaries/argument lists, nor have to be aware of the different syntax node representation used for the content of an empty dictionary vs. a dictionary with elements. If SwiftSyntaxBuilder already has some of this, I've missed it, in which case it's a documentation problem instead. My code is probably not the simplest form possible; since SwiftSyntax is such a large API surface, we should figure out how to strongly push users toward the simplest/cleanest APIs.

Trivia handling

I observed that in your FontLiteralMacro implementation, you write this:

if let leadingTrivia = macro.leadingTrivia {
    return initSyntax.withLeadingTrivia(leadingTrivia)
}
return initSyntax

What is the purpose of retaining the leading trivia—to preserve any comments that may precede it if the macro expansion is printed for debugging? I can see this being more important for declaration macros where you'd want to be able to scrape for documentation comments after expansion.

If trivia had to be manually preserved, I would have expected something like this to fail:

let x = #someMacro(...) + y

Where if the expansion didn't preserve the space in #someMacro(...)'s trailing trivia, you'd end up with an expansion let x = VALUE+ y, which would fail to parse because + is now a postfix operator instead of an infix operator. But I tested that and it appears to be still parsed as though it was let x = VALUE + y, so I'm unclear on what the actual trivia behavior is here.

EDIT: I may be able to answer my own question here. Since the macro is applied to the already type-checked AST, it knows that #macro(...) + y must already be an infix operator even if the trivia for the expansion changes?

Can the macro infrastructure manage trivia automatically so that <leading trivia>#macro(...)<trailing trivia> is always transformed to <leading trivia><expanded node><trailing trivia>, so that the macro can never remove or replace trivia? I could see us wanting to merge the original trivia with the trivia attached to the expanded node, but not allow it to be completely replaced.

In fact, I wonder if it's problematic for macro expansions to be able to see trivia at all. I wrote a really stupid macro:

// Returns `string` as an integer, but also add the number in the preceding comment if there is one.
public macro theSameNumber(_ string: String) -> Int = MacroExamplesPlugin.TheSameNumberMacro

let n1 = #theSameNumber("123")
print(n1)  // 123

let n2 =
  // 5
  #theSameNumber("123")
print(n2)  // 128

Allowing folks to have semantically significant comments feels like a bad idea, but I can also see value in having access to trivia for other kinds of macros. For example, a declaration macro could treat a preceding doc comment as a template to splat out new doc comments for the declarations that it generates. I'm not sure what's the best way to square these goals; should we only provide trivia to certain types of macros (e.g., declarations but not expressions)? Should we require any kind of macro to explicitly opt-in if it wants the trivia? Or do we just accept that people can do bizarre things with it?

5 Likes

One curiosity I noticed here is that you can have macros which do something different if a closure is specified as an argument vs as a trailing closure. Not sure what conclusions to draw from this, just feels a bit off to me.

This made me wonder, how will we know which version if SwiftSyntax are we coding against?

4 Likes

Yes, that’s a good point. We should strip trivia when providing the node to the macro, because one should not be able to affect the surrounding trivia. I'll fix this in the implementation.

Personally, I'm of the opinion that we should accept that people can do bizarre things with comments, and be okay with that.

Doug

4 Likes

In the final implementation, the version of SwiftSyntax you're coding against is whatever your package depends on. In today's prototype (where the macro implementations are built against the shared libraries in the toolchain), there's no good answer.

Doug

2 Likes

We could go so far as to strip the trivia from all the tokens recursively under the macro expansion node, but perhaps that extra processing overhead isn't worth it.

Although, since the final implementation (using a standalone executable instead of a dylib) will need to do some kind of processing anyway to convert the in-memory tree to something that can be sent over IPC, maybe that would lessen the impact.

I don't have a strong opinion here though, and I'm leaning weakly toward keeping the implementation efficient even if the cost is letting people have weird dependencies on comments and whitespace.

I'm not sure where things like custom runtime metadata attributes end up in syntax... is that as trivia or something else?

I can definitely foresee needing to handle these in macros, comments not so much though.

1 Like

"Trivia" encompasses whitespace, comments, and the occasional garbage/unexpected text: swift-syntax/Trivia.swift at main · apple/swift-syntax · GitHub.

Anything with syntactic/semantic significance will have its own syntax node; attributes for example are represented by AttributeSyntax (there's currently a distinction between those which are built-in attributes and CustomAttributeSyntax for property wrappers, but it looks like that distinction is being removed).

2 Likes