[Macros] Accessing the "parent context" of a syntax node passed to a macro

Douglas_Gregor · April 17, 2023, 1:04pm

Hi all,

The expansion operation of a macro implementation only receives source code corresponding to how a macro is spelled (e.g., #stringify(x + y) or @AddCompletionHandler) and any syntax it directly operates on (e.g., the declaration node to which @AddCompletionHandler was attached). These macros are not provided with any "context" information, e.g., is the attached node a member of a struct? Is the expansion inside a function?

I suggest that we introduce an API on MacroExpansionContext that provides the "parent" context node for a given syntax node. The API could look like this:

protocol MacroExpansionContext {
  // ...
  
  /// Determine the parent context of the given syntax node.
  ///
  /// For a syntax node that is part of the syntax provided to a macro
  /// expansion, find the innermost enclosing context node. A context
  /// node is an entity such as a function declaration, type declaration,
  /// or extension that can have other entities nested inside it.
  /// The resulting context node will have any information about nested
  /// entities removed from it, to prevent macro expansion operations from
  /// seeing unrelated code within the program.
  func parentContext<Node: SyntaxProtocol>(of node: Node) -> Syntax?
}

As noted in the comment, the resulting parent node will have much of the syntax stripped, including the bodies of functions and the members of types and extensions. For example, consider this code:

extension A {
  struct B {
    func f(a: Int) { 
      if a != 0 {
        #printContext
      }
    }
    func g(b: Int) { code code code }
  }
  func h(i: Int) { code code code }
}

The parent context of the #printContext macro expansion would be the function f(a: Int), but with the body removed:

func f(a: Int)

The parent context of that node will be the enclosing struct B, again with no members:

struct B { }

and the same for its parent context:

extension A { }

Parent contexts therefore give the full syntactic nesting structure of the nodes provided to a macro, but without exposing any information that is outside of the macro expansion's "slice" of the program. We're looking for the sweet spot where we don't compromise our incremental build performance, but we can write useful macros.

I have stubbed out and partially implemented this API in this pull request, and would love to hear everyone's thoughts on the API design and whether it's providing enough information to implement the macros you have in mind.

Doug

Jumhyn · April 17, 2023, 1:23pm

Could you elaborate a bit further on why this is useful?

bzamayo · April 17, 2023, 1:35pm

Here's a use case:

I have a theoretically-working prototype* of a macro that prints the enclosing context for logging. For instance, if you had

func myMethod(_ text: String, num: Int, flag: Bool) {
     #debugLog("message")
}

The macro would expand to printing myMethod("text", num: 4, flag: true): message to the console, or whatever the actual values of the parameters were.

To achieve this, you naturally need to be able to access the syntax of the parent declaration. This API provides that. I don't really have a critique on the proposed API — seems good to me.

*As in, I have done the SwiftSyntax manipulation logic already ... I just haven't bothered downloading a toolchain to actually make the macro yet.

grynspan · April 17, 2023, 5:43pm

I've discussed something like this with Doug off-forum previously. This solution would be more than sufficient for my macros use case, and I think it is a reasonable approach.

I'm curious why we can't just make the parent property of the macro/target syntax nodes refer to it though?

allevato · April 17, 2023, 5:50pm

"Parent" would be a misnomer here, because you're not always getting back the parent node (the immediate parent in the syntax tree). In the #printContext example above, the parent would be either the CodeBlockItem or CodeBlock representing the body of the if statement.

So I also feel like parentContext is the wrong name for this method for the same reason. Maybe something like ancestorContext or containingContext would help to indicate that you're not just getting the "parent" back, but walking up multiple levels to some more meaningful hierarchical structure.

What would the behavior be if #printContext was invoked in top-level code? Would the "parent context" be the SourceFileSyntax, or nil?

grynspan · April 17, 2023, 5:59pm

Perhaps node(containing:)?

Douglas_Gregor · April 17, 2023, 6:40pm

I like "containingContext" a lot. We could even go all the way to "lexicalContext" so that it's precise and one doesn't (e.g.) expect the semantic context for something like:

struct A { }
extension A {
  struct B { }
}
typealias C = A.B

extension C {
  func f() { }
    #printContext
  }
}

The context we'll provide is f and then C, not f and then B and then A.

It's nil. We could conjure up a top-level SourceFileSyntax, but it would have to be empty, so there's really no benefit to doing so.

Doug

allevato · April 17, 2023, 7:47pm

I do like lexicalContext, and it sets us up nicely to use a dual naming like semanticContext in the future when we do offer that functionality in whatever form those APIs take.

wes1 · April 17, 2023, 8:43pm

Interesting! As a user my main application for this is tracing code for validation purposes. That would always transit the parents to create a lexical context path to build a global name.

Since the other information on the parent nodes is largely stripped, should the API instead just provide this entire path directly (based on node.allMacroParentContexts() sans first)?

And include a filename for top-level code (regardless of WMO)?

A per-module lexical-scope path is the global quasi-key I'd want for code audits and perhaps code generation.

(A path might also avoid excessive expectations about a context that only provides a name.)

Douglas_Gregor · April 17, 2023, 10:05pm

We certainly could do that; it's actually easier to implement than the proposed API, because we don't have to track anything about the intermediate nodes we generated.

The filename is available already through the location API. Do we need it again here?

Doug

wes1 · April 18, 2023, 1:36am

No, sorry. And thanks for the quick reply!

ktoso · April 18, 2023, 2:35am

Looks good conceptually, I am curious if this information is readily available in the macro process, or if it'll cause a lookup through the "plugin" infrastructure and fetch that information lazily on-demand? I wasn't quite sure from just the swift-syntax PR which approach this would end up being.

s-k · April 18, 2023, 12:58pm

To me, there are currently three major categories of information not accessible by macros:

Lexical context
Type information about sub-expressions
Reflection of arbitrary types

It looks like the proposed addition would be sufficient to solve the first case, at least for the macros I have experimented with. I don't have a preference as to whether this info made accessible as a function parentContext() or returned as a [Syntax] array, as suggested by @wes1.

Another point: I think it would be important for the Syntax instances to contain all attributes, inheritance clauses, etc. of the parent declarations, i.e.:

@MainActor class ViewModel<Content>: ObservableObject where Content: View { }

instead of just:

class ViewModel { }

Douglas_Gregor · April 18, 2023, 4:47pm

The swift-syntax PR doesn't yet hook things into the compiler, because I don't have an answer to this question. Right now, this information isn't readily available in the macro process, so we either need to send it along as additional information in the expansion request (easy enough to do) or consider a different model that exposes more of the source file to the macro process (which would then filter it).

Yes, all of this information is retained. It's the function bodies and members that we remove, because they shouldn't have an impact on how the macro is expanded.

Doug

ktoso · April 19, 2023, 3:13am

Thanks for the reply Doug, that makes sense. Not sure which way would be better here. Thanks for clarifying that's still being thought about.

erneestoc · June 14, 2023, 12:25am

Would this work for what I'm asking here?

Douglas_Gregor · June 17, 2023, 6:10am

No, it does not. It's only providing syntactic / lexical information, so you get the enclosing parent syntax, but you can't "reach across" to other nodes to ask questions about them.

Doug

MaximBazarov · October 13, 2023, 5:34pm

I think it's a great addition to make macros very powerful, I have a case where expansions rely on the same object as their dependency, I can expand and create a new instance of this object for each of them but that would be a huge overhead, so knowing if the parent context already conforming to the protocol that has it so I can refer to it in the expansion, would be a great improvement.

Pippin · October 13, 2023, 6:13pm

I was influenced by this post to try making (and finally learning about) a freestanding declaration macro that could get its surrounding type declaration, however that wasn't possible given no parent context information.

It seems as discussed that this would be very helpful for debugging purposes.

andrewtheis · December 15, 2023, 10:35pm

So from what I'm understanding here it's not possible to get information about the surrounding type? For example with:

struct MyType {
    @SomePeerMacro
    private let example = "Example"
}

In this example there would be no way currently for @SomePeerMacro to know it's within a struct with name MyType?