Function body macros

wadetregaskis · August 1, 2023, 3:35am

sspringer:

shouldn‘t it be something like

func h(a: Int, b: Int) -> Int {
    var result: Int?
    withSpan("Doing complicated math") {
        let body: () -> Int = {
            return a + b
        }
        result = body()
    }
    return result!
}

Does the original function need to be inside the wrapper? That seems like it'd introduce the potential for unintended interactions (e.g. unintentional variable-in-the-middle conflicts). For example, for a hypothetical @PlusOne decorator:

let value = 42

@PlusOne
func findTheAnswer() -> Int {
    return value
}

…you could end up with the expansion:

let value = 42

func findTheAnswer() -> Int {
    func findTheAnswerInsidePlusOne() -> Int {
        return value // Compiler error in Swift 5.9 because it thinks you're
                     // trying to reference the `value` inside a(), below.
    }

    var value = findTheAnswerInsidePlusOne()

    value += 1

    return value
}

It might be safer (and cleaner at runtime) to just use peer functions, e.g.:

let value = 42

@inline(__always) // Optional?  To be added implicitly by the compiler.
func findTheAnswerInsidePlusOne() -> Int {
    return value
}

func findTheAnswer() -> Int { // Generated by the macro.
    // Compiler permits this shadowing without comment.
    var value = findTheAnswerInsidePlusOne()

    value += 1

    return value
}

sspringer · August 1, 2023, 3:54am

Aside from me being wrong about what constitutes withSpan (see above), this of course would be another variant.

s-k · August 4, 2023, 11:09am

I am looking forward to this being added to the language! I also don't mind if multiple macro types are provided for the same job (such as body, preamble or even bodyWrapper) because it makes the macro author's job easier.

I think it could be very valuable to perform arbitrary transformations on function bodies. To properly judge and/or mitigate the downsides, I find it important to have a good understanding of these drawbacks.

From my limited understanding, it seems that these downsides fall into two categories:

Performance: Type-checking twice can be expensive
Tooling: Type-checking after macro expansion may not give enough information to certain tools

I have several questions:

Does the solution of only type-checking after macro expansion have any negative impact on performance? If not, then I would not count performance as a downside.
What information exactly may be missing if the type-checking would only be done after macro expansion? I think it is crucial to have a mostly complete list of these.
Could the missing information be explicitly provided by the macro? Maybe the macro could (optionally) provide a list of variables including their types that are defined in the expanded body.
I assume that it could be a problem to display errors in the expanded code at the correct code locations in the original code. Would the macro need to explicitly help here by adding something like #sourceLocation annotations or could the compiler infer the correct location by looking at where the original syntax nodes have been inserted in the expanded code? Maybe the latter approach could even be used as an information source for other tools.

Michael_Ilseman · August 4, 2023, 1:26pm

I have not considered all the ramifications of this pitch, but I think something like this can be very helpful for complex Swift code.

For example, String is a complex type with manual bit layout and invariants across the bits. Our programming convention is that all initialization paths eventually funnel into a single raw-bit initializer which, in turn, calls an asserts-only _invariantCheck. The raw-bit funnel is also useful for testing/development. I could see a function body macro being useful for adding that call at the end of any given init, but would they be useful for ensuring that all code paths funnel to a common init? Or, alternatively, that all exits from any init call _invariantCheck()?

My understanding of the pitch is that this could help with mutation operations that wish to check invariants prior to and at the end of a mutation, such as when mutating a String.

System's FilePath follows similar patterns as it has invariants around (syntactic) normalization.

These macros would save only a single line in the source code per method/init and would still rely on programmer vigilance to ensure methods are properly annotated. I could see linter tools ensuring that any mutating or init methods have the macro attached, but is there a better way? E.g. a type-level annotation or macro that would put _invariantCheck() at the exit of any initializer or mutating method?

wadetregaskis · August 4, 2023, 3:46pm

I believe the concern here is not what might be 'missing' but that, as a design principle, Swift macros demand [already-]valid code as input. I think the reasoning is both that it makes it much easier for code readers to understand what they're seeing, and it makes it easier for macro authors since they can make at least some basic assumptions about their inputs.

The counter-example is well-established in preprocessor-based languages (e.g. C/C++) where sometimes what you see written as 'arguments' to a 'macro' is complete gibberish, that's completely syntactically invalid.

There are of course trade-offs, and I certainly see an appeal to being able to rely on a macro to "finish" a block of code that is otherwise incomplete and invalid, but I respect the Swift team's position on this.

dmt · August 4, 2023, 4:07pm

Can't disagree in case of C/C++ or m4, but sometimes macros are just conceptually beautiful.

Douglas_Gregor · August 4, 2023, 8:09pm

No, it does not have performance implications.

For the withSpan example, we'd need to know about the span parameter and its type information, but otherwise it leaves the body alone. There might be other subtleties when you start wrapping the body---for example, the body might be a Sendable closure, or a @MainActor closure, or an async closure that would change the type checking behavior from the original function's.

That's one way to deal with the span example, yes.

We don't have a way to do this today, but we want it so that code that comes directly from the user's source and gets embedded in the macro expansion can be reported in either (or both) places.

I think you can do all of these by analyzing the function body. Your macro might trend toward building a control flow graph if you want to handle arbitrarily-complicated functions, though.

Presumably, you could inject a defer that does this check.

You could use a member attribute macro to sprinkle a function body macro on mutating methods and initializers, and have that function body macro do the checking.

Right, there are definite trade-offs, and that's the crux of the argument here as to whether a function body macro can wholesale rewrite things vs. making more targeted changes that only augment an already-type-checked function body.

Doug

sighoya · August 6, 2023, 4:19pm

On the other hand, type-checking the function bodies before macro expansion has other issues. For one, the declarations to which an attached macro is attached are generally not type-checked at all prior to expansion, so type-checking the body prior to expansion would be a departure from that.

I think one would need to differentiate attached macros mapping and function declaration to a function definition and attached macros mapping a function definition to another function definition.

For the latter case, attaching a macro to function declaration (e.g. protocol method declaration) would mean to defer it's expansion until a body will be provided by the user e.g. by implementing the protocol.

I don't know though if there are limitations speaking against it, e.g. macros import limitations from one lib to the other.
Further, I would say that function declarations should be type checked by attached macros. Why should we allow invalid functions declarations to be filled by attached macros?

sspringer · August 7, 2023, 12:42am

I could imagine that it might be a difference for what kind of error messages you get. If your unexpanded function body does not get the return value right, your error message when applying the macro might be that the type of some type parameter could not be inferred, when a more helpful error message would be that the function body that you start with does not return the expected value for all branches.

rdemarest · August 7, 2023, 5:18am

Douglas_Gregor:

Right, it forces us into a different design. The idea proposed there is that we put the "span" attributes into a closure that's passed to Traced, e.g.,
@Traced("Doing complicated math") { span in 
  span.attributes["operation"] = "addition"
}
func myMath(a: Int, b: Int) -> Int {
  return a + b
}

Could this be supported at the macro declaration level instead?

@attached(wrapping) macro Traced<ReturnType>(_ name: String? = nil) = #externalMacro(module: "SwiftDataMacros", type: "QueryMacro", wrapper: (span: SpanProvider) -> ReturnType)

ReturnType would be provided by the compiler based on what the function is attached to, and the Traced macro description provides the content of the wrapped function.

In addition, why not differentiate the macros that create a body out of whole cloth like @Remote and the ones that merely wrap the content like @Traced? @attached(body) would generate a body, and @attached(wrapping) would wrap whatever is in there.

ktoso · August 7, 2023, 10:26am

I just realized I missed to comment on a potential alternative here (that it sadly is not so great):

Douglas_Gregor:

Right, it forces us into a different design. The idea proposed there is that we put the "span" attributes into a closure that's passed to Traced, e.g.,
@Traced("Doing complicated math") { span in 
  span.attributes["operation"] = "addition"
}
func myMath(a: Int, b: Int) -> Int {
  return a + b
}
They could also be passed to the macro via normal (non-closure) arguments, e.g., in a dictionary. Part of me likes this better, because it puts the "tracing stuff" in one place and the function logic in another, but I know my opinions are very much colored by not wanting the macro to be able to introduce new names into a function body that tools won't know about until after expansion.

So the problem with that is that attributes are almost never static. They're usually "the return code", "the error code" etc. So access to the span is necessary in combination with the context of the function, so like:

// @Traced("Doing complicated math") // ignoring macros for sake of clarity
func myMathPlus(request: HTTPRequest, a: Int, b: Int) -> Int {
  withSpan(...) { span in 
    span.attributes.client.address = ... // depends on client requesting
    span.attributes.path = "\(request.path)" // path is not hardcoded
    do {
      return ...
    } catch {
      span.recordError(error)
      throw error
    }
  }
}

So in any case, the problem is that we'd want both the span, and the function's context.

I do see the trickyness of introducing values by a macro, and maybe all those use-cases need to avoid macros... but it also feels very unfortunate at the same time

Alex did mention that sourcekit-lsp wise lookups should work as expected which did make me hopeful for being able to introduce values after all.

I wonder if there's a middle ground somehow -- maybe forcing macros to introduce only dollar prefixed names? $span at least looks suspicious enough to maybe consider that a macro might have created it? We do use $ for similar "out of nowhere" values after all in other places -- with property wrappers -- WDYT about that idea @Douglas_Gregor ?

dmt · August 7, 2023, 6:54pm

An inconvenient aspect is that the proposed macro transforms the function's body, but is applied to the function's signature. If we would be able to apply a macro to the function's body we could pass args directly by names:

func myMathPlus(request: HTTPRequest, a: Int, b: Int) -> Int 
@Traced("Doing complicated math", "\(request.path)") {
  // ...
}

The span variable could be introduced to the body via some kind of explicit binding. So the macro defines only what it introduces to the body, but names to these things are assigned on the callsite. For example:

@attached(body) macro Traced(_ name: String, _ path: String) -> Span = ...

func myMathPlus(request: HTTPRequest, a: Int, b: Int) -> Int 
@Traced("Doing complicated math", "\(request.path)") as span {
  do {
    return ...
  } catch {
    span.recordError(error)
    throw error
  }
}

But this way it becomes too close to something similar to the ability of defining a function with a single expression (i.e in Kotlin)

func myMathPlus(request: HTTPRequest, a: Int, b: Int) -> Int = traced("Doing complicated math", "\(request.path)") { span in
  do {
    return ...
  } catch {
    span.recordError(error)
    throw error
  }
}

EDIT:
Or we could sacrifice the top-to-bottom sequentiality and just refer to the function's arguments from a line above the function's signature:

@Traced("Doing complicated math", "\(request.path)") as span
func myMathPlus(request: HTTPRequest, a: Int, b: Int) -> Int {
  do {
    return ...
  } catch {
    span.recordError(error)
    throw error
  }
}

joshw · August 9, 2023, 4:48pm

Mentioning this per recommendation from this thread - I encountered an issue with the latest Xcode 15 Beta 6 where no existing attached macros can be applied to function parameters.

@URLForm
@POST("/todos/:id/tags")
func createTag(@Path id: Int) async throws // 🛑 'accessor' macro cannot be attached to parameter

@attached(accessor)
public macro Path(_ key: String? = nil) = #externalMacro(module: "PapyrusPlugin", type: "DecoratorMacro")

While it seems like that makes sense for all the existing attached macros - I think there could be some interesting use cases for a macro that allows this; particularly since property wrappers aren't, and likely won't be, allowed at the protocol level.

Would something like this make sense as part of this pitch? For what it's worth my no longer available use case was simply decorating a parameter so that another macro could generate code. I'm not certain of the actual use cases a standalone macro for function parameters might have though perhaps having a generic "decoration" macro that generates nothing is a valid use case.

joshw · August 9, 2023, 5:00pm

A possible use case - while this isn't possible quite yet, it could be interesting to have a macro that expands a simple data type's parameters as a function's list of parameters.

struct Person {
    let id: Int
    let name: String
    let age: Int
    let height: Double
}

func createPerson(@Expand person: Person) { ... }

// "expands" to the following macro generated code

func createPerson(id: Int, name: String, age: Int, height: Double) { ... }

Would need to let macros see the syntax info of types they are on. Not sure if that's achievable or plausible down the road but could be a good use case for reducing boilerplate.

Douglas_Gregor · August 20, 2023, 5:17am

rdemarest:

Could this be supported at the macro declaration level instead?
@attached(wrapping) macro Traced<ReturnType>(_ name: String? = nil) = #externalMacro(module: "SwiftDataMacros", type: "QueryMacro", wrapper: (span: SpanProvider) -> ReturnType)
ReturnType would be provided by the compiler based on what the function is attached to, and the Traced macro description provides the content of the wrapped function.

Yes, that's certainly one way we could address this: put enough type information into the macro declaration itself to allow the compiler to type-check the function body prior to macro expansion. I like the use of the function type here to express the signature of the wrapping closure.

That's effectively what I was trying to do with body vs. preamble. It's the wrapping case that's proven most difficult.

Doug

Douglas_Gregor · August 20, 2023, 5:27am

I don't think it fits in my pitch specifically, which is focused on the bodies of functions, but I could certainly see a case for some kind of "marker" attached macro that's meant as a tool for marking up source code---but can be written on any declaration and is never expanded.

This isn't going to work in the macro system because a macro applied to a parameter will only be able to see the parameter, and probably the function declaration it's in---it won't be able to "look over" at the definition of the type named in the parameter.

Doug

tera · November 7, 2023, 10:36pm

One could also use a macro to introduce logging code on entry and exit to a function, expanding the following
@Logged
func g(a: Int, b: Int) -> Int {
  return a + b
}
into

func g(a: Int, b: Int) -> Int {
  log("Entering g(a: \(a), b: \(b))")
  defer {
    log("Exiting g")
  }
  return a + b
}

Speaking of logging / tracing specifically: is macros the best approach for this particular task?

can it record the return value?
If I want to trace "everything" I'd have to put @Logged everywhere which would be a serious visual bloat.

IMHO, ideally this is a compiler switch in Xcode's diagnostic panel with no changes to the source code required. Understandably that would require quite serious changes to the prologue / epilogue code generation.

Douglas_Gregor · November 27, 2023, 7:47pm

To record the return value, I think we'd have to do some wrapping like withSpan rather than the defer approach with a preamble macro. That seems like a reasonable way to support the feature.

As I've noted before, macros are meant to be explicit, so I see (2) as a benefit. However, one could reduce the boilerplate for methods by, e.g., making @Logged also into a member-attribute macro, which propagates itself onto any methods/initializers within the corresponding type or extension, e.g,

@Logged
extension MyType {
  func f() { ... }
  init() { ... }
}

could propagate @Logged down to the function and initializer, i.e.,

extension MyType {
  @Logged func f() { ... }
  @Logged init() { ... }
}

That reduces boilerplate somewhat, while still having the @Logged annotation clearly visible in the source.

Doug

Douglas_Gregor · November 27, 2023, 9:01pm

Hi all,

I got a little side-tracked, but I'm back to thinking about this. I've revised, and have a basic implementation of body macros working in the compiler (Function body macros by DougGregor · Pull Request #70034 · apple/swift · GitHub), but I think the proposal as originally written is still the right way to go.

Doug

allevato · November 28, 2023, 12:16am

Since the proposal calls out accessors as being supported, can you clarify how that attachment will look? Will the macro always be attached to the accessor specifier, like this?

var x: Int {
  @Traced get { ... }
  @Traced set { ... }
}

And so if someone was using the implicit getter syntax for a read-only computed property, would they be required to insert the get { ... } to attach the macro or could they attach it to the property instead?

// Without the macro...
var x: Int { someValue }

// ...Would this be required?
var x: Int { @Traced get { someValue } }

// ...Or could they do this?
@Traced var x: Int { someValue }

My guess is that this is like adding throws/async effects where you have to write the get, but an example or two in the proposal might be helpful to show users what to expect.