Function body macros

Douglas_Gregor · July 31, 2023, 6:14am

Hey all,

I've been playing around with a design for function body macros, i.e., attached macros that can introduce, augment, or change the body of a function. Motivating examples include things like:

 @Remote // synthesizes a function body that does an RPC
 func f(a: Int, b: String) async throws -> String

@Logged // adds logging on entry/exit
func g(a: Int, b: Int) -> Int {
  return a + b
}

@Traced("Doing complicated math") // wraps the body in a withSpan call for the swift-distributed-tracing library
func h(a: Int, b: Int) -> Int {
  return a + b
}

I've written up a proposal document, but I'm not happy with it. Specifically, the Traced example needs to expand to something like:

func h(a: Int, b: Int) -> Int {
  withSpan("Doing complicated math") { _ in
    return a + b
  }
}

but doing so brings up tricky questions about when to type-check the body of a function and how much latitude these macros have to change what the developer wrote in the source. I'd love feedback on the tradeoffs here... see the proposal linked below for details.

Read the pitch here...

Doug

mlienert · July 31, 2023, 6:52am

Hello,

Maybe I missed it in the pitch but can you elaborate on the reasons why at most one body macro can be applied ?

Thank you for your hard work on this feature

ktoso · July 31, 2023, 8:45am

Thank you Doug for the awesome work here, we're very excited about this kind of macros for tracing systems!

Yeah I can see how body macros can get tricky and thus the "only one" limitation to avoid, what I assume to be, the complexity of having to type check the whole function body again and again if multiple body macros were to be combined? I think that limitation is probably fair...

Sadly it'll affect any macro that wants to make use of task-locals, but I think that is okey. At least for the tracing APIs these are convenience macros, so one can always go back to writing the withSpan by hand if some other body replacement macro would necessarily have to work on a function.

Thought experiment:
Is there some way we get away without "wrapping the user-provided function body" for @Traced

I attempted to see if we could get away without "body replacement" if we for example exposed "unsafe task local value push/pop" APIs, for purposes of implementation of such wrapper macros, however those also run into trouble.

Task locals are actually implemented by a pair of "push and pop" the value, and that's implemented using a push and defer { pop } pattern around the wrapped code. So we could consider exposing such unsafe APIs (getting the pop order wrong WILL immediately crash at runtime), and try to implement @Traced as:

func myMath(a: Int, b: Int) -> Int {
  var context: ServiceContext // get "current" from task-local
  // ... 
  _unsafeTaskLocalPush(ServiceContext.$current, value: context) // fictional

  let span = startSpan("Doing complicated math", context: context)

  defer { 
    span.end() 
    _unsafeTaskLocalPop(ServiceContext.$current) // fictional
  }

  // user code ~~~
  return a + b
}

So... for the bare-bones thing that could work if we exposed such unsafe APIs, however... the tracing APIs require more from us; E.g. we're required to record errors in a span like this:

do { 
  // user code
catch { 
  span.setError(error)
  span.setStatus ...
  throw error
}

So, we're unable to express this with a preamble anyway.

This leads me to conclude that the @Traced will be forced into staying a body replacement macro. But on the other hand -- it only is "convenience" and people can drop to writing with withSpan relatively easily, so if more body macros were to compete over being necessary for a function -- it is not a huge problem to remove the traced macro.

Back to questions posed by the proposal though.

I lean towards the same type-checking scheme as currently proposed... so that the macro is able to introduce a span member, though I see the problems it may cause to tooling.

I wonder if @ahoppen might be able to chime in how macros relate to lsp based tools? Would sourcekit-lsp and plugins based on it, like vscode, be able to query the compiler after the file has been compiled and realize the type of span if it were introduced by a body replacement macro? If yes, that's awesome news and let's definitely stick to this. If no... then we might want to think some more but it does seem like the mode we'd like to go for with whole body replacements I guess.

It is not a blocker for tracing per se, one can always write the withSpan manually. But it is very common to provide additional attributes to a span, so it is kind of nice to allow users access to it... If we don't, then I'm thinking we'll very likely be forced someday to stuff the Span also into a task local which would be unfortunate, as we've so far managed to not do so

Would it be still allowed to @Body @Preamble func henlo() {}?

Thanks again Doug for the pitch, it's looking great!

ktoso · July 31, 2023, 8:54am

Meh, I forgot to write up thoughts about the alternatives considered, so we here we go:

Type checking of functions involving function body macros

So the Capturing the withSpan pattern in another macro role pattern is actually pretty promising -- in that it being able to compose properly. And it looks the same as good-old "aspect oriented programming" patterns so it looks quite familiar and is well understood I think, in addition to the type-checking benefits.

It does lose the ability to introduce variables though, since the user defined body must type-check before the macro gets to wrap it... So that's an unfortnuate tradeoff I have mixed feelings about. I'm very curious how the tooling situation looks like with sourcekit-lsp so I hope Alex can shine some light on that.

The final concerns about this alternative considered was (1) language complexity, and (2) overfitting to the withSpan question.

I don't think those are justified though to be honest:

(1) in my mind language complexity is at its worst when things don't compose well, and you end up with hacks on hacks to work around the lack of composition - so the alternative actually fares better in that respect, it does compose and does cause less complexity and weird edge cases to remember about to end-users (vs. making it a bit harder to macro implementers -- of which there should be much fewer than macro users).
(2) the discussion in "Type checking of functions involving function body macros" already identified the fact that with... patterns are a very common thing in Swift, and withSpan is just one instance of it.

The one that does make me stop and wonder if it is a good choice or not though is the lack of ability to introduce values into the scope... which the replacement as being proposed does have.

No conclusive ideas here but I hope this gives some helpful input, thanks again for pitching these kinds of macros!

dmt · July 31, 2023, 9:44am

@Douglas_Gregor That would be a very cool feature. I would like to invite you to consider my suggestion for the result builders extension as a partial alternative. This could cover the "Logged" and "Traced" cases from your examples. Of course, in the general case, you would need to provide more information in the result builder, such as the signature of the called function and its arguments.

Even the "Remote" case could be covered if we assume that a result builder is capable of building the resulting function in the absence of the original one.

Certainly, the macro approach is a much more versatile mechanism, providing more control and allowing for additional checks at compile time. However, extending the result builders API might be easier to implement in the compiler and more user-friendly for programmers.

I can write roughly what the API should look like if you're interested.

mlienert · July 31, 2023, 10:14am

OK this is what I missed. I didn't realise that the body would need to be type checked before every macro expansion. As macro are working on source level and don't have access to type resolution I was under the assumption that it was done before and after all macros expansion.

allevato · July 31, 2023, 1:58pm

From the pitch:

Because the span parameter isn't known prior to macro expansion, various tools are unlikely to work well with it: for example, code completion won't know the type of span and therefore can't provide code completion for the members of it after span. . Other tools such as Jump-To-Definition are unlikely to work without some amount of heuristic guessing.

I might be missing a subtle difference in how attached macros are type-checked vs. what's being proposed here, but how is the scenario you're describing for body macros (for example, jumping to the definition of a symbol introduced inside a macro) different than other macro introduced names?

For example, if I write this in Xcode 15b5:

@Observable public class Foo {
  public var name: String = ""
  func foo() { print(_name) }
}

Jumping to the definition of _name puts the cursor on the @Observable macro invocation, and code completion also correctly populates the list with String members if I write _name.<something>. So SourceKit must be doing some amount of macro expansion to know that that symbol exists and what its type is.

blangmuir · July 31, 2023, 4:36pm

If I understand correctly, one difference is that the attached macro only needs to be expanded when the declarations or attributes of Foo change, whereas the body macro needs to be expanded for every change to the function body. If you think about code-completion or jump-to-definition inside a method, the attached macro can basically be expanded once and cached, while the body macro would need to re-expand on every edit inside the body.

Joe_Groff · July 31, 2023, 4:39pm

Would it be possible to make the originally-written body for body macros be opaque, so that the macro expansion can't change the types of parameters or returns in the context of the original body, and changes to the originally-written body can't affect the macro expansion (except in how the original body gets referenced in the expansion)? That way we wouldn't have to re-type-check the originally-written body after each macro expansion when multiple macros are applied, and conversely changes to the body wouldn't require immediate re-execution of the macro to update its effect on IDE functionality.

Paul_Cantrell · July 31, 2023, 4:41pm

When proposals for differentiable programming first circulated in these forums, I raised concerns (here and here) that the proposal was too domain-specific to be a language feature. I wished instead for “a system for AST transformation that could make ‘differentiable’ a library” feature instead.

I wonder: is this proposal robust enough to accomplish that? It seems on the surface that it might be: HasTrailingOptionalCodeBlock appears to give full access to the structure of the function body….

This is not a question about differentiable programming so much as a question about the scope of function body macros. I’m trying to get my head around just how robust this proposal is. Might be a use case for vetting it (and for mentioning in the proposal body).

bbrk24 · July 31, 2023, 4:43pm

Would this work if the withSpan function is rethrows? How would the macro know whether to insert try?

wes1 · July 31, 2023, 6:56pm

Proposal:

the [rewrite] macro itself can completely change the body of the function

Is this assumption necessary b/c of the Swift syntax implementation?

I can't imagine a well-behaved function macro that changes the type signature of a target function.

I assumed even the body rewriting macro could produce only something with the same type signature. That would mean the compiler doesn't need to worry about the actual type of the target body, and would work off the declared type (and API clients would not be affected by adding function macros).

i.e., not opaque but constant-type?

So, in order of evaluation:

Target function declares signature. Body and any macros assumed to comply.
Error in body if it doesn't comply
Error in macro if generated code doesn't comply.

mfilonen2 · July 31, 2023, 7:16pm

Why are function body macros even needed?
If just for decorating the function, wouldn't it be easier to create something like property wrappers, but fo functions, that are able to wrap the function into another one?

wes1 · July 31, 2023, 7:19pm

Macros that inject code at the beginning of an existing function body, such as the Logged macro that adds log calls at the beginning of the function along with a defer to trigger the log at the end of the function

If/since defer{} runs at the end of scope, it almost converts the "before" advice of preambles to "around" advice (to use AspectJ terminology[1]), with possibly confusing impacts on error-handling. ("Almost" because around advice (withSpan{}) can also catch errors, but defer does not.)

As a macro writer or reader, I'd (also) like to have a restricted form of preamble which guarantees e.g., that it only runs before the target body (and not after via defer), and that the target body will run (i.e., the preamble doesn't throw an error even if the target body is permitted to).

That would make the macro much easier to reason about. I can imagine development groups hesitant about macros might permit limited use of these safe preambles as a way to gain experience.

[1] The AspectJTM Programming Guide

wes1 · July 31, 2023, 7:40pm

This is indeed the hard question -- how to communicate between the body and the macro:

Traced [macro] injects a span variable that can be used in the original function body despite it only being declared by the macro expansion:

@Traced("Doing complicated math", spanName: "span")
func myMath(a: Int, b: Int) -> Int {
  span.attributes["operation"] = "addition"   // note: would not type-check by itself
  return a + b
}

This particular example could be hoisted as a parameter:

@Traced("Doing complicated math", operation: "addition")
func myMath(a: Int, b: Int) -> Int {
  return a + b
}

But that's not true in general, e.g., the span id could be lazy, and created if n/a.

Logging can be more compelling example, where developers add to the diagnostic context on entry, and have it automatically removed on exit, or you might want access to a transaction or geometry, like in SwiftUI.

In Python and Java, the with-form creates a context variable of a given type for that limited scope. I wonder if a similar form of context macro which creates a variable with a bounding scope would be of interest when the body needs access to the macro context.

Douglas_Gregor · July 31, 2023, 7:43pm

Attached macro expansions on a given declaration are meant to be independent, and they always see the source code as-written rather than the code as modified by other macro expansions. There is some rationale for this in the attached macros proposal, but it essentially means that you don't get to apply a macro to the result of expanding another macro. They need to remain independent.

Even if there's only a single body macro applied, type-checking both the code as originally written and also the expanded code means double the work. If these macros are used often enough (and they might be!) that's a problem for the developer experience.

Right, it forces us into a different design. The idea proposed there is that we put the "span" attributes into a closure that's passed to Traced, e.g.,

@Traced("Doing complicated math") { span in 
  span.attributes["operation"] = "addition"
}
func myMath(a: Int, b: Int) -> Int {
  return a + b
}

They could also be passed to the macro via normal (non-closure) arguments, e.g., in a dictionary. Part of me likes this better, because it puts the "tracing stuff" in one place and the function logic in another, but I know my opinions are very much colored by not wanting the macro to be able to introduce new names into a function body that tools won't know about until after expansion.

It does compose, although I'm not certain if we want it to. I'm skittish around suggesting composition of this form after my experience with property wrapper composition. Folks really, really wanted property wrappers to compose, and we made them compose, and after a ton of implementation effort it's effectively useless in practice. Personally, I'm much more motivated by the alternative's ability to have a single type check that reflects what the user wrote.

I think that any extension to result builders is going to have very narrow benefits. The point of pursuing macros is that they are versatile enough that we don't need to do surgical extensions to other features. Indeed, most of result builders can be implemented with macros.

This sounds a bit like the one of the alternatives in the pitch, which I'm starting to like more.

I suspect that peer macros would be the better match for this. A Differentiable peer macro applied to a function f could create a new function f_autodiff that replaces parameter/result types as appropriate and rewrites all calls to functions g with g_autodiff. I don't know how far you can get without actual type information, though.

It could look at whether the function itself is throws or, as a more advanced implementation, check for a try not enclosed in a catch.

Function body macros cannot change the type signature.

Just like accessor macros are more general than property wrappers (while effectively being a simpler feature), function body macros are expected to be more general than a potential "function wrappers" feature.

I think this should be handled via macro documentation, which can be verified by looking at the expanded source of the macro, not embedded in the macro feature.

Doug

dmt · July 31, 2023, 7:57pm

I agree with this if we want to make it possible to seriously change the behavior of the code inside the body of the function, for example, introduce new variables that would be available to the function code, or insert new statements in the middle of the code, or change the function signature.
But all the examples now listed in the proposal come down to the withSometing(...) { body() } pattern.

wadetregaskis · July 31, 2023, 7:58pm

Tangentially, there are similar use-cases which require being able to modify compile-time constants and/or execute at module load time, e.g. a @Registered macro which inserts a tagged function into a list of handlers for a server (a list that's either an implicitly-referenced global or passed as an argument to the macro).

Python is one language where I've seen this used a lot, and it can be very elegant in practice. Python decorators are executed at parse-time to do the wrapping, so they can manipulate program state arbitrarily - and efficiently, without having to do pthread_once (or equivalents) upon every invocation of the wrapped function.

dmt · July 31, 2023, 8:00pm

I think it's important to have the following invariants:

Function signature can't be changed
Implicit entities (variables/functions/types) can't be injected
The integrity of the original function code must not be broken

Douglas_Gregor · July 31, 2023, 8:00pm

This is the reason why SE-0385 was returned for revision.

Doug