[Pitch] Declaration macros

Douglas_Gregor · January 4, 2023, 12:28am

Hey all,

SE-0382 "Expression Macros" is under review now. Expression macros are one piece of the larger vision for macros. Declaration macros are another pieces, and are quite possibly the most important one, because they provide the ability for macros to produce new declarations as well as augment existing declarations.

Declaration macros support a number of use cases, including:

Creating trampoline or wrapper functions, such as automatically creating a completion-handler version of an async function, e.g.,

@addCompletionHandler
func fetchAvatar(_ username: String) async -> Image? { ... }

synthesizes

func fetchAvatar(_ username: String, completionHandler: @escaping (Image?) -> Void ) {
  Task.detached {
    completionHandler(await fetchAvatar(username))
  }
}

Creating wrapper types from another type, such as forming an OptionSet from an enum containing flags.
Creating accessors for a stored property or subscript, subsuming some of the behavior of SE-0258 "Property Wrappers".
Performing a non-trivial compile-time computation to produce an efficient implementation of a function, such as creating a perfect hash function for a fixed set of strings.
Subsuming the #warning and #error directives introduced in SE-0196 into macros.

Read the full pitch here.

Doug

mayoff · January 4, 2023, 1:12am

Can a declaration macro be attached to a case declaration in an enum? For example:

enum E {
  @myMacro
  case c
}

Also, does an ‘outer’ macro see the expansion of an ‘inner’ macro? For example:

@outerMacro
struct S {
  @innerMacro
   var i: Int
}

Suppose innerMacro adds member _i as a peer to i in S. Does outerMacro see member _i?

mayoff · January 4, 2023, 1:16am

Could declaration macros subsume the need for type wrappers?

tgoyne · January 4, 2023, 2:33am

It does appear to be strictly more powerful than type wrappers.

"The actual implementation of this macro involves a lot of syntax manipulation, so we settle for a pseudo-code definition here" is... concerning. It certainly doesn't look very easy to write macros, to the extent that they may not actually fully subsume the need for simpler versions of things which could be done with macros.

Of course, that's not specific to this pitch. If Swift is to have macros based on this general design, then I think the functionality described in this pitch is something I'd want to be able to do with them.

The "Up-front declarations of newly-introduced macro names only" mentions the compiler benefits from this, but I think it's also incredibly valuable for documentation purposes. A documentation generator could potentially automatically list the declarations a macro can define, and even without that just being able to command-click a macro and see that information without having to read through the actual definition of the macro helps mitigate one of the major problems which can emerge from macros.

Being able to fully replace the body of a function but not being able to fully replace the annotated declaration feels like a funny mismatch, but I couldn't come up with a stronger argument than that and I can see why one is more of a problem for tooling than the other.

sspringer · January 4, 2023, 8:33am

When manipulating function bodies via @myMacro, the question is if the following things might be possible:

When using a declaration macro to manipulate a function body, you might want to use a function argument of a certain type in the inserted code. So such a macro should be able to find and use such an argument, and of course the macro should then only be applicable if an according argument exists. (Same for several arguments.)
Inserted code should be able to exit the function, even in the general case of the function returning nothing or any (!) optional value (in this combination, so the inserted code could just return in the first case or return nil in the second case).

Sorry for maybe speculating a bit here. I have quite a few places where fixed structures are built for configuration purposes and I wonder in which cases macros could help to spare some of this build time. In the case of a struct with “simple” computed values (no references) a macro could certainly insert the computed values into the construction code, but could this go any further, e.g. already producing the resulting memory layout for the struct? What about dictionaries?

Alejandro_Martinez · January 4, 2023, 4:29pm

This looks quite awesome, being able to generate code like this opens so many possibilities.

One thing I'm not clear is in the "order of execution" inside the compiler. macros seem to run after type checking, but that means that if a macro wants to add the implementation for the requirements of a protocol, the compiler needs to type check the conforming type after macro expansion. Is that how it works or does this use case not work? or maybe this is where declaring the names before hand comes into play too?

Also seeing how the peers parameter is separated from .attached in @declaration, does that mean it can be used with .freestanding too? I guess the compiler will give an error message if one tries to use it with freestanding? I know this is not really "swift code" but it feels a bit weird, if this was Swift I would expect the peers to be an associated value of .attached.

@declaration(.freestanding)
@declaration(.attached(peers: [.overloaded]))

exciting pitch!

ktoso · January 5, 2023, 3:27am

Super glad we got to decl macros, I've been longing for them for a long time

Very glad that both freestanding and attached macros are supported. Especially since these macros can't modify the attached to function, I guess this might be good enough for now...

I like the macro declarations defining what peers/members they are going to introduce, and the spellings of that look fine. I can see the benefits knowing what a macro will produce for the compiler, and it's not too hard to declare what you're going to be emitting.

I'm assuming when we declare a specific member the macro is expected to emit (like rawValue), and the macro fails to emit such, that'd be a compile error -- since the macro didn't do what it claimed. And if there's any "we may be able to make some member" then it should be using .arbitrary I suppose.

I wasn't entirely sure about the @declaration(.attached, members: where we attached to a variable decl. It is "members" specifically because we're emitting accessors, right? And those are "inside"

  @dictionaryStorage
  var name: String // { get set } are "inside", thus members:

Should a freestanding macro have "peers" specified or doesn't that really make sense... Say, if it's a #nice_description that makes a nice var description: String {} would that declare @declaration(.freestanding, peers: [.named("description")]) macro?

The showcased addCompletionHandler is very close to a practical use-case I have with interoperating between existing large/complex C++ codebases with their own concurrency mechanisms (futures). Using such type of macro we'll be able tons of boilerplate and make Swift a viable and productive citizen alongside a large existing codebase.

~~I'm a bit sad about lack of ability to modify the attached to decl (though I do see the complications it'd cause). It means we can't implement a traced function:~~

We can actually, I missed the AttachedDeclarationExpansion.functionBody in the default design, thanks for pointing that out!

@traced func hello() {} // won't work; not allowed to modify the body.

Since a traced function needs to "startSpan()" in function prologue, and "end()" when the function returns (as well as setError when the function is actually throwing). In that sense then, we can't simulate this using an in-line freestanding macro like:

// won't work
func hello() {
  #traced("span name") // can't work, no way to set error
  // it could (probably?) emit:
  // - let span = startSpan
  // - defer { span.end() } // expression though...
  throw Boom() // bad, we can't intercept this to `span.setError` here

Plain task-locals we could help a bit; specifically in order order to un-pyramid-of-doom the Task locals use case explained here: Un-pyramid-of-doom the `with` style methods - #8 by ktoso we would be able emit the push/defer{pop} expressions a task-local needs, which is neat (using expression macros), but a traced remains impossible with either type of macro AFAICS

I was also thinking if we could steal some ideas from aspect-oriented-programming which was a big trend some time ago in JVM land, where an aspect would do either "around", "before" or "after" inject code into functions... and the "around" would help us here, but it would probably be the same complexity as the modifying the entire function... or perhaps it might be simpler, if we didn't give the real body of the func to the macro somehow...? Either way, this sounds like a more difficult problem, that I would love to see solved, but perhaps that's a different proposal

Douglas_Gregor · January 5, 2023, 3:36am

I think that's fine, yes, although the form of AttachedDeclarationExpansion currently means you won't be able to change anything about the case.

As in the expression-macros proposal, macro expansion is outside-in, so @outerMacro would get expanded first. It would see the unexpanded @innerMacro on var i.

No, it will not.

Maybe? Right now, we don't have an easy way for a macro applied to a type to add accessors to its members, but this seems like something that could fit into the model. @hborla has been thinking about this a bit and might want to weigh in.

You're absolutely right. I should incorporate this additional argument into the proposal test, thanks!

Yeah, the ability to replace a function body does feel a bit different than the rest of the proposal. It's a really useful capability, but perhaps it should be split out into a separate proposal. Perhaps we should see how many of our interesting use cases depend on it.

Sure, the macro could look into the function declaration's parameter list to find the parameter, if it's there, and use it. However, it might be hard to precisely determine that the function parameter has the precise type you want, without an extension to the macro expansion context that provides type information.

Inserted code will be type-checked in the same way that the originally-written code is type-checked, so you can return from a Void function, or return nil from an optional-returning function, and it's fine.

This seems more like constant initialization than macros. Perhaps we could find a way to do this with macros if they were able to add binary blobs to the resulting object file, which is something we could permit by adding more API to MacroExpansionContext. This could also be useful for implementing something like the C23 or Rust #embed.

Right, this area is dicey, because macro expansion will occur as part of type checking. This is much of the (implementation) reason why declaration macros can only have specific, prescribed effects, and we declare them in the macro itself, so we can make sure that expanding a macro cannot invalidate a previous result.

Yes. Effectively, .freestanding can only have peers, since there is no way for it to have members.

We could do that. I'm not at all... attached... to the design of these attribute arguments.

I was thinking it would be okay to overpromise in a macro declaration. In the option-set example, we shouldn't require the macro to define a rawValue if the user already wrote one, but that's the only reason I'd thought of for allowing a macro not to introduce a declaration that it said it might.

Yeah, this feels a little sketchy. We could have accessors be a separate thing entirely, which might feel cleaner.

Yes, I should add an example here, because @Alejandro_Martinez had a similar question.

ktoso:

I'm a bit sad about lack of ability to modify the attached to decl (though I do see the complications it'd cause). It means we can't implement a traced function:
@traced func hello() {} // won't work; not allowed to modify the body.

Hmm. I was intending AttachedDeclarationExpansion.functionBody to be used for this purpose. Is that not enough, if it can replace the body?

Doug

ktoso · January 5, 2023, 3:45am

Ah, totally missed that, yes that's totally enough! Worth adding a small section with an example, I'll send in a PR

hborla · January 5, 2023, 5:00am

I think that 2 enhancements to declaration macros (either in this proposal, or as follow-on proposals) would enable type wrappers to be expressed entirely as declaration macros.

The ability for declaration macros attached to types to attach declaration macros to properties inside that type + recursive macro expansion.

This is necessary because type wrappers turn all stored properties into computed properties that indirect access through a synthesized _storage property. This pitch already supports adding a new member called _storage using members: [.named("_storage")].

This pitch also already supports adding accessors by attaching a declaration macro to a stored property, so I think the cleanest way to allow a type declaration macro to turn stored properties into computed properties is to allow the macro to apply property declaration macros. Maybe a declaration macro could specify that it can annotate existing members with attributes:

@declaration(.attached, .annotatesMembers, addsMembers: [.named("_storage")]) macro makeAllPropertiesComputed

Now imagine I have another declaration macro that adds accessors to a single stored property:

@declaration(.attached, members: [.accessors]) macro makeComputed

The makeAllPropertiesComputed macro could simply attach @makeComputed to all stored properties:

@makeAllPropertiesComputed
struct S {
  var x: Int

  var y: Int
}

// expanded to

struct S {
  private var _storage: SomeWrapperType

  @makeComputed
  var x: Int

  @makeComputed
  var y: Int
}

When the @makeComputed macros are expanded, that will add the get and set accessors that indirect access to _storage.

I believe opting out through @typeWrapperIgnored can also be implemented as a macro that does nothing in its own expansion, but it instructs the type wrapper macro to leave that particular stored property alone. This is actually pretty neat, because it allows type wrapper macros to choose whether to support opting out. If the macro does choose to support opting out, the library author can pick a domain-specific attribute name instead of @typeWrapperIgnored, which nobody is very fond of.

A hook into definite initialization that allows declaration macros to customize initialization of stored properties.

Property wrappers and type wrappers both have special logic in definite initialization that rewrites assignment to a wrapped property to either an initialization of the backing wrapper type or a setter call depending on whether all of self is initialized. For example:

@propertyWrapper 
struct Wrapper<T> {
  var wrappedValue: T
  init(wrappedValue: T) { ... }
}

struct Me {
  @Wrapper var name: String
  var age: Int

  init() {
    self.name = "Holly" // re-written to self._name = Wrapper(wrappedValue: "Holly")
    self.age = 27

    // all of 'self' is initialized now

    self.name = "Holly Annesa" // this is a setter call through the computed 'name' property
  } 
}

The compiler calls these possibly-rewritten assignments "assign-by-wrapper". I think this operation might be generally useful for declaration macros.

My first idea for supporting this was to introduce a builtin expression macro that hooks into assign-by-wrapper. #assignByWrapper would take 3 arguments: a key-path, a value, and an initialization expression. A type wrapper macro could then re-write an initializer to change all property assignments to #assignByWrapper and provide a custom wrapper initialization. However, this doesn't really work for property-wrapper-like macros, because they cannot reach into an initializer and re-write assignments. Similarly, this doesn't work for property-wrapper-like macros attached to local variables.

My second thought is that customizing initialization could be a fundamental capability of declaration macros that are attached to properties. Such macros could opt into custom initialization and provide the re-written initialization expression through AttachedDeclarationExpansion:

public struct AttachedDeclarationExpansion {
  /// The set of peer declarations introduced by this macro, which will be introduced alongside the use of the
  /// macro.
  public var peers: [DeclSyntax] = []
  
  /// The set of member declarations introduced by this macro, which are nested inside 
  public var members: [DeclSyntax] = []
  
  /// For a function, body for the function. If non-nil, this will replace any existing function body.
  public var functionBody: CodeBlockSyntax? = nil

  /// For a property, an expression for initializing that property given a value of the property type. 
  /// If non-nil, definite initialization will use assign-by-wrapper and re-write to initialization using
  /// this expression.
  public func initialization(from value: ExprSyntax) -> ExprSyntax?
  
  public init(peers: [DeclSyntax] = [], members: [DeclSyntax] = [], functionBody: CodeBlockSyntax? = nil, initialization: ExprSyntax? = nil)
}

Aside from those two things, I think everything else in the current type wrapper pitch (and more!) is already covered by this design for declaration macros. Very cool!

dlbuckley · January 5, 2023, 9:11am

Can I clarify one thing regarding the dictionary storage example; it shows there are 2 declared macros at one point, one with a key and one without.

@declaration(.attached, members: [.accessors]) macro dictionaryStorage
@declaration(.attached, members: [.accessors]) macro dictionaryStorage(key: String)

Is this illustrating that both need to be declared to support what appears to be an optional key? If that is the case can we not simply make the key optional and give it a default value of nil just like we can with standard functions so the key can be omitted if not required?

@declaration(.attached, members: [.accessors]) macro dictionaryStorage(key: String? = nil)

This feels a bit nicer instead of having to redeclare the macro twice and will allow greater flexibility for which arguments may need to be passed in to a macro or omitted.

I may have misunderstood, but if that is the case then maybe this can be cleared up in the pitch.

stevex · January 5, 2023, 12:05pm

Sorry if this question is answered in the proposals but I'm having some trouble seeing the forest for the trees, and it's probably because I haven't spent enough time with them.

My main interest for macros and reflection is declarative REST API generation. Being able to produce an OpenAPI specification at runtime from the implementation using reflection, for example.

A short example from Java:

@Path("/hello")
public class GreetingResource {

    @GET
    @Produces(MediaType.TEXT_PLAIN)
    public String hello() {
        return "Hello from RESTEasy Reactive";
    }
}

This is more about reflection than macros, but there's probably some interaction there where the macro has to leave something there for reflection to find and act on so at runtime the API routing can be built from what reflection discovers.

Once both reflection and macros are implemented, will this be possible?

s-k · January 5, 2023, 1:15pm

I think the reason to declare the macro twice is to allow it to be used without parentheses, as in

@dictionaryStorage var name: String

With an optional key, it would have to be written as

@dictionaryStorage() var name: String

s-k · January 5, 2023, 1:30pm

I have a question. Would a function body with an attached macro be type-checked before the macro expansion? An example would be generating API code via macro:

@API(.GET, endpoint: "https://example.com/users")
func getUsers() async throws -> [User] {}

// would expand to:

func getUsers() async throws -> [User] {
    // Fetch and decode users
    ...

    return users
}

In this case, would not having a return statement in the original code be an error?

And thinking about it, it would even be nicer if one could leave out the {} when declaring the function:

@API(.GET, endpoint: "https://example.com/users")
func getUsers() async throws -> [User]

Alejandro_Martinez · January 5, 2023, 5:01pm

Thanks for the answers Doug!

Sorry, I’m a bit slow today , but does that mean we won’t be able to use macros to implement protocol conformances?

Nice

Douglas_Gregor · January 5, 2023, 5:43pm

That's a great question. I think all of the arguments in favor of type checking the macro arguments before expanding the macro apply to the function body just as well: better diagnostics, macros only get well-formed inputs, easier to reason about what macros do, etc.

However, your question and @Alejandro_Martinez 's question about order-of-operation bring up a really important point. Right now, the compiler will only type-check the body of the function when it needs the body for something, i.e., to generate code for the function. We don't want to pay the cost of type checking the function body if we're only doing so to expand a macro for its peer declarations.

This is yet more evidence for...

i.e., we should separate out the ability to add or replace a function body from the ability to add peer declarations, attributes, etc., because they run at different conceptual times: we need peer declarations to do things like name lookup, we need attributes to understand the full signature of a type for type checking, and we need the function body to generate code. This might even mean that "peer" and "member" macro implementation entry points should be separate in the design .

Right, it should be fine for a function-body-producing macro to provide a function body for a function that doesn't have one.

I believe this will be achievable by having a macro generate custom metadata attributes, as pitched elsewhere.

If we can't implement at least some protocol conformances with macros, we've gotten macros wrong. The question is how best to do it. With this pitch, you could create an attached macro that you place on the type itself, and which generates member declarations that correspond to the requirements of the protocol. That macro can use syntactic information from the type, e.g., it can walk through and find the declared properties. However, it can't reason about (e.g.) the effects of macros or property wrappers on the declared properties, so it's going to have rough edges. Perhaps that's okay, or perhaps it means we need a different model for things that want access to stored properties (protocol conformances, member wise initializers, etc.).

Doug

allevato · January 5, 2023, 6:19pm

The first feature that came to mind as a use case for declaration macros was "newtype" with automatic protocol forwarding. As a quick sketch, something like:

@newType(basedOn: Int)
public struct SomeIndex: Comparable, Hashable {}

would synthesize something like this:

public struct SomeIndex: Comparable, Hashable {
  private var rawValue: Int

  public init(rawValue: Int) { self.rawValue = rawValue }

  public static func == (lhs: SomeIndex, rhs: SomeIndex) { lhs.rawValue == rhs.rawValue }
  public static func < (lhs: SomeIndex, rhs: SomeIndex) { lhs.rawValue < rhs.rawValue }
  public func hash(into hasher: inout Hasher) { rawValue.hash(into: &hasher) }
}

The big blocker here is that AFAICT we don't have the capability in these macros yet to examine the members of the protocols we want to conform to; we only have the macro's syntactic context. So I can know that I want to do something with "things" named Comparable and Hashable, but nothing else about them (I don't know what module they came from, whether they're protocols or something else, etc.).

As a general feature, I really like where this is going, and I think that that there are certain examples (like those highlighted in the proposal) where purely syntactic introspection can be powerful enough for what users need, but I wonder how many times users would hit a wall where we need some semantic data. Do you have any more insight here? (If you've got another pitch in your back pocket, just tell me to wait )

s-k · January 6, 2023, 7:11am

While I fear that this may delay the ability to work with functions, I think this is a good decision if we get a nicer API this way.

I feel that it could be valuable to have a general idea how type information would be supplied to the macro. It may have consequences for the design of the basic feature (even if type information is added in a future proposal). While thinking about and implementing different possible macros, I have repeatedly seen the need to have type info. I think, there are (at least) two distinct needs:

Get the type of a (sub-)expression supplied to the macro. This can especially be useful for expression macros, but also for macros modifying function bodies. And since the compiler has already type-checked the code, this info should be readily available. A macro implementing something like function builders would definitely need this. It may be possible to somehow do some hacks involving type inference to make it work, but it wouldn't be nice. ( SE-0380 (if and switch expressions) could help, but I am not sure.)

I imagine this API such that the macro can supply a sub-expression of the expression passed to the macro to the MacroExpansionContext and get the type of the expression back (at least its name, possibly more).
Get infos about any given type or protocol (conformances, protocol requirements, members, etc.). I see this as generally valuable for all sorts of macros. This would be some sort of reflection API. It would be very nice if it could mirror the runtime reflection API as mentioned already in that thread. However, the currently pitched reflection API would not be enough because it does not supply conformances, computed properties and methods.

dlbuckley · January 6, 2023, 8:40am

I guess that makes sense but it's still a bit of a rough edge. I wonder if there is an opportunity to smooth it out at this stage as I can imagine as this feature gains wider adoption it will become a common use case.

Douglas_Gregor · January 7, 2023, 7:19am

Challenge accepted, I guess? I went ahead and did the full implementation of AddCompletionHandler, as my test case for the pull request that starts implementing peer declaration macros at a syntactic level.

It's about a hundred lines of syntax manipulation. There are some obvious little utility functions we could build to make this easier (e.g., "build a forwarding call to this function"), and we could drop all of the the trivia manipulation by having swift-format clean up after you at the end, but for the most part it's straightforward: form the completion-handler parameter, drop the result type, drop the attribute, form the call, etc. Iteration on this kind of syntax macro development is fast, because you're plucking the bits you care about from the existing syntax tree and interpolating them into strings to make more syntax nodes. The new parser catches any mistakes quickly and gives nice diagnostics.

Full implementation of `AddCompletionHandler`

public struct AddCompletionHandler: PeerDeclarationMacro {
   public static func expansion(
     of node: CustomAttributeSyntax,
     attachedTo declaration: DeclSyntax,
     in context: inout MacroExpansionContext
   ) throws -> [DeclSyntax] {
     // Only on functions at the moment. We could handle initializers as well
     // with a bit of work.
     guard let funcDecl = declaration.as(FunctionDeclSyntax.self) else {
       throw CustomError.message("@addCompletionHandler only works on functions")
     }

     // This only makes sense for async functions.
     if funcDecl.signature.asyncOrReasyncKeyword == nil {
       throw CustomError.message(
         "@addCompletionHandler requires an async function"
       )
     }

     // Form the completion handler parameter.
     let resultType: TypeSyntax? = funcDecl.signature.output?.returnType.withoutTrivia()

     let completionHandlerParam =
       FunctionParameterSyntax(
         firstName: .identifier("completionHandler"),
         colon: .colonToken(trailingTrivia: .space),
         type: "(\(resultType ?? "")) -> Void" as TypeSyntax
       )

     // Add the completion handler parameter to the parameter list.
     let parameterList = funcDecl.signature.input.parameterList
     let newParameterList: FunctionParameterListSyntax
     if let lastParam = parameterList.last {
       // We need to add a trailing comma to the preceding list.
       newParameterList = parameterList.removingLast()
         .appending(
           lastParam.withTrailingComma(
             .commaToken(trailingTrivia: .space)
           )
         )
         .appending(completionHandlerParam)
     } else {
       newParameterList = parameterList.appending(completionHandlerParam)
     }

     let callArguments: [String] = try parameterList.map { param in
       guard let argName = param.secondName ?? param.firstName else {
         throw CustomError.message(
           "@addCompletionHandler argument must have a name"
         )
       }

       if let paramName = param.firstName, paramName.text != "_" {
         return "\(paramName.withoutTrivia()): \(argName.withoutTrivia())"
       }

       return "\(argName.withoutTrivia())"
     }

     let call: ExprSyntax =
       "\(funcDecl.identifier)(\(raw: callArguments.joined(separator: ", ")))"

     // FIXME: We should make CodeBlockSyntax ExpressibleByStringInterpolation,
     // so that the full body could go here.
     let newBody: ExprSyntax =
       """

         Task {
           completionHandler(await \(call))
         }

       """

     // Drop the @addCompletionHandler attribute from the new declaration.
     let newAttributeList = AttributeListSyntax(
       funcDecl.attributes?.filter {
         guard case let .customAttribute(customAttr) = $0 else {
           return true
         }

         return customAttr != node
       } ?? []
     )

     let newFunc =
       funcDecl
       .withSignature(
         funcDecl.signature
           .withAsyncOrReasyncKeyword(nil)  // drop async
           .withOutput(nil)  // drop result type
           .withInput(  // add completion handler parameter
             funcDecl.signature.input.withParameterList(newParameterList)
               .withoutTrailingTrivia()
           )
       )
       .withBody(
         CodeBlockSyntax(
           leftBrace: .leftBraceToken(leadingTrivia: .space),
           statements: CodeBlockItemListSyntax(
             [CodeBlockItemSyntax(item: .expr(newBody))]
           ),
           rightBrace: .rightBraceToken(leadingTrivia: .newline)
         )
       )
       .withAttributes(newAttributeList)
       .withLeadingTrivia(.newlines(2))

     return [DeclSyntax(newFunc)]
   }
 }

Doug