A Possible Vision for Macros in Swift

gwendal.roue · October 18, 2022, 12:43pm

The last time I was hoping for macros, I was reading the documentation for SwiftUI EnvironmentValues. Here is a snippet of this documentation:

Create custom environment values by defining a type that conforms to the EnvironmentKey protocol, and then extending the environment values structure with a new property. Use your key to get and set the value, and provide a dedicated modifier for clients to use when setting the value:
private struct MyEnvironmentKey: EnvironmentKey {
    static let defaultValue: String = "Default value"
}

extension EnvironmentValues {
    var myCustomValue: String {
        get { self[MyEnvironmentKey.self] }
        set { self[MyEnvironmentKey.self] = newValue }
    }
}

extension View {
    func myCustomValue(_ myCustomValue: String) -> some View {
        environment(\.myCustomValue, myCustomValue)
    }
}

The above code snippet is what every developer is supposed to write in their application in order to use custom environment keys in SwiftUI.

This code has no intrinsic problem. Yet, it is highly repetitive and contains very little information that is related to the very user's needs.

What the user needs is:

myCustomValue: the identifier that makes it possible to set and read the environment key. The user wants to be able to write:
```
@Environment(\.myCustomValue) var value: String
myView.environment(\.myCustomValue, "Custom value")
```
String: the type of the environment value.
"Default value": the default value for the environment key
(Optional) myCustomValue: a convenience View extension method, so that the user can write:
```
myView.myCustomValue(...)
```

And yet, the user has to define a MyEnvironmentKey type that is never used (with the added difficulty to find a unique name for this unused type), and to write extensions with a fixed pattern. In this boilerplate, the relative amount of useful information, considering the use case, is very low.

To me, this sounds like one of the interesting use cases for a macro system.

Jon_Shier · October 18, 2022, 1:15pm

This is exactly the sort of API I hope we don't see with a macro system. I'd rather see the language enhanced to express type safety without the need for macros in the first place. That is not "expressible" code, it's inscrutable and hides additional magic, making it difficult to explore and debug.

stuchlej · October 18, 2022, 1:30pm

I'm confused. I can see how macros can be useful in a lot of way.

At the same time, Swift is getting increasingly complicated with each release (imo), which is not necessarily a bad thing, but I think were getting to the point, where the language is getting hard to read and leaving behind

Expressive. Swift benefits from decades of advancement in computer science to offer syntax that is a joy to use, with modern features developers expect.

Regular language additions like async/await or take/borrow may introduce additional complexity, but I think that macros are on a next level.

I'm afraid, that reading the code (especially unknown codebase) and debugging the code will get much harder. I also wonder, would I need to use something like swiftc -E to inspect that macros expand correctly?

taylorswift · October 18, 2022, 2:15pm

functionDecl.parameters.map { $0.description }.joined(separator: ", ") is not an AST manipulation, it is just gyb reincarnated and made slightly noisier.

davdroman · October 18, 2022, 2:29pm

Something that often comes to mind (from SE-0299):

public protocol ToggleStyle {
  associatedtype Body: View
  func makeBody(configuration: Configuration) -> Body
}

#staticMember(ToggleStyle, `default`)
public struct DefaultToggleStyle: ToggleStyle { ... }

#staticMember(ToggleStyle, `switch`)
public struct SwitchToggleStyle: ToggleStyle { ... }

#staticMember(ToggleStyle, checkbox)
public struct CheckboxToggleStyle: ToggleStyle { ... }

// synthesized:

extension ToggleStyle where Self == DefaultToggleStyle {
  public static var `default`: Self { .init() }
}

extension ToggleStyle where Self == SwitchToggleStyle {
  public static var `switch`: Self { .init() }
}

extension ToggleStyle where Self == CheckboxToggleStyle {
  public static var checkbox: Self { .init() }
}

filip-sakel · October 18, 2022, 4:11pm

This code saves only one line while adding the complexity of a macro. Are there really so many toggle, button, etc. styles that this is required? Additionally, macros will come with a compile-time cost, so I think this is an example of what macros shouldn't be used for. If there is a need to express this pattern more concisely we could consider something like public static let default = Self(), but I don't think any further syntactic sugar will be helpful.

davdroman · October 18, 2022, 4:15pm

Macro usefulness is mostly contextual and somewhat subjective. If you're Apple and maintain SwiftUI with dozens of these kinds of static members, then yes, it might be useful to have a macro like this internally to alleviate code repetition.

willft · October 18, 2022, 6:29pm

Top line:
This is awesome. My experience with macros comes mainly from two places: C-family (gross, as everyone seems to agree) and Lisp-family macros (excellent, IMO). I’m a big fan of macros in general and I really like this proposal.

Love: custom diagnostics; balance between AST-based manipulation and quoted code (“just spell the expanded code like swift code in a string literal.”)
Missing: gensym; function declaration macros. Also I think we should be able to pass code blocks into a macro which can have elements substituted by the macro.
It would be really nice to be able to view the expanded version of a macro. I would specifically recommend 3 tooling features: (1) command line option to print a file in its fully expanded form; (2) sourcekit refactor to replace the usage of a macro with its expansion; (3) sourcekit option to view the expanded version of a macro temporarily (like quick help or synthesized interface files or inline type hints)
(I’ll explain each complicated one in detail)

Questions:

Do macros really need to be in a separate package or can they be in a separate module in the same package? Seems like a separate module would be enough & that would be very helpful if you’re thinking of writing a library/framework which is closely intertwined with a macro.
What does this imply about the stability of SwiftSyntax? Will macro authors have to pin their macro libraries to specific versions of the Swift compiler, or will they be able to have one version of their library work with multiple versions of the Swift compiler?

—

Details:

Custom diagnostics are crucial to making usable DSLs. Macros are almost always making DSLs of some sort.
The ability to spell a new AST like you would spell Swift code (i.e. you can say "let \(myVar) = \(myThing)" instead of e.g. LetDecl(Identifier(myVar), myThing)) — this reminds me of quoting in lisp macros. It’s not necessary to write good macros, but boy does it make it more approachable.
Function declaration macros, as mentioned upthread.
Code blocks in macros. This is kind hard to explain, so I’ll give an example: Imagine a macro which allows you to write an expression once for every member in a type:

#forEachMember(in: Self) { x in self.x.hash(into: hasher) }

I think we need a way to spell that x thing. Maybe it takes the form of #forEachMember(in: Self) { prop in #prop(self).hash(into: hasher) } or something.

Gensym: I’m thinking of something like “gensym” in Common Lisp or Clojure. As far as I can tell there’s no way to ask “please give me a fresh variable name that isn’t used anywhere else.” To make temporary variable declarations in a macro expansion, it’s very important to be able to generate variable names that cannot be used elsewhere. For example imagine a (slightly problematic but let’s not worry about that) #temporarySet macro which sets value of a property and then unsets it in a defer. You’d want #temporarySet(myObject.x, 1) to expand to this:

let temp = myObject.x
myObject.x = 1
defer { myObject.x = temp }

But what if temp is used in the scope where the macro is used:

func stuff() {
    let temp = "foobar"
    #temporarySet(myObject, 1)
    myObject.doThing(temp)
}

This should cause a compile error because temp is redefined. Much better if the macro could pick an unusable name for its temp variable. I quite like the ability of Clojure’s gensym to add a prefix, since I think it makes an expanded macro more readable.

willft · October 18, 2022, 6:43pm

And now some motivating use cases:

I think a use case that’s sort of missing here is that macros can be reusable, not just scoped to exactly one problem (e.g. not one macro for equatable, one for hashable, etc.). So for example it would be great if I could implement a macro like this:

struct Example: Hashable, Encodable {
    var a: Int
    var b: String

    func hash(into hasher: Hasher) {
        #forEachMember(in: Self) { x in self.x.hash(into: hasher) }
        // Expands to a.hash(into: hasher); b.hash(into: hasher)
    }

    static func ==(lhs: Self, rhs: Self) -> Bool {
        #forEachMember(in: Self) { x in lhs.x == rhs.x } combine: { a, b in a && b }
        // Expands to lhs.a == rhs.a && lhs.b == rhs.b
    }

    func encode(to encoder: Encoder) throws {
        var container = encoder.container(keyedBy: CodingKeys.self)
        #forEachMember(in: Self) { x in try container.encode(self.x, forKey: CodingKeys.x) }
        // Equivalent of try container.encode(a, forKey: .a); try container.encode(b, forKey: .b)
    }

    enum CodingKeys: String, CodingKey {
        #forEachMember(in: Example) { x in case x }
        // Equivalent of case a, case b
    }
}

(this is the proposed syntax from my post above; alternately something like #forEachMember(in: Self, named: x, self.x.hash(into: hasher)))

Another thing I’m always finding myself wishing I could do is mirror members of a type into another type with slight modifications. These are just a little outside the realm of KeyPath-based dynamic members because they depend on things like adding parameters to functions that belong to another type.

// Common with codable stuff:
enum CodableThing: Encodable {
     case a(A), b(B)

     // Replace this with a macro     
     enum Kind: String, Encodable { case a, b }

     enum CodingKeys: String, CodingKey { case kind, value }
     func encode(to encoder: Encoder) throws {
          var container = encoder.container(keyedBy: CodingKeys.self)
          // Replace this with a macro
          switch self {
             case let .a(value):
                 try container.encode(value, forKey: .value)
                 try container.encode(Kind.a, forKey: .kind)
             case let .b(value):
                 try container.encode(value, forKey: .value)
                 try container.encode(Kind.b, forKey: .kind)
          }
     }
}

// Common with enums that have some different associated types but also
// some properties which every instance must have.
struct Token {
    var location: Location
    var value: Value
    enum Value { case semicolon, identifier(String), number(Int) }

    // Replace this with a macro
    static func semicolon(at: Location) -> Token { … }
    static func identifier(_: String, at: Location) -> Token { … }
    static func number(_: Int, at: Location) -> Token { … }
}

It would be so nice if I could explicitly make the auto-synthesized CodingKeys enum public; if this were spellable with a macro I imagine I could
Custom diagnostics. This could be done with a linter, but why not build it right in with a macro? If I’m doing something like defining my own CodingKeys (say my server has funny spelling conventions or something), I’d like to be able to write a macro that warns me about accidentally omitted cases:

struct Thing: Encodable {
    var foo: String
    var bar: Int
    var baz: [Int]
    enum CodingKeys: String, CodingKey {
       case foo = “fOoOo”, bar=“BAR!!!”
    }
    #checkMissingCodingKeys(Self) // Warning: baz not included in CodingKeys
}

Or even:

    #checkMissingCodingKeys(Self, ignore: baz) // OK: baz explicitly ignored

Sajjon · October 18, 2022, 9:02pm

Enum case discriminators will be the first thing I would use macros for.

(Because writing hundreds of lines of boilerplate discriminators is not that fun, and inclusion of Sourcery comes with some friction due to added complexity, and gyb has even higher friction)

allevato · October 18, 2022, 9:47pm

At a high-level, I really like what's presented here. Implementing Equatable/Hashable as an extension default in the standard library would have been a lot more enjoyable than poking at the C++ compiler bits!

I have a random assortment of questions/musings that came up as I read through it:

The question of debuggability brought up above is incredibly important. If a user is stepping through Swift code in the debugger and they land on a macro, how do you envision the debugger handling this situation? Does LLDB have/can it have facilities to step through macro-synthesized code that doesn't exist in the physical source?
Since the macros themselves are Swift code that can have complex control flow, what facilities would exist to debug the macro implementation itself? It would (?) be fairly awkward to attach the debugger to swift-frontend itself and be able to step through the macro implementation. Are diagnostics the only way to communicate arbitrary messages out of the macro, or would some other kind of logging be available? Will there be a compiler flag that can emit the macro-expanded source code?
- If diagnostics are the answer here, maybe instead of returning an array of Diagnostics, the macro should be passed a sort of DiagnosticEmitter object so that it can emit things imperatively during development?
Up until now, SwiftSyntax has not made an API stability promise. This design for macros makes the syntax node APIs a user-facing part of the Swift language, so does this imply that SwiftSyntax will start providing a source-compatibility guarantee so that macros continue to work with newer compilers?
Similarly, this design makes type-checked AST a user-facing feature (the reference to TypeDecl in the requirement, conformance, and memberDeclaration contexts). Since the type-checked AST has only been accessible in the compiler with C++ (and to a less refined degree, via SourceKit substructure), this would be exposing a very large new API surface to Swift, which would need the same stability guarantees as the syntactic AST. What do you envision that looking like, e.g., what kinds of questions can a macro author ask of the type-checked AST node it's given? There's a spectrum of possibilities here where we can either limit the APIs to only those that we think users care about (and risk it not being flexible enough for some important use case) or we surface the full power of the C++ type-checked AST where you can essentially walk the graph from any decl to any other decl in the same module or some other module (and risk it being too easy for users to abuse, or force us to indefinitely support a large fixed API). I'm not sure where the right place to draw the line here is.

stackotter · October 18, 2022, 9:53pm

We are on very different pages then. I think you want them to only be used in a way similar to source generation, which would benefit library developers a lot, but in my opinion not so much library users. I get that the magic can seem hard to debug, but it also leads to very safe code by default. In Vapor if someone was whipping together an API on a deadline, they would probably have force unwraps in quite a few places and a single typo could crash the code at runtime.

I definitely see where you’re coming from though and I think your view is one of the 3 main views I’ve seen on this proposal: just make inbuilt source generation, just add more features to address the issues, and yes add macros. Out of the three, ‘just add more features to Swift to address any shortcomings brought up in this thread’ is the one that I wouldn’t want to see coming out of this proposal, because imo new Swift features take quite a while to get into developer’s hands (compared to Rust) because toolchains are pretty linked to Xcode, and there’s no easy nightly toolchains system (so these changes can’t be used in actual projects).

patrickgoley · October 19, 2022, 4:18am

Thanks so much for the detailed write up! Obviously a lot to consider and process, but I just had one or two question up front.

Given that the procedural macros return a raw string which is then compiled, what happens if that string is not valid Swift code? Presumably this is due to a bug in the macro itself but we'd want to be smart about how that error is surfaced to the user. Ideally the error is traceable back to where the macro was invoked in the client code and not floating in the ether of some intermediate source files created during macro expansion.

As a side note, even if mainly utilized for Swift language features, macros would really democratize the ability to develop and contribute new ideas. Implementing something like CaseIterable would probably only take a few minutes with a procedural macro written in Swift. Much faster than what it takes most people to get the compiler dev toolchain up and running, much less implement something, assuming they are proficient in C++ to begin with. Seems like we could build and ship certain features much more quickly just because of the lower barrier to entry. Cool!

John_McCall · October 19, 2022, 6:58am

Doug and I have talked about these stability issues a fair amount, so I can speak to this some.

The initial macro implementations that Doug is describing would be external lexical macros, which is to say, they would be implemented as an external program which would take and produce a string of program text, passed in some sort of JSON-like envelope. The Swift project would provide a convenience package for defining these macros that would handle breaking down and parsing the input and formatting the output; a particular version of that package would have a particular version range of SwiftSyntax it supports. However, the communication between the compiler and the macro would fundamentally be text-based, and so there would be no tight version coupling between the compiler and either the convenience package or SwiftSyntax. We could even rev the design of the envelope over time — a macro would have metadata saying what versions of the input envelope it accepts, and the compiler would pass down what versions of the output envelope it accepts.

At least initially, these external macros would have to be almost purely lexical in their operation, meaning that they would get very limited semantic information about the program. This doesn’t mean they can’t be surfaced as semantic macros that have well-typed interactions with the surrounding source, just that most of that information would then be thrown away by the macro. For example, if a macro was surfaced semantically with the type <T> (T) -> Bool, the compiler could pass down that the type T was inferred to be Float during type-checking, but the macro can’t really ask anything about the nature of that type (such as its stored properties). Similarly, if the operand is the expression rect.width, the compiler could pass down that this has type Float, but not what the declarations rect or width are.

We can figure out ways to allow this information to be queried by the macro. That would basically be a stable API to the AST, however, even if it’s invoked over a textual protocol; it would to be designed very carefully. Making it usable to an external program communicating over a textual protocol would run into a number of technical problems specific to that setup. And there’s an inherent efficiency problem with external macros in that the compiler has to re-parse and thus re-type-check the entire result, even if most of it is unchanged from the original source, just in case something about the changes alters how it’s types.

An alternative would be to allow macros to be implemented directly in source using compile-time interpretation. This, of course, still requires the creation of that stable API to the AST. It also comes with its own major challenges, like needing to design and implement an interpreter. But it would solve some of the otherwise-intractable efficiency problems of external macros.

Finagolfin · October 19, 2022, 2:37pm

I read most of the gist, interesting approach. As someone who generally avoids macros like these but would like compile-time meta-programming, let me give you the perspective of the macro-averse (largely because I avoid domains that require boilerplate to begin with) who are going to be forced to deal with it in others' Swift code, from my experience in other languages.

Let me echo @willft and others in asking for a way to see the resulting source files after the macros have been expanded. For all C/C++'s faults, I can always check how their macros were used in clang -many -flags -o foo.o by running clang -many -flags -E -o foo.i and looking through the pre-processed file. This is a big deficiency in many modern macro systems.
It is very important that all macro uses are carefully marked, ie there should always be a hash-prefixed symbol or something more glaring anywhere a macro is invoked. I don't think it's a good idea that the property wrapper example has none. Zig made a big mistake by reusing if() for statically known checks, as opposed to something more explicit like if constexpr (I'm aware that Zig can interpret the runtime if at compile-time, I'm talking about differentiating cases where you know the condition is statically known).
Macros are an advanced feature, and requiring that those using it know internal compiler terminology is going to keep it so. It's good to keep it somewhat exclusive, but I wonder if that's the particular barrier to erect.
I don't use Rust but I've seen issues when cross-compiling it for Android, where flags that were only meant for Android were getting misapplied to the host macros. Something to avoid if taking a similar approach of compiling Swift macros for the host.

allevato · October 19, 2022, 2:50pm

Thanks for all this detail, John. So these things would be treated almost like "compiler plug-ins" rather than "macros", with the distinction that they only run when requested by a sigil in the source code, vs. traditional plug-ins that can run over all sources passed to the compiler arbitrarily.

I presume this is where the external: "ModuleName.MacroName" syntax comes into play; it tells the compiler where on the module search path to look for the macro's implementation.

If there are any more detailed sketches or thoughts on the horizon about what the driver/frontend interface for these macros would look like, I'd be very interested in seeing those to get a better sense of how other build systems like Bazel would integrate them into the user's build graph.

John_McCall · October 19, 2022, 3:13pm

Right, external macros are a kind of more language-integrated compiler plugin, invoked on subsets of the source only when requested.

The sigil thing is interesting. We want macros that are by design lexical / syntactic to use a sigil so that the compiler (and other tools, and of course programmers) know immediately that the usual parsing / type-checking rules are disabled within them. Semantic macros, which fit into the language more like a magic function call, don’t need a sigil on the same technical level. However, I think that on a design level, for readability’s sake, we would want to insist on a sigil anyway for any macro that really deviated from that function-call idea. For example, I don’t think we’d want to allow something that just looked like a normal function call to expand to something that returned from the enclosing function. So maybe statement-level semantic macros would need a sigil, while expression-level semantic macros would not but would be restricted in what they could do to, well, typical expression sorts of things.

allevato · October 19, 2022, 3:53pm

Java annotation processors come to mind as a similar model to what we're discussing here (but with the possible addition of annotation-less macros in narrower situations that you described).

That makes me think of another strong potential use case: automatic dependency injection. A set of macros could synthesize the initializers for fields marked as being injected, simplify other infrastructure around components/modules, and so forth.

This would require some degree of breadth and depth in the APIs that the macro implementation gets to access the type-checked AST, since it would need to navigate and filter type members, may want to check their conformances, things like that.

On a different note: One of the inputs to the macro is the syntax node to which it is being applied, and in SwiftSyntax those nodes also contain the trivia (whitespace and comments) that are applied to those syntactic elements.

Would that trivia be carried along through to the macro implementation? If it was, then that would in effect give us semantically important whitespace and comments, because the macro implementation could operate differently based on those elements. Let's be gross for a moment:

public struct SumOfNumberAndLeadingSpaces: ExpressionMacro {
  public static func apply(
    arguments: (String?, ExprSyntax)..., in context: MacroEvaluationContext
  ) -> (ExprSyntax, [Diagnostic]) {
    guard let integerExpr = arguments[0].1 as? IntegerLiteralExprSyntax else {
      return (arguments[0].1, [ /* some diagnostic */ ])
    }
    let numSpaces = node.leadingTrivia.reduce(Int(integerExpr.digits.text)) { result, trivia in
      switch trivia {
      case .spaces(let count): return result + count
      default: return result
      }
    }
    return ("\(numSpaces)", [])
  }
}

let x = #sumOfNumberAndLeadingSpaces(       5)  // x = 12

I assume this is something we'd never want anyone to do. At the same time, I don't think we can completely strip the trivia from the nodes, because the Stringify example in the vision document relies on the whitespace being the same as what was passed into the macro.

I wonder if there's a way we can thread this needle to prevent egregious use cases, or if we just need to assume that users won't do horrible things (as we do for many other language features).

John_McCall · October 19, 2022, 5:24pm

That's an interesting question. I suppose that syntactic/semantic macros could reformat inputs to an unformatted token stream with comments stripped and other whitespace canonicalized. That would complicate mapping diagnostic locations back to the source text, but that's a relatively minor implementation-level concern.

If we do that, then a macro that's intended to operate on a pre-parsing level should probably still be passed the original source text without interference. Technically we'd have to lex it in order to delimit the macro, but I don't think we need to make further restrictions.

patrickgoley · October 19, 2022, 6:20pm

Purely an anecdote but having worked on very large Android projects that used Dagger annotations for injecting and providing dependencies, I can say I really did not enjoy that experience. Tons of boilerplate and the most inscrutable errors all just so I can connect one object to another. Very often running into "can't find a dependency for this interface" when it was very clearly provided somewhere else but no ability to debug the whole set up because it was all constructed via annotations. Maybe it can be done in a better way, but I would really hate to see Swift go down the same path. Dependency injection is so simple at it's core... make a constructor that accepts your dependencies and then pass in the right ones in a convenience init or factory method. Trivial to set up and debug unlike annotations.