A Possible Vision for Macros in Swift

Douglas_Gregor · October 17, 2022, 7:00pm

Hey all,

As Swift evolves, it gains new language features and capabilities. One of the ways in which it grows is via new syntactic sugar to eliminate common boilerplate, taking something that can be written out in long-form and making it more concise. Such features don't technically add any expressive power to the language, because you can always write the long-form version, but their effect can be transformational if it enables use cases that would otherwise have been unwieldy. Such language features in this category can be hard to evaluate, because there is a fundamental question of whether the feature is "worth it": does the set of use cases made better by this feature outweigh the cost of making the language larger and more complicated?

Macros can offer a way out of this conundrum, because they allow one to introduce boilerplate-reducing facilities without requiring a bespoke language extension or a separate source-manipulating tool to do so. A good macro system could replace the need for a swath of new language features, but one must be careful not to degrade the development experience by making it too hard to build good tools (i.e., C macros are notorious for breaking tooling).

We propose to introduce a macro system into Swift. At a very high level, macros can be explicitly triggered with syntax such as #stringify(x + y), subsuming a number of existing expressions that use similar syntax already (#line, #colorLiteral(...), etc.) with a general features. Macros can also be triggered more implicitly in response to type checking, e.g., applying a macro to the argument passed to a function. Macro arguments are type-checked so that tooling behaves similarly to today, but macro evaluation involves transforming the syntax so that it can serve a number of different use cases.

A possible vision for macros in Swift talks about the design space and proposes a path for macros in Swift.

There is also an experimental syntactic macro system that implements a basic syntactic macro system using syntax-tree manipulation as a mechanism for implementing macro evaluation. It's not the full compiler integration as proposed above, but allows us to experiment with the kinds of transformations that macros could do.

What kinds of things would you like to do with macros in Swift? Does the possible vision linked about cover those use cases, or is there something more or different needed?

Doug

sergiocampama · October 17, 2022, 7:42pm

Maybe this is one of the basic usages of the C macro system, but the feature I'm still having to rely on adding C-based code (plus glue files like headers and modulemaps) to get around it is to inject string parameters into the build, for example, the commit hash a binary was built at, so that it can be presented in UI (command line tools are good examples). This may involve a bit more than what's described, since it would be interfacing build tools, and not just in-code macro definition and usage.

MPLewis · October 17, 2022, 10:07pm

This can be accomplished today (without resorting to C) with package manager plugins that generate a Swift source file for inclusion in the build, either with something like Sourcery or just a hardcoded template.

I worry about making macros too powerful, as they're generally pretty difficult to read in most languages I've encountered them (C, Rust, Verilog, etc.). I think there definitely is a need for some additional macro functionality*, but in my own opinion the combination of build plugins, some additional work on things like function wrappers, and build-time constants can solve the majority of metaprogramming needs without opening the can of worms that is a fully-featured macro system.

For instance, I am pretty strongly against things from the linked document like:

#localCleanup("file", File(opening: "hello.txt"))
#colorLiteral(red: 0.5, green: 0.5, blue: 0.25, alpha: 1.0)
func assert(#stringify _ result: Bool)

I get that such macros centralize some boilerplate, but they make the code that much harder to reason about for a pretty minor simplification - now you have to go find the underlying macro definition to understand each of those. Put another way, if we need to resort to macros to accomplish each of those examples that might instead be a sign we need additional language features/standard library functionality instead (in the case of the first example above, something like a Python-style context manager could provide similar functionality). I'd also be quite concerned with the effect these would have on code-completion and other developer tools.

Maybe the core features and style of Swift will lend itself well to encouraging programmers to be more disciplined with how they use macros, but I've seen one too many macro-filled spaghetti-code C/Verilog codebases to think that opening the door to such a powerful set of features won't result in a net decrease in code readability.

* Things that definitively fall into this category for me are memberwise initializer generation and generalizing automatic protocol conformances like for Codable/Equatable/etc.. I don't know how you provide the tools to implement those without also opening the door to macro "abuse", but I'd be very much in favor of a more limited macro system that was basically only able to synthesize methods/computed properties from a type's stored properties.

Andrew · October 17, 2022, 10:37pm

Looks great!

Having used and made Rust macros a lot I’m glad that you’re moving in a procedural macro style approach.

Type checking and diagnostics are significantly better with the proc macro approach you’re suggesting, but the barrier to entry writing them is much higher (at the moment) than macro_rules. This is in knowledge, project setup, and code.

A common pattern for the macro_rules rewrite language you haven’t captured is mass synthesis of implementations within a library. This macro usage is typically constrained to the same file containing the macro definition, and often saves a lot of repetitive code.

Procedural macros are written a lot less frequently because (in my experience) there’s a decent overhead of setting up a package, and it’s often a lot more code. I don’t think you need to worry about it being more code (macro_rules style syntax could be implemented as a macro in Swift). Although perhaps you could have an optional inline-syntax (ie. trailing closure on the macro(contexts: […]) { … } call) which would export that code to an implicit package?

My second suggestion is that the context parameter seems to mirror the protocol conformances of the external macro implementation, could that just be implied from the conformance? I assume the user of the macro wouldn’t necessarily have the context (sorry) to understand what the macro needs, or the knowledge of macro implementations.

regexident · October 17, 2022, 11:35pm

My usecase for macros in Swift is similar to that described by @Andrew. And similar to his example I would have used pattern-based macros à la Rust. Especially when writing unit tests having macros can be a god-send.

To give a motivational example:

As one of the maintainers of the popular Surge (a framework for high-performance computation, powered by Apple's Accelerate framework) and the main driver behind a major cleanup and refactor around 3 years ago I found the redundancy of code in the project and the need to keep all its redundant consistent and in sync a major burden for maintenance and —let's face it— motivation to keep working on it. It's just no fun at all. Working on Surge has been everything but fun.

So at some point I decided to try to start from scratch (dubbed SurgeNeue), this time making excessive use of code-generation. I however had to quickly realize that Sourcery, the only promising tool I could find back then, was inadequate for the task. And .gyb would have violated the whole point of making working on Surge fun again.

Surge already exposed over 300 functions and SurgeNeue would further increase this number by introducing additional optimized variants that would allow for writing into pre-allocated buffers.

What I would have needed to make this feasible was proper macros (the good ones à la Rust, not the shitty ones from C). The quickly abandoned experimental project can be found here.

The idea behind SurgeNeue was to try to find common patterns in the functions provided by Accelerate and to write a template for each such function pattern and use that to generate the annoying and utterly redundant boilerplate.

As such one of the patterns in Accelerate is unary functions, which tend to be in the shape of:

// parallel sqrt(x)
func vDSP_vsqD(
    _ a: UnsafePointer<Double>,
    _ ia: vDSP_Stride,
    _ c: UnsafeMutablePointer<Double>,
    _ ic: vDSP_Stride,
    _ n: vDSP_Length
)

with a and ia being the source buffer and stride, c and ic being the mutable destination buffer and stride and n being the number of elements.

I'll be calling this pattern an "external unary" function. External since the destination buffer's type is provided by the caller. Mutating, since it writes into a pre-allocated buffer. And unary since sqrt(x) is a unary function.

Turns out once you have such an "external mutating unary" function (pseudo-code):

External mutating unary:

// Read from `src` and writes into pre-allocated `dst` of possibly different type:
let src: Array<Double> = [1, 2, 3]
var dst: ContiguousArray<Double> = .init()
dst.reserveCapacity(src.count)
squareRoot(src, into: &dst)

func squareRoot<Src, Dst>(_ src: Src, into dst: inout Dst) {
    assert(
        src.stride == 1 && dst.stride == 1,
        "sqrt doesn't support stride values other than 1"
    )

    assert(
        dst.count == 1,
        "destination not memory of a single scalar"
    )

    let stride: vDSP_Stride = numericCast(src.stride)
    let count: vDSP_Length = numericCast(src.count)
    vDSP_vsqD(src.pointer, stride, dst.pointer, stride, count)
}

… you can easily provide convenience variants for it by just wrapping it (pseudo-code):

External unary:

// Read from `src` and writes into newly allocated `dst`:
let src: Array<Double> = [1, 2, 3]
let dst = squareRoot(src, as: ContiguousArray<Double>.self)

func squareRoot<Src, Dst>(_ src: Src, as type: Dst.Type = Dst.self) -> Dst {
    var dst = Dst()
    return squareRoot(src, &dst)
}

Internal mutating unary:

// Read from `src` and writes into pre-allocated `dst` of same type:
var src: Array<Double> = [1, 2, 3]
squareRoot(&src)

func squareRoot<Src>(_ src: inout Src) {
    return squareRoot(src, src)
}

Internal unary:

// Read from `src` and writes into newly allocated `dst` of same type:
let src: Array<Double> = [1, 2, 3]
let dst = squareRoot(src)

func squareRoot<Src>(_ src: Src) -> Src {
    var dst = src
    return squareRoot(src, &dst)
}

The only parts that are different (between implementation arithmetic functions) for each of these are the implementation of the "external unary" variant, the functions' names and their corresponding documentation comments. Otherwise whether you're implementing parallel sqrt() or parallel abs() makes no difference: the code structure is always the same. Like a perfect stencil. Perfect for code-gen via macros.

(There are some differences between functions from vecLib and vDSP, but let's ignore that for sake of simplicity here.)

As such with macros at my disposal I could have just written this instead:

/// Calculates the element-wise square root of `src`, writing the results into `dst`.
func squareRoot<Src, Dst>(_ src: Src, into dst: inout Dst) {
    // actual implementation
}

/// Calculates the element-wise square root of `src`, returning the results as a `Dst`.
#generateExternalUnary(squareRoot)

/// Calculates the element-wise square root of `src`, writing the results into itself.
#generateInternalMutatingUnary(squareRoot)

/// Calculates the element-wise square root of `src`, returning the results as modified copy.
#generateInternalUnary(squareRoot)

… and have Swift generate the repetitive code from above for me.

The benefit would be even greater for the test suite that I would want to write for SurgeNeue, if I could write this.

#testUnary(squareRoot)

… instead of:

func test_externalMutating_squareRoot() {
    let values = [1, 2, 3, 4, 5]
    var actual = ContiguousArray<Double>()
    actual.reserveCapacity(values.count)
    squareRoot(values, &actual)
    let expected = serial_squareRoot(values)
    XCTAssertEqual(actual, expected)
}

func test_external_squareRoot() {
    let values = [1, 2, 3, 4, 5]
    let actual = squareRoot(values, as: ContiguousArray<Double>.self)
    let expected = serial_squareRoot(values)
    XCTAssertEqual(actual, expected)
}

func test_internalMutating_squareRoot() {
    var actual = [1, 2, 3, 4, 5]
    var actual = values
    squareRoot(&actual)
    let expected = serial_squareRoot(values)
    XCTAssertEqual(actual, expected)
}

func test_internal_squareRoot() {
    let values = [1, 2, 3, 4, 5]
    let actual = squareRoot(values)
    let expected = serial_squareRoot(values)
    XCTAssertEqual(actual, expected)
}

Especially since such tests would commonly be much more involved than this example already is.

I might have managed to plow through the repetitive implementation somehow, but writing tests for it would have broken me for sure. Imagine writing 4 such highly repetitive test functions each, for each of over 100 arithmetic functions in SurgeNeue.

ktoso · October 18, 2022, 12:07am

Great stuff, I'm very happy that this is finally seeing the light of day

I've not dug deeper into semantics and expressive power yet, but from a quick skim it looks mostly fine. I have experience building things with Scala macros so I'll give this a review from that perspective, hopefully we can avoid a bunch of Scala 2 macro's pitfalls

One omission it seems:

It seems the proposal missed the functions use-case which in my mind is rather hugely important, and one we've talked about before. Here's a paragraph about the use-case you can work into the proposal, if desired:

actual type declarations used in below snippet are all made up; just showcasing the use-case .

functionDeclaration: A macro that can be used on a function declaration and replace the function definition. As with a conformance macro, a member declaration macro would probably want access to the stored properties of the enclosing type, and potentially other information. As an example, let's create a macro to synthesize a memberwise initializer:

// In the standard library
macro(contexts: [.functionDeclaration], external: "MyMacros.Traced")
func traced(...)

// In the macro definition library
struct Traced: FunctionDeclarationMacro {
  public static func apply(
    in context: MacroEvaluationContext
  ) -> (FunctionDeclSyntax, [Diagnostic]) {   
    return (
      #"""
      func \(functionDecl.name)(\(functionDecl.parameters.map { $0.description }.joined(separator: ", "))) {
        Tracer.current.withSpan(#function) { span in 
          \(functionDecl.body)
        }
      }
      """#,
      []
    )
  }
}

Using this macro within the context of a type, e.g.,

import Tracing // https://github.com/apple/swift-distributed-tracing

(distributed) actor Worker { 

  #traced
  func hello(name: String) { 
    print("Hello, \(name)!")
  } 

}

would produce code like the following:

  func hello(name: String) { 
    Tracer.current.withSpan(#function) { span in
      print("Hello, \(name)!")
    }
  }

and allow us to automatically start tracing spans for methods like these.

Jon_Shier · October 18, 2022, 12:17am

Personally, using macros in place of actually generated code sounds awful. It lacks all inspectability and debugability. If that's what we want here an actual integrated code generator would be much better. These type of Rust-like macros make code essentially unintelligible and just as magic as having the compiler generate it. The only reason the status quo is really viable is that no one really needs to inspect the contents of a generated Equatable or Codable conformance given how rote and similar they are.

If we do want such capabilities, it should not be enabled by yet another new syntax. I should be able to make my macros look like normal capabilities, like @Traced or something like that.

taylorswift · October 18, 2022, 1:12am

my experience with swift-package-factory, a SwiftSyntax-based code generation plugin, has led me to believe this is the correct view. an attribute-based system made up of a small number of highly-composable rules is preferable to this C++-like macroed code.

when designing SPF, i found it is best to stick to AST-level manipulations (rather than arbitrary string substitutions). here is a sample:

extension Int
{
    @matrix(__ordinal__: [i, j, k], __value__: [0, 1, 2])
    @inlinable public 
    var __ordinal__:Int 
    {
        __value__
    }

    @basis 
    let cases:[Never] = [a, b]

    enum Cases:Int
    {
        @matrix(__case__: cases)
        case __case__
    }

    @matrix(__case__: cases)
    public static 
    var __case__:Self 
    {
        Cases.__case__.rawValue
    }
}

generates:

extension Int
{
    @inlinable public 
    var i:Int 
    {
        0
    }
    @inlinable public 
    var j:Int 
    {
        1
    }
    @inlinable public 
    var k:Int 
    {
        2
    }

    enum Cases:Int
    {
        case a
        case b
    }

    public static 
    var a:Self 
    {
        Cases.a.rawValue
    }

    public static 
    var b:Self 
    {
        Cases.b.rawValue
    }
}

Jon_Shier · October 18, 2022, 1:19am

I meant more that the end product of something like ktoso's #traced should be @Traced to properly bring it inline with existing syntax. I don't think the actual code generation should be triggered by attributes at all, though generating actual code is a plus.

mcfans · October 18, 2022, 1:50am

Refactor this

guard let self = self else { return }

to some shorter macro

Jon_Shier · October 18, 2022, 1:55am

Exactly what it shouldn’t be used for.

TizianoCoroneo · October 18, 2022, 7:38am

Small mistake in one example: the implementation of EquatableSynthesis does not return a Diagnostic array.

Great work an easy use case could be to obfuscate/deobfuscate constants like API keys and such… looking forward to this

mackoj · October 18, 2022, 7:50am

This is great!

Thanks for sharing your vision about this subject. I was pleasantly surprised about its depth.

After reading it the first thing that comes to mind is the lack of function support as ktoso mentioned here.

The way ktoso propose to do it would allow doing (Compile time support for Aspect-oriented programming, Prepitch: function wrappers, [Proposal/Pitch] Function decorators) pretty easily.

This would be nice to have.

Zollerboy1 · October 18, 2022, 7:52am

But, AFAICS, AST-level manipulations are what’s being proposed here, no?

frzi · October 18, 2022, 10:07am

Read, (parse) and embed the contents of text files directly into my code. This could be anything like tokens, urls, localized strings or shader code. Content I'd like to modify outside of my code during development, but rather have embedded directly into the binary opposed to shipping as files alongside the binary and read at runtime.

arennow · October 18, 2022, 10:37am

With respect, I think this is probably a very simple form of the worst thing macros could be used for. If a macro is able to emit new symbols into a scope, then neither humans nor non-macro-executing computers will be able to understand the flow of data in that scope. Even worse if the new symbol is shadowing an existing one because then the type appears to suddenly change (from Self? to Self in this case), which is close enough to a normal situation to be a red herring.

I also get a bit tired of writing guard let self = self else { return } (though with Swift 5.7, it can now be guard let self else { return }), but I think the minor inconvenience of that repetition is paid for by its perfect readability.

stackotter · October 18, 2022, 12:12pm

Code generators have their place, but in their current form they just can't provide the same functionality as macros in my opinion, because code generation doesn't result in anything very reusable outside of a single project. For example the code generation used in SwiftUI to generate their variadic generic types doesn't make any difference to the end user. It's just a tool that the SwiftUI developers use to make the same APIs with less boilerplate on their end.

In comparison, macros in the Rust sense create a whole new opportunity to make highly expressive APIs that reduce not only internal library boilerplate but also boilerplate that consumers must write. One example of an amazing use of macros in Rust (imo) is the Rocket web framework. It's extremely type-safe and they use macros in some very nice/clever ways to make it harder for users to write incorrect/unsafe code. Compare the way it handles parameters and url segments to the way Vapor does so for an idea of how the Swift ecosystem could be improved by macros. As far as I can tell, there's not much Vapor can do to be more typesafe at the moment. Here's a small snippet from the Rocket homepage for reference:

#[macro_use] extern crate rocket;

#[get("/hello/<name>/<age>")]
fn hello(name: &str, age: u8) -> String {
    format!("Hello, {} year old named {}!", age, name)
}

#[launch]
fn rocket() -> _ {
    rocket::build().mount("/", routes![hello])
}

Although I do love macros, I agree that there are certainly some terrible ways that macro systems can be used. Careful consideration will have to be put into this review to ensure that the easiest and most obvious way to use the macro system is correctly, but imo a macro system will always have good and bad ways of using it if it's capable enough to be useful, and that's something that I am perfectly ok to live with.

stackotter · October 18, 2022, 12:20pm

Just a quick off-topic aside from a security point of view: if an API key needs to be obfuscated (because it's something a user shouldn't be able to get) then it shouldn't be in your app, it should probably be on a backend server.

TizianoCoroneo · October 18, 2022, 12:31pm

There are many third party SDKs for iOS that require API keys/tokens to be available synchronously at app launch. Is there any way around that? Obfuscation is the last resort, of course...

stackotter · October 18, 2022, 12:38pm

Afaict, Firebase's API keys are safe for anyone to get a hold of, they purely act as an identifier for your app and don't give users access to information that they wouldn't have access to anyway. Anyway, I digress, we can start another thread if that doesn't answer your question.