Why aren't macros given type information?

Since macros have been introduced, a lot of people have been wanting for more semantic information about the code that the macro expands: e.g.

// the type of a variable
var x = 5 // what is the type of x?

// whether a type conforms to a protocol
var y: Int // does Int conform to Equatable? Hashable? Codable?

// the type of a function result
let z = doSomething() // what is the return type of doSomething?

This would be very useful. After reading the macro vision document, it says that syntactic macros were chosen as a good balance between only text-based and semantic macros, but I still think it would be (again) very useful to have at least the option to receive additional type information. Is this a future direction or is this barred from happening?

// bikeshedding here
@attached(conformance)
@attached(member)
public semantic macro SynthesizeEquatable() -> ()

// or...
// bikeshedding here
@attached(memberAttribute)
public syntactic macro AllObjectiveC() -> ()

// also...
@freestanding(expression)
public textual macro ConvertCCode(code) -> ()
// in this case type information isn't necessary, since the macro is fed raw text which isn't parsed at all

There's no Swift API for type-checked Swift AST akin to the unchecked syntax tree API that Swift Syntax provides. The type checker is written in C++ and has no stable public API.

A presence of such API available to Swift code would be a necessary technical prerequisite for accessing type information in macros, assuming there are no other reasons not to expose it in the first place.

8 Likes

I thought SourceKit provided some API for these things; but if there is no API then yes, one should be created! Even if this feature is years away it would be nice to know if it's on the roadmap or if macros will have this limitation forever.

SourceKit also provides only a private C++ API.

I think it goes without saying that the capabilities of macros can and will be expanded, just as every other part of the language also evolves.

But yes, it has been mentioned specifically that we would, eventually, like to provide macros with more type information:

The macro expansion context could be extended with an operation to produce the type of a given syntax node, e.g.,

extension MacroExpansionContext { 
  func type(of node: ExprSyntax) -> Type? 
}

When given one of the expression syntax nodes that is part of the macro expansion expression, this operation would produce a representation of the type of that expression. The Type would need to be able to represent the breadth of the Swift type system, including structural types like tuple and function types, and nominal types like struct, enum, actor, and protocol names.

Additional information could be provided about the actual resolved declarations. For example, the syntax node for .red could be queried to produce a full declaration name Color.red, and the syntax node for == could resolve to the full name of the declaration of the ==operator that compares two Color values. A macro could then distinguish between different == operator implementations.

The main complexity of this future direction is in defining the APIs to be used by macro implementations to describe the Swift type system and related information. It would likely be a simplified form of a type checker's internal representation of types, but would need to remain stable. Therefore, while we feel that the addition of type information is a highly valuable extension for expression macros, the scope of the addition means it would best be introduced as a follow-on proposal.

SE0382 - Expression macros

Moreover, it has been acknowledged since the very beginning that the syntactic macros we have are just one flavour of macro, and that there are other interesting flavours with their own strengths and limitations.

A program's source code goes through several different representations as it is compiled, and a macro system can choose at what point in this translation it operates. We consider three different possibilities:

  • Lexical: a macro could operate directly on the program text (as a string) or a stream of tokens, and produce a new stream of tokens. The inputs to such a macro would not even have to be valid Swift syntax, which might allow for arbitrary sub-languages to be embedded within a macro. C macros are lexical in nature, and most lexical approaches would inherit the familiar problems of C macros: tooling (such as code completion and syntax highlighting) cannot reason about the inputs to lexical macros, and it's easy for such a macro to produce ill-formed output that results in poor diagnostics.

  • Syntactic: a macro could operate on a syntax tree and produce a new syntax tree. The inputs to such a macro would be a parsed syntax tree, which is strictly less flexible than a lexical approach because it means the macros can only operate within the bounds of the existing Swift grammar. However, this restriction means that tooling based on the grammar (such as syntax highlighting) would apply to the inputs without having to expand the macro, and macro-using Swift code would follow the basic grammatical structure of Swift. The output of a macro should be a well-formed syntax tree, which will be type-checked by the compiler and integrated into the program.

  • Semantic: a macro could operate on a type-checked representation of the program, such as an Abstract Syntax Tree (AST) with annotations providing types for expressions, information about which specific declarations are referenced in a function call, any implicit conversions applied to expressions, and so on. Semantic macros have a wealth of additional information that is not provided to lexical or syntactic information, but unlike lexical or syntactic macros, their inputs are restricted to well-typed Swift code. This limits the ability of macros to change the meaning of the code provided to them, which can be viewed both as a negative (less freedom to implement interesting macros) or as a positive (less chance of a macro doing something that confounds the expectations of a Swift programmer). A semantic macro could be required to itself produce a well-typed Abstract Syntax Tree that is incorporated into the program.

Whichever kind of translation we choose, we will need some kind of language or library that is suitable for working with the program at that level. A lexical translation needs to be able to work with program text, whereas a syntactic translation also needs a representation of the program's syntax tree. Semantic translation requires a much larger part of the compiler, including a representation of the type system and the detailed results of fully type-checked code.

Macros vision document

4 Likes