[Pitch] Lazy Accessor Macros

Hello, Swift Evolution!

Initializer expressions on properties annotated with an accessor macro are subsumed from the original context. The macro is responsible for using the initializer expression in a new context, for example in an introduced get accessor. In many cases, this results in a lazily evaluated initializer expression.

Currently, the subsumed initializer is still type-checked in the original context. Using self in the initializer results in a compiler error, even if the initializer in its new context could have access to self thanks to lazy evaluation. I am pitching to lift this restriction.

Please let me know what you think about this idea.

Motivation

Swift allows for lazy property initialization with the lazy keyword, which has been available long before macros were introduced. This is useful for properties where

  • initialization is expensive and should be avoided unless actually needed
  • initialization depends on values that become available after initialization of the enclosing type, including access to self

Here is a rough description of the transformations applied to a lazy var foo: T = initExpr() by the compiler:

  • a private backing var of type T? is added in the same scope, initial value is nil
  • the original property is converted into a computed property.
  • in the get accessor, the initializer is used if the backing var is nil

The conversion looks approximately (not entirely, but close enough) like this:

private var __foo: T? = nil

var foo: T {
    get {
        if let value = __foo {
            return value
        }
        let newValue = initExpr()
        __foo = newValue
        return newValue
    }
    set {
        __foo = newValue
    }
}

The initializer expression is type-checked in the original context as if it was already moved into the getter. This feels natural and convenient:

struct Earth {
    let mice = 21
    
    let noGood = mice * 2
    //           Λ† ❌ cannot use instance member 'mice' within property initializer; property initializers run before 'self' is available
    
    // βœ… Nice, initializer expression has access to `self`!
    lazy var theAnswer = mice * 2
}

We can write an accessor macro that performs a similar transformation. Unfortunately, we have no way of asking the compiler to type-check the initializer expression in the same way as lazy properties do. If we need self access, we always get a compiler error:

struct Earth {
    let mice = 21
    
    @Lazy
    var theAnswer = mice * 2
    //              Λ† ❌ cannot use instance member 'mice' within property initializer; property initializers run before 'self' is available
}

Concrete Example

We work on a document-based SwiftUI app. The user can open multiple documents. We propagate a representation of the open document as an Environment value. We use a SwiftData ModelContext for this example, but it could be any object needed by a view branch; for example, an object representing a repository, a file, or a database.

Environment injection is an elegant approach because it allows different branches of the view tree to work with different modelContexts. It's great if we use @Query directly in our views, but sometimes our business logic is too involved, so we move it into a ViewModel. We extract the context from the environment and inject it into the view model. Since @State will be a lazy macro in the upcoming releases, this should be easy:

@Observable
final class ViewModel {
    let modelContext: ModelContext
    
    init(modelContext: ModelContext) { ... }
}

struct MyView: View {
    @Environment(\.modelContext) var modelContext
    
    // `@State` will be a lazy macro in the '27 releases
    // However, this init expression does not work because we need `self` access.
    @State var viewModel = ViewModel(modelContext: modelContext)
    //                     Λ† ❌ cannot use instance member 'modelContext' within property initializer; property initializers run before 'self' is available
    
    var body: some View { ... }
}

Workaround 1

We could extract the modelContext in the parent view and initializer-inject it into the child. This approach is not satisfying because it breaks encapsulation. The parent should not know about the internal implementation of its child.

@Observable
final class ViewModel {
    let modelContext: ModelContext
    
    init(modelContext: ModelContext) { ... }
}

struct ParentView: View {
    // ⚠️ potentially unnecessary dependency on modelContext
    //    if `ParentView` only needs it to inject into `MyView`
    @Environment(\.modelContext) var modelContext
    
    var body: some View {
        // ⚠️ broken encapsulation
        MyView(context: modelContext)
        
        // ⚠️ Also, what if `MyView` is a generic type?
    }
}

struct MyView: View {
    @State var viewModel: ViewModel
    
    init(context: ModelContext) {
        viewModel = ViewModel(modelContext: context)
    }
    
    var body: some View { ... }
}

Workaround 2

Make the modelContext in the viewModel optional and inject it later.

@Observable
final class ViewModel {

    // The `modelContext` property must be `var`, but should be `let`.
    //
    // ⚠️ Do we need to deal with "some co-worker" AKA "future self"
    //    accidentally replacing an existing viewModel with a
    //    different one?
    //
    //
    // The value must be optional and will be `nil` initially, which is awkward:
    //
    // ❓ Deal with optional modelContext everywhere? 🀷
    //
    // ❗️ Or use explicit unwrapping, tolerating crashes if
    //    developer forgets to add `.onAppear`? πŸ’£
    //
    var modelContext: ModelContext?
    
    init() { }
}

struct MyView: View {
    @Environment(\.modelContext) var modelContext
    
    // Dependency on `modelContext` is not explicit in initializer:
    @State var viewModel = ViewModel()
    
    var body: some View {
        SomeBody()
            // ⚠️ Don't forget this:
            .onAppear {
                viewModel.modelContext = modelContext
            }
    }
}

Clearly, this approach is awkward to work with. It exposes the risk of bugs if the dependency is not injected correctly. The cleanest and most convenient solution would be to inject the viewModel in the initializer. This would be enabled by the idea of this pitch.

Proposed Solution

Add an optional initializer: parameter to the accessor macro role declaration. Possible values are lazy or eager. A macro author uses lazy to declare that the initializer is invoked lazily after macro expansion, which enables access to self in the initializer expression. eager is the default value, and resembles the current behavior of accessor macros.

// Declaration of the example `@Lazy` macro
//
// With `initializer: lazy`, we promise to use the initializer
// in a context where `self` is available.
@attached(accessor, initializer: lazy, names: named(get), named(set))
@attached(peer, names: prefixed(_))
public macro Lazy() = #externalMacro( ... )

// Usage:

struct Earth {
    let mice = 21
    
    // βœ… We can use `self` here:
    @Lazy var theAnswer = self.mice * 2
}

Detailed Design

The compiler type-checks a subsumed initializer expression in its original context before any macro is expanded. The initializer is type-checked again in its new context after an accessor macro is expanded. Type-checking in the initial context should not be disabled, because it enables type inference. The inferred type is available to the accessor macro and is often needed to form the expansion. Consequently, it would be impossible to change that order; macros cannot be expanded before the property type is inferred, hence the initializer must be type-checked first. The following example illustrates this using the aforementioned @Lazy macro. First, the property type is inferred by checking the initializer:

@Lazy var foo = 42
//              Λ† inferred type is `Int` after checking initializer expression

The macro is invoked with this information:

@Lazy var foo: Int = 42
//             Λ† inferred type the macro can work with

The macro uses the inferred type during expansion:

// The inferred initializer type is needed for the optional
// type of this property, which is generated by the macro:
private var _foo: Int?

var foo: Int {
    get {
        if let value = _foo {
            return value
        }
        let newValue = 42
        //             Λ†The macro has re-contextualized the initializer here.
        //              It will be checked again in this context to make sure
        //              the expansion is valid.
        _foo = newValue
        return newValue
    }
    set { _foo = newValue }
}

Since the initial initializer type-check happens before macro expansion, there is currently no information about how the initializer will be used by the macro. For example, it could be used in an init accessor. The initializer would be evaluated during initialization of the enclosing type, when there is no access to self. For such a situation, the current implementation is correct: self is not available during type-checking in the original context.

Other macros such as @Lazy in the examples above may result in an expansion where self access would be valid for the re-contextualized initializer. Currently, this fact is unknown during the initial check; therefore, self access is assumed to be illegal.

Role Declaration

This pitch proposes adding an optional initializer: parameter to the macro role declaration of accessor macros. This can have one of two possible arguments: either eager or lazy.

  • eager: The current behavior – no self access for the initializer expression
  • lazy: The macro promises to use the initializer in a context where self is available

If the initializer: parameter is omitted, eager will be used as the default value. This choice makes sure that existing macros behave the same as before.

Some example declarations:

// Declaration of the example `@Lazy` macro
//
// With `initializer: lazy`, we promise to use the initializer
// in a context where `self` is available.
@attached(accessor, initializer: lazy, names: named(get), named(set))
@attached(peer, names: prefixed(_))
public macro Lazy() = #externalMacro( ... )

// This macro uses the initializer in an `init` accessor, where `self` access
// would be invalid.
@attached(accessor, initializer: eager, names: named(init))
public macro SomeEagerMacro() = #externalMacro( ... )

// If `self` is not available for the initializer,
// we can just omit the `initializer:` property.
@attached(accessor, names: named(init))
public macro SomeEagerMacro() = #externalMacro( ... )

// `initializer:` is only available for accessor macros.
@attached(body, initializer: lazy)
//              Λ† ❌ Error: "initializer" is only available for accessor role

// Only `lazy` or `eager` are valid:
@attached(accessor, initializer: ridiculous, names: named(get), named(set))
//                               Λ† ❌ Error: Unknown initializer context 'ridiculous'. Possible values are 'eager' or 'lazy'.

/// `initializer:` takes one argument:
@attached(accessor, initializer: eager, lazy, names: named(get), named(set))
//                               Λ† ❌ Error: multiple arguments unsupported.

Effect on Type Checking

When checking the initializer in its original context, the type-checker will look for an accessor macro with a lazy declaration attached. If such a macro is found, the same type-checking code as implemented for the existing lazy keyword is used, where self access is allowed. In all other cases, the behavior remains unchanged: self access is diagnosed as a compiler error.

If the macro author promises lazy behavior in the role declaration, but re-contextualizes the initializer where self access is illegal, the type-checker will allow self access in the initializer's original context. But it will be checked again in the new context after macro expansion. The error will be diagnosed in the expanded code as expected.

// Macro declaration:
@attached(accessor, initializer: lazy, names: named(get), named(init))
@attached(peer, names: prefixed(_))
public macro NotLazy() = #externalMacro( ... )

// Macro usage:
struct Earth {
    let mice = 21
    @NotLazy var theAnswer = mice * 2
}

// expansion:
struct Earth {
    let mice = 21
    private var _theAnswer: Int
    var theAnswer: Int {
        @storageRestrictions(initializes: _theAnswer)
        init {
            _theAnswer = mice * 2
//                       Λ† ❌ cannot use instance member 'mice' within property initializer; property initializers run before 'self' is available
        }
        get { _theAnswer }
    }
}

Source Compatibility

Additive. Existing macro role declarations may remain unchanged. In this case, today's behavior remains unaffected.

Newly written accessor macros may elect to specify lazy initializer usage. This does not affect existing source code.

A macro author may elect to specify lazy initializer usage for an existing macro. This change may affect existing code and should be carefully considered by the macro author. Here is an example:

struct Earth {
    static let someValue: Double = 17
    let someValue = 42
    
    // ⚠️ `theAnswer` used to be "17.0" before the macro author declared `lazy`.
    //     Now the answer is "42". Also, the inferred type has changed: It used
    //     to be `Double`, now it's `Int`.
    @NewlyLazy var theAnswer = someValue
}

ABI Compatibility

No effect on ABI when using existing macros as-is. Using an existing macro that newly adopted lazy may cause a different type to be inferred in some cases as shown in the example above.

Alternatives Considered

Use Introduced Names to Infer Initializer Context

The type checker has access to the introduced names of an accessor macro. Instead of having the macro author declare lazy or eager usage, the type-checker could assume a lazy context if the following conditions are met:

  • macro introduces non-observing accessors
  • macro does not introduce an init accessor, assuming the init expression will be used here

This approach could work in the general case, but we can construct situations where it breaks down. For example, a macro may introduce an init accessor that is unrelated to the initializer expression. Instead, the initializer is actually used lazily inside the getter. self access would have been OK for such a macro, but is forbidden by the type checker.

Another problem of this approach is that existing macros automatically adopt the new behavior if they fulfill the conditions. This may result in source-breaking changes or subtle bugs when an initializer suddenly uses self.foo instead of Self.foo.

Macro + Existing lazy Keyword

There was a discussion about attaching a macro to a lazy var. An effect of the lazy keyword is that self can be used in the initializer, which would achieve the desired behavior. However, the compiler diagnoses an error:

@MyMacro
lazy var value = self.compute()
// ❌ 'lazy' cannot be used on a computed property

This error is correct, because the compiler cannot check if the macro is actually lazy. If this combination was allowed, the programmer would need to know the internals of the macro implementation at the usage site and make sure that the lazy keyword is used correctly. This idea has been rejected previously.

Status

There is a draft implementation available here. I still have to add some tests and update the documentation.

11 Likes

Maybe fix-its would be nice for the error diagnostics...

// `initializer:` is only available for accessor macros.
@attached(body, initializer: lazy)
//              Λ† ❌ Error: "initializer" is only available for accessor role
//                πŸ”§ Fix: Remove "initializer: lazy"

// Only `lazy` or `eager` are valid:
@attached(accessor, initializer: ridiculous, names: named(get), named(set))
//                               Λ† ❌ Error: Unknown initializer context 'ridiculous'. Possible values are 'eager' or 'lazy'.
//                                 πŸ”§ Fix: Use "lazy"
//                                 πŸ”§ Fix: Remove "initializer: ridiculous" for default eager initialization

/// `initializer:` takes one argument:
@attached(accessor, initializer: eager, lazy, names: named(get), named(set))
//                               Λ† ❌ Error: multiple arguments unsupported.
//                                 πŸ”§ Fix: Remove "eager"
//                                 πŸ”§ Fix: Remove "lazy"

Of course, I'll have to look into how to implement these.
Do you think I should add fix-its to the proposal?

Are there any thoughts about the proposed naming?

For example, initialization: could also work. So instead of

@attached(accessor, initializer: lazy, names: named(get)

it could be

@attached(accessor, initialization: lazy, names: named(get)

Maybe it's slightly clearer because it's less likely that someone thinks this argument refers to the init(...) initializers in the enclosing type. On the other hand, the context is a macro role declaration; and macro authors are very much focusing on their macro, not on what's happening elsewhere.

Do you have any preference, or additional ideas? Does the proposed declaration API feel ergonomic?

Moving features out of the compiler into the library seems like a good idea. Keep up the good work!

I'd think fix-its could come later.

Question, though. Can a strong reference cycle form if the lazy initialiser refers to self?

It is a sad day when there are no opinions about naming (at least, yet)!

I do like declarative tone of initialization more. lazy as a name is clear. An alternative to eager could be immediate, but that's really six of one, half a dozen of another.

Thanks for your thoughts and kind words! I'd like to expand my thoughts regarding using lazy in the motivation, and also address your question about retain cycles.

Replacing lazy is not in scope

For expectation management, this pitch does not propose to move the lazy feature out of the compiler. That could be handled separately later. It would need some special handling to avoid source-breaking changes because lazy is a keyword as opposed to a @lazy attribute. Some possible options:

  • Deprecation warnings
  • Migration tooling
  • Compiler auto-converts the lazy keyword into @lazy to keep existing code compatible

I'm sure it would be nice to remove most of the code related to lazy from the compiler. Just the function to synthesize the getter is more than 100 lines long[1]. Plus, there are lots of conditions checking for lazy sprinkled all over the place :slight_smile:

Another benefit of a macro could be to make it easier to improve lazy properties. For example, currently we cannot use ~Copyable values. There is no useful diagnostic for this, and we cannot expand the synthesized getter:

struct NonCopyable: ~Copyable { }

struct Container: ~Copyable {
    // ❌ Compiler crash
    lazy var brokenValue = NonCopyable()
}

A macro produces normal code: The compiler would provide a diagnostic and we could inspect the expansion to find out more:

struct NonCopyable: ~Copyable { }

struct Container: ~Copyable {
    // expansion for `@lazy var value = NonCopyable()`:
    var value: NonCopyable {
        get {
            if let _value {
                return _value
            }
            let initialValue = NonCopyable()
            _value = initialValue
            //       ^ πŸ›‘ Implicit conversion to 'NonCopyable?' is consuming
            return initialValue
        }
    }
    private var _value: NonCopyable?
}
Maybe a `@lazy` macro could produce a `borrow` accessor to fix this particular issue...

OK, I'm not super familiar with all the new ownership features, but at least this would compile:

struct NonCopyable: ~Copyable { }

struct Container: ~Copyable {
    @lazy var value: NonCopyable = NonCopyable()

    // Expansion:
    var value: NonCopyable! {
        mutating borrow {
            if _value == nil {
                _value = NonCopyable()
            }
            return _value
        }
    }
    private var _value: NonCopyable!
}

The EUOs are not great, but there are currently some limitations with borrowing the value of an Optional's .some case. Or it may be a skill issue :wink:

On reference cycles

Usually not, but that depends on the expanded code. If the macro just moves the initializer expression into a get accessor as the examples in the pitch do, it is no different from writing the getter by hand. There is no closure involved where self is captured.

Of course, the macro author may introduce a reference cycle by capturing self somehow.

Contrived example
@Computed var value = V(foo: self.foo)

// expansion:
var value: V {
    get {
        if let make = _valueMaker {
            return make()
        }
        let make = { return V(foo: self.foo) }
        //                         ^ `self` captured here
        _valueMaker = make
        //            ^ `make` with captured `self` escapes here, creating a nice retain cycle
        return make()
    }
}
private var _valueMaker: (() -> V)?

The pitch does not change how macros expand. What's new is that the initializer expression may use self. Macro authors explicitly opt-in by adopting initialization: lazy. If they do, they have to ensure that their expansion does not introduce a cycle.


  1. synthesizeLazyGetterBody in TypeCheckStorage.cpp β†©οΈŽ

2 Likes