Memoization of Swift properties

reuschj · July 26, 2020, 7:50pm

I wanted to make a pitch (my first) for the ability to memoize computed properties. What follows is a summary, but I also have a much more detailed version here.

The memoization of computed properties would help to speed up some Swift programs relying on computed values (especially expensive ones) by only re-calculating the results when one of the properties they depend on has changed.

Anyone familiar with React and React hooks will know one of the most useful hooks is useMemo , which provides this exact functionality for rendering UIs. While memoization would be broadly applicable to any Swift program, it may be especially helpful for use in SwiftUI, where preventing unnecessary re-renders can help optimize app performance.

In Swift, we can already achieve similar results, but it takes quite a bit of boilerplate.

Note: For these examples, imagine that getting the area is actually a much more expensive calculation. Getting the area is trivial, but for sake of demonstration, consider it as a proxy for something more complex.

Non-lazy Memoizing

First, let's look at a box with memoized area. It stores a private area property that's first calculated on initialization (not lazy). The didSet observers for height and width will actively recompute the area after they are set.

struct Box {
    var height: Double {
        didSet { // On setting height, the updated area is actively recalculated
            setArea()
        }
    }
    var width: Double {
        didSet { // On setting width, the updated area is actively recalculated
            setArea()
        }
    }
    
    private(set) var area: Double = 0
    private mutating func setArea() {
        area = width * height
    }

    init(height: Double, width: Double) {
        self.height = height
        self.width = width
        setArea()
    }
}

Lazy Memoizing

Next, let's look at a box with memoized and lazy area. It stores a private area property that's only calculated on first use. The didSet observers for height and width don't actively recompute area. Rather, they invalidate the stored area by setting it to nil. When the getter for area finds a value, it uses it. When it finds nil, it recomputes the area and stores it in the private property for later use.

struct Box {
    var height: Double {
        didSet { // On setting height, the memoized area is invalidated
            _area = nil
        }
    }
    var width: Double {
        didSet { // On setting width, the memoized area is invalidated
            _area = nil
        }
    }

    // Private var to store memoized value
    private var _area: Double? = nil

    var area: Double {
        mutating get { // Area is calculated lazily, only as needed (though this implementation means it can't be used on `let` constants)
            guard let area = _area else {
                let newArea = width * height
                _area = newArea
                return newArea
            }
            return area
        }
    }

    init(height: Double, width: Double) {
        self.height = height
        self.width = width
    }
}

This option is ideal for types that will always be var variables, but unsuitable for use with types that could be declared as let constants.

Proposed solution

The proposal is to create a simplified syntax to tell the compiler to synthesize the boilerplate outlined above. Let's take a look at what the structs above might look like with memo and lazy memo keywords (this syntax is just my initial proposal, but there are other options discussed here):

struct Box {
    var height: Double
    var width: Double
    memo var area: Double { |width, height| in width * height }
}

... and the lazy version:

struct Box {
    var height: Double
    var width: Double
    lazy memo var area: Double = { |width, height| in width * height }()
}

You can also read a much more detailed version of the above here.

stevapple · July 27, 2020, 6:55pm

Here are my humble opinions:

Providing a list of properties used for memoizing may lead to inconsistency. Consider the following case:

struct Box {
    var height: Double
    var width: Double
    memo var area: Double { |width| in width * height }
}
var box = Box(height: 1.0, width: 2.0)
var areas = [box.area]
box.height = 2.0
areas.append(box.area)
box.width = 1.0
areas.append(box.area)
print(areas)

Shall the compiler throw an error, or build it successfully? And what will the result be like?

[2.0, 2.0, 2.0] // Use the current value of box.height
[2.0, 2.0, 1.0] // Fix the value of box.height
[2.0, 4.0, 2.0] // The normal and reasonable result, but what does memo do?

Any possible result is somehow confusing. Therefore, the properties used for memoizing should (and must) be inferred by the compiler instead of code declarations.

I personally doubt the effect of memo, since in the ideal case, memo doesn’t cause any behavioral change. This means we can actually apply such strategy to every computed property. If we manage to handle it properly, it won’t cause a performance decline. Then it’s more likely to become a general feature controlled by a compile flag instead of keywords in the code.
I prefer @memo to memo personally. But based on the two reasons above, I don’t think it’s worth a keyword.

In a nutshell, I suggest working on a way to enable memoizing on every computed property, which can largely speed up the calculations with a low potential performance loss in the worst case.

Karl · July 27, 2020, 8:57pm

It seems to me that you want a resettable lazy. That’s a common request; there are some issues doing that with property wrappers today but hopefully we’ll be able to do it one day.

suyashsrijan · July 27, 2020, 9:25pm

There is a way to reset lazy var if you want. I think it's probably a bug (EDIT: now fixed), but you can access the underlying storage of a lazy var using $__lazy_storage_$_{property_name} and set it to nil. For example:

class A {
    lazy var foo: Int = {
        print("A")
        return 0
    }()
    
    func resetLazy() {
        $__lazy_storage_$_foo = nil
    }
}

let a = A()
print(a.foo)
a.resetLazy()
print(a.foo)

// Prints:
// A
// 0
// A
// 0

Saklad5 · July 27, 2020, 9:35pm

That seems like undefined behavior. I wouldn’t expect that to work across Swift versions, or even across platforms.

Karl · July 27, 2020, 9:47pm

IIRC, lazy initialisation is also thread-safe. Will this also reset the token for concurrent reads?

suyashsrijan · July 27, 2020, 9:47pm

Karl · July 27, 2020, 9:48pm

My mistake, it isn’t thread-safe.

I wonder if we could maybe remodel the existing lazy support as a built-in property wrapper. That would give you a cleaner way to access the underlying storage.

It would need to use magic, but maybe that’s okay.

reuschj · July 28, 2020, 3:44am

Providing a list of properties used for memoizing may lead to inconsistency. Consider the following case:

I somewhat agree. I think you might be right that by default (if programmer omits a dependency list), Swift can just assume it should recalculate based on any property or variable captured. This is probably the most error-proof default.

However, I think there are still cases where the developer may want the ability to maintain their own list for two reasons:

As pointed out here, there may be times your closure uses a property that doesn't affect the return value. If so, you may want to omit this value to prevent it triggering a recalculation.
There may be times where you want to omit some properties that do affect the return value, if you can still guarantee your logic covers this omission.

For example,

struct SquareBox {
    private(set) var height: Double
    private(set) var width: Double

    private var updateFlag: Bool = false

    mutating func setSides(to length: Double) {
        self.height = length
        self.width = length
        updateFlag = !updateFlag
    }
    
    memo var area(): Double { | updateFlag | in height * width }
}

Ignore the fact that this might not be the best way to set up a square... just pay attention to the fact that height and width (the dependencies) are guaranteed to change together. I'm only letting you change side length via a method that sets both. In this (non-lazy version) if I let Swift manage my properties it would recalculate twice during my setSides call. So instead, I'm creating a private Bool to flip once I know both sides have updated. Then area recalculates only once at the end.

Again, this may be a dumb implementation of a square, but it highlights that cases like these will pop up when you might have to manage your dependency list for optimal efficiency.

However, in a majority of cases, I think we can assume that Swift could auto-generate the dependency list based on whatever was captured in the closure.

suyashsrijan · July 28, 2020, 3:46am

You mean something like this?

reuschj · July 28, 2020, 3:55am

Yup.. it would be nice if memo was just the default behavior. But there may be a few drawbacks to this (at least in the short term):

My solution doesn't really address reference types (yet... someone may have a good idea... if so, please chime in!). So until someone has a way to guarantee the memoized value will always update (even when the dependency is a reference), it may be detrimental to existing code to apply to everything. At the moment, it's a good solution to opt into some of the time, but it isn't really effective for all situations for now (even though that would be nice).
Sometimes you do need a bit more manual control.

I agree to this. The use of a keyword in my proposal is mostly to stay aligned with lazy, but in many ways an attribute may be preferable.

reuschj · July 28, 2020, 4:03am

Yes and no. There are two parts to the ask and one of them could be described as a resettable lazy (in part... I'd also like to not worry about manually resetting the lazy but rather let the lazy reset itself as input changes).

For the other half of the ask, there are time where the laziness is not important and you explicitly don't want the lazy part of it at all:

let box = Box(height: 2, width: 4)
print(box.area) // Error: Cannot use mutating getter on immutable value: 'box' is a 'let' constant

Here the lazy property has prevented someone who wanted to use Box as a let constant from accessing the area (which just feels wrong in this example). When you you define any lazy property, you are making it clear that type is only intended to be used as a mutable type... and that's fine sometimes.

In the non-lazy version, the type can still be used as a let constant... my memoized "computed property" has effectively become a fixed stored property. But is still has the benefits of memoization for users who declare it as a mutable var variable.

So, the answer isn't one of the other, but both for different scenarios.

reuschj · July 28, 2020, 4:12am

I like the looks of that.. what "resettable lazy" wrappers can't really do is watch values for change. (In general, this is a limitation of property wrappers, which really can't (and probably shouldn't) have this type of access to external properties). So, all very good and helpful stuff for improvements to lazy... but memoization based on dependency change is a slightly different can of worms.

reuschj · July 28, 2020, 4:20am

I'll highlight this section of the more detailed pitch as the biggest real challenge here: what to do about reference types. Since a type really only contains the reference pointer, it's quite hard to monitor for any changes. If anyone has ideas, it would be very helpful.

rumnat · July 30, 2020, 9:44am

React Hooks use that approach to use functional components instead of class components. So does Jetpack Compose in the Android world because all UI parts are functions but they need to manage state and re-rendering. In SwiftUI it's about structures and @State/@StateObject/@ObservedObject property wrappers

idrougge · July 30, 2020, 11:19pm

That's not how a computed property is supposed to work. Instead of worrying about recalculating whenever one of the dependencies changes, just recalculate it at the next access.

Jon_Shier · July 31, 2020, 1:56am

This seems like something you should be able to do with a property wrapper, I'm just not sure how you would observe the dependent properties.

muukii · July 31, 2020, 5:45am

I thought that too, but current property wrapper's spec can't do enclosing self in struct.
that is only available class type.

In my view, to compute and memoize a value from values inside a struct, we need something new language level function.

reuschj · July 31, 2020, 6:01am

You don't really have to call it a "computed property". A memoized property is basically both computed and stored. As you said, the lazy version would compute on next access. But, since it has to store (mutate) on access, the lazy version can never be used with let constant. So, for types that could be let constants (but less concerned with lazy), you would need to recalculate on dependency change (or if let constant, just never recalculate).

CTMacUser · July 31, 2020, 6:31pm

Could we do a struct that wraps a class, like Array/Set/``Dictionary` do?