Pattern/idiom for having a struct with a deferred/lazy/cached member?

I drank the struct/value koolaid in Swift. And now I have an interesting problem I don't know how to solve. I have a struct which is a container, e.g.

struct Foo {
    var bars:[Bar]
}

As I make edits to this, I create copies so that I can keep an undo stack. So far so good. Just like the good tutorials showed. There are some derived attributes that I use with this guy though:

struct Foo {
    var bars:[Bar]

    var derivedValue:Int {
        ...
    }
}

In recent profiling, I noticed a) that the computation to compute derivedValue is kind of expensive/redundant b) not always necessary to compute in a variety of use cases.

In my classic OOP rut, I would make this a memoizing/lazy/caching variable (I've found that the term memoize can mean slightly different things to different people, use loosely please). Basically, have it be nil until called upon, compute it once and store it, and return said result on future calls. Since I'm following a "make copies to edit" pattern, the invariant wouldn't be broken.

But I can't figure out how to apply this pattern if it is struct. I can do this:

struct Foo {
    var bars:[Bar]
    lazy var derivedValue:Int = self.computeDerivation()
}

which works, until more struct behavior references that value, e.g.

struct Foo {
    var bars:[Bar]
    lazy var derivedValue:Int = self.computeDerivation()

    fun anotherDerivedComputation() {
        return self.derivedValue / 2
    }
}

At this point, the compiler complains because anotherDerivedComputation is causing a change to the receiver and therefore needs to be marked mutating. That just feels wrong to make an accessor be marked mutating. But for grins, I try it, but that creates a new raft of problems. Now anywhere where I have an expression like

XCTAssertEqaul(foo.anotherDerivedComputation(), 20)

the compiler complains because a parameter is implicitly a non mutating let value, not a var.

Is there a pattern I'm missing for having a struct with a deferred/lazy/cached member?

The normal way to handle this is hiding the internal mutations in a reference type. The following code is an oversimplification to illustrate the gist of the idea, but in practice it is much harder to do correctly to avoid bugs caused by aliasing the reference. So it may not be worth the effort in your case. I have an internal cache micro-framework that helps with correct implementation, but is not documented enough to be open-sourced.

struct Foo {
    
    class Cache {
        var derivedValue: Int?
    }
    
    let cache = Cache()
    var bars:[Bar] = [] { didSet { cache.derivedValue = nil } }
    var derivedValue: Int {
        if cache.derivedValue == nil { cache.derivedValue = computeDerivation() }
        return cache.derivedValue!
    }
}

I generalized the problem to a simpler one: An x,y Point struct, that wants to lazily compute/cache the value for r(adius). Following your suggestion, I went with the "reference" wrapper around a block closure and came up with the following. I think of it as a "Once" block, I pattern I implemented many moons ago in Smalltalk (though it was easier because Closures were first class objects you could extend).

import Foundation

class Once<Input,Output> {
	let block:(Input)->Output
	private var cache:Output? = nil
	
	init(_ block:@escaping (Input)->Output) {
		self.block = block
	}
	
	func once(_ input:Input) -> Output {
		if self.cache == nil {
			self.cache = self.block(input)
		}
		return self.cache!
	}
}

struct Point {
	let x:Float
	let y:Float
	private let rOnce:Once<Point,Float> = Once {myself in myself.computeRadius()}
	
	init(x:Float, y:Float) {
		self.x = x
		self.y = y
	}
	
	var r:Float {
		return self.rOnce.once(self)
	}
	
	func computeRadius() -> Float {
		return sqrtf((self.x * self.x) + (self.y * self.y))
	}
}

let p = Point(x: 30, y: 40)

print("p.r \(p.r)")

I made the choice to have the OnceBlock take an input, because otherwise initializing it as a function that has a reference to self is a pain because self doesn't exist yet at initialization, so it was easier to just defer that linkage to the cache/call site (the var r:Float)

Can't you just do this?

import Foundation

final class Cache<Value> {
    lazy private(set) var value = generateValue()
    let generateValue: () -> Value
    init(_ generateValue: @escaping () -> Value) {
        self.generateValue = generateValue
    }
}

struct Point {
    let x: Float 
    let y: Float 
    private let cached: Cache<Float> 
    var radius: Float { 
        return cached.value 
    } 
    init(x: Float, y: Float) { 
        self.x = x 
        self.y = y 
        cached = .init {
            sqrtf((x * x) + (y * y)) 
        } 
    } 
} 

let p = Point(x: 30, y: 30)
print("p.radius = \(p.radius)")

If this is your use case, probably converting computeRadius to a memoized function of x and y would be your best bet. If you really need to maintain internal cache, you need to address aliasing issue. The following modification of my original response, takes care of the aliasing issue, but still isn't thread-safe:

struct Foo {
    
    class Cache {
        var derivedValue: Int?
    }
    
    var cache = Cache()
    var bars:[Bar] = [] {
        didSet {
            // Invalidate cache if there is a change that affects cached value:
            if isKnownUniquelyReferenced(&cache) {
                cache.derivedValue = nil
            } else {
                // The value has been copied since the last mutation, we need a new
                // cache object to avoid aliasing: (This is a mini copy on write.)
                cache = Cache()
           }
        }
    }
    var derivedValue: Int {
        if cache.derivedValue == nil { cache.derivedValue = computeDerivation() }
        return cache.derivedValue!
    }
}

@Nobody1707, that is even simpler. Nice. I assume the problem you were addressing was how I chose to defer the self issue? In this particular case, it works handy because you have access to x and y in the init block without having to go through self access, but a more complicated derivation might have legitimate need to reference self.

@hooman. Can you explain "the aliasing issue"?

Aliasing can happen if you have a mutable struct that also stores references. I think the shortest and fastest way to illustrate the issue is with a sample code:

    class Bar { var name = "" }
    struct Foo { var name = "", bar = Bar() }

    var f1 = Foo()
    f1.name = "f1"
    f1.bar.name = "f1.bar"

    var f2 = f1 // Here, only reference to `bar` object is copied not the object itself.
    f2.name = "f2"
    f2.bar.name = "f2.bar" // This modifies the same bar instance now shared (aliased) between f1 & f2

    print(f1.name, f1.bar.name) // prints "f1 f2.bar" instead of expected “f1 f1.bar”

Here is Wikipedia definition:

In computing, aliasing describes a situation in which a data location in memory can be accessed through different symbolic names in the program. Thus, modifying the data through one name implicitly modifies the values associated with all aliased names, which may not be expected by the programmer.

If you want to have cached computed properties in a struct, you need to make it a reference type with value semantics (struct backed by an object) with copy-on-write and full multithreaded access. Here is an example:

Terms of Service

Privacy Policy

Cookie Policy