I was watching a great talk recently by @johannesweiss called "High-performance systems in Swift", where he talked about the importance of copy-on-write for large value types in high-performance systems (which is already implemented on common Swift types like String
, Array
, and Dictionary
).
However, implementing copy-on-write manually, although pretty straightforward, comes with lots of boilerplate, with having to create a computed property for each of the properties in the underlying storage, and in each of computed properties' setter an !isKnownUniquelyReferenced(_:)
check.
So, I set out on a mission to find a way to reduce the boilerplate as much as possible when implementing copy-on-write, while still having accessing and modifying the properties feeling natural, and came up with this:
@dynamicMemberLookup
protocol CopyOnWrite {
associatedtype Storage: AnyObject & Copyable
var _storage: Storage { get set }
}
extension CopyOnWrite {
subscript<Value>(dynamicMember keypath: ReferenceWritableKeyPath<Storage, Value>) -> Value {
get {
return _storage[keyPath: keypath]
}
set {
if !isKnownUniquelyReferenced(&_storage) {
_storage = _storage.copy()
}
self._storage[keyPath: keypath] = newValue
}
}
}
With Copyable
a simple protocol:
protocol Copyable {
func copy() -> Self
}
As you see, instead of making a computed property for each of the properties, we can use keypath member lookup to access the underlying storage which is constrained to being a reference type – this means we only need to write the !isKnownUniquelyReferenced(_:)
check once.
Now, implementing copy-on-write for a value type looks like this:
struct LargeType: CopyOnWrite {
typealias Storage = _Storage
var _storage: _Storage
init(value: String) {
self._storage = _Storage(value: value)
}
final class _Storage: Copyable { // requires the `copy()` method
var value: String
init(value: String) {
self.value = value
}
func copy() -> _Storage {
return .init(value: self.value)
}
}
}
To test this out:
var first = LargeType(value: "first")
print(first.value) // first
var second = first
print(second.value) // first
print(first._storage === second._storage) // true
second.value = "second"
print(first.value) // first
print(second.value) // second
print(first._storage === second._storage) // false
Is this a valid approach? Are all the rules of copy-on-write respected?
I saw a copy-on-write wrapper implementation using a property wrapper. Although it definitely has even less boilerplate involved, modifying the storage using the projectedValue
feels awkward.
As Swift continues to move into the server space (especially with the introduction of actors and other concurrency features), performance will become increasingly important. As we wait for ownership semantics, it's important that users are able to use copy-on-write semantics for their large types succinctly.
Here's the full gist – I'd love to hear your thoughts!