Using an indirect enum to implement copy-on-write

(Ole Begemann) #1

Question 1: do indirect enums use copy-on-write under the hood?

Question 2: if so, could we use an indirect enum to effectively implement a universal generic wrapper type that can make any struct copy-on-write?

Background: If you want copy-on-write behavior for a struct, you have to implement it manually. It's not particularly difficult if you know the pattern, but it requires a fair bit of boilerplate, and you have to take care to perform the isKnownUniquelyReferenced check in every mutating method.

Details: Sample code for a struct implementing copy-on-write
struct COW: CustomStringConvertible {
    class Storage {
        var value: Int

        init(value: Int) {
            self.value = value
        }

        func copy() -> Storage {
            return Storage(value: value)
        }
    }

    private var storage: Storage
    private var _storageForWriting: Storage {
        mutating get {
            if !isKnownUniquelyReferenced(&storage) {
                print("Before making copy of storage: \(self)")
                storage = storage.copy()
                print("After making copy of storage: \(self)")
            }
            return storage
        }
    }

    var name: String
    var value: Int {
        get { return storage.value }
        set { _storageForWriting.value = newValue }
    }

    init(name: String, value: Int) {
        self.name = name
        self.storage = Storage(value: value)
    }

    var description: String {
        return "\(name) – value: \(value) – storage: \(ObjectIdentifier(storage))"
    }
}

var cow = COW(name: "cow", value: 0)
var copy = cow
copy.name = "copy"

print("""
    Before mutation
    \(cow)
    \(copy)
    ---
    """)

print("Mutation")
copy.value += 1

print("""
    ---
    After mutation
    \(cow)
    \(copy)
    """)

@Chris_Eidhof and I were wondering if you could use an indirect enum to effectively get copy-on-write behavior for free.

Suppose we want to add copy-on-write to this struct:

struct MyValue {
    var a: Int
    var b: Int
    var c: Int
}

MemoryLayout<MyValue>.size // 24

The idea is to write a generic single-case indirect enum whose purpose is to box any value in a reference:

// Idea: use an indirect enum to get automatic copy-on-write for a large struct
/// A generic enum to wrap any value type indirectly.
/// Provides copy-on-write behavior for `Wrapped`.
indirect enum CopyOnWrite<Wrapped>: CustomStringConvertible {
    case payload(Wrapped)

    init(_ value: Wrapped) {
        self = .payload(value)
    }

    var unbox: Wrapped {
        get {
            switch self {
            case .payload(let v): return v
            }
        }
        set {
            self = .payload(newValue)
        }
    }

    var description: String {
        return "CopyOnWrite: \(unbox)"
    }
}

As expected, the size of CopyOnWrite<MyValue> is 8 bytes because indirect boxes everything in an internal class (or something like a class):

MemoryLayout<CopyOnWrite<MyValue>>.size // 8

To test this out, I created an instance of CopyOnWrite<MyValue> and made a copy via assignment. I then used unsafeBitCast to verify that both variables contain the same address (I'm not 100% sure this is the best way to do this check):

// Test it out
var myValue = CopyOnWrite(MyValue(a: 1, b: 2, c: 3))
var copy = myValue

// Before mutation, myValue and copy should have the same address
let originalAddressBeforeMutation = unsafeBitCast(myValue, to: OpaquePointer.self)
let copyAddressBeforeMutation = unsafeBitCast(copy, to: OpaquePointer.self)
assert(originalAddressBeforeMutation == copyAddressBeforeMutation)

The assertion passes, i.e. both addresses are 0x00007fa7db536e00.

Then I mutate the value wrapped in the copy and check the addresses again:

// Mutate the copy (or the original, doesn't matter)
copy.unbox.a = 23
assert(copy.unbox.a == 23)
assert(myValue.unbox.a == 1) // is still 1

let originalAddressAfterMutation = unsafeBitCast(myValue, to: OpaquePointer.self)
let copyAddressAfterMutation = unsafeBitCast(copy, to: OpaquePointer.self)
assert(originalAddressAfterMutation == originalAddressBeforeMutation, "original address should be unchanged")
assert(copyAddressAfterMutation != originalAddressAfterMutation, "copy address should have changed")

These assertions pass as well, i.e. the address of copy has now changed to 0x00007fa7db5377b0 while myValue still has its old address.

This sure looks like copy-on-write to me.

Would this be a valid approach? Can we expect the same pertformance characteristics as with a manual CoW implementation?

Here's the full code for pasting into a playground:

// The struct we want to make copy-on-write
struct MyValue {
var a: Int
var b: Int
var c: Int
}

MemoryLayout.size // 24

// Idea: use an indirect enum to get automatic copy-on-write for a large struct
/// A generic enum to wrap any value type indirectly.
/// Provides copy-on-write behavior for Wrapped.
indirect enum CopyOnWrite: CustomStringConvertible {
case payload(Wrapped)

init(_ value: Wrapped) {
    self = .payload(value)
}

var unbox: Wrapped {
    get {
        switch self {
        case .payload(let v): return v
        }
    }
    set {
        self = .payload(newValue)
    }
}

var description: String {
    return "CopyOnWrite: \(unbox)"
}

}

MemoryLayout<CopyOnWrite>.size // 8

// Test it out
var myValue = CopyOnWrite(MyValue(a: 1, b: 2, c: 3))
var copy = myValue

// Before mutation, myValue and copy should have the same address
let originalAddressBeforeMutation = unsafeBitCast(myValue, to: OpaquePointer.self)
let copyAddressBeforeMutation = unsafeBitCast(copy, to: OpaquePointer.self)
assert(originalAddressBeforeMutation == copyAddressBeforeMutation)

// Mutate the copy (or the original, doesn't matter)
copy.unbox.a = 23
assert(copy.unbox.a == 23)
assert(myValue.unbox.a == 1) // is still 1

let originalAddressAfterMutation = unsafeBitCast(myValue, to: OpaquePointer.self)
let copyAddressAfterMutation = unsafeBitCast(copy, to: OpaquePointer.self)
assert(originalAddressAfterMutation == originalAddressBeforeMutation, "original address should be unchanged")
assert(copyAddressAfterMutation != originalAddressAfterMutation, "copy address should have changed")

// Convenience APIs to hide the wrapper type. Entirely optional.
extension CopyOnWrite where Wrapped == MyValue {
var a: Int {
get { return unbox.a }
set { unbox.a = newValue }
}
// var b: Int { ... }
// var c: Int { ... }
}

// Allows you to access properties directly, without going through unbox.
copy.a = 42

5 Likes
(Joe Groff) #2

Enum values are currently immutable. There's no way to get a handle to the payload of an enum value and modify it in place. If we added this functionality in the future, we would probably make indirect payloads copy-on-write, though.

Not today, because you can't mutate the inside of the indirect box. With @Doug_Gregor's property delegates feature, you could define a COW<T> wrapper struct that handles the copy-on-write to a boxed value, and delegate a property's implementation to it.

5 Likes
(Ole Begemann) #3

Thanks. I see the error in my thinking now. Mutating the indirect enum always creates a new box instance (i.e. the pointer changes), regardless of whether there is a unique owner or not. Unlike copy-on-write, which eliminates the copy as long as the owner is unique.

With @Doug_Gregor's property delegates feature, you could define a COW<T> wrapper struct that handles the copy-on-write to a boxed value, and delegate a property's implementation to it.

That sounds exciting. I haven't read the property delegates threads yet.

1 Like
(Jordan Rose) #4

I don't think this is guaranteed by the ABI. Mutating an indirect enum always creates a new value; whether or not it does a copy-on-write is an implementation detail at this point, because you can't actually get at the enum's address without doing unsafe operations.

1 Like
(Matthew Johnson) #5

What a coincidence that this comes up. I was just thinking about mutating switch and inout pattern bindings last night.