Does Data copy-on-write?

I'm just wondering with regard to the performance characteristics of Data in Swift: is Data always copied like a simple struct? Data values can be quite large in terms of memory, so it seems like it would make sense for it to be copy-on-write but I'm not sure if this is the case.

Here is a quick simple test:

import Foundation

var data_1 = Data([0])
let data_2 = data_1
data_1.append(contentsOf: [1])
// prints 2, 1, false
print(data_1.count, data_2.count, data_1 == data_2)

Data is copy-on-write for exactly the reasons you suspect. Proving that is a challenge because copy-on-write is supposed to be an implementation detail. However, consider this:

var d1 = Data("Hello Cruel World!".utf8)
let d2 = d1
d1.withUnsafeBytes { buf in
    print(buf.baseAddress!)
}
d2.withUnsafeBytes { buf in
    print(buf.baseAddress!)
}
d1[0] = UInt8(ascii: "h")
d1.withUnsafeBytes { buf in
    print(buf.baseAddress!)
}
d2.withUnsafeBytes { buf in
    print(buf.baseAddress!)
}

On my machine (Xcode 11.5 running a Debug configuration of a command-line tool target on 10.15.4) it prints:

0x0000000103881ee0
0x0000000103881ee0
0x00000001038822d0
0x0000000103881ee0

As you can see, the third printed value is different because the modified of d1 on line 9 triggered a copy of the underlying buffer.

IMPORTANT This code is meant to expose an implementation detail. Do not use this technique in a shipping product to depend on that implementations detail. The implementation of Data has changed numerous times in the past and it’s not hard to imagine it changing again in the future.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

3 Likes

Even though we can observe the open source implementation of Data I still have one question. How is it handled when we use Data on Apple platforms which would bridge to NSData if I'm not mistaken? Is there an overlay that handles the COW for NSData? Is that part also open sourced?

Is there an overlay that handles the COW for NSData ? Is that part also open sourced?

Data on Darwin is defined in swift/stdlib/public/Darwin/Foundation/Data.swift

Searching that file for "ensureUniqueReference" may be a good starting point to see how copy-on-write is implemented.

public struct Data: … {
    …
    internal struct LargeSlice {
        …
        mutating func ensureUniqueReference() {
            if !isKnownUniquelyReferenced(&storage) {
                storage = storage.mutableCopy(range)
            }
            if !isKnownUniquelyReferenced(&slice) {
                slice = RangeReference(range)
            }
        }
        …
3 Likes

Oh so the actual bridge is defined in that file as well? I though it was the implementation for Data on non-darwin platforms. Good to know. Thank you for the hint.

I have edited my post to make it clear that it's the definition of the Data struct, not so much the overlay, sorry. And I can't say I have fully grasped how it all works, but I think you can see the COW implementation in that file.

1 Like

Okay I see, thank you anyways. :slight_smile:

Bridging to NSData (and hence dispatch_data) is complex, and not something that I’m intimately familiar with. Let’s see if @Philippe_Hausler wants to wade in (-:

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

Data is a CoW type. If a mutation is made and there are other references (i.e. it is not uniquely referenced) it will cause a copy of the underlying buffer to make the mutation. if it is uniquely referenced it will not impose a copy.

The CoW part is open sourced, look for the isUniquelyReferenced call in Data.swift (both for the swift-corelibs-foundation and the Foundation overlay).

There is one caveat to all of this: small Data payloads are always copy. If it is on a platform that the payload is less than a specified amount it will have the entire payload in stack allocation space. So for example a Data of two bytes is always on the stack and so therefore always copied when a mutation occurs, because it copies that stack memory (similar to how smol Strings work).

The tl;dr yes it is a CoW type in very similar spirit to String.

4 Likes