I like the idea of Foundation's Data type a lot (the equivalent of a RawPointer
for managed data storage), but every time I try to use it, it ends up making everything really slow.
Benchmark used is available here. Make sure to compile it with -O
. Most of the printouts are to make sure the optimizer doesn't remove things, they can be generally ignored and were removed from the output posted here.
First, simply allocating a Data
takes 2.5 times as long as it does for array
Here's a comparison of the time it takes to allocate 220 collections ([UInt8]
or Data
) holding a 32-byte payload:
> time ./DataSpeedTest alloc array 20
0.23 real 0.19 user 0.03 sys
> time ./DataSpeedTest alloc data 20
0.62 real 0.54 user 0.07 sys
If you look at the memory allocations made using Instruments, you'll see that one [UInt8]
storing 32 bytes allocates 8 bytes on the stack and 64 bytes on the heap, while a Data
allocates 24 bytes on the stack, and both a 96 byte Foundation._DataStorage
and a 48 byte payload.
Personally, I actually find this the most reasonable of the issues, since Data
does support custom deallocators and such, but it does mean that if you want a container to store an object that's a few bytes of raw data, Data
is not the container for you.
The rest of the tests are performed on a collection storage holding a 226 byte (64MB) payload of repeated ASCII 'a'
s. If you want to run with a different size, supply the log2 of the count you want to use as the final argument to the program.
Simply looping over a data
is much slower than an array (test does a for loop that counts and sums the collection's contents)
> time ./DataSpeedTest for array
0.08 real 0.05 user 0.02 sys
> time ./DataSpeedTest for data
0.51 real 0.48 user 0.03 sys
This carries over to generic functions like reduce
> time ./DataSpeedTest reduce array
0.08 real 0.05 user 0.03 sys
> time ./DataSpeedTest reduce data
0.48 real 0.44 user 0.03 sys
...and String.init(decoding:as:)
> time ./DataSpeedTest string array
0.26 real 0.20 user 0.05 sys
> time ./DataSpeedTest string data
1.05 real 0.99 user 0.05 sys
Finally, the worst issue I found was with slicing Data
, which for some reason causes a memory allocation. The test code loops over the collection by repeatedly shrinking a slice and reading from the beginning, something I've found useful for algorithms that want to take variably-sized pieces off of a collection.
> time ./DataSpeedTest slice array
0.10 real 0.06 user 0.03 sys
> time ./DataSpeedTest slice data
10.62 real 10.56 user 0.04 sys
Am I using Data
wrong or misunderstanding what it's meant for? Is it only supposed to be used for large allocations that you access using withUnsafeBytes
?