Hi everyone,
For Swift 6.4, I’ve spent a fair amount of time working on bag of bytes types (including Data in particular). Based on community feedback and some performance investigations, Data has now gained some fairly significant performance improvements, including:
-
Eliminating unnecessary exclusivity checking (which previously incurred significant runtime costs)
-
Adopting an entirely new ABI on platforms without ABI stability (such as Linux, Windows, Android, and WASM) that drastically reduces client code size and improves throughput performance of common operations
-
Improving specializations and fast paths in common operations to reduce runtime overhead and eliminate some common pitfalls
These performance improvements have led to some significant wins in benchmarking:
-
Data.bytesis 787% faster and produces significantly smaller client binaries with the new ABI, and 147% faster with the existing ABI -
Data.countis 363% faster with the new ABI -
Data.==is 74% faster with one-third of the client binary size with the new ABI, and 47% faster with the existing ABI -
Appending a byte to a
Datais 34% faster with the new ABI -
Iterating a
Datais up to 24% faster, and produces significantly smaller client binaries with the new ABI
In common cases, I’ve found that the throughput performance of Data is now comparable to the performance of Array<UInt8> for equivalent operations when using the new ABI. I’d encourage you all to give it a try and let me know how it impacts your apps!
Improving the performance of Data was a key first step, but we still have more work to do (especially with the introduction of new non-copyable and non-escapable types and language features). As these features develop, I’m working on putting together a vision document to guide the direction for bag of bytes types in Swift. I plan to pitch this vision document as a tool to help us establish concrete recommendations for developers and craft future evolution proposals in this space.
As part of this process, I’d first like to kick off a discussion around your existing use of bag of bytes types in your projects to gather information about experiences today. This is not a pitch for new API or an evolution review yet, but rather a request for information on how you use bag of bytes types and what you feel are important aspects of these types. In particular, I’m interested in hearing from you on:
-
What bag of bytes types do you use in your code? Do you use different types for different purposes?
- Examples include
Data,Array<UInt8>,Unsafe(Mutable)(Raw)BufferPointer,ByteBuffer,DispatchData, span types, etc.
- Examples include
-
Why did you choose to use these types / what aspects led to these decisions?
- e.g. providing module, API surface/functionality, behavioral guarantees, performance characteristics, etc.
-
In general, what are important characteristics of bag of bytes types for your use cases?
- e.g. alignment guarantees, allocation or lifetime guarantees, categories of APIs available, interoperability requirements with other languages or SDK/toolchain APIs, etc.
-
What does the flow of your bytes look like through your application or API?
- What APIs do you use to receive bytes, what APIs do you provide bytes to, what does the lifetime of your bytes look like?
Thanks in advance for sharing your experience. I hope we can identify some key themes that I can incorporate into a future evolution discussion around a long-term vision for bytes.