is there a way to safely convert a [UInt8] array to a tuple of (UnsafeRawBufferPointer, AnyObject?), where the object pointer keeps the buffer allocation alive? Array<UInt8> itself is not a class type, so i want to retain a reference to its underlying storage, if it has any. (not a box that retains an array value)
well, that's a problem because without a neutral ABI for Array and ByteBuffer, we need to use some RandomAccessCollection<UInt8>, and that means the whole BSON library has to be sprayed with @inlinable. which worked for a while, but it has gotten to the point where it is seriously impacting compilation speed.
i see few alternatives to falling back to totally unsafe raw buffer pointers.
As with all underscored APIs, it is not overly generous when it comes to making documented guarantees. In practice, I believe retaining the returned owner will be detected as sharing by COW (so it will still protect you against overlapping writes and enforce value semantics on the Swift side), but Array does lots of magic that isn't available to other Swift types, so it's hard to know for sure.
I think it would be cool to have a (pointer, owner) abstraction for sharing contiguous buffers without copying. Ideally generics would make things agnostic to specific data types, but in practice lots of things are still written in terms of Array, or String, and it would be nice to pass data to them without copying.
Nothing stops Array from having a representation that is neither an object nor static memory (except perhaps on Apple platforms, where its overall representation is partially locked-down); perhaps a compact inline representation, like String. Even the interface Karl found is used only for & in practice, and may in fact do an allocation when used.
NSArray would work for this, and even be cheap…if your elements were already objects. No good here.
That said, I think it’s very unlikely that Array, carefully squished to fit multiple representations in a single machine word, will have a representation like the one I described above. So you could indeed propose the addition of this API instead, along with the implicit limitations about Array that it requires, as an evolution proposal.
For now, there’s a chance you can get pretty far with withContiguousStorageIfAvailable and storing offsets instead of pointers in your intermediate types, but I haven’t looked specifically at your code, so I don’t know for sure.
the BSON code is open-sourced, the types look like:
extension BSON
{
/// A BSON document. The backing storage of this type is opaque,
/// permitting lazy parsing of its inline content.
@frozen public
struct DocumentView<Bytes> where Bytes:RandomAccessCollection<UInt8>
{
/// The raw data backing this document. This collection *does not*
/// include the trailing null byte that typically appears after its
/// inline field list.
public
let slice:Bytes
/// Stores the argument in ``slice`` unchanged.
///
/// > Complexity: O(1)
@inlinable public
init(slice:Bytes)
{
self.slice = slice
}
}
}
it is common when decoding BSON to escape an unparsed DocumentView<some RandomAccessCollection<UInt8>>, in fact that is the pretty much the point of using BSON instead of JSON - you can skip decoding things you don’t care about, or more commonly, don’t know how to decode because you haven’t modeled the schema, or want to delegate the decoding to some component that does know how to decode it.
when the BSON types are specialized, the Bytes parameter is almost always one of:
ArraySlice<UInt8>, which is three pointers long
ByteBuffer, which is also three pointers long
so it is really motivating to me to get rid of the generics entirely and store a raw buffer pointer (2 pointers long) + an object reference (1 pointer long).