Efficiently converting from Data to UnsafePointer<CChar>?

Is there an efficient way to get an UnsafePointer<CChar> from Swift's Data type?

In RocksDB, keys and values are represented as std::string. In the C-API that RocksDB offers, the function signature for putting a value becomes:

// c.h
void rocksdb_put(
  rocksdb_t* db, 
  const rocksdb_writeoptions_t* options, 
  const char* key,
  size_t keylen, 
  const char* val, 
  size_t vallen, 
  char** errptr
);

This gets imported into Swift as:

func rocksdb_put(
  _ db: OpaquePointer!,
  _ options: OpaquePointer!,
  _ key: UnsafePointer<CChar>!,
  _ keylen: Int,
  _ val: UnsafePointer<CChar>!,
  _ vallen: Int,
  _ errptr: UnsafeMutablePointer<UnsafeMutablePointer<CChar>?>!
)

More often than not, the value in Swift is doing to be a Data, which would be very common if you were storing JSON objects. So far, I've come up with this code snippet, but is there a more efficient way?

let valueData = try JSONEncoder().encode(record)

valueData.withUnsafeBytes { (valueBytes: UnsafeRawBufferPointer) in

  let valuePtr = valueBytes.bindMemory(to: CChar.self).baseAddress!

  rocksdb_put(db, writeOptions, keyData, keyData.count, valuePtr, valueBytes.count, &error)
}

How long does rocksdb_put require that its argument pointers be valid for?

It's my understanding that when rocksdb_put returns, then the caller is free to deallocate any bytes used for the key or value field.

Internally, RocksDB uses a non-owning structure called a Slice to store char *data_ and size_t size_ members. If you trace through a Put call, the contents of a Slice are eventually copied into an std::string using .append. At that point, I assume RocksDB has a copy by the time the initial _put call returns.

Assuming this is true, then there is no more efficient pattern. The call to bindMemory(to:) is technically unsafe: Swift does not allow you to randomly bind this memory. However, as a practical matter, as used here, you are extremely unlikely to violate an aliasing rule.

However, in Swift 5.6 you should be able to avoid the pointer binding and pass the UnsafeRawPointer from bytes.baseAddress directly to the C call, thanks to SR-0324.

Oh, nice. That's a rather timely update. Presumably the same assumptions apply regarding the required lifetime of the passed in pointer.

Initially I was curious if withUnsafeBytes or bindMemory might have been causing an extra allocation and copy to occur.

Thanks.

Correct.

bindMemory never will: memory binding is a metadata operation and doesn't have any effect on the pointer itself.

withUnsafeBytes on Data will never allocate either.

1 Like

Doesn't this depend of where the Data instance comes from?

nope, the only cases are DataProtocol and those of which that are not contiguous bytes that have to allocate for that method. All Data instances have contiguous storage and can at least temporarily expose it via that family of methods.

1 Like