Wrapping a C++ std::map to appear as a collection in Swift?

We have a C++ API where a core feature is an std::map<string, SomeObject>. (For the purposes of this discussion, the type of SomeObject doesn't matter. It could be as simple as int.)

We'd like to write a Swift API that gives the developers access to this library; it would be great if our "bridge" to this std::map looked like it was exposing a collection similar to a Swift dictionary.

The first thing I notice about writing something that conforms to collection is that you be able to do certain operations in O(1) time. In particular, given an index, you have to be able to jump to that location quickly. This is fundementally at odds with std::map where you can't compute distances between elements, or jump any place really. Is a class which wraps std::map with something which satisfies the collection protocol just a non-starter? I'm thinking about how to do this and just hitting a blank wall.

The other question is that a collection doesn't say anything about looking things up by a specific key. Is there a protocol layer i'm missing that talks about using a key type to get to a value? Or is that just built into Dictionary, and there's not a protocol which has this property i could conform to?

O(1) performance is only expected for types conforming to RandomAccessCollection. It is not a requirement for types only conforming to the Collection protocol.

You can define your own Index type for your collection. Indices don't need to be integers. They can be of any type. The only requirement is that they are comparable.

1 Like

Thank you. It appears I misread a few things.

In the documentation on swiftdoc.org it says: Types that conform to Collection are expected to provide the startIndex and endIndex properties and subscript access to elements as O(1) operations. Types that are not able to guarantee this performance must document the departure, because many collection operations depend on O(1) subscripting performance for their own performance guarantees.

So I should assume that departing from this and providing O(log n) access won't be the worst thing in the world for looking up items?

I was further confused by "a random access collection, which can measure the distance between two indices in O(1) time, can calculate its count property in O(1) time" because I thought a Dictionary was a random access collection. Clearly, they're talking about an array. So there's no need to even provide a measure of distance.

Great, things are looking up then!

startIndex and endIndex are roughly equivalent to std::begin and std::end, and the Collection subscript is roughly equivalent to dereferencing an iterator after checking to make sure that the iterator isn't past-the-end or before-begin. That bounds check is why indices are Comparable and not just Equatable.

2 Likes

FWIW the documentation on swiftdoc.org is taken from the inline source documentation. Here is the section in the code: https://github.com/apple/swift/blob/694b153992b2cc2f8aaaa7a13134419a03812694/stdlib/public/core/Collection.swift#L322

Perhaps the documentation here is a little bit weak as to what 'expectation' really means…

Types that conform to Collection are expected to provide the startIndex and endIndex properties and subscript access to elements as O(1) operations. Types that are not able to guarantee this performance must document the departure, because many collection operations depend on O(1) subscripting performance for their own performance guarantees.

I think you are right that the docs apply to Collections in general, not just RandomAccessCollection. Apple docs are identical.

However, I read this as it's fine to have slower performance than O(1), you just need to indicate it to your users in docs, because otherwise people assume that it meets the typical Collection performance guarantees. But you can still conform so that users can take advantage of Collection methods and rely on methods they know Dictionaries etc have.

You may be able to override the default implementations of some Collection methods to improve their performance.

Do note that subscript access uses an index (so it's where you give the collection an index in exchange for the item at that index). As long as you make your associated index type carry enough information, you should be able to supply subscript access in O(1). As a worst case scenario for a really hard to index collection, you could pre-calculate the element and store it in the index, and just take it out in the subscript operator. I think String may actually do this for its indices.

For example, Swift's Dictionary's index is AFAIK actually the index into the underlying array that backs the dictionary. Advancing the index just skips any empty buckets until it arrives at the next in-use one. image

2 Likes

Oh, I see. I’ve been thinking of the index as the string key into the map, but I see what you are saying. Yes, given that an index is really the c++ iterator into the std::map, access is O(1) in that case. Taking a string and getting an index back is still log n though.

Apples to oranges.

1 Like

Yeah, note that Dictionary indexes in swift will really only be used for for (key, value) in myDictionary and such, since it's a Index -> (Key, Value) mapping rather than a Key -> Value mapping. The second subscript that Dictionary has, the one that takes a Key and returns a Value, isn't actually a part of its Collection conformance at all.

Why isn’t there a protocol which captures/expresses lookup by the key? That’s what confused me, I was at the wrong level. I would expect, like in python, to find a “map” protocol. But there is not one.

I've wondered that too before. But I have yet to think of a reason I would want a function or data structure that accepts any type of key-value store, so I haven't made any evolution proposals yet.

It's only worth making a protocol if (1) there are multiple possible conforming types, and (2) there are operations that make sense across any of those types. You can come up with examples of those for key/value mappings, but it's pretty sparse for both lists.

Terms of Service

Privacy Policy

Cookie Policy