Ticket map?

tyoung · April 8, 2021, 2:37pm

Any interest in an implementation of a ticket map (the "Russian Coat Check Algorithm")? Conceptually, this is a map from a monotonically increasing ID to some associated value, which supports clients passing around the ID as a handle to the registered object. The naive implementation uses a Dictionary<ID, Data>, but that leaves a lot of performance on the table due to the low cache locality. A smarter implementation uses an Array of tuples of (ID, Data) as its storage and exploits the fact that the ticket IDs are monotonically increasing, so we can do a quick binary search over a relatively densely-packed array to do a lookup.

This is something I've used in Swift for tracking things like observer registrations—you register an escaping closure as an observer, and you get back a token representing that registration. Later, you can pass that token to cancel your registration.

You can find some more background in Sean Parent's CppCon 2019 keynote—he describes the "Registry" first with a std::unordered_map implementation starting at 48:17, followed by the array-backed ticket map starting at 51:08. There's also a nice writeup of a comprehensive C++ implementation. I'd be happy to put together a proposal for what this might look like in Swift.

lukasa · April 8, 2021, 3:40pm

This seems like an interesting data structure to add. For prior art I can point out NIOHTTP2's StreamMap data structure, which is a composition of a pair of ticket maps. The data structure in question is implemented directly on top of NIO's CircularBuffer, which arguably leaves a minor amount of performance on the table, but is otherwise a very nice performance win over Dictionary in that use-case.

lukasa · April 8, 2021, 3:42pm

As a note, this implementation eagerly clears itself rather than being sparse. This is considered safe because overwhelmingly the streams that finish first are the youngest or the oldest, but a sparse strategy would be valuable if it existed.

Michael_Ilseman · April 8, 2021, 6:33pm

If you have a finite key universe and don't mind using space linear in the key universe size, then a sparse set (really a map) or sparse multi-set (really multi-map) such as LLVM's has very nice properties. It's far faster than a hash table and values that would be sparse in key-space are densely stored in contiguous memory. They are great for when you have a fixed number of them in your process.

(The paper influencing LLVM's data structure is An efficient representation for sparse sets).

It also serves as a good example of where Swift could improve with more support for compilation-time processing and values in type parameter position.

kylemacomber · April 14, 2021, 2:10pm

@tyoung we'd love to see a proposal for this addition to the Swift Collections package! It sounds like this data structure is not only independently useful but also a potential building block for higher-level data structures we also want to vend.

As a first cut, it seems plenty challenging and useful to build an implementation that restricts the ID type to Int. From there, as a second step, we could explore generalizing it to support a generic ID type.

lorentey · April 15, 2021, 7:11pm

I agree there is room for this!

Some points to think about:

Instead of modeling the data structure with an array of optionals, it might be better to use a bitmap + an unsafe mutable buffer instead. This could considerably reduce memory overhead in case the payload type doesn't provide an unused bit pattern for the nil value, although it could interfere with binary search / iteration performance. (Or not -- that is to be verified. A small hit might be worth it if it means we cut memory use in ~half sometimes.)
As I see it, the primary API design challenge here is in figuring out how to model the identifier type. The presentation sidestepped this problem by simply hardwiring it to Int; but is that really all we need?
Would TicketMap be a collection, or just a sequence? If it's a collection, what would be its index type? (It seems to me an opaque struct wrapper around a zero-based offset into storage would work fine.)
Would we call this TicketMap? It does introduce the term Map which would be new to Swift. Also, what's a "ticket"?
Experimenting with the minimum load factor to set the ideal memory use / performance tradeoff will be a fun exercise.

dnadoba · April 15, 2021, 8:21pm

In RSocket, we currently use a Dictionary and want to replace it with a more efficient implementation and it sounds like this could be the perfect solution.

Stream IDs in RSocket are very similar to HTTP 2 Stream IDs and follow the same rules with respect to odd/even values. They are always monotonic increasing.
They main difference to HTTP 2 is that in RSocket it is possible and actually common to have some streams open for a very long time (or forever) and some can be closed almost immediately.
An implementation can also overflow if it runs out of Stream IDs and start from the beginning but this part of the specification is still not final.

tyoung · April 16, 2021, 2:52pm

Lots of good stuff to think about here.

@lorentey I don't think I follow the bitmap + unsafe mutable buffer suggestion—could you point me to existing code that does something similar? I admit the only experience I have with with unsafe Swift has been in using media transcoding APIs that wrap C and such. (You're right that my naive plan was to just use [Ticket, Value?] as the underlying storage.)

I'm inclined to agree that limiting the ticket identifier to Int is not ideal... it'd prevent it from being used in existing places like the networking stream IDs. My thinking was the container could be generic on the ticket type, and maybe just be constructible with a next(_: Ticket) closure whose default is just { $0 + 1 } which would work with anything that conforms to AdditiveArithmetic, so long as your app doesn't have particular semantics associated with the ID values. (In C++ you'd make the next function a template parameter so it's embedded as part of the type, but I don't think that's an option without value-type template parameters.)

Re: collection or sequence, I'm not really sure. There's certainly a lot of syntactic value to supporting a subscript of the ticket ID type... something like:

subscript(_ ticket: Ticket) -> Value? { get }

... but it would probably be a bad idea to have client code storing an offset into the underlying storage, since any removal could trigger a compacting of the array. At the same time, though, you'd certainly want to support traversing the container multiple times and such.

Re: naming, I see the issue with the use of "map"... I'd only heard the data structure discussed in the context of C++, where of course they call the map function transform(). It's funny that NIOHTTP2 calls it a StreamMap... maybe @lukasa can chime in on why he picked "map" rather than, say, "dictionary."

Re: "what's a ticket," there's a little bit of prior art for using ticket as the term for a monotonically increasing ID that gets handed out to clients—I'm thinking of a ticket lock, where the ticket is an analogy to a deli queuing system where you take a ticket and wait for your ticket number to be called.

@dnadoba The overflow case does scare me a little... by my back-of-the-envelope calculations, though, if you had a 64-bit int as your ticket type, you could construct 100,000 tickets/second for more than 2 million years before overflowing.

60 * 60 * 24 * 365.25 ~= 32 million seconds/year
Int64.max / 100_000 / 32_000_000 = 2,882,303 years

(Obviously if your ticket type is an Int32, all bets are off at that level of usage!)

lukasa · April 16, 2021, 3:08pm

I didn't spend as much time considering naming as you should, because the type is purely internal to HTTP/2. For my part, I chose StreamMap for two reasons, neither of them compelling:

Dictionary is more letters than Map.
StreamDictionary would potentially imply that the data structure is implemented on top of Dictionary, which it is not.

To put point (2) another way, I consider the Dictionary type to be squatting the name Dictionary, so I wanted a term for the API contract that was a different word. That really left only Map or AssociativeArray.

This is not a problem that exists for swift-collections. It has a privileged place in the Swift ecosystem regarding type names and can establish whatever convention it wants. What I will say is this: if we conceptualise of a protocol that both TicketMap and Dictionary would conform to that codifies their unique behaviour, what name would that have? That should inform the naming of a hypothetical TicketMap.

Dante-Broggi · April 16, 2021, 5:12pm

I recently had cause to consider the data structure sketched below,
which I called an ArrayMap for lack of a better name.
It seems like a TicketMap which stores its keys implicitly as their raw values.

/// can only store keys with raw values >= 0.
/// prefers key raw values to be densely packed near 0.
public struct ArrayMap<K: RawRepresentable, V> where K.RawValue == Int {
    var data: [V?]

    public subscript(_ k: K) -> V? {
        get { data[at: k.rawValue] }
        set { /*…*/ }
    }
}

dnadoba · April 16, 2021, 6:00pm

@dnadoba The overflow case does scare me a little... by my back-of-the-envelope calculations, though, if you had a 64-bit int as your ticket type, you could construct 100,000 tickets/second for more than 2 million years before overflowing.

60 * 60 * 24 * 365.25 ~= 32 million seconds/year

Int64.max / 100_000 / 32_000_000 = 2,882,303 years

(Obviously if your ticket type is an Int32 , all bets are off at that level of usage!)

Me too because stream ID is an Int32 but negative values are not allowed and odd numbers are for stream IDs from the server and even for the client, so only 2^30 valid numbers for one side.
With 100,000 stream IDs per seconds, it would only take ~3 hours to overflow.
To make things worse, RSocket can resume a connection. In that case, stream id generation will continue with the last stream ID which makes overflow more likely.

lorentey · April 20, 2021, 8:13pm

Oh, it's that if the type T doesn't have an unused bit pattern that the compiler can choose to represent the nil value, then Optional<T> needs to add an extra byte to its layout, which often results in padding bytes getting added to accommodate the type's alignment.

MemoryLayout<Int>.stride      // ⟹ 8
MemoryLayout<Int?>.stride     // ⟹ 16

By replacing the buffer of optionals with a (partially initialized) unsafe buffer, and a separate bitmap indicating which slots are initialized, we can considerably reduce allocation sizes, often without significantly impacting performance. The bitmap is very compact, so it doesn't take up much space in the CPU cache. (Of course, cold lookups suffer two memory accesses rather than just one.)

For example, a 1024-item buffer of Optional<Int>s needs 16,384 bytes, while an unsafe buffer of 1024 Ints plus a same-sized bitmap only takes 8,192 + 128 = 8,320 bytes -- a mere 50.78% of the space. (Of course, we may need a constant amount of additional metadata to keep track of the bitmap's position etc.)

An example where this is currently used is in the Set implementation -- it uses this exact mechanism to represent its hash table, with both the bitmap and the unsafe elements buffer located in tail-allocated storage of an internal class instance.

lorentey · April 20, 2021, 8:56pm

The map vs Map thing may not be as big a deal as it seems -- the two meanings are very closely related. (In a sense, a Dictionary is a data structure that represents the result of a map, and vice versa.) Map may even be a useful name to consider for the eventual dictionary protocol.

We also have the option not to emphasize the mapiness of the new type -- it's not really a true dictionary anyway, because it probably wouldn't support arbitrarily inserting custom tickets. So something like TicketStore may be a viable option, too.

That's a very good point -- using the storage offset for the index would unnecessarily complicate invalidation rules. (It would need to work like Set/Dictionary indices, where removals invalidate all indices. It's annoying to have to learn about the custom rules when we're working with concrete collection types, and we cannot make use of them in generic code anyway -- no collection protocol promises the same invalidation behavior.)

The best overall option seems to be to use the identifier as the index. This rules out Collection conformance, since subscript(_: Ticket) is an O(log(count)) operation. That is fine! We can still conform to Sequence, while also providing the subscript and perhaps an equivalent to the indices property.

(If there is a use case for it (such as to enable use of Collection algorithms), we can still provide an elements view that implements Collection with the fancy, ephemeral storage-based Index -- a little like how OrderedDictionary works. However, I suspect this may be unnecessary in this case.)

Ticket does paint a nice picture!

Do we need to be consistent with Identifiable, which calls a similar type Identifier? (I'm not sure.)

Using a closure may be a good idea in this case! I would prefer not to have a new protocol just for this use case, and my usual reasons for disliking closure-based data structures don't really apply here. (E.g., not being able to optimize closure calls, or to run equality checks etc.) I think the identifier type needs to be Comparable, but that may be the only thing we need.

One potential gotcha with making the identifier type generic is that I expect most people would default to Int, which will often overflow on 32-bit systems. Perhaps highlighting this fact in the docs would be an acceptable workaround.

(Selecting a concrete identifier type instead of leaving it generic would probably mean using UInt64 instead of Int, which would be unfortunate, but it's an option.)

tyoung · April 21, 2021, 12:46am

@lorentey Thanks for the bitmap explanation—yeah, that makes a lot of sense!

I do like TicketStore... evocative, without saying anything unnecessary about the underlying implementation. IdentifierStore could work as well.

Agreed re: the identifier type needing to be Comparable and not much else so long as there's a way to specify how to generate new IDs.

I'm gonna see about putting together a first draft of an API this weekend.

kylemacomber · April 23, 2021, 3:57pm

For Identifiable more generally, while I think IDs are often Ints mapping to SQL ROWIDs, they're just as likely to be UUIDs. So I think it could make sense to use a more specific term here, like Ticket, to indicate the extra restriction of the data structure and identifier.

tyoung · April 24, 2021, 12:58am

Not sure what the best way to post a draft interface for review is—I pushed a file to a GitHub fork, but let me know if there's a better place for it.

There's... kind of a lot of noise here, mostly from the kinds of transformations a person might conceivably want to do on a store. We'd want to make it clear in the docs that like 95% of the usage of a TicketStore is gonna be subscript, insert, and removing by Ticket.

Anyway, I'd appreciate any comments/criticism y'all can provide. Once it gets to a point where folks are happy with it, I can get to work on an implementation.

lorentey · May 4, 2021, 5:54am

This looks like a good start! Random comments below:

I think we can leave off Collection conformance.
I prefer not to provide a capacity property unless absolutely necessary. It exposes internal implementation details (as well as arbitrary platform-specific allocation behavior) that would be better left hidden. Additionally, it can potentially prevent changing the underlying data structure to a different one later. (capacity only makes sense for data structures that use a single contiguous buffer for storage.)
It is probably not a good idea to represent storage as a buffer of (Ticket, Value) pairs -- if the two types have different alignments, then this wastes some memory on padding bytes. (It's a variant of the Optional<T> issue.)
_UnsafeBitset does not own its storage, so it cannot be used as a stored property. The bitmap needs to be in managed storage.
The stdliby way to resolve these last two points is to use a single heterogeneous buffer, with its storage sliced into three different regions, one each for the bitmap, all the tickets, and all the values, respectively. (Like Dictionary's native representation.)

Doing this with ManagedBuffer is tricky, as that type does not directly support such use -- we'd need to manually ensure correct alignment for the three separate components, and we need to temporarily rebind memory on every access.

An alternative is to use three separate ManagedBuffers instead. This increases allocation overhead (which I expect will be very measurable for small stores), and potentially also slow down copying store instances: it'll take three retain/release pairs instead of one. (Unless we collect all references into a fourth allocation.)
I'm not sure we need to provide first/last; but if we do, I think it's fine if they take O(n) time -- it's not that slow to search a bitmap. (If there is a good reason to make these fast, then we can also choose maintain their indices within the store instance. This feels overkill to me.)
Thinking about it a little, I think the merge operation looks weirdly out of place. It raises uncomfortable questions about the nature of the tickets. Does it have a use case? (What is it?)
We do not need custom foreach implementations. If there is a good reason to allow iterating through only tickets or only values, then we can provide tickets/values views with Sequence conformances. (Although that would probably make more sense if this also conformed to Collection, which I don't think is necessary.) Otherwise, the existing Sequence implementation already provides a great way to iterate over the contents of a store.
Similarly, I think it may make sense to only provide a single in-place filter method that supplies both tickets and values to its predicate.
At first glance, I also don't think we'll need custom allSatisfy/first(where:)/firstTicket(where:) methods -- it seems to me that the Sequence algorithms will do just as well.
What is popFirst going to be used for? I'm worried that it would become a potential performance pitfall.

tyoung · May 21, 2021, 2:07am

Sorry for the slow reply... life's been chaos around here, as I moved across town and am changing jobs in the span of a couple weeks.

First, thanks so much @lorentey for the detailed, insightful feedback. As you can probably tell, I totally misunderstood Sequence conformance... I see now that just by implementing the iterator stuff, we get filter(), first(where:), etc. Lovely!

I'm happy to drop the implementations of all the stuff you suggested! It's certainly easier to add it later than take it away. I was mostly just looking at the interfaces for Dictionary and Set and such and seeing which pieces I could possibly imagine being useful. The core use case really is just insert, update, & remove... if someone needs to do stuff beyond that, I'm totally comfortable saying just to extend the class.

I see your point re: the single-ManagedBuffer-sliced-into-three-regions. I'll see if I can't put together a draft implementation of the storage stuff this coming week.

tyoung · June 2, 2021, 8:10pm

I've pushed a starting place implementation. This probably isn't worth reviewing yet... I've pushed it mostly to make it clear that I haven't forgotten about this. You can see the commit message for the numerous reasons I'm aware of that would prevent it from being ready to go, but it at least works and is reasonably tested.