[API Review for 1.1] Hash-Array Mapped Prefix Trees

Maybe "HashTrieSet"? The term "trie" is used for similar structures for some reason.

Shareable seems like a very reasonable choice to me.

Any name with "trie" or "prefix tree" in it seems like something you should steer well clear of. Those are specialized data structures that are occasionally extremely useful because of the specialized APIs they support around e.g. finding entries that share a maximal prefix with a given lookup key. We wouldn't want a programmer looking for them to think these data structures could be used that way.

3 Likes

I just saw the term HAMT used above which according to wikipedia is "Hash array mapped trie".

HAMTSet / HAMTDictionary looks good as well.

Shareable seems like a very reasonable choice to me.

Let me beat this dead horse one more time…

The vagueness of “shareable” worries me, especially since any set is shareable by the assumed definition until you edit it. And it seems to imply some sort of locking mechanism to allow cross thread sharing, which isn’t the case.

Why not SharedResourceSet? Says more succinctly what is shared, with no implication that it is threadsafe.

PS my personal favorite is PerpetuatingSet. Says what it does, and frankly, the name kicks butt

3 Likes

The compromise we ended up with is to use the structural type names TreeSet and TreeDictionary. The name for the new module is HashTreeCollections.

I rejected the prefixes Persistent and Shareable due to very persistent (sic) feedback that people consider them misleading.

TreeSet and TreeDictionary are structural rather than functional names, but they avoid any and all such confusion, which makes them a reasonable compromise.

In other programming languages, the names TreeSet and TreeDictionary are more typically used for sorted collections. However, we will use the proper, functional names SortedSet and SortedDictionary for those, leaving the Tree prefix free to reserve for persistent data structures.

When this package gains persistent variants of Array and String, we will therefore have the option to call them TreeArray and TreeString. PersistentSet, PersistentDictionary, PersistentArray and PersistentString would be the most technically correct names, but the association with long term storage seems too strong to accept them at this time. (We can revisit this if/when these types get proposed for inclusion in the Standard Library. It is very much possible that the confusion will dissipate as people become familiar with the concept.)

One more thing: Offline feedback indicated that the current TreeDictionary features for diffing/merging two dictionaries (keys.subtracting/keys.intersection) aren't enough to properly support immediate use cases: we will need a more elaborate interface for structurally combining two dictionary instances. This will require some non-trivial API additions -- I'll open a separate thread once I have a workable implementation.

This review thread is now closed. Thank you all for contributing to this discussion!

18 Likes