Add Accumulator/TinyArray to the standard library?

in a twist of convergent evolution, i noticed _TinyArray getting added in a PR to the swift-certificates package.

this was interesting to me because i have an assortment of similarly-shaped types in my code as well, such as:

@frozen public
enum Accumulator<Element>
{
    case none
    case one  (Element)
    case many([Element])
}

seems like a good candidate for a standard library type?

5 Likes

I think it would be great if we could lift this or a similar type into a common library. The right place for this, at least in the beginning, would be swift-collections though.
Ideally we would also have a more general "SmallArray" type that stores more than one element inline. LLVM comes with class template called SmallVector that is a good reference implementation. The number of inline elements can be configured or left unspecified to use some heuristic i.e. use ~64 bytes of inline storage. However, Swifts generics system today doesn't really allow us to customise the number of inline stored elements. We could pre-generate different sizes with macros but ideally this would be part of the generic type system.

In addition, swift-certificate would benefit from a data structure which I would call SmallDictionary. We have an ad-hoc implementation in swift-certificates today for Certificate.Extensions and ExtendedKeyUsage. Instead of using a Dictionary we use an Array and use linear search to find, remove, modify items and find duplicates. This is faster than hashing in our use case, usually for up to 64 elements. One needs to be careful though because for larger collections this can slow down your program exponentially. We just limit the number of elements to 32 to be safe as we never expect more than that but we could also switch the underlying storage to a real Dictionary as well. This type could also store some elements inline to further improve performance.

cc: @lorentey who probably has some thoughts on this as well.

4 Likes

Do we really need new types for that, or can we improve our existing Array and Dictionary types?
For example, String already stores the content inline if it is small enough. We could do the same for collections, too.

1 Like

Array and Dictionary are frozen, so changing them represents an ABI break.

2 Likes

When you get to this level of optimisation, the number of potential designs explodes based on usage patterns.

For example, binary search is algorithmically faster than a linear search, and depending on the size of the collection and elements, cost of comparison, and predictability of user requests, it can make lookups much faster. But it requires that the elements are stored in sorted order, so insertions and removals are more expensive.

I feel that there is room for a library which offers these kinds of collections tuned to specific use-cases.

2 Likes

perhaps. i just hope if it gets prototyped in an SPM package that the type will actually “graduate” to the standard library, we have too many types (OrderedDictionary<Key, Value>, Deque<T>, etc) that seem to be stuck in permanent adolescence in that package :slight_smile:

not quite the same thing, but i did just dig up this snippet i needed to repurpose today:

extension _
{
    enum AccumulatorDictionary
    {
        case one ((Key, Value))
        case many([Key: Value])
    }
}

extension _.AccumulatorDictionary? 
{
    mutating 
    func insert(_ key:Key, value:Value)
    {
        switch _move self
        {
        case nil: 
            self = .one((key, value))
        case .one((key, _))?: 
            self = .one((key, value))
        case .one(let other)?: 
            self = .many([other.0: other.1, key: value])
        case .many(var items)?:
            items[key] = value
            self = .many(items)
        }
    }
}

of course i need to rewrite this so that it doesn’t use an extension on Optional<T> so that i can make AccumulatorDictionary generic. (and also stop using experimental _move)

but it does show that i’ve needed a “TinyDictionary” counterpart to “TinyArray”, so maybe they go together.


here’s what i’ve got so far:


@frozen public
enum InlineDictionary<Key, Value> where Key:Hashable
{
    case one ((Key, Value))
    case some([Key: Value])
}
extension InlineDictionary:Sendable where Key:Sendable, Value:Sendable
{
}
extension InlineDictionary
{
    @inlinable public
    subscript(key:Key) -> Value?
    {
        _read
        {
            switch self
            {
            case .one((key, let value)):    yield value
            case .one:                      yield nil
            case .some(let items):          yield items[key]
            }
        }
        _modify
        {
            switch self
            {
            case .one((key, let value)):
                self = .some([:])
                var value:Value? = value
                defer
                {
                    if  let value
                    {
                        self = .one((key, value))
                    }
                }
                yield &value

            case .one(let item):
                var value:Value? = nil
                defer
                {
                    if  let value
                    {
                        self = .some([item.0: item.1, key: value])
                    }
                }
                yield &value

            case .some(var items):
                self = .some([:])
                defer
                {
                    if  let item:(Key, Value) = items.first, items.count == 1
                    {
                        self = .one(item)
                    }
                    else
                    {
                        self = .some(items)
                    }
                }
                yield &items[key]
            }
        }
    }
    @inlinable public
    subscript(key:Key, default value:@autoclosure () -> Value) -> Value
    {
        _read
        {
            switch self
            {
            case .one((key, let value)):    yield value
            case .one:                      yield value()
            case .some(let items):          yield items[key, default: value()]
            }
        }
        _modify
        {
            switch self
            {
            case .one((key, var value)):
                self = .some([:])
                defer { self = .one((key, value)) }
                yield &value

            case .one(let item):
                var value:Value = value()
                defer { self = .some([item.0: item.1, key: value]) }
                yield &value

            case .some(var items):
                self = .some([:])
                if  items.isEmpty
                {
                    var value:Value = value()
                    defer { self = .one((key, value)) }
                    yield &value
                }
                else
                {
                    defer { self = .some(items) }
                    yield &items[key, default: value()]
                }
            }
        }
    }
}

thoughts welcome :slight_smile:

1 Like