Exercise: abstracting over a single-element and multi-element Collection

here’s a swift exercise:

given a union type such as the following:

enum InlineSet<Element>:Hashable where Element:Hashable 
{
    case one        (Element)
    case many   (Set<Element>)
}

how would you implement a Collection interface over it?

extension InlineSet:Collection 
{
}

I guess this would be my implementation. It traps in some cases, but I think it's better than having undefined behavior. It also allows for valid comparisons between indices in the one range which is probably unnecessary, but why not.

extension InlineSet: Collection {
    enum Index: Comparable {
        // Only 0 is valid, but other indices are provided
        // for valid comparisons after incrementing the index.
        case one(Int = 0)
        
        case many(Set<Element>.Index)
        
        static func > (a: Self, b: Self) -> Bool {
            switch (a, b) {
            case let (.one(indexA), .one(indexB)): 
                return indexA > indexB
                
            case let (.many(indexA), .many(indexB)):
                return indexA > indexB
                
            case (.one, .many), (.many, .one):
                fatalError("Cannot compare indices of different inline sets.")
            }
        }
    }
    
    var startIndex: Index {
        switch self {
        case .one: return .one()
        case let .many(set): return .many(set.startIndex)
        }
    }
    
    var endIndex: Index {
        switch self {
        case .one: return .one()
        case let .many(set): return .many(set.endIndex)
        }
    }
    
    func index(after i: Index) -> Index {
        switch (self, i) {
        case let (.one, .one(index)): 
            return .one(index + 1)
            
        case let (.many(set), .many(index)): 
            return .many(set.index(after: index))
            
        case (.one, .many), (.many, .one):
            fatalError("Cannot increment indices of a different inline set.")
        }
    }
    
    subscript(position: Index) -> Element {
        switch (self, position) {
        case let (.one(element), .one(index)):
            guard index == 0 else {
                fatalError("Index out of bounds.")
            }
            
            return element
            
        case let (.many(set), .many(index)):
            return set[index]
            
        case (.one, .many), (.many, .one):
            fatalError("Cannot access an element with an index of a different inline set.")
        }
    }
}
1 Like

Why are you doing this?

i need to store a great many Sets, where the most common element count is (by far) 1. rather than creating tons and tons of single-element Sets in the heap, i would rather burn a few extra bytes of stride to store these single-element collections directly in the main allocation.

won’t this generate an empty sequence for the .one(_:) case?

Got you. We have the "short string optimization", perhaps we may consider a similar "short set / dictionary / array optimization" (if we don't have that already) so that all users of those container types take advantage of the optimization without the need to reimplement it manually.

4 Likes

well… it’s not always a win. Set has an alignment of 8, so trying to form any kind of union with it will double its memory footprint. i’m only doing it here because my data is distributed in a way such that over 98% of the Sets would be single-element Sets without the inline storage optimization.

1 Like

Right, yeah it should be Index.one(1)

How were you thinking of implementing the Collection conformance?