Edit: See the third post below for a refined version of this idea.
Discussion in another thread lead me to the following idea, but I don't want to sidetrack that thread any further, especially given this idea won't give C / NS_Options compatible layout interesting in the specific context of OptionSets.
Could we store small sets as tagged pointers?
This idea could work for enumerations whose domain is small:
enum E { case a, b, c }
Set<E> ✅
Enumerations in this case could have rawValues but should have no associated values.
This idea could could also work for a set made from a short range defined on integers:
Set<6...42> ✅
Set<100...150> ✅
However this idea won't work for arbitrary sets †, for example:
[0, 1, 2] as Set<Int> ❌
(† Edit: in fact the idea could still work in this case at the price of some runtime overhead. E.g. if the old set was not tagged and it got a change, and the changed set size is less than 52 we can check that all elements are within a specific range (e.g. the logical choice of 0..<52) - if so this can be converted into a tagged set. and if the set is tagged and we are adding an element that's outside 0..<52 range - we convert it into a non tagged set. The difference here is that with large domains the set could be switched between being tagged and not tagged while in small domains it would always be represented as a tagged pointer.)
AFAIU all 7 main tag values of tagged pointers are already used (BTW, what's "OBJC_TAG_1"?) and we'd have to use an extended tag with a smaller 52-bit payload, which is still good enough for small sets with up to 52 elements.
For this exemplar set:
enum E: String {
case foo, bar, baz, .qux, .quux = "Hello, World"
}
var e: Set<E> = [.foo, .baz, .qux]
we may have the following layout in memory when using tagged pointers:
// Intel:
// 0000 <10 nibbles skipped> 0000 1101 xxxx xxxx 1111
│ ││││ ││││ ││││ ││││
│ ││││ ││││ ││││ │││└─ tagged pointer
│ ││││ ││││ ││││ └┴┴── use extended tag
│ ││││ ││││ ││││
│ ││││ └┴┴┴─┴┴┴┴────── tagged set tag (TBD)
│ ││││
│ │││└──────────────── foo = on
│ ││└───────────────── bar = off
│ │└────────────────── baz = on
│ └─────────────────── qux = on
│
└───────────────────── quux = off
// ARM format old:
// 1111 xxxx xxxx <10 nibbles skipped> 0000 0000 1101
// ARM new format:
// 1xxx xxxx x000 <10 nibbles skipped> 0000 0110 1111
Curiously, Hashability (and even Equatability!) of elements won't be needed in the implementation of tagged set variant.
Thoughts?