In the documentation on github there is interesting exmaple of using extra inhabitants for memory optimization of multi-payload enums:
enum TerminalChar { => LLVM i32
case Plain(UnicodeScalar) => i32 (zext i21 %Plain to i32)
case Bold(UnicodeScalar) => i32 (or (zext i21 %Bold to i32), 0x0020_0000)
case Underline(UnicodeScalar) => i32 (or (zext i21 %Underline to i32), 0x0040_0000)
case Blink(UnicodeScalar) => i32 (or (zext i21 %Blink to i32), 0x0060_0000)
case Empty => i32 0x0080_0000
case Cursor => i32 0x0080_0001
}
I had tried to reproduce it, but got different result with additional byte for distinct tag discriminator. Is this tricky technique with extra inhabitants no longer used anymore?
enum TerminalChar { // With UnicodeScalar(0) as payload
case Plain(UnicodeScalar) => 0x0000_0000_00
case Bold(UnicodeScalar) => 0x0000_0000_01
case Underline(UnicodeScalar) => 0x0000_0000_02
case Blink(UnicodeScalar) => 0x0000_0000_03
case Empty => 0x0000_0000_04
case Cursor => 0x1000_0000_04
}
This optimization is still widely used in the compiler. The particular problem with this example is that Unicode.Scalar is no longer modeled as a UInt21 but instead is modeled as a UInt32 which doesn't have any extra inhabitants. You can see an example of this optimization with Bool who is a Builtin.Int1:
enum TerminalChar { // With false as payload
case plain(Bool) // 0x00
case bold(Bool) // 0x20
case underline(Bool) // 0x40
case blink(Bool) // 0x60
case empty // 0x80
case cursor // 0x81
}
MemoryLayout<Bool>.size == MemoryLayout<TerminalChar>.size // true
I have no idea, but I did manage to find the commit that did change the representation and it seems it was all the way back in 2014 (which is way before my time) here: Adjust test cases. · apple/swift@fad8747 · GitHub
Yeah after a little thinking it is really not a rare case. Some reference type payloads use this technique to store tag discriminator in spare bits sometimes.
Optional<T> is not rare. In particular Optional<UnicodeScalar> would take one byte more than UnicodeScalar, and if you have a vector of those it'd take twice as much space due to alignment.
"Range" types could also have benefited from this optimization (say, a type which can hold an integer from 10 to 1000).
If you only need 10 bits to store a value you'd need to allocate 2 bytes for the storage with 6 bits unused, those could be repurposed to store enum tags.
Even if you don't have "whole" unused bits, you'd be able to repurpose the unused bit patterns to represent other things:
enum E {
case v000 // 0b00000000
case v001 // 0b00000001
...
case v252 // 0x11111100
case v253 // 0x11111101
}
not that these 2 bit patterns are unused: 0b11111110 & 0b11111111 so we can use one of them to represent Optional.none
In case of range, say 10 ... 1000 as I used above the particular bit allocation can vary. In this case can be simply:
10 -> 0b_0000_0000_0000_1010
11 -> 0b_0000_0000_0000_1011
....
1000 -> 0b_0000_0011_1110_1000
// all other bit patterns are unused and can be repurposed
// to represent something else, e.g. Optional.none
or it could be a "translated" representation:
10 -> 0b_0000_0000_0000_0000
11 -> 0b_0000_0000_0000_0001
....
1000 -> 0b_0000_0011_1101_1110
// all other bit patterns are unused and can be repurposed
// to represent something else, e.g. Optional.none
which is particularly useful in case of ranges that could benefit size wise (in this case one byte instead of two):
// 900 ... 1000
900 -> 0b_0000_0000
901 -> 0b_0000_0001
....
901 -> 0b_0110_0100
// all other bit patterns are unused and can be repurposed
// to represent something else, e.g. Optional.none
May be I put it wrong. I understand these optimisations with extra inhabitants and spare bits, but I don’t understand how can Range be related to them. Can you provide some example? Something like
enum CustomEnum {
case range(Range<Int>)
case anotherOne
}
looks meaningless. Range is a struct with two Int's within.