Note that we already have this, in a sense, for unsigned integers - you can use the sign bit for the 'none' value, e.g. use an Int64 but values less than zero mean nil, so it's essentially an Optional<UInt63>.
Of course, Swift won't do this for you, which makes it relatively tedious and error-prone.
There is some appeal to being able to express that to Swift; to give it permission to use a bit rather than as much as 64 more bits (due to alignment requirements).
But, that's tangential to this pitch; I don't think it obviates the need for nor merit of [U]Int128.
My take on limited-range integer values in general is that it's better to keep the limited values, and corresponding smaller representations, confined to the storage where those smaller representations are necessary, while still providing a common big-enough Int type as your API. A property macro might be able to take a Pascal-ish declaration like
@LimitedRange(0...9000) var powerLevel: Int
and produce a storage property of smaller type, with bounds checks on get/set when converting to Int at the API level. If the field is Optional typed it could also use an out-of-range value to represent the nil case.
I think the main issue with that is that it turns trivial assignment into something that can crash, e.g. if you write powerLevel = 9001.
You can try to work around it with bespoke setters that throw instead (not great ergonomics without additional language support), or you can just silently clamp the value (but doesn't make sense in many cases).
How can I make a 1 byte struct like Int8, and optional of it that doesn't add another byte?
struct Int7 { // one byte, values in -64 ... +63 range
private var rawValue: Int8
...
}
struct Int8x { // one byte, values in -127 ... +127 range
private var rawValue: Int8
...
}
var x: Optional<Int7> = .some(42) // still one byte wanted!
var y: Optional<Int8x> = .some(42) // still one byte wanted!
If you're forcing the type conversion onto your clients, then a crash is still there when they write powerLevel = UInt16(someInteger) or something like that. Maybe a developer could use one of the other non-trapping constructors but I doubt they would in practice. Trapping when preconditions are unmet is usually the right thing to do.
The supported way would be to define an enum with fewer than 128 cases. The less-supported way would be to -enable-experiment-feature BuiltinModule and wrap the Builtin.Int7 type in a struct.
Aligning to the size of the type does have one significant advantage; it stops reads or writes of that type from splitting across cache line and page boundaries. It's hard to be certain, but I would expect many uses of these large types to want to align to the size of the type even if the underlying implementation supported a smaller alignment boundary, to avoid cross-boundary reads and writes.
I also note that malloc() on many platforms already aligns to 16 byte boundaries because of the alignment requirements of vector instructions, and further that sensible choice of ordering within a struct or class would allow the compiler to pack data into the space that would otherwise be used as padding. Given that some platforms will be 16-byte aligned, it seems likely that structs that contain Int128s would be ordered appropriately for that situation anyway, so the memory saving is probably more limited in practice.
Maybe the fix here is to fix things so that Array<Optional<SomeType>> keeps the optional flags in a side table instead of with the data, unless SomeType has extra inhabitants. That would be more efficient in the general case anyway, and the only problem is if we've somehow guaranteed the memory layout of Array<Optional<SomeType>>
The Array type actually doesn't guarantee contiguous storage, not least because it can be bridged from NSArray, which also doesn't guarantee contiguous storage. ContiguousArray does, of course.
When working with this type I need to ensure that the "nil" pattern (which, experimentally, happens to be 00 00 ... 00 FF doesn't happen during normal operations on the type (like +, -, etc). For example, speaking of Int16x: the valid range is -32767 ... +32767, when converting between user facing values and internal representation I can offset that range so that the prohibited number 0x8000 corresponds to the nil pattern 0x00FF (taking endianness into account).
I can do a similar trick with Optional<struct holding UnsafeRawPointer> which is also 8 bytes, but that's only for 8 byte sized types.
PS. For good or bad such types won't be aligned the same way as their normal integer counterparts.
PPS. I'm missing C unions
It might be interesting to expand our notion of type layout to have a "preferred alignment" that can be more than the ABI-required alignment, which we would use when allocating or emplacing values of the type without any other constraints, while generating code that is tolerant of values being only at the required alignment. Since most instructions on contemporary CPUs can tolerate less-than-ideally-aligned memory accesses, you can get the benefit of well aligned data "for free" without making that good alignment a global structural requirement.
We generally want people to think of it as guaranteeing contiguous storage, since even if it's backed by an NSArray, you can force it into a linear array using withUnsafeBufferPointer, and developers use that as if it's a "free" operation so they'll notice if it starts getting more expensive. I agree with you and @ksluder that, in most cases where you're consciously storing an array of sparsely-present values, a better data structure is appropriate, but [Int128?] could nonetheless arise from generic standard library or common utility packages.
If the NSArray grows beyond a certain size, that will be expensive already (because large NSArrays are backed by a 2-3 tree, not by contiguous memory). But yes, that is the trip hazard here if we pulled the optional flags out into a separate block of memory.
We could definitely build such a thing, but I think we'd make it a different type than Array, because it violates a bunch of basic expectations people have for Array.
The existing trip hazard is only there for code that has to interoperate with Objective-C, where the Objective-C side uses NSArray subclasses that don't themselves implement a contiguous storage SPI. So it's not a concern for "pure" Swift code or code on non-Apple platforms. It would be a new thing if we started making Array produce not-necessarily-contiguous storage on behalf of native Swift code.