[Pre-proposal/Discussion] Padding and Smaller Integer Types?


(Haravikk) #1

So we have a pretty good set of integer types right now, but one issue I come into is when they don't quite fit within the padding of a type. It's not something I've ever really thought about before, since most of my background is in OOP with reference types (Java for example) and these don't really encourage you to think about it, but with value types I find myself doing so quite a lot.

For example, an Int32? has space for a four-byte integer value, plus one extra bit to indicate whether it is a .None or a .Some, this means that it's padded to 5-bytes. If I create a struct containing an optional Int32, I have seven bits of free space before the type grows in size again (more if I target the type's stride width), so I could add up to seven Bool values if I wanted in this case.
But what if I want to add another small integer? If I add a UInt8 this will cause the type's size to grow (as it'll still have one-bit of extra overhead), but I may not require the full range from the UInt8.

The simple solution to this might be introduce an Octet type (Int4/UInt4) but that may actually be smaller than I need; what if what I would really like is a UInt7 type?

Now I'm not very familiar with how other languages solve problems like this; as I say I've mostly worked with reference based OOP languages in the past, and only worked fleetingly with C (I can modify code, but haven't written anything new in it in a very long time), so I guess I'm mostly just looking to find those with more knowledge than me on the subject to see if there are some good solutions to this? Swift's rich support for optionals, and enums in general, make small overheads like this more common, so a neat way to work with them would be nice.

In my mind the most perfect option would be some kind of integer type for which I can specify a minimum range for what I need, but which I can have scaled in size to fill, but avoid growing, my type's size or stride. For example, if my type is intended for storage in arrays exclusively, then I might like to use up all of the free space in my type's stride width, only growing if it is necessary to achieve the minimum range of values I require. So I might require a minimum range of 0-127 (7 bits unsigned) but end up with a wider type that I can make use of generically by querying the min/max values, i.e- if the remaining space is 7-bits then my integer will fill that perfectly, if the remaining space is 13-bits I will end up with a 13-bit integer, if remaining space is 3-bits the type might grow by 16-bits, giving me a 19-bit integer.
Is such a thing possible? If so, how is it achieved in other languages?

Sorry about the lack of knowledge in my part on this subject; like I say it's nothing I've really thought about before, but now that I'm working with a value type that I want to pack as efficiently as possible into an array (while remaining useful) it's something that is suddenly of interest to me =D

All ideas and thoughts welcome!


(Ben Rimmington) #2

UnicodeScalar previously stored its _value as Builtin.Int21 (instead of the current UInt32).

<https://github.com/apple/swift/blob/b86ebcc042ae87b3de64dcaa7100454291446da2/stdlib/core/UnicodeScalar.swift>

There are some out-of-date examples, where enums with UnicodeScalar payloads used the spare bits (i32 \ i21) for the tag.

<https://github.com/apple/swift/blob/master/docs/ABI.rst#single-payload-enums>
<https://github.com/apple/swift/blob/master/docs/ABI.rst#multi-payload-enums>

Maybe these strategies can't be used for libraries with binary compatibility?

<http://jrose-apple.github.io/swift-library-evolution/#enums>

-- Ben

ยทยทยท

On 3 Aug 2016, at 13:15, Haravikk wrote:

So we have a pretty good set of integer types right now, but one issue I come into is when they don't quite fit within the padding of a type. It's not something I've ever really thought about before, since most of my background is in OOP with reference types (Java for example) and these don't really encourage you to think about it, but with value types I find myself doing so quite a lot.

For example, an Int32? has space for a four-byte integer value, plus one extra bit to indicate whether it is a .None or a .Some, this means that it's padded to 5-bytes. If I create a struct containing an optional Int32, I have seven bits of free space before the type grows in size again (more if I target the type's stride width), so I could add up to seven Bool values if I wanted in this case.
But what if I want to add another small integer? If I add a UInt8 this will cause the type's size to grow (as it'll still have one-bit of extra overhead), but I may not require the full range from the UInt8.

The simple solution to this might be introduce an Octet type (Int4/UInt4) but that may actually be smaller than I need; what if what I would really like is a UInt7 type?

Now I'm not very familiar with how other languages solve problems like this; as I say I've mostly worked with reference based OOP languages in the past, and only worked fleetingly with C (I can modify code, but haven't written anything new in it in a very long time), so I guess I'm mostly just looking to find those with more knowledge than me on the subject to see if there are some good solutions to this? Swift's rich support for optionals, and enums in general, make small overheads like this more common, so a neat way to work with them would be nice.

In my mind the most perfect option would be some kind of integer type for which I can specify a minimum range for what I need, but which I can have scaled in size to fill, but avoid growing, my type's size or stride. For example, if my type is intended for storage in arrays exclusively, then I might like to use up all of the free space in my type's stride width, only growing if it is necessary to achieve the minimum range of values I require. So I might require a minimum range of 0-127 (7 bits unsigned) but end up with a wider type that I can make use of generically by querying the min/max values, i.e- if the remaining space is 7-bits then my integer will fill that perfectly, if the remaining space is 13-bits I will end up with a 13-bit integer, if remaining space is 3-bits the type might grow by 16-bits, giving me a 19-bit integer.
Is such a thing possible? If so, how is it achieved in other languages?

Sorry about the lack of knowledge in my part on this subject; like I say it's nothing I've really thought about before, but now that I'm working with a value type that I want to pack as efficiently as possible into an array (while remaining useful) it's something that is suddenly of interest to me =D

All ideas and thoughts welcome!