Fixing enums on big-endian systems

mundaym · March 18, 2019, 12:24pm

Hi,

We are currently working on the Swift port to the big-endian platform IBM Z (the 64-bit evolution of the S/390 architecture also commonly referred to as s390x). While the port is mostly functional there are some fundamental issues with the way discriminated unions (enums in Swift) work on big-endian platforms. This has resulted in a steady stream of bugs being found.

Below I've written out a long-winded explanation of how enums are currently represented in memory to help me better understand the problems we have faced. If you spot a mistake I'd be really grateful if you could let me know.

To fix the core issue I think we need to strictly define the spare bit vector (whether represented as a SpareBitVector, directly as a ClusteredBitVector or as an APInt) as a little-endian integer representation of the memory layout. This means that on big-endian systems we'll need to byte swap 'chunks' when adding to - and reading from - the spare bit vectors, probably making use of helper functions or maybe methods on SpareBitVector` (if possible without ending up with circular dependencies).

If you have any ideas on ways to break up this task a bit I'd be really interested to hear them. Currently it looks like I'm going to have to change a lot of code all at once.

If you have any other thoughts on how we can approach fixing this it would be great to discuss them too. I'd like to make this fix as robust as possible.

I hope to get this fixed in the next few weeks.

Thanks very much for your time,
Michael

Background

The issues this post is concerned with affect enums with multiple cases, one or more of which has a payload. In the Swift compiler enums fitting this description are implemented either as a single-payload enum or a multi-payload enum. A single-payload enum consists of exactly one case with a payload as well as one or more empty cases. Multi-payload enums are any other enum with multiple cases and one or more payloads. In particular, multi-payload enums are used whenever there are multiple cases with payloads regardless of whether the payloads have the same type.

All enums with payloads and multiple cases are laid out the following way in memory:

0                   N                  M+N
|      payload      |  extra tag bytes  |

Where N is the payload storage size in bytes (unlimited) and M is the number of extra tag bytes used (0, 1, 2, or 4). Swift only allows 2³² (?) possible case values.

There are two primary ways to pack case values into payloads without using extra tag bytes that are used in Swift: extra inhabitants and spare bits.

Extra inhabitants are integer values that do not represent a valid instance of a payload type. One example of a type with extra inhabitants is a pointer: values in the range [0:4095] are not valid and can be used to represent case values. Another example is enums without a payload that have fewer case values than their integer-representation can hold. For example, a enum with two cases (0 or 1) still takes up one byte in storage, so the extra inhabitants (in the unsigned domain) are values in the range [2:255]. Extra inhabitant information is available at runtime (we can get and set the extra inhabitant values through the value witness table).

Spare bits are individual bits in a type that when set result in an invalid value of that type, regardless of what other bits are set. For example, a boolean value can be represented using just one bit. When that bit is stored as the least significant bit of a byte all the other bits of that byte are spare since if they are set the value won't be 0 or 1. Spare bit information isn't available at runtime so we can only use it if the payload types are known and fixed everywhere.

The two concepts are (extra inhabitants and spare bits) are related in that any type with spare bits will always have extra inhabitants, as any value with a spare bit set is an extra inhabitant.

Single-Payload Enum Layout

Single payload enums use extra inhabitants in the payload type to pack empty (i.e. no-payload) case values.

If there are enough extra inhabitants in the payload type for all empty case values then we store all the enum cases as extra inhabitant values in the payload and do not need any extra tag bytes. If the payload is a valid instance of the payload type then we know the case is the element with a payload.

For example Optional<Bool> is one byte in size and the cases are mapped to the value of that byte as follows:

Value	Memory Layout
`Optional<Bool>.some(false)`	`[0]`
`Optional<Bool>.some(true)`	`[1]`
`Optional<Bool>.none`	`[2]`

If the payload does not contain enough extra inhabitants to represent all the empty case values then we need one or more extra tag bytes. Extra inhabitants are still used to represent as many empty case values as possible. The extra tag bytes will be zero for both the case with a payload and the empty cases represented using extra inhabitants. Extra tag bytes with a value of one or more indicate the payload contains an integer value representing other empty cases. We represent up to 2^N empty case values in the payload, utilising the full width of the payload, per tag value. For payload types containing enough bytes to represent all empty cases we only need two tag values: 0 when the payload is valid or is an extra inhabitant, and 1 when the payload represents an empty case value.

For example, the type Optional<Int8> has two cases and the payload type has no extra inhabitants. It is laid out in memory as:

Value	Memory Layout
`Optional<Int8>.some(x)`	`[x, 0]`
`Optional<Int8>.none`	`[0, 1]`

BigEnum has 257 cases and the payload type has 254 extra inhabitants:

enum BigEnum {
    case Case0(Bool)
    case Case1
    ...
    case Case254
    case Case255
    case Case256
}

It is laid out in memory as:

Value	Memory Layout
`BigEnum.Case0(x)`	`[ x, 0]`
`BigEnum.Case1`	`[ 2, 0]`
`BigEnum.Case2`	`[ 3, 0]`
...
`BigEnum.Case254`	`[255, 0]`
`BigEnum.Case255`	`[ 0, 1]`
`BigEnum.Case256`	`[ 1, 1]`

Multi-Payload Enum Layout

Multi-payload enums use spare bits in the payload area to store case values. Any spare bits used in the payload must be spare in all the possible payload types. These are referred to in the code base as 'common spare bits'. The size of the payload is the storage size of the largest payload type.

The disciminator for a multi-payload enum is split into two: the tag and the tag index. There is one tag value for each case with a payload, regardless of whether it shares the same payload type as another case. There are also one or more tag values that represent no-payload cases. Where there are fewer tag values for no-payload cases than there are no-payload cases we use the tag index to discriminate between them. The tag index, if it exists, is stored in the bits of the payload that aren't used as spare bits, known in the code base as 'used bits'.

The decision on how to lay out multi-payload enums is taken as follows:

Firstly we need to calculate the number of tag values and the number of tag index values. The number of possible tag index values is 2^U, where U is the number of 'used bits' in the payload. We need one tag value per case with payload. We also need ceil(E / 2^U) tag values for empty cases, where E is the number of empty cases.

If there are fewer than 2^C tag values, where C is the number of common spare bits, then we pack all the tag values into the payload's common spare bits. If there are more than 2^C tag values then we need enough extra tag bytes to represent floor(T / 2^C) values, where T is the number of tag values.

The number of tag index values is less than or equal to 2^U, where U is again the number of 'used bits' in the payload. If there are more no-payload cases than tag index values then we add extra tag values.

For example, if we add an extra payload case to BigEnum then we have a multi-payload enum:

enum BigEnumMulti {
    case Case0(Bool)
    case Case1(Bool)
    case Case2
    case Case3
    ...
    case Case254
    case Case255
    case Case256
}

The spare bit mask for Bool is 0b1111_1110. The used bits mask is therefore 0b0000_0001.

Value	Memory Layout	Tag	Tag Index
`BigEnumMulti.Case0(false)`	`[ 0, 0]`	0	N/A
`BigEnumMulti.Case0(true)`	`[ 1, 0]`	0	N/A
`BigEnumMulti.Case1(false)`	`[ 2, 0]`	1	N/A
`BigEnumMulti.Case1(true)`	`[ 3, 0]`	1	N/A
`BigEnumMulti.Case2`	`[ 4, 0]`	2	0
`BigEnumMulti.Case3`	`[ 5, 0]`	2	1
`BigEnumMulti.Case4`	`[ 6, 0]`	3	0
`BigEnumMulti.Case5`	`[ 7, 0]`	3	1
...
`BigEnumMulti.Case253`	`[255, 0]`	127	1
`BigEnumMulti.Case254`	`[ 0, 1]`	128	0
`BigEnumMulti.Case255`	`[ 1, 1]`	128	1

Bugs

All of the above examples for enum layout use a single byte for each component. This works well on both big- and little-endian systems and in fact the memory layouts are identical. The problems we have seen arise when the payload is larger than 1 byte.

For example, following the rules for single-payload enums the optional tuple type Optional<(Bool, UInt8)> should be laid out in storage like this:

Value	Memory Layout
`Optional<(Bool,UInt8)>.some((false, 0))`	`[0, 0]`
`Optional<(Bool,UInt8)>.some((false, 1))`	`[0, 1]`
`Optional<(Bool,UInt8)>.some((true, 0))`	`[1, 0]`
`Optional<(Bool,UInt8)>.some((true, 0))`	`[1, 1]`
`Optional<(Bool,UInt8)>.some((false, 2))`	`[0, 2]`
`Optional<(Bool,UInt8)>.none`	`[2, 0]`

The empty case (none) uses an extra inhabitant in the Bool part of the tuple to represent itself.

Currently on big-endian systems, however, Optional<(Bool,UInt8)>.none gets (incorrectly) laid out like this:

Value	Memory Layout
`Optional<(Bool,UInt8)>.none`	`[0, 2]`

In fact there is a collision here and both print(Optional<(Bool,UInt8)>.some(false, 2)) and print(Optional<(Bool,UInt8)>.none) print Optional((false, 2)).

To understand the reasons why this is issue occurs we need to look at how spare bits are tracked in IRGen. Spare bit masks are used in multi-payload enums directly and, for single-payload enums, indirectly via the code that generates extra inhabitant information. These spare bit masks are tracked using the ClusteredBitVector type.

The ClusteredBitVector type is essentially an arbitrarily long mask. The Clustered part of the type name refers to how it is implemented and we will ignore that here. The bits in the mask are ordered from 0 to N-1, where N is the length of the mask. Unfortunately the way in which the order of the bits in the mask correspond to bytes in storage is currently poorly defined on big-endian platforms. On little-endian platforms the mapping from the bit mask to bytes in storage is unambiguous. Bit 0 is the least significant bit of the first byte in storage, bit 8 is the least significant bit of the second byte in storage, and so on.

For example, the tuple contained in Optional<(Bool,UInt8)> will have the spare bits mask 0xe0. When interpreted as a 16-bit integer it will be laid out as [0, 14] on little-endian systems and [14, 0] on big-endian systems. This is essentially the cause of the Optional<(Bool,UInt8)> layout issues. We are assuming that the most-significant bit in the spare bits mask is in the last byte of the payload in memory, but by treating the spare bits mask as an integer value on big-endian systems this assumption no longer holds.

Note that this problem doesn't exist when the spare bits in the payload type are in word-sized integer values (e.g. pointers). This is because spare bit mask is used in chunks up to a word in size and so the spare bit positions are correct. This is also why this issue isn't currently completely catastrophic - the spare bits in most enum payloads are in pointers or other non-composite types.

carlos42421 · March 18, 2019, 1:53pm

Can I just say thank you as an innocent bystander, for the first good explanation I've seen of how enums work under the hood (maybe because I'm not very good at finding documentation!)

Aside from the bugs you've raised, I was wondering if someone from the core team could possibly confirm if your explanation of how enums work is accurate and, if so, maybe this could become the basis of a quick document in the docs directory, something like EnumImplementationDetails.md?

Joe_Groff · March 18, 2019, 3:41pm

Yeah, this was always the intent, that the compiler representations be strictly little-endian-bit-order, though we at Apple didn't have time or immediate need to handle the big-endian cases, alas. There is some cleanup in this area that I think would be good to do. There's no good reason to use both ClusteredBitVector and APInt together; it'd be good to standardize on APInt, which is the more fleshed-out and capable interface from LLVM. There are also some slightly higher-level abstractions you can work with that should hopefully make the task easier; EnumPayload in particular is intended to be the primary interface for working with enum payloads as LLVM values. Its interface provides mechanisms for initializing constant values from an APInt, as well as application of constant APInt masks and so on. The implementation handles "chunking" the abstract little-endian bit vector into word-sized components that LLVM can handle. Hopefully, most of your job dealing with endianness can be concentrated in the EnumPayload interface, doing the necessary transformation during chunking and unchunking into LLVM values. EnumPayload itself also incompletely adopted, though, and it would be good cleanup to make it used more pervasively, which would hopefully also reduce the amount of endian-specific code you all have to deal with.

Joe_Groff · September 3, 2019, 6:10pm

By the way, would we be able to take your writeup of how enum layout works and put it in our ABI docs? It's well-written, and could be very useful to other compiler developers.

mundaym · September 5, 2019, 7:07am

Of course, no problem at all.