[Pitch] Pointer bit width compile time conditional

This is also true on LP64 targets like iOS, macOS, and Linux. The only thing that really changes on LLP64 targets is that CLong is Int32.

1 Like

Right. :man_facepalming:

My confusion about this is a manifestation of why I think #if LP64 and friends aren’t a good substitute for #if intBitWidth(). :slight_smile:

Int is meant to be intptr_t-width, so pointerSize or pointerBitWidth seems better to me.

I understand this logic, but unless you’ve used intptr_t and/or know about Int’s relationship to it, are you really going to think pointerBitWidth answers the question of “how wide is Int?”

Also, I believe there are historical architectures with different sizes for different types of pointers, particularly function pointers. Are there any contemporary architectures like this which Swift might be ported to?

1 Like

"How wide is Int?" is a very proximal question. What are you doing with that information? Can it be expressed some other way?

…Maybe we do need to see some more realistic use cases to name this thing. I could go search where it's used in the standard library, but I'm not sure the standard library is very representative of normal Swift code, even the subset of that code that cares about architectures.

7 Likes

My use case is a SIMD vector of words, with a specific bit width (128, 256, or 512).

I can use the GNU vector_size attribute:

#include <stdint.h>
#if __has_attribute(__vector_size__)
typedef uintptr_t UInt128Vector __attribute__((__vector_size__(16)));
typedef uintptr_t UInt256Vector __attribute__((__vector_size__(32)));
typedef uintptr_t UInt512Vector __attribute__((__vector_size__(64)));
#endif

The generated Swift interface (for a 64-bit arch) is:

import Darwin.C.stdint
public typealias UInt128Vector = SIMD2<UInt>
public typealias UInt256Vector = SIMD4<UInt>
public typealias UInt512Vector = SIMD8<UInt>

One example in the stdlib would be in String. The number of bytes available for small string storage directly correlates with the pointer size on the given platform.

On 32-bit platforms we have the Variant enum, which contains either a pointer to an immortal string, represented as UInt (which also shows that pointer size and Int size should be the same), a reference to a native swift object, or a bridged _CocoaString and in addition to that the count as Int and then a UInt8 discriminator and UInt16 flags. Which gives us an aligned size of 12 bytes, of which 10 are usable for small string storage.

On 64-bit platforms we instead pack the count and flags in a single UInt64 and the pointer to the storage in a Builtin.BridgeObject, both of which are 64 bit wide, so total storage is 16 byte, of which 15 are usable for small string storage.

In _StringGuts we have an invariant check that verifies the size:

I think this is a good example of how the proposed conditional would make this code simpler and more robust and also shows the correlation between pointer and integer size.

4 Likes

Yeah, on the one hand that’s a pretty dang good example and on the other we’d probably pick a new dedicated representation for CHERI, heh.

3 Likes

My two cents (and mere speculation as we can’t experiment with CHERI on Swift yet) would be that we would want “pointer bit size” to mean “virtual address/offset bit size”. AFAIK, the only valid integer-to-capability operation in CHERI is offsetting an existing capability, be it a capability representing an array or the DDC capability. The latter would enable legacy uses of casting pointers to integers and back, with DDC pointing to a region of memory intended for “unsafe”/“legacy” uses.

It’s a good point to think about though. We shouldn’t introduce new keywords referring to pointers without at least considering the possible future direction of supporting capabilities in some future. :slight_smile:

3 Likes

I think allowing this (with whatever spelling) would be a significant improvement for maintainers of low-level Swift libraries.

The thing I think we specifically need to be able to check for is the bit width of the standard Int type, and the conditional's name ought to reflect this.

I'd personally prefer to call this intBitWidth, but I'd be happy to accept whatever spelling as long as the feature ships. :stuck_out_tongue_winking_eye:

For what it's worth, our gybbed code is parameterized over ptr_size (either 4 or 8), and the stdlib utility module includes this gem:

# Number of bits in the Builtin.Word type
word_bits = int(CMAKE_SIZEOF_VOID_P) * 8

So we are currently calling the thing this conditional would compare as "pointer size", "word size" and "Int size", depending on the background/mood of the code author, sometimes mixing multiple names within the same line of code. It's been fine so far.

FWIW, in its public API and implementation, the Swift Stdlib currently assumes that Int, UInt, Optional<Unsafe[Mutable][Raw]Pointer>, Optional<AnyObject> and Optional<Unmanaged> (as well as their non-optional variants) all have the same size (or at least that conversions between them are lossless). Deviating from this would require quite a large amount of (probably source-breaking) work; I don't think having to review usages of a new compile-time conditional would make this work meaningfully more (or less) difficult. (After all, the conditional would simply replace the horrid list of architecture tests that we are currently forced to live with.)

5 Likes

This is very true -- it may well be that very little code outside the stdlib would need to use this conditional. However, I'd argue that the stdlib's use case is still strong enough to be worth having this. (Even if underscored, like _endian, _runtime or _ptrauth.) Choosing between alternative type definitions based on the size of Int (like the stdlib does for String) is rarely necessary, but it does legitimately come up from time to time.)

We could also resolve the maintenance problems by defining a custom build-time conditional like INT_BITWIDTH_64, but each project would need to do that manually, leading to the same maintenance issues. Having to ship a new release of a package only to add a new architecture to an #if incantation seems silly.

Sorry for dropping the ball here. Aside from the different opinions on naming, it seems that most people agree this would be a useful addition.

@jrose I would love to hear more about the CHERI case and what we can do to make this feature work for it. If I understand correctly the relevant difference is that pointers are generally bigger than integers, because of the additional information encoded in them. Would adding a separate conditional for integer bit width be sufficient?

1 Like

That is my impression from their C/C++ programming guide (here). Of course it's a couple of years old and it's not obvious every implementation decision has been nailed down, for C or hardware.

1 Like

Laurence Tratt: Making Rust a Better Fit for CHERI and Other Platforms seems a reasonable summary given what (little) I know, in addition to the doc Guillaume linked. A CHERI-Swift (not in "hybrid mode") would conceivably want to have 64-bit Int and 128-bit stored pointers, which would (likely) mean similar changes to what Rust is doing.

What might those changes look like?
  • Deprecating Pointer.init(bitPattern: Int), and defining Pointer.bitPattern to only return the address part of the pointer. Or deprecating bitPattern too and replacing it with addressBitPattern or something. (Tratt talks about extensions that may be needed to access the "capability" part of the pointer as bits, but the Rust folks haven't added them yet, so maybe we could defer that.)

  • Adding an API to make a pointer using the "capabilities" of existing pointer a but the address b. In the Rust experiment this is a.with_addr(b).

  • (optional) Adding convenience APIs for tagged pointer things, so that people don't have to manually write p.withAddress(p.bitPattern | 1)

  • Adding an initializer for "an invalid pointer with a given address bit pattern", so that people can still have their pointer / integer unions.

  • (optional) Making a standard pointer/int union type, so people stop implementing that on top of Int (now too small) instead of one of the Pointer types.

  • Importing intptr_t and uintptr_t as something other than Int on CHERI. This kind of stinks because intptr_t and uintptr_t were up until now the most consistent representations of Int and UInt in C. (ptrdiff_t is an okay Int, but size_t is also treated as Int by Swift because the Int/UInt conversion is more painful in Swift, so there's no great UInt. long and unsigned long are pretty good everywhere but Windows.)

  • Changing the runtime data structures to be more careful about what's an integer and what's a pointer.

  • Accepting that not every package will automatically work right away, or possibly at all. That's fine. Really. It's a new platform.

I don't think this is actually so bad! If we really wanted to make it a priority, we could. That said, I don't think we do need to make it a priority at this time. Rust (along with C, of course) is used for many experimental platforms; Swift is not.

I think ultimately this comes out to saying that if the conditional is called intBitWidth or similar, it will be 64 on CHERI, whereas if it's called pointerBitWidth, it will be 128 on CHERI. (Or 129. It isn't clear to me whether the validity bit is stored in memory.) I can treat this as a mild argument for the former, because any package testing for intBitWidth(64) will then compile for CHERI and may or may not behave as expected, while a package testing for pointerBitWidth(64) will not compile for CHERI whether or not it would behave as expected.

4 Likes

This sounds like adding both - intBitWidth and pointerBitWidth - would be the reasonable thing to do. In general I prefer code that does not compile over code that misbehaves on platforms that are not explicitly supported.

1 Like