Usability of pointers in Swift

Void in swift is a real type that you can have a value of, unlike the void in C

How would you create a pointer to an empty tuple, if Pointer<Void> is an equivalent of void* ?

Pointer<Never> sounds cool.

So to update my proposal:

  • Primary pointer types are UnsafePointer<T> and UnsafeMutablePointer<T>.
  • Both pointer types have a count and behave like BufferPointer. Pointers coming from C have a count of 1.
  • Both pointer types include the Raw API.
  • void* coming from C is UnsafePointer<Never>

AFAICT, this preserves the level of safety, reduces various pointer conversions (which feel unnecessary) and reduces the confusion of which withUnsafe function to choose.

Obviously I'm no Swift expert. Happy to get some feedback.

2 Likes

I'm quite afraid of implicit count-of-one. Many of inter-op pointers will end up in this bucket. It would prevent things like pointer[2] even when it's a valid memory due to bound check. And without bound checking, there'd be no point in having count.

To have count at all, we'd need a way to re-assign it. That more-or-less puts us back to where we start, but now this subtle nuance between specified/unspecified count doesn't appear in the type information.

Also, it could point to an array of zero element, with is quite a scary scenario to apply this rule.

But in the current design, if you returned a pointer to an array of zero elements, you could get undefined behavior just the same by getting the pointee, no?

Perhaps? You dropped bound check voluntarily there.

Your model does pretend to to do bound check, and I suspect user will believe it to be so.

Well, if a C function returns a non-null pointer that can't be dereferenced without undefined behavior, isn't that a bug in the C function? Can't we safely assume that any non-null pointer returned from C can be dereferenced?

Slicing array in the middle can do just that. Even malloc can return non-null pointers.

I mean, not really. Any attempt to define reasonableness in programming tends to fall apart in the face of the reality of programming. For example, consider this hypothetical C data structure library:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
    size_t count;
    size_t capacity;
    char bytes[];
} MyTailAllocArray;

MyTailAllocArray *MakeArray(size_t size) {
    MyTailAllocArray *array = malloc(sizeof(MyTailAllocArray) + size);
    if (array == NULL) { return NULL; }

    array->count = 0;
    array->capacity = size;
    memset(array->bytes, 0, size);
    return array;
}

size_t ArrayElements(MyTailAllocArray *array, char *elements) {
    if (array == NULL) { return 0; }

    elements = array->bytes;
    return array->count;
}

This defines a simple tail-allocated data structure. Here it's an array of char but it could be nearly anything, it's just that array of char is simple. Imagine here that MyTailAllocArray is an opaque type (that is, the structure definition is not in the public headers), to allow the library to evolve the type going forward.

We then have a simple function that passes out an interior pointer to the elements. The question is, is it safe to assume that if the elements pointer is non-null it's safe to dereference? The answer is no. There are two cases where it is not safe to dereference: the first is if you passed a null pointer for array, the second is if array->capacity is 0 (that is, it was originally allocated as a zero-sized array). In that case, the pointer is pointing off the "back" of the data structure, and is therefore out of bounds and is not safe to dereference.

Now, it isn't unreasonable to say that this C library is not defensive enough against this kind of situation, but it's an unfortunate reality that C code tends not to be. It's really common to find functions that require you to check the returned length before you go accessing a pointer.

5 Likes

Right ok, sure. Bit of a gross C api, but fair enough.

However, for ArrayElements, UnsafePointer won't do any better in terms of safety than a buffer pointer with length one.

That's true, but here's a mind bender: why not a buffer pointer with length zero as the default? This actually definitionally does follow the collection semantics, in that it doesn't promise any valid elements. As @Lantua notes, the biggest downside of any of these ideas is that they awkwardly overload subscripting and make that harder. But if we had to make all pointers buffer pointers with default sizes, it seems like the right default size is 0.

Ok I agree about 0.

And yes that makes subscripting pointers coming from C a little more annoying:

ReturnsPointer().resize(n)[i] or something.

I still kinda like it. I think you'd have to look at various examples of how real world code would change.

What if we have UnsafePointer.set(count: Int) -> UnsafeBufferPointer? It would be more compatible with the status quo and achieve similar affect, well, if we also deprecate UnsafePointer.pointee.

PS
While checking the unsafe apis, I checked this

turns out it has them, both in-place and out-of-place versions, even both minus variations: pointer distance, and shift back.

1 Like