Usability of pointers in Swift

I'd think that the presence of the UnsafePointer would be a good signal of doing something risky.

Anyway, I was just trying to contribute some idea based on your love of the buffer pointers rather than just complaining about the API.

There are definitely a lot of people who feel that way. I agree that it should always be possible to at least explain its use when present. If one is going to do that, it's often a good idea to put the explanation directly in a fatalError message (in which case you no longer have a !).

Actually @lukasa, I have a better idea than my earlier attempt.

If you had pointers coming from C as buffer pointers with a count of 1, then you could avoid the crash you mentioned (countTheZeroes). As a collection, it would just iterate over the one thing, rather than crash. If the pointer coming from C in fact refers to an array, the programmer could explicitly set the count based on their knowledge of the particular C API.

That eliminates the annoying distinction between UnsafePointer and UnsafeBufferPointer.

What do you think of that? Appreciate you indulging my curiosity.

If you had to pick one, I would definitely agree that the "buffer" variations that carry a size, with 1 as the size for referencing a single scalar, would be the more useful one. To me, it's not so much about safety as doing anything with a pointer to a buffer requires somebody ultimately knows how big it is, so it's useful to have a type that lumps the pointer to the beginning of a buffer and its size together, even if you're living dangerously.

1 Like

Ok, so then proceeding from there, I would add the Raw API to my buffer pointer. You can create an UnsafeRawPointer from an UnsafePointer<T> easily anyway, so what's the harm? UnsafeBufferPointer<Int8> would then replace UnsafeRawBufferPointer.

Then you'd be down to just UnsafeBufferPointer and UnsafeMutableBufferPointer (though personally I would rename them for brevity).

Where does that break down?

What if C api vents out void*, asserting that it has 1 byte doesn't seem that useful.

Now that we have a Never type, maybe you could model void* as Pointer<Never> to indicate you can't dereference it without casting.

11 Likes

While I do like this suggestion. Why not go with Pointer<Void> since that’s literally what it is anyways? That would probably make the most sense to anyone new to the language learning about C interop.

Void in swift is a real type that you can have a value of, unlike the void in C

How would you create a pointer to an empty tuple, if Pointer<Void> is an equivalent of void* ?

Pointer<Never> sounds cool.

So to update my proposal:

  • Primary pointer types are UnsafePointer<T> and UnsafeMutablePointer<T>.
  • Both pointer types have a count and behave like BufferPointer. Pointers coming from C have a count of 1.
  • Both pointer types include the Raw API.
  • void* coming from C is UnsafePointer<Never>

AFAICT, this preserves the level of safety, reduces various pointer conversions (which feel unnecessary) and reduces the confusion of which withUnsafe function to choose.

Obviously I'm no Swift expert. Happy to get some feedback.

2 Likes

I'm quite afraid of implicit count-of-one. Many of inter-op pointers will end up in this bucket. It would prevent things like pointer[2] even when it's a valid memory due to bound check. And without bound checking, there'd be no point in having count.

To have count at all, we'd need a way to re-assign it. That more-or-less puts us back to where we start, but now this subtle nuance between specified/unspecified count doesn't appear in the type information.

Also, it could point to an array of zero element, with is quite a scary scenario to apply this rule.

But in the current design, if you returned a pointer to an array of zero elements, you could get undefined behavior just the same by getting the pointee, no?

Perhaps? You dropped bound check voluntarily there.

Your model does pretend to to do bound check, and I suspect user will believe it to be so.

Well, if a C function returns a non-null pointer that can't be dereferenced without undefined behavior, isn't that a bug in the C function? Can't we safely assume that any non-null pointer returned from C can be dereferenced?

Slicing array in the middle can do just that. Even malloc can return non-null pointers.

I mean, not really. Any attempt to define reasonableness in programming tends to fall apart in the face of the reality of programming. For example, consider this hypothetical C data structure library:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
    size_t count;
    size_t capacity;
    char bytes[];
} MyTailAllocArray;

MyTailAllocArray *MakeArray(size_t size) {
    MyTailAllocArray *array = malloc(sizeof(MyTailAllocArray) + size);
    if (array == NULL) { return NULL; }

    array->count = 0;
    array->capacity = size;
    memset(array->bytes, 0, size);
    return array;
}

size_t ArrayElements(MyTailAllocArray *array, char *elements) {
    if (array == NULL) { return 0; }

    elements = array->bytes;
    return array->count;
}

This defines a simple tail-allocated data structure. Here it's an array of char but it could be nearly anything, it's just that array of char is simple. Imagine here that MyTailAllocArray is an opaque type (that is, the structure definition is not in the public headers), to allow the library to evolve the type going forward.

We then have a simple function that passes out an interior pointer to the elements. The question is, is it safe to assume that if the elements pointer is non-null it's safe to dereference? The answer is no. There are two cases where it is not safe to dereference: the first is if you passed a null pointer for array, the second is if array->capacity is 0 (that is, it was originally allocated as a zero-sized array). In that case, the pointer is pointing off the "back" of the data structure, and is therefore out of bounds and is not safe to dereference.

Now, it isn't unreasonable to say that this C library is not defensive enough against this kind of situation, but it's an unfortunate reality that C code tends not to be. It's really common to find functions that require you to check the returned length before you go accessing a pointer.

5 Likes

Right ok, sure. Bit of a gross C api, but fair enough.

However, for ArrayElements, UnsafePointer won't do any better in terms of safety than a buffer pointer with length one.

That's true, but here's a mind bender: why not a buffer pointer with length zero as the default? This actually definitionally does follow the collection semantics, in that it doesn't promise any valid elements. As @Lantua notes, the biggest downside of any of these ideas is that they awkwardly overload subscripting and make that harder. But if we had to make all pointers buffer pointers with default sizes, it seems like the right default size is 0.

Ok I agree about 0.

And yes that makes subscripting pointers coming from C a little more annoying:

ReturnsPointer().resize(n)[i] or something.

I still kinda like it. I think you'd have to look at various examples of how real world code would change.

What if we have UnsafePointer.set(count: Int) -> UnsafeBufferPointer? It would be more compatible with the status quo and achieve similar affect, well, if we also deprecate UnsafePointer.pointee.

PS
While checking the unsafe apis, I checked this

turns out it has them, both in-place and out-of-place versions, even both minus variations: pointer distance, and shift back.

1 Like