Usability of pointers in Swift

audulus · March 31, 2020, 3:33pm

You're presenting this a something that has to be a particular way, but I think this seems like a subjective matter of design. For example, as we touched on above, C doesn't make a distinction: int* could refer to a single int or an array of ints. In general the compiler doesn't know.

It seems way more friendly than the current state of affairs, to me. You get cleaner easier-to-read code, I would think. The price would be some runtime checks.

As to lifetime analysis, I don't know why it would be really different. Maybe you could explain more?

Avi · March 31, 2020, 3:41pm

it seems like your complaints boil down to "it's not how C does it". Not being like C with respect to memory management is one of Swift's design goals. You're not going to get much traction arguing in this vein.

audulus · March 31, 2020, 3:42pm

I think that's pretty flippant and dismissive of some of the details I've been expressing, but hey if you want to be like that.

audulus · March 31, 2020, 4:07pm

AFAIK, in C++ there's no attempt to prevent the aliasing issue mentioned by @Joe_Groff above, and there isn't a bounds-checked pointer type like UnsafeBufferPointer. I think you could build both things if you wanted to. I think it's telling that nothing like that ever caught on in C++.

Also, in C/C++, it's way easier to convert between pointer types.

But anyway, it does seem like folks have some ideas for making this stuff more user-friendly, so I should probably just wait and see what they come up with

Lantua · March 31, 2020, 4:14pm

Wouldn't strict aliasing rule be their attempt?

Joe_Groff · March 31, 2020, 4:17pm

Handling aliasing problems is in an ongoing research problem in C and C++, and there are still-unsolved semantic problems with the memory model. C++17 adds magic functions to try to paper over some classes of aliasing problems. That you haven't run into them personally is some combination of luck and the ongoing negotiation between the standards bodies, compiler implementers, and real codebases trying to keep the whole mess working. And although standard C++ has not historically included a "buffer pointer" type, nearly every codebase I've worked with has had one of its own, and C++17 finally standardizes this by adding string_view and span. Swift's API definitely needs improvement, but the memory model was designed by folks deeply familiar with C and C++'s design, and I think it does a decent job of avoiding many of the fundamental problems you end up if you stare too deeply into C's model. Hopefully the API will catch up with the model someday.

lukasa · March 31, 2020, 5:29pm

I'm making a statement about Swift. In Swift, the question of whether type T conforms to Collection is a static one: it either does or does not.

That's fair enough, and I think we just have a difference of opinion here.

What I'm getting at is that, in your world where all pointers are collections, if I write code like this:

func countTheZeroes(_ ptr: UnsafePointer<UInt8>) -> Int {
    return ptr.lazy.filter { $0 != 0 }.count
}

I don't know if this code will crash or not (for reasons other than SIGSEGV). The only way to know is to follow every pointer in the program that is ever passed into this code and find out where it came from. This is what static analysers do in other languages.

However, if the parameter is an UnsafeBufferPointer<UInt8> instead I am confident that this will not crash. Furthermore, if I ever get a SIGSEGV out of the code I know that somewhere in my code I construct an UnsafeBufferPointer with invalid length, and so can audit only those call sites, rather than everywhere a pointer may have entered my program from C.

audulus · March 31, 2020, 6:43pm

Thanks for the example. It strikes me that using ! on an optional can crash in a similar way, and would require the same sort of static analysis, but doesn't seem to cause the same sort of concerns. Is that true? Can you help me understand why that wouldn't be analogous?

Avi · March 31, 2020, 7:04pm

It does cause the same kind of concern. That's why you must write !, instead of it always being implicit. Also, dereferencing a nil optional always traps. With pointers, anything can happen, including nothing at all. That means pointers are far more unsafe than Optionals and its "unsafe" affordances.

lukasa · March 31, 2020, 7:09pm

As @avi says, it absolutely is analogous, which is why you must state “I know I am doing something risky”. The SSWG’s guidance (intended as an example of a policy, not necessarily as an endorsement) on ! is that either it should be replaced with a safe alternative that handles the risk of being nil, or it should be possible to describe in a code comment why the ! is either impossible to trigger or the crash is acceptable.

As to “it doesn’t share the same concerns”, you’ll find many people on this forum who consider the appearance of ! in a codebase to be entirely unacceptable in all circumstances.

audulus · March 31, 2020, 7:20pm

I'd think that the presence of the UnsafePointer would be a good signal of doing something risky.

Anyway, I was just trying to contribute some idea based on your love of the buffer pointers rather than just complaining about the API.

anandabits · March 31, 2020, 8:41pm

There are definitely a lot of people who feel that way. I agree that it should always be possible to at least explain its use when present. If one is going to do that, it's often a good idea to put the explanation directly in a fatalError message (in which case you no longer have a !).

audulus · March 31, 2020, 9:53pm

Actually @lukasa, I have a better idea than my earlier attempt.

If you had pointers coming from C as buffer pointers with a count of 1, then you could avoid the crash you mentioned (countTheZeroes). As a collection, it would just iterate over the one thing, rather than crash. If the pointer coming from C in fact refers to an array, the programmer could explicitly set the count based on their knowledge of the particular C API.

That eliminates the annoying distinction between UnsafePointer and UnsafeBufferPointer.

What do you think of that? Appreciate you indulging my curiosity.

Joe_Groff · March 31, 2020, 10:12pm

If you had to pick one, I would definitely agree that the "buffer" variations that carry a size, with 1 as the size for referencing a single scalar, would be the more useful one. To me, it's not so much about safety as doing anything with a pointer to a buffer requires somebody ultimately knows how big it is, so it's useful to have a type that lumps the pointer to the beginning of a buffer and its size together, even if you're living dangerously.

audulus · March 31, 2020, 11:45pm

Ok, so then proceeding from there, I would add the Raw API to my buffer pointer. You can create an UnsafeRawPointer from an UnsafePointer<T> easily anyway, so what's the harm? UnsafeBufferPointer<Int8> would then replace UnsafeRawBufferPointer.

Then you'd be down to just UnsafeBufferPointer and UnsafeMutableBufferPointer (though personally I would rename them for brevity).

Where does that break down?

Lantua · March 31, 2020, 11:51pm

What if C api vents out void*, asserting that it has 1 byte doesn't seem that useful.

Joe_Groff · March 31, 2020, 11:53pm

Now that we have a Never type, maybe you could model void* as Pointer<Never> to indicate you can't dereference it without casting.

Ponyboy47 · April 1, 2020, 2:46pm

While I do like this suggestion. Why not go with Pointer<Void> since that’s literally what it is anyways? That would probably make the most sense to anyone new to the language learning about C interop.

cukr · April 1, 2020, 3:09pm

Void in swift is a real type that you can have a value of, unlike the void in C

How would you create a pointer to an empty tuple, if Pointer<Void> is an equivalent of void* ?

audulus · April 1, 2020, 5:56pm

Pointer<Never> sounds cool.

So to update my proposal:

Primary pointer types are UnsafePointer<T> and UnsafeMutablePointer<T>.
Both pointer types have a count and behave like BufferPointer. Pointers coming from C have a count of 1.
Both pointer types include the Raw API.
void* coming from C is UnsafePointer<Never>

AFAICT, this preserves the level of safety, reduces various pointer conversions (which feel unnecessary) and reduces the confusion of which withUnsafe function to choose.

Obviously I'm no Swift expert. Happy to get some feedback.