A Wednesday puzzler

Though I should say that even including the two preconditions it's not worth worrying about. Both will likely be branch-predicted away to nothingness, and again the difference in operations is trivial alongside the cost of doing the copy.

In C, there is no syntactic difference between a pointer to a single (non-array) value and to the first element in an array of values, and we're discussing behavior that needs to handle C interop. So, let's break things down into cases.

  1. If an unsafe buffer pointer is derived from a value in a Swift type, there's no problem, because we know whether it's a collection or not. That tells us how to set the buffer pointer's associated count.

  2. If an unsafe buffer pointer is derived from a C pointer that's nil, the count is 0.

  3. If an unsafe buffer pointer is derived from a non-nil C pointer that was accompanied by an element or byte count, the buffer pointer count can be set appropriately, in explicit code written at the point of interop.

  4. If a non-nil C pointer arrives without an explicit count, we have to make a somewhat arbitrary ruling. In fact, in C, there is nothing preventing you from writing code to dereference a non-nil pointer. In fact, in C, it's commonly done to pass around a non-nil pointer with no associated count — in other words, a non-nil pointer to a non-array value — so you usually have to dereference pointers blindly.

It's case #4 that we're suggesting different Swift solutions for. Yes, it would be safe to set the count to 0, but that would make it more awkward to actually use the value, since the collection would appear to be empty.

It would be as safe as C to set the count to 1 (in case #4), and I'm suggesting this would actually be safe enough for Swift (in case #4), because the purpose of a non-nil pointer is almost certainly to be dereferenced to get a value. (Even if the count defaulted to 0 as you suggest, some code would end up trying to dereference the pointer anyway.)

There would be no baseAddress. That's the point. Every function in the Swift universe that currently needs an unsafe pointer would take an unsafe buffer pointer instead. A huge chunk of API surface that's duplicative across pointer and buffer pointer types would simply disappear from common use.

There would of course be interop unsafe pointer types (things like CUnsafeRawPointer, perhaps) that the compiler would use at @convention(c) sites, just like the other C interop types, but these would not be used explicitly in Swift source code, except under conditions of extreme need.

It would be as safe as C to set the count to Int.max in the trivial case. The reason that's a bad idea is that C does not let you write a for loop directly on a pointer, where as Swift does let you write one against a buffer pointer. That attractive nuisance is really bad, especially when you don't know the provenance of a buffer pointer.

In Swift today if you have a buffer pointer you have reason to believe a programmer thought about what the count was. Now, they may get it wrong, but at least you know that in principle some thought went into the idea of the count. In your proposal this ceases to be true. This markedly reduces the safety of iterating a buffer pointer.

You go on to say this:

This is wrong. It is extremely common in C to have non-null pointers that may not be dereferenced. Any uninitialised pointer allocated to the stack, any pointer that has previously been passed to free, and in general any pointer that was intended to point to something but where there was no something to which to point. I covered this here: Usability of pointers in Swift - #89 by lukasa.

Don't be distracted by dereferencing the pointer. I accept the danger of dereferencing a buffer pointer's base pointer. The danger is iterating the pointer.

In what way is this different than what you have today? Almost all pointers vended by Swift structures are buffer pointers, not simple pointers. Simple pointers are today vended almost entirely from C code.

2 Likes

Let me be clear. I think that using a buffer element count of 0 would be a pretty natural choice in these circumstances, and I tried to say as much earlier. It wouldn't upset me if we chose this approach.

However, I thought earlier, and still pretty much think now, that in actual use it would be approximately the same experience if the buffer count was 1 instead of 0, but might model what's going on more closely.

Probably one choice would turn out to be preferable over the other. I'm not sure we have to arbitrate that in the current discussion. I was presenting an idea (of which this question was one specific detail), not making a pitch.

Which brings this discussion full circle. The simple pointer in this thread's example is the destination of the copy, not the source, and it in fact is a simple pointer because it arrived over a C interop interface.

The consequence of that is that the copyMemory or fillMemory function, being a method of an Unsafe…Pointer type, has to be told the number of elements or bytes to move.

In what way would this be different from today? Eliminating the Unsafe…Pointer type in favor of an Unsafe…BufferPointer type would allow the "count" parameter to be omitted from the call site, because the counts of both the source and destination would be encapsulated in their respective buffer pointers, where their counts properly belong.

I don’t see this as being as substantial as you appear to think it is. If the simple pointer was vended from C, the only case where you could not have produced a buffer pointer from it is the case where you don’t know the size of the buffer, and in that case the only automatic buffer sizes are zero or one, neither of which is going to lead to the behaviour you want (passing the correct count parameter).

The idea that having only buffer pointers magically makes this better only works in cases where Swift can tell what the size of the buffer is supposed to be. In all other cases, a programmer still has to write the code to say what the size of the buffer is, and so we’re back to status quo.

But irregardless, you didn’t address my question. I’ll quote you again:

It seems to me that what you have said here is CUnsafeRawPointer will exist at @convention(c) call sites. This is 100% of C call sites, right? So far as I can see this amounts to typealias CUnsafeRawPointer = UnsafePointer and typealias UnsafePointer = UnsafeBufferPointer.

My point is that you have outlined the current state of affairs with this proposal.

You’ve outlined a concrete proposal elsewhere: there are no simple pointers, all pointers are buffer pointers. The way this would have to work is that Swift has an automatic translation: wherever a C API expects a pointer, Swift will just silently pass the base address of its buffer pointer type, and whenever it receives one it’ll choose a length for it.

This is pretty weird, IMO: we’re saying that all pointers are collections, with a default size of 0 or 1 depending on context, and then relying on programmers to override Swift’s default logic. I see no benefit to this. All buffers coming from C still require that the programmer provide the length: no lines of code are saved by this proposal, no thought can be removed.

What I’m missing is the ergonomic improvement here. Moving to and from buffer pointers is simply not difficult: a buffer pointer can become a simple pointer in one line of code, and vice-versa. What’s the motivation for removing this (IMO quite valuable) distinction?

AFAICT, the argument you're having is, "Let's make use of the buffer pointer concept to improve the handling of naked pointers in Swift."

I believe the argument I'm having is, "Let's remove the duplicative concept of naked pointers from Swift completely [*almost: see below].

These seem to be different arguments, and I wouldn't expect the reasoning in my argument to make any sense in your argument. I'm not really ducking your objections, just trying to stay on the track I started along.

With that in mind, let me try to address what you said a bit more directly:

Apologies if I'm twisting your words, but I read that as saying: if we could decide what to use for the buffer count in the difficult case (your previous recommendation: use 0), then the naked pointer and the buffer pointer in Swift code are more or less interchangeable. The only choice you make is when to code the transition between the types, if you choose to do that at all.

That's basically my point too, except I'm going one step further to suggest that the language doesn't need more than one form of an isomorphism. We can eliminate one of the current two forms, and in so doing remove an entire type from the standard library [*almost], along with half of their parallel APIs and the confusion resulting from the inconsistencies in API naming and organization. That seems like a pretty big win to me.

re [*almost]:
Yes, renaming the type isn't intended to change it. It's intended to hide it. The pointer passed across a C interface takes 8 bytes (in most architectures we deal with), but Unsafe…BufferPointer takes more than 8 bytes because it needs to hold the count. We're forced to use an 8-byte type at the C interface, but that doesn't mean we should advertise it.

This is almost exactly analogous to the difference between CInt and Int. We have a tiny piece of compiler magic that does the type conversion invisibly almost all of the time. For most Swift programmers, we don't expect or wish that CInt will escape into the rest of their code, nor do we expect them to write a sliver of code at each imported interface to implement the trivial conversion.

Cool, good to know.

That’s what I’m saying, but you’re missing one extra theme. I’m saying that they are more-or-less interchangeable, and that the buffer pointer has become less useful. It carries less meaning, less information.

Yup, this is the crux of the last argument too, and once again I’ll say that I don’t understand why we think types are bad.

I fundamentally disagree. In my eyes BufferPointer is a very clear evolution of a natural API. There must be a type for the word-width pointer because we need to pass it to some APIs, as you state. But it’s common in C to have a pointer to a buffer of things, and it would be nice for Swift to get a way to use such a pointer as an iterator. Hence: buffer pointer, a pointer-based iterator that we can use whenever we know the size of the buffer.

This seems to me to be a meaningful distinction worthy of the type system: we don’t just have a pointer here, we have a pointer, and we know the size of the memory its pointing too.

Perhaps the problem here is simply that some folks believe that having two types where one could do the job is bad. I understand this argument, but I simply don’t agree with it. I’m the kind of person who creates new enums for two-case arguments rather than using Bool. So if you wanted to actually convince me that removing buffer pointer is a good idea, you’ll have to make some case for why having two types is bad other than simply the fact that the second type exists.

This is not the crux of your argument, but for future reference: we do expect users to write that sliver of code. CInt does not map to Int on 64-bit platforms, it maps to Int32. We in fact do have to transform it at every call site. Let’s assume however that we actually did implement the compiler magic here, because we could.

1 Like