[Review] SE-0184: Unsafe[Mutable][Raw][Buffer]Pointer: add missing methods, adjust existing labels for clarity, and remove deallocation size

If I may suggest one more incremental improvement in clarity, it would be
to move `at[ByteOffset]` to be the first argument; this eliminates the
possible reading that we are offsetting the source buffer:

UnsafeRawBufferPointer.initializeMemory(atByteOffset: position *
MemoryLayout<T>.stride, as:T.self, from: bufferOfT)
···

On Sat, Sep 2, 2017 at 4:41 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 2, 2017, at 2:06 PM, Taylor Swift via swift-evolution < > swift-evolution@swift.org> wrote:

the subscript doesn’t know about the stride that you want to use. If you

want to get rid of the `at:` ambiguity, you have to get rid of it
everywhere, or users will just wind up having to remember two ways (one
ambiguous and one less ambiguous) of doing the same thing, instead of one
(ambiguous) way of doing it.

Certainly, that's a good point. On rethink and another re-reading of the
proposal, it's unclear to me that the addition of `at` arguments to
UnsafeMutablePointer is sufficiently justified by the proposal text. Is it
merely that it's shorter than writing `foo + MemoryLayout<T>.stride *
offset`? With the ambiguity of `at`, it seems that the current way of
writing it, though longer, is certainly less ambiguous.

Please reread it; UnsafeMutablePointer’s methods do *not* use `at:`.

Regarding the typed buffer pointers, I think it is clear by convention,
and will be well documented, that the `at` label refers to a position in
`self`.

The raw buffer pointer API isn’t so obvious. Since the `at` refers to
`self` it might more logically be a byte offset. Note that `at` as a label
name always refers to a strided index.

This would be a bit more explicit:
UnsafeRawBufferPointer.initializeMemory(as:T.self, atByteOffset: position
* MemoryLayout<T>.stride, from: bufferOfT)

But possibly less convenient… Since that `at` label currently on
UnsafeRawPointer.initializeMemory is almost never used, I don’t think we
need to worry too much about convenience.

That existing `at` label on UnsafeRawPointer.initializeMemory, would also
need to be renamed, which is fine.

Ok good. For growable buffers, we also want the OS to give us malloc_size which may be larger than requested capacity.

The semantics of buffer.deallocate() needs to be: free `buffer.count` bytes of memory at `buffer.baseAddress`. So, that will always be the fast path!
Kelvin, do you agree with that?

Any future safe API for manual buffer allocation/deallocation will also track the buffer size, so that will also be the fast path.

In the future, pointer.deallocate() without capacity might become a slower path. But I think that the UnsafePointer allocation/deallocation paths are Swift's equivalent of malloc/free. If the client code knows the buffer size, it shouldn't be using those. In fact, we don't want to force the client to track capacity if deallocation is *not* performance critical.

I suppose an optional `allocatedCapacity` argument would be ok, since we should be asserting in the runtime anyway. It just adds complexity to the API and I don't buy the backward deployment argument.

-Andy

···

On Sep 6, 2017, at 3:42 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 6, 2017, at 3:26 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 6, 2017, at 2:31 PM, Joe Groff <jgroff@apple.com> wrote:

The fact that we’re using malloc and free is already part of the ABI because old libraries need to be able to deallocate memory allocated by newer libraries.

The compiler doesn't ever generate calls directly to malloc and free, and the runtime entry points we do use already take size and alignment on both allocation and deallocation.

True, we’ve never said that UnsafePointer deallocation is compatible with C, and I we don't currently expose malloc_size functionality in any API AFAIK.

Within the standard library we could make use of some new deallocation fast path in the future without worrying about backward deployment.

Outside of the standard library, clients will get the benefits of whatever allocator is available on their deployed platform because we now encourage them to use UnsafeBufferPointer.deallocate(). We can change the implementation inside UnsafeBufferPointer all we want, as long as it’s still malloc-compatible.

I’m sure we’ll want to provide a better allocation/deallocation API in the future for systems programmers based on move-only types. That will already be deployment-limited.

Absolute worst case, we introduce a sized UnsafePointer.deallocate in the future. Any new code outside the stdlib that wants the performance advantage would either need to
- trivially wrap deallocation using UnsafeBufferPointer
- create a trivial UnsafePointer.deallocate thunk under an availability flag

Since we already have sized deallocate, why would we take it away? If the name is misleading, we can change the name.

I don't think leaving it as an optional argument is worthwhile, as I explained above. Deallocation capacity is either required or we drop it completely. If we keep it, then `allocatedCapacity` is the right name.

The reason for taking it away, beside being misleading, is that it exposes another level of unsafety.

My thinking has been that this is not the allocation fast path of the future, and the stdlib itself could track the size of unsafely-allocated blocks if it ever used a different underlying allocator.

Now I realize this isn't really about fast/slow deallocation paths. Removing `capacity` or even making it optional forces all future Swift implementations to provide malloc_size functionality for any piece of memory that is compatible with the Unsafe API.

I'm actually ok with that, because I think it's generally desirable for application memory to reside in either in malloc-compatible blocks or fixed size pools. i.e. I think malloc_size is something the platform needs. However, you seem to think this would be too restrictive in the future. How is this a known problem for C and what's your confidence level it will be a problem for Swift?

No, I agree that being malloc-compatible is a reasonable goal; on Apple platforms, being able to register any custom allocator we come up with as a malloc zone would mean that the platform's existing memory profiling and debugging tools can still work. Even if we have a scheme where the allocator directly reaches into per-thread fixed-sized pools, it seems to me like it'd be hard to make malloc_size impossible to implement, though it might be slow, asking each pool for each thread whether an address is inside it. Strongly encouraging, if not requiring, user code to provide deallocation sizes seems to me like a prerequisite to making that sort of design a net win over plain malloc/free.

-Joe

I think we agree that the runtime should assert when it is passed an invalid deallocation size. The problem you’re describing could still occur with the UnsafeBufferPointer.deallocate() method if the user happened to slice and rebase the buffer.

-Andy

···

On Sep 6, 2017, at 4:46 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

A sized implementation may fail more obviously when you violate the contract in the future. Not having sized deallocation is a known deficiency of the C model we've been fairly diligent about avoiding in Swift's allocation interfaces, and it would be extremely unfortunate for us to backpedal from it.

-Joe

Which is worse, an active gaping hole in Swift’s memory system, or potentially discouraging users from using a hypothetical future allocation API?

Making sure the existing allocation API is working properly is a prerequisite to introducing a future more advanced allocation API.

the proposal specifically defines UnsafeMutableBufferPointer.deallocate() behavior in terms of calling deallocate() on its baseAddress if non-nil. this operation doesn’t have any relation to the length of the buffer itself. that being said, it *should*, and i’d be down for redefining it in terms of deallocating self.count elements, if this was technically possible but it’s not

···

On Sep 6, 2017, at 6:56 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 6, 2017, at 4:46 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

A sized implementation may fail more obviously when you violate the contract in the future. Not having sized deallocation is a known deficiency of the C model we've been fairly diligent about avoiding in Swift's allocation interfaces, and it would be extremely unfortunate for us to backpedal from it.

-Joe

Which is worse, an active gaping hole in Swift’s memory system, or potentially discouraging users from using a hypothetical future allocation API?

Making sure the existing allocation API is working properly is a prerequisite to introducing a future more advanced allocation API.

I think we agree that the runtime should assert when it is passed an invalid deallocation size. The problem you’re describing could still occur with the UnsafeBufferPointer.deallocate() method if the user happened to slice and rebase the buffer.

-Andy

Currently, memory is deallocated by an instance method on UnsafeMutablePointer, deallocate(count:). Like much of the Swift pointer API, performing this operation on a buffer pointer requires extracting baseAddress! and count. It is very common for the allocation code above to be immediately followed by:

defer

{
buffer.
baseAddress?.deallocate(capacity: buffer.count
)
}

This method is extremely problematic because nearly all users, on first seeing the signature of deallocate(capacity:), will naturally conclude from the capacity label that deallocate(capacity:) is equivalent to some kind of realloc()that can only shrink the buffer. However this is not the actual behavior — deallocate(capacity:) actually ignores the capacity argument and just calls free() on self. The current API is not only awkward and suboptimal, it is misleading. You can write perfectly legal Swift code that shouldn’t segfault, but still can, for example

var ptr = UnsafeMutablePointer<UInt8>.allocate(capacity: 1000000
)
ptr.
initialize(to: 13, count: 1000000
)
ptr.
deallocate(capacity: 500000) // deallocate the second half of the memory block
ptr[0] // segmentation fault
where the first 500000 addresses should still be valid if the documentation is to be read literally.

The fact that the Swift runtime currently uses malloc/free is an implementation detail. Tracking deallocation size is a very useful optimization for better allocator backends, and C++ underwent an ABI break to make it so that sized delete can be supported. Maybe we can change the name to `deallocate(allocatedCapacity:)` to make it clear that it isn't resizing the memory, and/or make the capacity argument optional so that you can pay for the overhead of the allocator deriving the size if it's inconvenient for the calling code to carry the size around, but we shouldn't remove the functionality altogether.

-Joe
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

The idea is to get the house in order by removing all parameters from deallocate(), since that’s what it really does right now. Then, in the future, if Swift gets a more sophisticated allocator backend, a new method like deallocate(capacity:) or reallocate(toCapacity:) could be added without conflicting with the currently named deallocate(capacity:). However, using the function signature to pretend that it does something it can’t actually do right now is extremely dangerous.

I don't think that's a good idea in this case, because it's not unlikely we would explore an optimized allocator soon after ABI stability, and retrofitting these interfaces in a future version of Swift would put a deployment target limit on when they can be used, and mean that a lot of user code would need to be retrofitted to carry allocated capacities where needed to see any benefit.

-Joe

The fact that we’re using malloc and free is already part of the ABI because old libraries need to be able to deallocate memory allocated by newer libraries.

The compiler doesn't ever generate calls directly to malloc and free, and the runtime entry points we do use already take size and alignment on both allocation and deallocation.

Within the standard library we could make use of some new deallocation fast path in the future without worrying about backward deployment.

Outside of the standard library, clients will get the benefits of whatever allocator is available on their deployed platform because we now encourage them to use UnsafeBufferPointer.deallocate(). We can change the implementation inside UnsafeBufferPointer all we want, as long as it’s still malloc-compatible.

I’m sure we’ll want to provide a better allocation/deallocation API in the future for systems programmers based on move-only types. That will already be deployment-limited.

Absolute worst case, we introduce a sized UnsafePointer.deallocate in the future. Any new code outside the stdlib that wants the performance advantage would either need to
- trivially wrap deallocation using UnsafeBufferPointer
- create a trivial UnsafePointer.deallocate thunk under an availability flag

Since we already have sized deallocate, why would we take it away? If the name is misleading, we can change the name.

-Joe

Exactly, we are changing the name to `deallocate()`. As for the old `deallocate(capacity:)` method that *needs* to be removed because it is actively harmful. As I’ve explained it’s not enough to just drop in a sized backend later as an “implementation detail” because it’s not an implementation detail, you’re changing the fundamental behavior of that method.

It would remove a mode of "holding it wrong", but the specified behavior will not change; it has and will always fully deallocate the object being referenced in the cases where the call has defined behavior.

idk what you mean by it holding it wrong, but while the specified behavior won’t change the actual behavior *will* change, because right now the two don’t agree. We can’t just treat it as a bug to fix because it’s been around since the beginning of Swift and some people treat it as part of expected behavior, using the function like `free()`. This isn’t using it incorrectly, in fact in terms of behavior, passing a meaningful capacity argument to `deallocate(capacity:)` is *incorrect usage*, as demonstrated in the segfaulting example in the proposal.

The segfaulting example is an incorrect usage. The only valid parameters to deallocate(capacity:) are the base address of an allocation, and the original capacity passed into allocate(); it has never been intended to support partial deallocation of allocated blocks. It seems to me like this proposal is based on a misunderstanding of how the API works. The documentation and/or name should be clarified.

-Joe

···

On Sep 6, 2017, at 4:46 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

On Sep 6, 2017, at 5:12 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 6, 2017, at 3:07 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

On Sep 6, 2017, at 4:31 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 6, 2017, at 2:28 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 6, 2017, at 1:12 PM, Joe Groff via swift-evolution <swift-evolution@swift.org> wrote:
On Sep 6, 2017, at 1:06 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

On Wed, Sep 6, 2017 at 12:41 PM, Joe Groff via swift-evolution <swift-evolution@swift.org> wrote:

“fixing” this bug will cause programs that once operated on previously valid assumptions of “free()” semantics to behave differently, without any warnings ever being generated. Conversely incorrect code will suddenly become “correct” though this is less of a problem.

A sized implementation may fail more obviously when you violate the contract in the future. Not having sized deallocation is a known deficiency of the C model we've been fairly diligent about avoiding in Swift's allocation interfaces, and it would be extremely unfortunate for us to backpedal from it.

-Joe

Which is worse, an active gaping hole in Swift’s memory system, or potentially discouraging users from using a hypothetical future allocation API?

Making sure the existing allocation API is working properly is a prerequisite to introducing a future more advanced allocation API.

Sure, that probably makes sense if we decide to go with a byte offset vs. stride index.
-Andy

···

On Sep 2, 2017, at 5:34 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sat, Sep 2, 2017 at 4:41 PM, Andrew Trick <atrick@apple.com <mailto:atrick@apple.com>> wrote:

On Sep 2, 2017, at 2:06 PM, Taylor Swift via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

the subscript doesn’t know about the stride that you want to use. If you want to get rid of the `at:` ambiguity, you have to get rid of it everywhere, or users will just wind up having to remember two ways (one ambiguous and one less ambiguous) of doing the same thing, instead of one (ambiguous) way of doing it.

Certainly, that's a good point. On rethink and another re-reading of the proposal, it's unclear to me that the addition of `at` arguments to UnsafeMutablePointer is sufficiently justified by the proposal text. Is it merely that it's shorter than writing `foo + MemoryLayout<T>.stride * offset`? With the ambiguity of `at`, it seems that the current way of writing it, though longer, is certainly less ambiguous.

Please reread it; UnsafeMutablePointer’s methods do not use `at:`.

Regarding the typed buffer pointers, I think it is clear by convention, and will be well documented, that the `at` label refers to a position in `self`.

The raw buffer pointer API isn’t so obvious. Since the `at` refers to `self` it might more logically be a byte offset. Note that `at` as a label name always refers to a strided index.

This would be a bit more explicit:
UnsafeRawBufferPointer.initializeMemory(as:T.self, atByteOffset: position * MemoryLayout<T>.stride, from: bufferOfT)

But possibly less convenient… Since that `at` label currently on UnsafeRawPointer.initializeMemory is almost never used, I don’t think we need to worry too much about convenience.

That existing `at` label on UnsafeRawPointer.initializeMemory, would also need to be renamed, which is fine.

If I may suggest one more incremental improvement in clarity, it would be to move `at[ByteOffset]` to be the first argument; this eliminates the possible reading that we are offsetting the source buffer:

UnsafeRawBufferPointer.initializeMemory(atByteOffset: position * MemoryLayout<T>.stride, as:T.self, from: bufferOfT)

If we use byte offset, then the at parameter in UnsafeMutableRawPointer
should be removed, since pointer arithmetic can be used instead (just like
with UnsafeMutablePointer). Not convinced moving the at: argument to come
before the as: argument is worth it in terms of source breakage.

···

On Sat, Sep 2, 2017 at 7:53 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 2, 2017, at 5:34 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sat, Sep 2, 2017 at 4:41 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 2, 2017, at 2:06 PM, Taylor Swift via swift-evolution < >> swift-evolution@swift.org> wrote:

the subscript doesn’t know about the stride that you want to use. If you

want to get rid of the `at:` ambiguity, you have to get rid of it
everywhere, or users will just wind up having to remember two ways (one
ambiguous and one less ambiguous) of doing the same thing, instead of one
(ambiguous) way of doing it.

Certainly, that's a good point. On rethink and another re-reading of the
proposal, it's unclear to me that the addition of `at` arguments to
UnsafeMutablePointer is sufficiently justified by the proposal text. Is it
merely that it's shorter than writing `foo + MemoryLayout<T>.stride *
offset`? With the ambiguity of `at`, it seems that the current way of
writing it, though longer, is certainly less ambiguous.

Please reread it; UnsafeMutablePointer’s methods do *not* use `at:`.

Regarding the typed buffer pointers, I think it is clear by convention,
and will be well documented, that the `at` label refers to a position in
`self`.

The raw buffer pointer API isn’t so obvious. Since the `at` refers to
`self` it might more logically be a byte offset. Note that `at` as a label
name always refers to a strided index.

This would be a bit more explicit:
UnsafeRawBufferPointer.initializeMemory(as:T.self, atByteOffset:
position * MemoryLayout<T>.stride, from: bufferOfT)

But possibly less convenient… Since that `at` label currently on
UnsafeRawPointer.initializeMemory is almost never used, I don’t think we
need to worry too much about convenience.

That existing `at` label on UnsafeRawPointer.initializeMemory, would
also need to be renamed, which is fine.

If I may suggest one more incremental improvement in clarity, it would be
to move `at[ByteOffset]` to be the first argument; this eliminates the
possible reading that we are offsetting the source buffer:

UnsafeRawBufferPointer.initializeMemory(atByteOffset: position *
MemoryLayout<T>.stride, as:T.self, from: bufferOfT)

Sure, that probably makes sense if we decide to go with a byte offset vs.
stride index.
-Andy

The fact that we’re using malloc and free is already part of the ABI because old libraries need to be able to deallocate memory allocated by newer libraries.

The compiler doesn't ever generate calls directly to malloc and free, and the runtime entry points we do use already take size and alignment on both allocation and deallocation.

True, we’ve never said that UnsafePointer deallocation is compatible with C, and I we don't currently expose malloc_size functionality in any API AFAIK.

Within the standard library we could make use of some new deallocation fast path in the future without worrying about backward deployment.

Outside of the standard library, clients will get the benefits of whatever allocator is available on their deployed platform because we now encourage them to use UnsafeBufferPointer.deallocate(). We can change the implementation inside UnsafeBufferPointer all we want, as long as it’s still malloc-compatible.

I’m sure we’ll want to provide a better allocation/deallocation API in the future for systems programmers based on move-only types. That will already be deployment-limited.

Absolute worst case, we introduce a sized UnsafePointer.deallocate in the future. Any new code outside the stdlib that wants the performance advantage would either need to
- trivially wrap deallocation using UnsafeBufferPointer
- create a trivial UnsafePointer.deallocate thunk under an availability flag

Since we already have sized deallocate, why would we take it away? If the name is misleading, we can change the name.

I don't think leaving it as an optional argument is worthwhile, as I explained above. Deallocation capacity is either required or we drop it completely. If we keep it, then `allocatedCapacity` is the right name.

The reason for taking it away, beside being misleading, is that it exposes another level of unsafety.

My thinking has been that this is not the allocation fast path of the future, and the stdlib itself could track the size of unsafely-allocated blocks if it ever used a different underlying allocator.

Now I realize this isn't really about fast/slow deallocation paths. Removing `capacity` or even making it optional forces all future Swift implementations to provide malloc_size functionality for any piece of memory that is compatible with the Unsafe API.

I'm actually ok with that, because I think it's generally desirable for application memory to reside in either in malloc-compatible blocks or fixed size pools. i.e. I think malloc_size is something the platform needs. However, you seem to think this would be too restrictive in the future. How is this a known problem for C and what's your confidence level it will be a problem for Swift?

No, I agree that being malloc-compatible is a reasonable goal; on Apple platforms, being able to register any custom allocator we come up with as a malloc zone would mean that the platform's existing memory profiling and debugging tools can still work. Even if we have a scheme where the allocator directly reaches into per-thread fixed-sized pools, it seems to me like it'd be hard to make malloc_size impossible to implement, though it might be slow, asking each pool for each thread whether an address is inside it. Strongly encouraging, if not requiring, user code to provide deallocation sizes seems to me like a prerequisite to making that sort of design a net win over plain malloc/free.

-Joe

Ok good. For growable buffers, we also want the OS to give us malloc_size which may be larger than requested capacity.

The semantics of buffer.deallocate() needs to be: free `buffer.count` bytes of memory at `buffer.baseAddress`. So, that will always be the fast path!
Kelvin, do you agree with that?

this could be problematic if you have multiple contiguous buffers carved out of the same heap block. i agree that this is the best semantics for buffer pointers but we need the sized backend in Swift before this is possible else we will end up in the same boat we’re in right now with `deallocate(capacity:)` where we would have to make buffer deallocate heap block-based for now and then pull the rug out from underneath users later in order to switch to the improved semantics

Any future safe API for manual buffer allocation/deallocation will also track the buffer size, so that will also be the fast path.

In the future, pointer.deallocate() without capacity might become a slower path. But I think that the UnsafePointer allocation/deallocation paths are Swift's equivalent of malloc/free. If the client code knows the buffer size, it shouldn't be using those. In fact, we don't want to force the client to track capacity if deallocation is *not* performance critical.

agreed

···

On Sep 6, 2017, at 6:27 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 6, 2017, at 3:42 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 6, 2017, at 3:26 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 6, 2017, at 2:31 PM, Joe Groff <jgroff@apple.com> wrote:

I suppose an optional `allocatedCapacity` argument would be ok, since we should be asserting in the runtime anyway. It just adds complexity to the API and I don't buy the backward deployment argument.

-Andy

I don’t see any source for this claim in the documentation
<https://developer.apple.com/documentation/swift/unsafemutablepointer/2295090-deallocate>,
or the source code
<https://github.com/apple/swift/blob/master/stdlib/public/core/UnsafePointer.swift.gyb#L432>.
As far as I can tell the expected behavior is that partial deallocation
“should” work.

···

On Thu, Sep 7, 2017 at 8:59 AM, Joe Groff <jgroff@apple.com> wrote:

> On Sep 6, 2017, at 4:46 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:
>
>
>
>> On Sep 6, 2017, at 5:12 PM, Joe Groff <jgroff@apple.com> wrote:
>>
>>
>>> On Sep 6, 2017, at 3:07 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:
>>>
>>>
>>>
>>>> On Sep 6, 2017, at 4:31 PM, Joe Groff <jgroff@apple.com> wrote:
>>>>
>>>>
>>>>> On Sep 6, 2017, at 2:28 PM, Andrew Trick <atrick@apple.com> wrote:
>>>>>
>>>>>
>>>>>>> On Sep 6, 2017, at 1:12 PM, Joe Groff via swift-evolution < > swift-evolution@swift.org> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Sep 6, 2017, at 1:06 PM, Taylor Swift <kelvin13ma@gmail.com> > wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On Wed, Sep 6, 2017 at 12:41 PM, Joe Groff via swift-evolution < > swift-evolution@swift.org> wrote:
>>>>>>>> Currently, memory is deallocated by an instance method on
UnsafeMutablePointer, deallocate(count:). Like much of the Swift pointer
API, performing this operation on a buffer pointer requires extracting
baseAddress! and count. It is very common for the allocation code above to
be immediately followed by:
>>>>>>>>
>>>>>>>> defer
>>>>>>>>
>>>>>>>> {
>>>>>>>> buffer.
>>>>>>>> baseAddress?.deallocate(capacity: buffer.count
>>>>>>>> )
>>>>>>>> }
>>>>>>>>
>>>>>>>> This method is extremely problematic because nearly all users, on
first seeing the signature of deallocate(capacity:), will naturally
conclude from the capacity label that deallocate(capacity:) is equivalent
to some kind of realloc()that can only shrink the buffer. However this is
not the actual behavior — deallocate(capacity:) actually ignores the
capacity argument and just calls free() on self. The current API is not
only awkward and suboptimal, it is misleading. You can write perfectly
legal Swift code that shouldn’t segfault, but still can, for example
>>>>>>>>
>>>>>>>> var ptr = UnsafeMutablePointer<UInt8>.allocate(capacity: 1000000
>>>>>>>> )
>>>>>>>> ptr.
>>>>>>>> initialize(to: 13, count: 1000000
>>>>>>>> )
>>>>>>>> ptr.
>>>>>>>> deallocate(capacity: 500000) // deallocate the second half of the
memory block
>>>>>>>> ptr[0] // segmentation fault
>>>>>>>> where the first 500000 addresses should still be valid if the
documentation is to be read literally.
>>>>>>>
>>>>>>> The fact that the Swift runtime currently uses malloc/free is an
implementation detail. Tracking deallocation size is a very useful
optimization for better allocator backends, and C++ underwent an ABI break
to make it so that sized delete can be supported. Maybe we can change the
name to `deallocate(allocatedCapacity:)` to make it clear that it isn't
resizing the memory, and/or make the capacity argument optional so that you
can pay for the overhead of the allocator deriving the size if it's
inconvenient for the calling code to carry the size around, but we
shouldn't remove the functionality altogether.
>>>>>>>
>>>>>>> -Joe
>>>>>>> _______________________________________________
>>>>>>> swift-evolution mailing list
>>>>>>> swift-evolution@swift.org
>>>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>>>>
>>>>>>> The idea is to get the house in order by removing all parameters
from deallocate(), since that’s what it really does right now. Then, in the
future, if Swift gets a more sophisticated allocator backend, a new method
like deallocate(capacity:) or reallocate(toCapacity:) could be added
without conflicting with the currently named deallocate(capacity:).
However, using the function signature to pretend that it does something it
can’t actually do right now is extremely dangerous.
>>>>>>
>>>>>> I don't think that's a good idea in this case, because it's not
unlikely we would explore an optimized allocator soon after ABI stability,
and retrofitting these interfaces in a future version of Swift would put a
deployment target limit on when they can be used, and mean that a lot of
user code would need to be retrofitted to carry allocated capacities where
needed to see any benefit.
>>>>>>
>>>>>> -Joe
>>>>>
>>>>> The fact that we’re using malloc and free is already part of the ABI
because old libraries need to be able to deallocate memory allocated by
newer libraries.
>>>>
>>>> The compiler doesn't ever generate calls directly to malloc and free,
and the runtime entry points we do use already take size and alignment on
both allocation and deallocation.
>>>>
>>>>> Within the standard library we could make use of some new
deallocation fast path in the future without worrying about backward
deployment.
>>>>>
>>>>> Outside of the standard library, clients will get the benefits of
whatever allocator is available on their deployed platform because we now
encourage them to use UnsafeBufferPointer.deallocate(). We can change the
implementation inside UnsafeBufferPointer all we want, as long as it’s
still malloc-compatible.
>>>>>
>>>>> I’m sure we’ll want to provide a better allocation/deallocation API
in the future for systems programmers based on move-only types. That will
already be deployment-limited.
>>>>>
>>>>> Absolute worst case, we introduce a sized UnsafePointer.deallocate
in the future. Any new code outside the stdlib that wants the performance
advantage would either need to
>>>>> - trivially wrap deallocation using UnsafeBufferPointer
>>>>> - create a trivial UnsafePointer.deallocate thunk under an
availability flag
>>>>
>>>> Since we already have sized deallocate, why would we take it away? If
the name is misleading, we can change the name.
>>>>
>>>> -Joe
>>>
>>> Exactly, we are changing the name to `deallocate()`. As for the old
`deallocate(capacity:)` method that *needs* to be removed because it is
actively harmful. As I’ve explained it’s not enough to just drop in a sized
backend later as an “implementation detail” because it’s not an
implementation detail, you’re changing the fundamental behavior of that
method.
>>
>> It would remove a mode of "holding it wrong", but the specified
behavior will not change; it has and will always fully deallocate the
object being referenced in the cases where the call has defined behavior.
>
> idk what you mean by it holding it wrong, but while the specified
behavior won’t change the actual behavior *will* change, because right now
the two don’t agree. We can’t just treat it as a bug to fix because it’s
been around since the beginning of Swift and some people treat it as part
of expected behavior, using the function like `free()`. This isn’t using it
incorrectly, in fact in terms of behavior, passing a meaningful capacity
argument to `deallocate(capacity:)` is *incorrect usage*, as demonstrated
in the segfaulting example in the proposal.

The segfaulting example is an incorrect usage. The only valid parameters
to deallocate(capacity:) are the base address of an allocation, and the
original capacity passed into allocate(); it has never been intended to
support partial deallocation of allocated blocks. It seems to me like this
proposal is based on a misunderstanding of how the API works. The
documentation and/or name should be clarified.

-Joe

> “fixing” this bug will cause programs that once operated on previously
valid assumptions of “free()” semantics to behave differently, without any
warnings ever being generated. Conversely incorrect code will suddenly
become “correct” though this is less of a problem.
>
>> A sized implementation may fail more obviously when you violate the
contract in the future. Not having sized deallocation is a known deficiency
of the C model we've been fairly diligent about avoiding in Swift's
allocation interfaces, and it would be extremely unfortunate for us to
backpedal from it.
>>
>> -Joe
>
> Which is worse, an active gaping hole in Swift’s memory system, or
potentially discouraging users from using a hypothetical future allocation
API?
>
> Making sure the existing allocation API is working properly is a
prerequisite to introducing a future more advanced allocation API.

the subscript doesn’t know about the stride that you want to use. If you

want to get rid of the `at:` ambiguity, you have to get rid of it
everywhere, or users will just wind up having to remember two ways (one
ambiguous and one less ambiguous) of doing the same thing, instead of one
(ambiguous) way of doing it.

Certainly, that's a good point. On rethink and another re-reading of
the proposal, it's unclear to me that the addition of `at` arguments to
UnsafeMutablePointer is sufficiently justified by the proposal text. Is it
merely that it's shorter than writing `foo + MemoryLayout<T>.stride *
offset`? With the ambiguity of `at`, it seems that the current way of
writing it, though longer, is certainly less ambiguous.

Please reread it; UnsafeMutablePointer’s methods do *not* use `at:`.

Regarding the typed buffer pointers, I think it is clear by convention,
and will be well documented, that the `at` label refers to a position in
`self`.

The raw buffer pointer API isn’t so obvious. Since the `at` refers to
`self` it might more logically be a byte offset. Note that `at` as a label
name always refers to a strided index.

This would be a bit more explicit:
UnsafeRawBufferPointer.initializeMemory(as:T.self, atByteOffset:
position * MemoryLayout<T>.stride, from: bufferOfT)

But possibly less convenient… Since that `at` label currently on
UnsafeRawPointer.initializeMemory is almost never used, I don’t think we
need to worry too much about convenience.

That existing `at` label on UnsafeRawPointer.initializeMemory, would
also need to be renamed, which is fine.

If I may suggest one more incremental improvement in clarity, it would be
to move `at[ByteOffset]` to be the first argument; this eliminates the
possible reading that we are offsetting the source buffer:

UnsafeRawBufferPointer.initializeMemory(atByteOffset: position *
MemoryLayout<T>.stride, as:T.self, from: bufferOfT)

Sure, that probably makes sense if we decide to go with a byte offset vs.
stride index.
-Andy

If we use byte offset, then the at parameter in UnsafeMutableRawPointer
should be removed, since pointer arithmetic can be used instead (just like
with UnsafeMutablePointer).

I agree that it seems quite sensible to remove the ‘at’ parameter
altogether from the UMRP method.

Not convinced moving the at: argument to come before the as: argument is

worth it in terms of source breakage.

Since much of this proposal involves shuffling and relabeling arguments,
I’d argue it’s better to break slight more source in one go for the optimal
API than to break slightly less for a slightly less optimal API, no? (This
is assuming there is agreement that ‘at:as:’ is less prone to
misinterpretation than ‘as:at:’.)

···

On Sun, Sep 3, 2017 at 21:43 Taylor Swift <kelvin13ma@gmail.com> wrote:

On Sat, Sep 2, 2017 at 7:53 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 2, 2017, at 5:34 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Sat, Sep 2, 2017 at 4:41 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 2, 2017, at 2:06 PM, Taylor Swift via swift-evolution < >>> swift-evolution@swift.org> wrote:

If I understand your proposal, it’s only valid to deallocate a buffer that was allocated with the same capacity. Anything else should assert.
-Andy

···

On Sep 6, 2017, at 4:54 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

The semantics of buffer.deallocate() needs to be: free `buffer.count` bytes of memory at `buffer.baseAddress`. So, that will always be the fast path!
Kelvin, do you agree with that?

this could be problematic if you have multiple contiguous buffers carved out of the same heap block. i agree that this is the best semantics for buffer pointers but we need the sized backend in Swift before this is possible else we will end up in the same boat we’re in right now with `deallocate(capacity:)` where we would have to make buffer deallocate heap block-based for now and then pull the rug out from underneath users later in order to switch to the improved semantics

the proposal isn’t specific enough there and that’s my fault but this seems like a good solution. in the future if we get a sized backend we can loosen the assertions and make the partial heap block buffer case defined behavior.

···

On Sep 6, 2017, at 7:01 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 6, 2017, at 4:54 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

The semantics of buffer.deallocate() needs to be: free `buffer.count` bytes of memory at `buffer.baseAddress`. So, that will always be the fast path!
Kelvin, do you agree with that?

this could be problematic if you have multiple contiguous buffers carved out of the same heap block. i agree that this is the best semantics for buffer pointers but we need the sized backend in Swift before this is possible else we will end up in the same boat we’re in right now with `deallocate(capacity:)` where we would have to make buffer deallocate heap block-based for now and then pull the rug out from underneath users later in order to switch to the improved semantics

If I understand your proposal, it’s only valid to deallocate a buffer that was allocated with the same capacity. Anything else should assert.
-Andy

I don’t see any source for this claim in the documentation <https://developer.apple.com/documentation/swift/unsafemutablepointer/2295090-deallocate>, or the source code <https://github.com/apple/swift/blob/master/stdlib/public/core/UnsafePointer.swift.gyb#L432>. As far as I can tell the expected behavior is that partial deallocation “should” work.

We should fix the documentation bug then.

Another way we could make these APIs easier to use correctly is to leverage the type system. allocate could return the allocation size wrapped up in an opaque AllocationToken type, and deallocate could take an AllocationToken:

static func allocate(capacity: Int) -> (UnsafeMutableBufferPointer<T>, AllocationToken)

func deallocate(token: AllocationToken)

That would make it harder for user code to pass an arbitrary size in, and make the relationship between the allocate and deallocate calls more explicit in their signatures. If we get an ABI break and a bold new idea for an allocator design with different context needs in the future, it'd also insulate source code from the exact contents of the information that needs to be carried from allocation to deallocation.

-Joe

···

On Sep 7, 2017, at 8:06 AM, Taylor Swift <kelvin13ma@gmail.com> wrote:

On Thu, Sep 7, 2017 at 8:59 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

> On Sep 6, 2017, at 4:46 PM, Taylor Swift <kelvin13ma@gmail.com <mailto:kelvin13ma@gmail.com>> wrote:
>
>
>
>> On Sep 6, 2017, at 5:12 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:
>>
>>
>>> On Sep 6, 2017, at 3:07 PM, Taylor Swift <kelvin13ma@gmail.com <mailto:kelvin13ma@gmail.com>> wrote:
>>>
>>>
>>>
>>>> On Sep 6, 2017, at 4:31 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:
>>>>
>>>>
>>>>> On Sep 6, 2017, at 2:28 PM, Andrew Trick <atrick@apple.com <mailto:atrick@apple.com>> wrote:
>>>>>
>>>>>
>>>>>>> On Sep 6, 2017, at 1:12 PM, Joe Groff via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Sep 6, 2017, at 1:06 PM, Taylor Swift <kelvin13ma@gmail.com <mailto:kelvin13ma@gmail.com>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On Wed, Sep 6, 2017 at 12:41 PM, Joe Groff via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>>>>>>>> Currently, memory is deallocated by an instance method on UnsafeMutablePointer, deallocate(count:). Like much of the Swift pointer API, performing this operation on a buffer pointer requires extracting baseAddress! and count. It is very common for the allocation code above to be immediately followed by:
>>>>>>>>
>>>>>>>> defer
>>>>>>>>
>>>>>>>> {
>>>>>>>> buffer.
>>>>>>>> baseAddress?.deallocate(capacity: buffer.count
>>>>>>>> )
>>>>>>>> }
>>>>>>>>
>>>>>>>> This method is extremely problematic because nearly all users, on first seeing the signature of deallocate(capacity:), will naturally conclude from the capacity label that deallocate(capacity:) is equivalent to some kind of realloc()that can only shrink the buffer. However this is not the actual behavior — deallocate(capacity:) actually ignores the capacity argument and just calls free() on self. The current API is not only awkward and suboptimal, it is misleading. You can write perfectly legal Swift code that shouldn’t segfault, but still can, for example
>>>>>>>>
>>>>>>>> var ptr = UnsafeMutablePointer<UInt8>.allocate(capacity: 1000000
>>>>>>>> )
>>>>>>>> ptr.
>>>>>>>> initialize(to: 13, count: 1000000
>>>>>>>> )
>>>>>>>> ptr.
>>>>>>>> deallocate(capacity: 500000) // deallocate the second half of the memory block
>>>>>>>> ptr[0] // segmentation fault
>>>>>>>> where the first 500000 addresses should still be valid if the documentation is to be read literally.
>>>>>>>
>>>>>>> The fact that the Swift runtime currently uses malloc/free is an implementation detail. Tracking deallocation size is a very useful optimization for better allocator backends, and C++ underwent an ABI break to make it so that sized delete can be supported. Maybe we can change the name to `deallocate(allocatedCapacity:)` to make it clear that it isn't resizing the memory, and/or make the capacity argument optional so that you can pay for the overhead of the allocator deriving the size if it's inconvenient for the calling code to carry the size around, but we shouldn't remove the functionality altogether.
>>>>>>>
>>>>>>> -Joe
>>>>>>> _______________________________________________
>>>>>>> swift-evolution mailing list
>>>>>>> swift-evolution@swift.org <mailto:swift-evolution@swift.org>
>>>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>>>>
>>>>>>> The idea is to get the house in order by removing all parameters from deallocate(), since that’s what it really does right now. Then, in the future, if Swift gets a more sophisticated allocator backend, a new method like deallocate(capacity:) or reallocate(toCapacity:) could be added without conflicting with the currently named deallocate(capacity:). However, using the function signature to pretend that it does something it can’t actually do right now is extremely dangerous.
>>>>>>
>>>>>> I don't think that's a good idea in this case, because it's not unlikely we would explore an optimized allocator soon after ABI stability, and retrofitting these interfaces in a future version of Swift would put a deployment target limit on when they can be used, and mean that a lot of user code would need to be retrofitted to carry allocated capacities where needed to see any benefit.
>>>>>>
>>>>>> -Joe
>>>>>
>>>>> The fact that we’re using malloc and free is already part of the ABI because old libraries need to be able to deallocate memory allocated by newer libraries.
>>>>
>>>> The compiler doesn't ever generate calls directly to malloc and free, and the runtime entry points we do use already take size and alignment on both allocation and deallocation.
>>>>
>>>>> Within the standard library we could make use of some new deallocation fast path in the future without worrying about backward deployment.
>>>>>
>>>>> Outside of the standard library, clients will get the benefits of whatever allocator is available on their deployed platform because we now encourage them to use UnsafeBufferPointer.deallocate(). We can change the implementation inside UnsafeBufferPointer all we want, as long as it’s still malloc-compatible.
>>>>>
>>>>> I’m sure we’ll want to provide a better allocation/deallocation API in the future for systems programmers based on move-only types. That will already be deployment-limited.
>>>>>
>>>>> Absolute worst case, we introduce a sized UnsafePointer.deallocate in the future. Any new code outside the stdlib that wants the performance advantage would either need to
>>>>> - trivially wrap deallocation using UnsafeBufferPointer
>>>>> - create a trivial UnsafePointer.deallocate thunk under an availability flag
>>>>
>>>> Since we already have sized deallocate, why would we take it away? If the name is misleading, we can change the name.
>>>>
>>>> -Joe
>>>
>>> Exactly, we are changing the name to `deallocate()`. As for the old `deallocate(capacity:)` method that *needs* to be removed because it is actively harmful. As I’ve explained it’s not enough to just drop in a sized backend later as an “implementation detail” because it’s not an implementation detail, you’re changing the fundamental behavior of that method.
>>
>> It would remove a mode of "holding it wrong", but the specified behavior will not change; it has and will always fully deallocate the object being referenced in the cases where the call has defined behavior.
>
> idk what you mean by it holding it wrong, but while the specified behavior won’t change the actual behavior *will* change, because right now the two don’t agree. We can’t just treat it as a bug to fix because it’s been around since the beginning of Swift and some people treat it as part of expected behavior, using the function like `free()`. This isn’t using it incorrectly, in fact in terms of behavior, passing a meaningful capacity argument to `deallocate(capacity:)` is *incorrect usage*, as demonstrated in the segfaulting example in the proposal.

The segfaulting example is an incorrect usage. The only valid parameters to deallocate(capacity:) are the base address of an allocation, and the original capacity passed into allocate(); it has never been intended to support partial deallocation of allocated blocks. It seems to me like this proposal is based on a misunderstanding of how the API works. The documentation and/or name should be clarified.

-Joe

> “fixing” this bug will cause programs that once operated on previously valid assumptions of “free()” semantics to behave differently, without any warnings ever being generated. Conversely incorrect code will suddenly become “correct” though this is less of a problem.
>
>> A sized implementation may fail more obviously when you violate the contract in the future. Not having sized deallocation is a known deficiency of the C model we've been fairly diligent about avoiding in Swift's allocation interfaces, and it would be extremely unfortunate for us to backpedal from it.
>>
>> -Joe
>
> Which is worse, an active gaping hole in Swift’s memory system, or potentially discouraging users from using a hypothetical future allocation API?
>
> Making sure the existing allocation API is working properly is a prerequisite to introducing a future more advanced allocation API.

This discussion needs to be grounded by reiterating role of the API. UnsafePointer specifies the memory model without extraneous functionality or convenience.

The UnsafePointer.deallocate() API *is not*:

- a common, expected, or encouraged way to deallocate

- the simplest, safest, or most convenient way to deallocate

- necessarilly the most optimal path for deallocation

There is only one decision that needs to be made here. Does the Swift runtime track allocation size for manually allocated blocks? I think the answer should be "yes", or at least haven't heard a strong argument against it. UnsafePointer.deallocate() needs to direcly reflect that model by making `allocatedCapacity` an *optional* argument.

Discussion about whether this API is unsafe, misleading, suboptimal or incorrectly implemented are secondary. Those are all deficiencies in the current documentation, current implementation, and availability of higher-level APIs.

Note that yesterday I argued that an optional argument wasn't worth the potential for confusion. That's true from a practical perspective, but I had lost sight of need to clearly specify the memory model. We want the Swift runtime to both have the functionality for tracking block size and also allow user code to track it more efficiently. Both those intentions need to be reflected in this API.

-Andy

···

On Sep 7, 2017, at 8:06 AM, Taylor Swift <kelvin13ma@gmail.com> wrote:

I don’t see any source for this claim in the documentation <https://developer.apple.com/documentation/swift/unsafemutablepointer/2295090-deallocate>, or the source code <https://github.com/apple/swift/blob/master/stdlib/public/core/UnsafePointer.swift.gyb#L432>. As far as I can tell the expected behavior is that partial deallocation “should” work.

On Thu, Sep 7, 2017 at 8:59 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

The segfaulting example is an incorrect usage. The only valid parameters to deallocate(capacity:) are the base address of an allocation, and the original capacity passed into allocate(); it has never been intended to support partial deallocation of allocated blocks. It seems to me like this proposal is based on a misunderstanding of how the API works. The documentation and/or name should be clarified.

-Joe

> “fixing” this bug will cause programs that once operated on previously valid assumptions of “free()” semantics to behave differently, without any warnings ever being generated. Conversely incorrect code will suddenly become “correct” though this is less of a problem.
>
>> A sized implementation may fail more obviously when you violate the contract in the future. Not having sized deallocation is a known deficiency of the C model we've been fairly diligent about avoiding in Swift's allocation interfaces, and it would be extremely unfortunate for us to backpedal from it.
>>
>> -Joe

not a fan of this. A lot of data structures including buffer pointers
already store their capacity in a property, storing an allocation token
would be redundant and waste space. It also forces allocate(capacity:) to
return a tuple which is a big turn off for me.

···

On Thu, Sep 7, 2017 at 10:56 AM, Joe Groff <jgroff@apple.com> wrote:

On Sep 7, 2017, at 8:06 AM, Taylor Swift <kelvin13ma@gmail.com> wrote:

I don’t see any source for this claim in the documentation
<https://developer.apple.com/documentation/swift/unsafemutablepointer/2295090-deallocate>,
or the source code
<https://github.com/apple/swift/blob/master/stdlib/public/core/UnsafePointer.swift.gyb#L432>.
As far as I can tell the expected behavior is that partial deallocation
“should” work.

We should fix the documentation bug then.

Another way we could make these APIs easier to use correctly is to
leverage the type system. allocate could return the allocation size wrapped
up in an opaque AllocationToken type, and deallocate could take an
AllocationToken:

static func allocate(capacity: Int) -> (UnsafeMutableBufferPointer<T>,
AllocationToken)

func deallocate(token: AllocationToken)

That would make it harder for user code to pass an arbitrary size in, and
make the relationship between the allocate and deallocate calls more
explicit in their signatures. If we get an ABI break and a bold new idea
for an allocator design with different context needs in the future, it'd
also insulate source code from the exact contents of the information that
needs to be carried from allocation to deallocation.

-Joe

On Thu, Sep 7, 2017 at 8:59 AM, Joe Groff <jgroff@apple.com> wrote:

> On Sep 6, 2017, at 4:46 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:
>
>
>
>> On Sep 6, 2017, at 5:12 PM, Joe Groff <jgroff@apple.com> wrote:
>>
>>
>>> On Sep 6, 2017, at 3:07 PM, Taylor Swift <kelvin13ma@gmail.com> >> wrote:
>>>
>>>
>>>
>>>> On Sep 6, 2017, at 4:31 PM, Joe Groff <jgroff@apple.com> wrote:
>>>>
>>>>
>>>>> On Sep 6, 2017, at 2:28 PM, Andrew Trick <atrick@apple.com> wrote:
>>>>>
>>>>>
>>>>>>> On Sep 6, 2017, at 1:12 PM, Joe Groff via swift-evolution < >> swift-evolution@swift.org> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Sep 6, 2017, at 1:06 PM, Taylor Swift <kelvin13ma@gmail.com> >> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On Wed, Sep 6, 2017 at 12:41 PM, Joe Groff via swift-evolution < >> swift-evolution@swift.org> wrote:
>>>>>>>> Currently, memory is deallocated by an instance method on
UnsafeMutablePointer, deallocate(count:). Like much of the Swift pointer
API, performing this operation on a buffer pointer requires extracting
baseAddress! and count. It is very common for the allocation code above to
be immediately followed by:
>>>>>>>>
>>>>>>>> defer
>>>>>>>>
>>>>>>>> {
>>>>>>>> buffer.
>>>>>>>> baseAddress?.deallocate(capacity: buffer.count
>>>>>>>> )
>>>>>>>> }
>>>>>>>>
>>>>>>>> This method is extremely problematic because nearly all users,
on first seeing the signature of deallocate(capacity:), will naturally
conclude from the capacity label that deallocate(capacity:) is equivalent
to some kind of realloc()that can only shrink the buffer. However this is
not the actual behavior — deallocate(capacity:) actually ignores the
capacity argument and just calls free() on self. The current API is not
only awkward and suboptimal, it is misleading. You can write perfectly
legal Swift code that shouldn’t segfault, but still can, for example
>>>>>>>>
>>>>>>>> var ptr = UnsafeMutablePointer<UInt8>.allocate(capacity: 1000000
>>>>>>>> )
>>>>>>>> ptr.
>>>>>>>> initialize(to: 13, count: 1000000
>>>>>>>> )
>>>>>>>> ptr.
>>>>>>>> deallocate(capacity: 500000) // deallocate the second half of
the memory block
>>>>>>>> ptr[0] // segmentation fault
>>>>>>>> where the first 500000 addresses should still be valid if the
documentation is to be read literally.
>>>>>>>
>>>>>>> The fact that the Swift runtime currently uses malloc/free is an
implementation detail. Tracking deallocation size is a very useful
optimization for better allocator backends, and C++ underwent an ABI break
to make it so that sized delete can be supported. Maybe we can change the
name to `deallocate(allocatedCapacity:)` to make it clear that it isn't
resizing the memory, and/or make the capacity argument optional so that you
can pay for the overhead of the allocator deriving the size if it's
inconvenient for the calling code to carry the size around, but we
shouldn't remove the functionality altogether.
>>>>>>>
>>>>>>> -Joe
>>>>>>> _______________________________________________
>>>>>>> swift-evolution mailing list
>>>>>>> swift-evolution@swift.org
>>>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>>>>
>>>>>>> The idea is to get the house in order by removing all parameters
from deallocate(), since that’s what it really does right now. Then, in the
future, if Swift gets a more sophisticated allocator backend, a new method
like deallocate(capacity:) or reallocate(toCapacity:) could be added
without conflicting with the currently named deallocate(capacity:).
However, using the function signature to pretend that it does something it
can’t actually do right now is extremely dangerous.
>>>>>>
>>>>>> I don't think that's a good idea in this case, because it's not
unlikely we would explore an optimized allocator soon after ABI stability,
and retrofitting these interfaces in a future version of Swift would put a
deployment target limit on when they can be used, and mean that a lot of
user code would need to be retrofitted to carry allocated capacities where
needed to see any benefit.
>>>>>>
>>>>>> -Joe
>>>>>
>>>>> The fact that we’re using malloc and free is already part of the
ABI because old libraries need to be able to deallocate memory allocated by
newer libraries.
>>>>
>>>> The compiler doesn't ever generate calls directly to malloc and
free, and the runtime entry points we do use already take size and
alignment on both allocation and deallocation.
>>>>
>>>>> Within the standard library we could make use of some new
deallocation fast path in the future without worrying about backward
deployment.
>>>>>
>>>>> Outside of the standard library, clients will get the benefits of
whatever allocator is available on their deployed platform because we now
encourage them to use UnsafeBufferPointer.deallocate(). We can change
the implementation inside UnsafeBufferPointer all we want, as long as it’s
still malloc-compatible.
>>>>>
>>>>> I’m sure we’ll want to provide a better allocation/deallocation API
in the future for systems programmers based on move-only types. That will
already be deployment-limited.
>>>>>
>>>>> Absolute worst case, we introduce a sized UnsafePointer.deallocate
in the future. Any new code outside the stdlib that wants the performance
advantage would either need to
>>>>> - trivially wrap deallocation using UnsafeBufferPointer
>>>>> - create a trivial UnsafePointer.deallocate thunk under an
availability flag
>>>>
>>>> Since we already have sized deallocate, why would we take it away?
If the name is misleading, we can change the name.
>>>>
>>>> -Joe
>>>
>>> Exactly, we are changing the name to `deallocate()`. As for the old
`deallocate(capacity:)` method that *needs* to be removed because it is
actively harmful. As I’ve explained it’s not enough to just drop in a sized
backend later as an “implementation detail” because it’s not an
implementation detail, you’re changing the fundamental behavior of that
method.
>>
>> It would remove a mode of "holding it wrong", but the specified
behavior will not change; it has and will always fully deallocate the
object being referenced in the cases where the call has defined behavior.
>
> idk what you mean by it holding it wrong, but while the specified
behavior won’t change the actual behavior *will* change, because right now
the two don’t agree. We can’t just treat it as a bug to fix because it’s
been around since the beginning of Swift and some people treat it as part
of expected behavior, using the function like `free()`. This isn’t using it
incorrectly, in fact in terms of behavior, passing a meaningful capacity
argument to `deallocate(capacity:)` is *incorrect usage*, as demonstrated
in the segfaulting example in the proposal.

The segfaulting example is an incorrect usage. The only valid parameters
to deallocate(capacity:) are the base address of an allocation, and the
original capacity passed into allocate(); it has never been intended to
support partial deallocation of allocated blocks. It seems to me like this
proposal is based on a misunderstanding of how the API works. The
documentation and/or name should be clarified.

-Joe

> “fixing” this bug will cause programs that once operated on previously
valid assumptions of “free()” semantics to behave differently, without any
warnings ever being generated. Conversely incorrect code will suddenly
become “correct” though this is less of a problem.
>
>> A sized implementation may fail more obviously when you violate the
contract in the future. Not having sized deallocation is a known deficiency
of the C model we've been fairly diligent about avoiding in Swift's
allocation interfaces, and it would be extremely unfortunate for us to
backpedal from it.
>>
>> -Joe
>
> Which is worse, an active gaping hole in Swift’s memory system, or
potentially discouraging users from using a hypothetical future allocation
API?
>
> Making sure the existing allocation API is working properly is a
prerequisite to introducing a future more advanced allocation API.

If we use byte offset, then the at parameter in UnsafeMutableRawPointer should be removed, since pointer arithmetic can be used instead (just like with UnsafeMutablePointer).

I agree that it seems quite sensible to remove the ‘at’ parameter altogether from the UMRP method.

No code in tree or on github is using the `at` argument. I think it can be removed. A fixit should still be possible.

Not convinced moving the at: argument to come before the as: argument is worth it in terms of source breakage.

Since much of this proposal involves shuffling and relabeling arguments, I’d argue it’s better to break slight more source in one go for the optimal API than to break slightly less for a slightly less optimal API, no? (This is assuming there is agreement that ‘at:as:’ is less prone to misinterpretation than ‘as:at:’.)

To be clear, we’re just talking about UnsafeMutableRawBufferPointer.initializeMemory now, so this is purely additive.
I think the label needs to be `atByteOffset`, and placing it before `as` makes a lot of sense because it no longer depends on the type’s stride.

-Andy

···

On Sep 3, 2017, at 8:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

what was the reasoning for making raw at: offset in strides and not bytes?

So, you’re talking about UnsafeMutableRawPointer(as:at:count:to:)…

The thinking was that it was typical for users to continue filling in the memory using the same type, and that manually performing address arithmetic is error prone.

Now that we're considering consistency with the raw buffer pointer API, which didn’t exist at the time, the thinking is that `at` can be ambiguous and that manual address arithmetic is actually less error prone and increases clarity at the call site.

Does that adequately sum of the last few messages in this thread?

-Andy

···

On Sep 3, 2017, at 8:38 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

On Sep 3, 2017, at 10:22 PM, Andrew Trick <atrick@apple.com <mailto:atrick@apple.com>> wrote:

On Sep 3, 2017, at 8:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

If we use byte offset, then the at parameter in UnsafeMutableRawPointer should be removed, since pointer arithmetic can be used instead (just like with UnsafeMutablePointer).

I agree that it seems quite sensible to remove the ‘at’ parameter altogether from the UMRP method.

No code in tree or on github is using the `at` argument. I think it can be removed. A fixit should still be possible.

Not convinced moving the at: argument to come before the as: argument is worth it in terms of source breakage.

Since much of this proposal involves shuffling and relabeling arguments, I’d argue it’s better to break slight more source in one go for the optimal API than to break slightly less for a slightly less optimal API, no? (This is assuming there is agreement that ‘at:as:’ is less prone to misinterpretation than ‘as:at:’.)

To be clear, we’re just talking about UnsafeMutableRawBufferPointer.initializeMemory now, so this is purely additive.
I think the label needs to be `atByteOffset`, and placing it before `as` makes a lot of sense because it no longer depends on the type’s stride.

-Andy

what was the reasoning for making raw at: offset in strides and not bytes?

···

On Sep 3, 2017, at 10:22 PM, Andrew Trick <atrick@apple.com> wrote:

On Sep 3, 2017, at 8:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

If we use byte offset, then the at parameter in UnsafeMutableRawPointer should be removed, since pointer arithmetic can be used instead (just like with UnsafeMutablePointer).

I agree that it seems quite sensible to remove the ‘at’ parameter altogether from the UMRP method.

No code in tree or on github is using the `at` argument. I think it can be removed. A fixit should still be possible.

Not convinced moving the at: argument to come before the as: argument is worth it in terms of source breakage.

Since much of this proposal involves shuffling and relabeling arguments, I’d argue it’s better to break slight more source in one go for the optimal API than to break slightly less for a slightly less optimal API, no? (This is assuming there is agreement that ‘at:as:’ is less prone to misinterpretation than ‘as:at:’.)

To be clear, we’re just talking about UnsafeMutableRawBufferPointer.initializeMemory now, so this is purely additive.
I think the label needs to be `atByteOffset`, and placing it before `as` makes a lot of sense because it no longer depends on the type’s stride.

-Andy

I think we’ve agreed to a few minor updates to this proposal. Since there hasn’t been any other feedback on the thread it may be worth posting an amended proposal so we all know what we’ve agreed on.

-Andy

···

On Sep 3, 2017, at 8:23 PM, Andrew Trick via swift-evolution <swift-evolution@swift.org> wrote:

On Sep 3, 2017, at 8:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

If we use byte offset, then the at parameter in UnsafeMutableRawPointer should be removed, since pointer arithmetic can be used instead (just like with UnsafeMutablePointer).

I agree that it seems quite sensible to remove the ‘at’ parameter altogether from the UMRP method.

No code in tree or on github is using the `at` argument. I think it can be removed. A fixit should still be possible.

Not convinced moving the at: argument to come before the as: argument is worth it in terms of source breakage.

Since much of this proposal involves shuffling and relabeling arguments, I’d argue it’s better to break slight more source in one go for the optimal API than to break slightly less for a slightly less optimal API, no? (This is assuming there is agreement that ‘at:as:’ is less prone to misinterpretation than ‘as:at:’.)

To be clear, we’re just talking about UnsafeMutableRawBufferPointer.initializeMemory now, so this is purely additive.
I think the label needs to be `atByteOffset`, and placing it before `as` makes a lot of sense because it no longer depends on the type’s stride.

-Andy
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

The important thing is that the UnsafeBufferPointer API is clearly documented. We do not want users to think it’s ok to deallocate a smaller buffer than they allocated.

Unfortunately, there’s actually no way to assert this in the runtime because malloc_size could be larger than the allocated capacity. Incorrect code could happen to work and we can live with that.

This is really the same issue that we punted on earlier… there’s no way to indicate that a buffer owns its memory. So we need to rely on clear documentation and discourage buffer “rebasing”.

In the future, a safe buffer will always own its memory and buffer slices will refer back to it.

-Andy

···

On Sep 6, 2017, at 5:17 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

On Sep 6, 2017, at 7:01 PM, Andrew Trick <atrick@apple.com <mailto:atrick@apple.com>> wrote:

On Sep 6, 2017, at 4:54 PM, Taylor Swift <kelvin13ma@gmail.com <mailto:kelvin13ma@gmail.com>> wrote:

The semantics of buffer.deallocate() needs to be: free `buffer.count` bytes of memory at `buffer.baseAddress`. So, that will always be the fast path!
Kelvin, do you agree with that?

this could be problematic if you have multiple contiguous buffers carved out of the same heap block. i agree that this is the best semantics for buffer pointers but we need the sized backend in Swift before this is possible else we will end up in the same boat we’re in right now with `deallocate(capacity:)` where we would have to make buffer deallocate heap block-based for now and then pull the rug out from underneath users later in order to switch to the improved semantics

If I understand your proposal, it’s only valid to deallocate a buffer that was allocated with the same capacity. Anything else should assert.
-Andy

the proposal isn’t specific enough there and that’s my fault but this seems like a good solution. in the future if we get a sized backend we can loosen the assertions and make the partial heap block buffer case defined behavior.