[Pitch] Normalize Slice Types for Unsafe Buffers

Sorry, let me clarify. Can you, Dave, elaborate as to why the issue you
stated as "this" ("untyped memory without bounds checks" conforming to
Collection) seems, as you say, "suspect"?

···

On Thu, Dec 8, 2016 at 18:32 Dave Abrahams <dabrahams@apple.com> wrote:

on Thu Dec 08 2016, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:

Can you elaborate on this? Why aren't you sure this is a wise
idea?

Whom are you asking? What is “this?”

--
-Dave

Yes. But I don't see why the raw buffer should be a collection in the first place. Why not

  byteBuffer += rawBuffer.bytes[payloadIndex..<endIndex]

?

···

on Thu Dec 08 2016, Jordan Rose <jordan_rose-AT-apple.com> wrote:

On Dec 8, 2016, at 16:22, Andrew Trick via swift-evolution >> <swift-evolution@swift.org> wrote:

In practice, it needs to be able to interoperate with [UInt8] and be interchangeable in the same

generic context.

e.g. `byteBuffer += rawBuffer[payloadIndex..<endIndex]` is typical.

I think Sequence is sufficient for that purpose.

Um, Sequence doesn’t have a subscript (or indexes). Sequences are single-pass. So if this is
important, it needs to stay a Collection.

--
-Dave

Ah, right, thank you. Retracted.

···

On Dec 8, 2016, at 16:53, Ben Cohen <ben_cohen@apple.com> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Um, Sequence doesn’t have a subscript (or indexes). Sequences are single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection does not mean it should be one. It needs to tick all the boxes before its allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting but isn’t a collection) or be multi-pass (strides are multiples but are only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t a collection because it doesn’t meet the requirements for slicing i.e. that indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete Slice type rather than on Collection… which is a documentation bug we should fix).

Um, Sequence doesn’t have a subscript (or indexes). Sequences are
single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection
does not mean it should be one. It needs to tick all the boxes before its
allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting
but isn’t a collection) or be multi-pass (strides are multiples but are
only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t
a collection because it doesn’t meet the requirements for slicing i.e. that
indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete
Slice type rather than on Collection… which is a documentation bug we
should fix).

If this is indeed a requirement for Collection, then my vote would be for
Nate's option #1 and Andy's option #2, to give UnsafeRawBufferPointer a
Slice type that fulfills the requirement. It's the smallest change,
preserves the use of integer indices, and preserves what Andy stated as the
desired use case of making it easy for users to switch out code written for
[UInt8].

I'm not sure I fully understand yet why Dave finds the idea of Collection
conformance fishy, but I'm comfortable with a type that's clearly labeled
as unsafe not being fully footgun-proof.

···

On Thu, Dec 8, 2016 at 6:53 PM, Ben Cohen via swift-evolution < swift-evolution@swift.org> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution < > swift-evolution@swift.org> wrote:

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Um, Sequence doesn’t have a subscript (or indexes). Sequences are single-pass. So if this is

important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection does not mean it should be
one.

Um, it sorta does.

It needs to tick all the boxes before its allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting but isn’t a collection) or
be multi-pass (strides are multiples but are only sequences). That’s OK

A sequence that is multipass can always be a collection, and there's no
good reason to limit its expressivity by keeping it a sequence.

In this case, yes it’s multi-pass, yes it has a subscript, but no it
isn’t a collection because it doesn’t meet the requirements for
slicing i.e. that indices of the slice be indices of the parent.

But it could. It's just a bug that we failed to get that right.

···

on Thu Dec 08 2016, Ben Cohen <ben_cohen-AT-apple.com> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution >> <swift-evolution@swift.org> wrote:

(relatedly… it appears this requirement is documented on the concrete
Slice type rather than on Collection… which is a documentation bug we
should fix).

--
-Dave

`RawBytes : Collection` could be a view over the raw buffer with `SubSequence : RandomAccessSlice<RawBytes>` if that's what you mean.

I hadn’t considered that just because it’s yet another type that needs to be introduced. Now interface authors need to decide whether they want to take a collection of bytes or a buffer. (In a non-generic context, the correct answer is to pass the buffer, not the collection--the collection would be just a convenient temporary view).

We still need an initializer or extension to handle the expected use case of nested buffers:

extension RandomAccessSlice where Base == RawBytes {
    var rebased: UnsafeRawBufferPointer {
        return UnsafeRawBufferPointer(start: base.baseAddress, count: count)
    }
}

I would even be willing to eliminate raw buffer subscripting altogether, which I think is what you're getting at, since this isn't too awful:

`buffer.bytes[i]`

Pro: More explicit division between raw memory semantics and Collection semantics.

Con: Raw buffer users need to juggle two different types and know how convert between them.

-Andy

···

On Dec 9, 2016, at 10:22 AM, Dave Abrahams <dabrahams@apple.com> wrote:

on Thu Dec 08 2016, Jordan Rose <jordan_rose-AT-apple.com> wrote:

On Dec 8, 2016, at 16:22, Andrew Trick via swift-evolution <swift-evolution@swift.org> wrote:
In practice, it needs to be able to interoperate with [UInt8] and be interchangeable in the same

generic context.

e.g. `byteBuffer += rawBuffer[payloadIndex..<endIndex]` is typical.
I think Sequence is sufficient for that purpose.

Um, Sequence doesn’t have a subscript (or indexes). Sequences are single-pass. So if this is
important, it needs to stay a Collection.

Yes. But I don't see why the raw buffer should be a collection in the first place. Why not

byteBuffer += rawBuffer.bytes[payloadIndex..<endIndex]

?

--
-Dave

Let me restate, because I think Jordan's question was valid given my statement.

It would be *nice* for raw buffers to be Collection<UInt8> because they’re meant to be a replacement for code that is typically written for [UInt8], and anything you can do with an array that applies to raw buffers is covered by Collection<UInt8>.

However, I don’t expect the raw buffer to be used in a generic context except being passed to utilities that copy the bytes out. That will either be done by directly iterating over the collection or invoking some other API that could take a Sequence. The most important is probably Array.append(contentsOf:), which is moving over to Sequence. However, we would also need to change UnsafeRawBufferPointer(copyBytes:), NSData(replaceSubrange:), and whatever else I haven't thought of. That's a small disadvantage to this solution.

I'm also a little concerned that Sequence is immutable, so generic code has no way to copy bytes into the buffer.

My bigger concern is still that the range subscript’s inconsistent behavior may still lead to bugs in practice in nongeneric code.

-Andy

···

On Dec 8, 2016, at 4:54 PM, Jordan Rose <jordan_rose@apple.com> wrote:

On Dec 8, 2016, at 16:53, Ben Cohen <ben_cohen@apple.com <mailto:ben_cohen@apple.com>> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Um, Sequence doesn’t have a subscript (or indexes). Sequences are single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection does not mean it should be one. It needs to tick all the boxes before its allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting but isn’t a collection) or be multi-pass (strides are multiples but are only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t a collection because it doesn’t meet the requirements for slicing i.e. that indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete Slice type rather than on Collection… which is a documentation bug we should fix).

Ah, right, thank you. Retracted.

Ok, but there needs to be an easy way in a nongeneric context to convert from a Slice<URBP> into an URBP (with normalized byte offsets).

Does anyone object to adding an initializer for this? Any suggestions on naming? Do we need an argument label? etc?

UnsafeRawBufferPointer(_ : Slice<UnsafeRawBufferPointer>)

as in:

let region = UnsafeRawBufferPointer(buffer[i..<j])

-Andy

···

On Dec 8, 2016, at 5:44 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

On Thu, Dec 8, 2016 at 6:53 PM, Ben Cohen via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Um, Sequence doesn’t have a subscript (or indexes). Sequences are single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection does not mean it should be one. It needs to tick all the boxes before its allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting but isn’t a collection) or be multi-pass (strides are multiples but are only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t a collection because it doesn’t meet the requirements for slicing i.e. that indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete Slice type rather than on Collection… which is a documentation bug we should fix).

If this is indeed a requirement for Collection, then my vote would be for Nate's option #1 and Andy's option #2, to give UnsafeRawBufferPointer a Slice type that fulfills the requirement. It's the smallest change, preserves the use of integer indices, and preserves what Andy stated as the desired use case of making it easy for users to switch out code written for [UInt8].

Probably needs an argument label since it's performing an explicit purpose, not just a vanilla conversion initializer. So maybe UnsafeRawBufferPointer.init(rebasing:)

Or since we have same-type constrained extensions now on master maybe you could do it as a property:

extension RandomAccessSlice where Base == UnsafeRawBufferPointer {
    var rebased: UnsafeRawBufferPointer {
        return UnsafeRawBufferPointer(start: base.baseAddress, count: count)
    }
}

(written by hand without a compiler so unlikely to be correct :)

···

On Dec 8, 2016, at 18:07, Andrew Trick <atrick@apple.com> wrote:

On Dec 8, 2016, at 5:44 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

On Thu, Dec 8, 2016 at 6:53 PM, Ben Cohen via swift-evolution <swift-evolution@swift.org> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution <swift-evolution@swift.org> wrote:

Um, Sequence doesn’t have a subscript (or indexes). Sequences are single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection does not mean it should be one. It needs to tick all the boxes before its allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting but isn’t a collection) or be multi-pass (strides are multiples but are only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t a collection because it doesn’t meet the requirements for slicing i.e. that indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete Slice type rather than on Collection… which is a documentation bug we should fix).

If this is indeed a requirement for Collection, then my vote would be for Nate's option #1 and Andy's option #2, to give UnsafeRawBufferPointer a Slice type that fulfills the requirement. It's the smallest change, preserves the use of integer indices, and preserves what Andy stated as the desired use case of making it easy for users to switch out code written for [UInt8].

Ok, but there needs to be an easy way in a nongeneric context to convert from a Slice<URBP> into an URBP (with normalized byte offsets).

Does anyone object to adding an initializer for this? Any suggestions on naming? Do we need an argument label? etc?

UnsafeRawBufferPointer(_ : Slice<UnsafeRawBufferPointer>)

as in:

let region = UnsafeRawBufferPointer(buffer[i..<j])

-Andy

Um, Sequence doesn’t have a subscript (or indexes). Sequences are
single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection
does not mean it should be one. It needs to tick all the boxes before its
allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting
but isn’t a collection) or be multi-pass (strides are multiples but are
only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t
a collection because it doesn’t meet the requirements for slicing i.e. that
indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete
Slice type rather than on Collection… which is a documentation bug we
should fix).

If this is indeed a requirement for Collection, then my vote would be for
Nate's option #1 and Andy's option #2, to give UnsafeRawBufferPointer a
Slice type that fulfills the requirement. It's the smallest change,
preserves the use of integer indices, and preserves what Andy stated as the
desired use case of making it easy for users to switch out code written for
[UInt8].

I'm not sure I fully understand yet why Dave finds the idea of Collection
conformance fishy,

Because the memory can easily be already bound to another type than
UInt8, and there's no obvious reason why UInt8 should be privileged as a
type you can get out of a raw buffer without binding the memory.

···

on Thu Dec 08 2016, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:

On Thu, Dec 8, 2016 at 6:53 PM, Ben Cohen via swift-evolution < > swift-evolution@swift.org> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution < >> swift-evolution@swift.org> wrote:

but I'm comfortable with a type that's clearly labeled as unsafe not
being fully footgun-proof.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave

So now that I look at it, it appears UnsafeRawBufferPointer(copyBytes:) has the same problems we are trying to solve on initialize(as:from) i.e. it is completely at the mercy of the passed-in collection's count being accurate, and if it isn't it'll scribble over memory. We should probably apply similar fixes and change it to take a sequence.

I realize it's a sticking plaster on this particular issue though, so still doesn't answer whether it's better for UnsafeRawBufferPointer to be a collection, just created more work...

···

On Dec 8, 2016, at 17:06, Andrew Trick <atrick@apple.com> wrote:

On Dec 8, 2016, at 4:54 PM, Jordan Rose <jordan_rose@apple.com> wrote:

On Dec 8, 2016, at 16:53, Ben Cohen <ben_cohen@apple.com> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution <swift-evolution@swift.org> wrote:

Um, Sequence doesn’t have a subscript (or indexes). Sequences are single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection does not mean it should be one. It needs to tick all the boxes before its allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting but isn’t a collection) or be multi-pass (strides are multiples but are only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t a collection because it doesn’t meet the requirements for slicing i.e. that indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete Slice type rather than on Collection… which is a documentation bug we should fix).

Ah, right, thank you. Retracted.

Let me restate, because I think Jordan's question was valid given my statement.

It would be *nice* for raw buffers to be Collection<UInt8> because they’re meant to be a replacement for code that is typically written for [UInt8], and anything you can do with an array that applies to raw buffers is covered by Collection<UInt8>.

However, I don’t expect the raw buffer to be used in a generic context except being passed to utilities that copy the bytes out. That will either be done by directly iterating over the collection or invoking some other API that could take a Sequence. The most important is probably Array.append(contentsOf:), which is moving over to Sequence. However, we would also need to change UnsafeRawBufferPointer(copyBytes:), NSData(replaceSubrange:), and whatever else I haven't thought of. That's a small disadvantage to this solution.

I'm also a little concerned that Sequence is immutable, so generic code has no way to copy bytes into the buffer.

My bigger concern is still that the range subscript’s inconsistent behavior may still lead to bugs in practice in nongeneric code.

-Andy

Um, Sequence doesn’t have a subscript (or indexes). Sequences are
single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection
does not mean it should be one. It needs to tick all the boxes before its
allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting
but isn’t a collection) or be multi-pass (strides are multiples but are
only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t
a collection because it doesn’t meet the requirements for slicing i.e. that
indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete
Slice type rather than on Collection… which is a documentation bug we
should fix).

If this is indeed a requirement for Collection, then my vote would be for
Nate's option #1 and Andy's option #2, to give UnsafeRawBufferPointer a
Slice type that fulfills the requirement. It's the smallest change,
preserves the use of integer indices, and preserves what Andy stated as the
desired use case of making it easy for users to switch out code written for
[UInt8].

I'm not sure I fully understand yet why Dave finds the idea of Collection
conformance fishy,

Because the memory can easily be already bound to another type than
UInt8, and there's no obvious reason why UInt8 should be privileged as a
type you can get out of a raw buffer without binding the memory.

I strongly disagree with that statement. The overwhelmingly common use
case for raw buffers is to view them as a sequence of UInt8 *without*
binding the type. Generally, at the point that you're dealing with a
raw buffer it's impossible to (re)bind memory because you don't know
what type it holds. The reason it's so important to have an
UnsafeRawBufferPointer data type is precisely so that users don't need
mess about with binding memory. It's easy to get that wrong even when
it's possible.

The only reason that UInt8 is special is that when users create
temporary typed buffers for bytes (e.g. they sometimes want a growable
array or just don't want to bother with manual allocation) they always
use UInt8 as the element type.

That said, we could easily divide these concerns into two types as
you suggested. A raw buffer, which doesn't have any special UInt8
features, and a RawBytes collection that handles both buffer slicing
and UInt8 interoperability.

-Andy

···

On Dec 9, 2016, at 10:27 AM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Thu Dec 08 2016, Xiaodi Wu <xiaodi.wu-AT-gmail.com <http://xiaodi.wu-at-gmail.com/&gt;&gt; wrote:

On Thu, Dec 8, 2016 at 6:53 PM, Ben Cohen via swift-evolution < >> swift-evolution@swift.org> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution < >>> swift-evolution@swift.org> wrote:

but I'm comfortable with a type that's clearly labeled as unsafe not
being fully footgun-proof.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

Probably needs an argument label since it's performing an explicit purpose, not just a vanilla
conversion initializer. So maybe
UnsafeRawBufferPointer.init(rebasing:)

I disagree that we need a label here. It's a value-preserving
conversion; the result even means the same thing as the argument.

···

on Thu Dec 08 2016, Ben Cohen <swift-evolution@swift.org> wrote:

Or since we have same-type constrained extensions now on master maybe you could do it as a property:

extension RandomAccessSlice where Base == UnsafeRawBufferPointer {
    var rebased: UnsafeRawBufferPointer {
        return UnsafeRawBufferPointer(start: base.baseAddress, count: count)
    }
}

(written by hand without a compiler so unlikely to be correct :)

On Dec 8, 2016, at 18:07, Andrew Trick <atrick@apple.com> wrote:

On Dec 8, 2016, at 5:44 PM, Xiaodi Wu via swift-evolution > <swift-evolution@swift.org> wrote:

On Thu, Dec 8, 2016 at 6:53 PM, Ben Cohen via swift-evolution <swift-evolution@swift.org> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution <swift-evolution@swift.org> wrote:

Um, Sequence doesn’t have a subscript (or indexes). Sequences are single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection does not mean it should be one. It needs to tick all the boxes before its allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting but isn’t a collection) or be multi-pass (strides are multiples but are only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t a collection because it doesn’t meet the requirements for slicing i.e. that indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete Slice type rather than on Collection… which is a documentation bug we should fix).

If this is indeed a requirement for Collection, then my vote would be for Nate's option #1 and Andy's option #2, to give UnsafeRawBufferPointer a Slice type that fulfills the requirement. It's the smallest change, preserves the use of integer indices, and preserves what Andy stated as the desired use case of making it easy for users to switch out code written for [UInt8].

Ok, but there needs to be an easy way in a nongeneric context to convert from a Slice<URBP> into an URBP (with normalized byte offsets).

Does anyone object to adding an initializer for this? Any suggestions on naming? Do we need an

argument label? etc?

UnsafeRawBufferPointer(_ : Slice<UnsafeRawBufferPointer>)

as in:

let region = UnsafeRawBufferPointer(buffer[i..<j])

-Andy

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave

Um, Sequence doesn’t have a subscript (or indexes). Sequences are
single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection
does not mean it should be one. It needs to tick all the boxes before its
allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting
but isn’t a collection) or be multi-pass (strides are multiples but are
only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t
a collection because it doesn’t meet the requirements for slicing i.e. that
indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete
Slice type rather than on Collection… which is a documentation bug we
should fix).

If this is indeed a requirement for Collection, then my vote would be for
Nate's option #1 and Andy's option #2, to give UnsafeRawBufferPointer a
Slice type that fulfills the requirement. It's the smallest change,
preserves the use of integer indices, and preserves what Andy stated as the
desired use case of making it easy for users to switch out code written for
[UInt8].

I'm not sure I fully understand yet why Dave finds the idea of Collection
conformance fishy,

Because the memory can easily be already bound to another type than
UInt8, and there's no obvious reason why UInt8 should be privileged as a
type you can get out of a raw buffer without binding the memory.

I strongly disagree with that statement. The overwhelmingly common use
case for raw buffers is to view them as a sequence of UInt8 *without*
binding the type. Generally, at the point that you're dealing with a
raw buffer it's impossible to (re)bind memory because you don't know
what type it holds.

Oh, you can't just rebind to UInt8 because that's not defined as
universally compatible with all data. OK, sorry.

The reason it's so important to have an UnsafeRawBufferPointer data
type is precisely so that users don't need mess about with binding
memory. It's easy to get that wrong even when it's possible.

The only reason that UInt8 is special is that when users create
temporary typed buffers for bytes (e.g. they sometimes want a growable
array or just don't want to bother with manual allocation) they always
use UInt8 as the element type.

That said, we could easily divide these concerns into two types as
you suggested. A raw buffer, which doesn't have any special UInt8
features, and a RawBytes collection that handles both buffer slicing
and UInt8 interoperability.

But, now that I think of it, that wouldn't really solve any problems,
would it?

···

on Fri Dec 09 2016, Andrew Trick <swift-evolution@swift.org> wrote:

On Dec 9, 2016, at 10:27 AM, Dave Abrahams via swift-evolution > <swift-evolution@swift.org> wrote:
on Thu Dec 08 2016, Xiaodi Wu <xiaodi.wu-AT-gmail.com <http://xiaodi.wu-at-gmail.com/&gt;&gt; wrote:

On Thu, Dec 8, 2016 at 6:53 PM, Ben Cohen via swift-evolution < >>> swift-evolution@swift.org> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution < >>>> swift-evolution@swift.org> wrote:

--
-Dave

A new Collection type doesn't solve any practical problems. It does solve a
conceptual problem if you think that a raw buffer is not *inherently* a
collection of bytes. There is an elegance in separating the raw buffer
semantics from the byte collection semantics, but that elegance does
not simplify anything for users--it's just more abstraction to figure
out. Certainly, the most straightforward way to fix this is to simply
change raw buffer's slice type, so I'm inclined to favor that
approach. Creating a new collection type would involve
rethinking/redesigning some of the related APIs.

Also note that I'm leaning toward slice -> buffer conversion via an
unlabeled initializer because I think it's the most obvious with least
API surface.

I don't think we absolutely need a new proposal for this easy fix,
since it's not really introducing a new API. The additional
initializer merely allows code that used to work to be migrated via a
fixit.

Here's a quick summary. If there aren't any strong objections, I'll
post an ammendment to the original proposal along with a PR for more
formal review.

Proposed ammendment to SE-0138:
<https://github.com/apple/swift-evolution/blob/master/proposals/0138-unsaferawbufferpointer.md&gt;

Fix: Change Unsafe${Mutable}RawBufferPointer's SubSequnce type

Original: Unsafe${Mutable}RawBufferPointer.SubSequence = Unsafe${Mutable}RawBufferPointer

Fixed: Unsafe${Mutable}RawBufferPointer.SubSequence = ${Mutable}RandomAccessSlice<Unsafe${Mutable}RawBufferPointer>

This is a source breaking bug fix that only applies to
post-3.0.1. It's extremely unlikely that any Swift 3 code would rely
on the Subsequence type, except for the simple use case of passing a
raw buffer subrange to an another raw buffer argument:

`takesRawBuffer(buffer[i..<j])`

A trivial fixit would insert an extra cast:

`takesRawBuffer(UnsafeRawBufferPointer(buffer[i..<j]))`

Add unlabeled initializers:

struct Unsafe${Mutable}RawBufferPointer {
  init(_ bytes: SubSequence)
}

struct UnsafeRawBufferPointer {
  init(_ bytes: RandomAccessSlice<UnsafeMutableRawBufferPointer>)
}

-Andy

PS: Thanks to Nate Cook and Kevin Ballard for raising this issue.

···

On Dec 9, 2016, at 11:50 AM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

on Fri Dec 09 2016, Andrew Trick <swift-evolution@swift.org> wrote:

On Dec 9, 2016, at 10:27 AM, Dave Abrahams via swift-evolution >> <swift-evolution@swift.org> wrote:

on Thu Dec 08 2016, Xiaodi Wu <xiaodi.wu-AT-gmail.com <http://xiaodi.wu-at-gmail.com/&gt;&gt; wrote:

On Thu, Dec 8, 2016 at 6:53 PM, Ben Cohen via swift-evolution < >>>> swift-evolution@swift.org> wrote:

On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution < >>>>> swift-evolution@swift.org> wrote:

Um, Sequence doesn’t have a subscript (or indexes). Sequences are
single-pass. So if this is important, it needs to stay a Collection.

Just because something fulfills one of the requirements of a Collection
does not mean it should be one. It needs to tick all the boxes before its
allowed to be elevated.

But it’s still allowed to have subscripts (UnsafePointer has subscripting
but isn’t a collection) or be multi-pass (strides are multiples but are
only sequences). That’s OK

In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t
a collection because it doesn’t meet the requirements for slicing i.e. that
indices of the slice be indices of the parent.
(relatedly… it appears this requirement is documented on the concrete
Slice type rather than on Collection… which is a documentation bug we
should fix).

If this is indeed a requirement for Collection, then my vote would be for
Nate's option #1 and Andy's option #2, to give UnsafeRawBufferPointer a
Slice type that fulfills the requirement. It's the smallest change,
preserves the use of integer indices, and preserves what Andy stated as the
desired use case of making it easy for users to switch out code written for
[UInt8].

I'm not sure I fully understand yet why Dave finds the idea of Collection
conformance fishy,

Because the memory can easily be already bound to another type than
UInt8, and there's no obvious reason why UInt8 should be privileged as a
type you can get out of a raw buffer without binding the memory.

I strongly disagree with that statement. The overwhelmingly common use
case for raw buffers is to view them as a sequence of UInt8 *without*
binding the type. Generally, at the point that you're dealing with a
raw buffer it's impossible to (re)bind memory because you don't know
what type it holds.

Oh, you can't just rebind to UInt8 because that's not defined as
universally compatible with all data. OK, sorry.

The reason it's so important to have an UnsafeRawBufferPointer data
type is precisely so that users don't need mess about with binding
memory. It's easy to get that wrong even when it's possible.

The only reason that UInt8 is special is that when users create
temporary typed buffers for bytes (e.g. they sometimes want a growable
array or just don't want to bother with manual allocation) they always
use UInt8 as the element type.

That said, we could easily divide these concerns into two types as
you suggested. A raw buffer, which doesn't have any special UInt8
features, and a RawBytes collection that handles both buffer slicing
and UInt8 interoperability.

But, now that I think of it, that wouldn't really solve any problems,
would it?

>>
>>
>>
>
>>>
>>>>
>>>>
>>>> Um, Sequence doesn’t have a subscript (or indexes). Sequences are
>>>> single-pass. So if this is important, it needs to stay a Collection.
>>>>
>>>>
>>>> Just because something fulfills one of the requirements of a Collection
>>>> does not mean it should be one. It needs to tick all the boxes before its
>>>> allowed to be elevated.
>>>>
>>>> But it’s still allowed to have subscripts (UnsafePointer has subscripting
>>>> but isn’t a collection) or be multi-pass (strides are multiples but are
>>>> only sequences). That’s OK
>>>>
>>>> In this case, yes it’s multi-pass, yes it has a subscript, but no it isn’t
>>>> a collection because it doesn’t meet the requirements for slicing i.e. that
>>>> indices of the slice be indices of the parent.
>>>> (relatedly… it appears this requirement is documented on the concrete
>>>> Slice type rather than on Collection… which is a documentation bug we
>>>> should fix).
>>>>
>>>
>>> If this is indeed a requirement for Collection, then my vote would be for
>>> Nate's option #1 and Andy's option #2, to give UnsafeRawBufferPointer a
>>> Slice type that fulfills the requirement. It's the smallest change,
>>> preserves the use of integer indices, and preserves what Andy stated as the
>>> desired use case of making it easy for users to switch out code written for
>>> [UInt8].
>>>
>>> I'm not sure I fully understand yet why Dave finds the idea of Collection
>>> conformance fishy,
>>
>> Because the memory can easily be already bound to another type than
>> UInt8, and there's no obvious reason why UInt8 should be privileged as a
>> type you can get out of a raw buffer without binding the memory.
>
> I strongly disagree with that statement. The overwhelmingly common use
> case for raw buffers is to view them as a sequence of UInt8 *without*
> binding the type. Generally, at the point that you're dealing with a
> raw buffer it's impossible to (re)bind memory because you don't know
> what type it holds.

Oh, you can't just rebind to UInt8 because that's not defined as
universally compatible with all data. OK, sorry.

If you're dealing with raw bytes to begin with then I'd hope you're working with types that can be safely expressed as a collection of bytes. And since this is an unsafe type, I don't think there's any problem with having the possibility of trying to express e.g. [NSObject] as a raw byte buffer.

> The reason it's so important to have an UnsafeRawBufferPointer data
> type is precisely so that users don't need mess about with binding
> memory. It's easy to get that wrong even when it's possible.
>
> The only reason that UInt8 is special is that when users create
> temporary typed buffers for bytes (e.g. they sometimes want a growable
> array or just don't want to bother with manual allocation) they always
> use UInt8 as the element type.
>
> That said, we could easily divide these concerns into two types as
> you suggested. A raw buffer, which doesn't have any special UInt8
> features, and a RawBytes collection that handles both buffer slicing
> and UInt8 interoperability.

But, now that I think of it, that wouldn't really solve any problems,
would it?

Agreed. If we have a separate RawBytes type, then I'm not really sure what the point of UnsafeRawBufferPointer is anymore. The whole point of that type is it's a collection of bytes rather than just being a tuple `(baseAddress: UnsafeRawPointer, count: Int)`.

-Kevin Ballard

···

On Fri, Dec 9, 2016, at 11:50 AM, Dave Abrahams via swift-evolution wrote:

on Fri Dec 09 2016, Andrew Trick <swift-evolution@swift.org> wrote:
>> On Dec 9, 2016, at 10:27 AM, Dave Abrahams via swift-evolution > > <swift-evolution@swift.org> wrote:
>> on Thu Dec 08 2016, Xiaodi Wu <xiaodi.wu-AT-gmail.com <http://xiaodi.wu-at-gmail.com/&gt;&gt; wrote:
>>> On Thu, Dec 8, 2016 at 6:53 PM, Ben Cohen via swift-evolution < > >>> swift-evolution@swift.org> wrote:
>>>> On Dec 8, 2016, at 4:35 PM, Jordan Rose via swift-evolution < > >>>> swift-evolution@swift.org> wrote: