[late pitch] UnsafeBytes proposal


(Andrew Trick) #1

Hi swift-evolutionaries,

I'm sorry to bring a proposal late to the table, but this could make a big difference to the Swift 3 migration experience.

This proposal adds basic usability for working with raw memory without breaking source. The need to provide higher level API for working with raw memory buffers has always been evident, but making improvements in this area depended on first introducing `UnsafeRawPointer`. It was not clear until the final week of source-breaking changes whether SE-0107 would make it into Swift 3. Now that it has, we should use the little remaining time to improve the migration experience and encourage correct use of the memory model by introducing a low-risk additive API.

Proposal:

https://github.com/atrick/swift-evolution/blob/unsafebytes/proposals/NNNN-UnsafeBytes.md

Intro:

[SE-0107: UnsafeRawPointer](https://github.com/apple/swift-evolution/blob/master/proposals/0107-unsaferawpointer.md) formalized Swift's memory model with respect to strict aliasing and prevented arbitrary conversion between `UnsafePointer` types. When moving to Swift 3, users will need to migrate much of their code dealing with `UnsafePointer`s. The new `UnsafeRawPointer` makes that possible. It provides a legal means to operate on raw memory (independent of the type of values in memory), and it provides an API for binding memory to a type for subsequent normal typed access. However, migrating to these new APIs is not always straightforward. It has become customary to use `[UInt8]` in APIs that deal with a buffer of bytes and are agnostic to the type of values held by the buffer. However, converting between `UInt8` and the client's element type at every API transition is difficult to do safely. See the [WIP UnsafeRawPointer Migration Guide]().

-Andy

NNNN-UnsafeBytes.md (31.5 KB)


(Brent Royal-Gordon) #2

I've only read a little but so far, but: Is the difference between this and UnsafeBufferPointer that it's built around a raw pointer rather than a bound pointer? If so, would UnsafeRawBufferPointer be a better name?

···

On Aug 12, 2016, at 6:12 PM, Andrew Trick via swift-evolution <swift-evolution@swift.org> wrote:

This proposal adds basic usability for working with raw memory without breaking source. The need to provide higher level API for working with raw memory buffers has always been evident, but making improvements in this area depended on first introducing `UnsafeRawPointer`. It was not clear until the final week of source-breaking changes whether SE-0107 would make it into Swift 3. Now that it has, we should use the little remaining time to improve the migration experience and encourage correct use of the memory model by introducing a low-risk additive API.

Proposal:

https://github.com/atrick/swift-evolution/blob/unsafebytes/proposals/NNNN-UnsafeBytes.md

<NNNN-UnsafeBytes.md>

--
Brent Royal-Gordon
Architechies


(Xiaodi Wu) #3

Only very recently, I remember running into the very issue identified in
this write-up regarding the Data API. I'm glad that this proposal is aiming
to address some of that.

Question, though: In what sense is UnsafeBytes unsafe?

···

On Fri, Aug 12, 2016 at 20:12 Andrew Trick via swift-evolution < swift-evolution@swift.org> wrote:

Hi swift-evolutionaries,

I'm sorry to bring a proposal late to the table, but this could make a big
difference to the Swift 3 migration experience.

This proposal adds basic usability for working with raw memory without
breaking source. The need to provide higher level API for working with raw
memory buffers has always been evident, but making improvements in this
area depended on first introducing `UnsafeRawPointer`. It was not clear
until the final week of source-breaking changes whether SE-0107 would make
it into Swift 3. Now that it has, we should use the little remaining time
to improve the migration experience and encourage correct use of the memory
model by introducing a low-risk additive API.

Proposal:

https://github.com/atrick/swift-evolution/blob/unsafebytes/proposals/NNNN-UnsafeBytes.md

Intro:

[SE-0107: UnsafeRawPointer](
https://github.com/apple/swift-evolution/blob/master/proposals/0107-unsaferawpointer.md)
formalized Swift's memory model with respect to strict aliasing and
prevented arbitrary conversion between `UnsafePointer` types. When moving
to Swift 3, users will need to migrate much of their code dealing with
`UnsafePointer`s. The new `UnsafeRawPointer` makes that possible. It
provides a legal means to operate on raw memory (independent of the type of
values in memory), and it provides an API for binding memory to a type for
subsequent normal typed access. However, migrating to these new APIs is not
always straightforward. It has become customary to use `[UInt8]` in APIs
that deal with a buffer of bytes and are agnostic to the type of values
held by the buffer. However, converting between `UInt8` and the client's
element type at every API transition is difficult to do safely. See the
[WIP UnsafeRawPointer Migration Guide]().

-Andy
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Magnus Ahltorp) #4

This proposal adds basic usability for working with raw memory without breaking source. The need to provide higher level API for working with raw memory buffers has always been evident, but making improvements in this area depended on first introducing `UnsafeRawPointer`. It was not clear until the final week of source-breaking changes whether SE-0107 would make it into Swift 3. Now that it has, we should use the little remaining time to improve the migration experience and encourage correct use of the memory model by introducing a low-risk additive API.

As I wrote during the UnsafeRawPointer review, which seems to apply to this proposal as well (forgive me if I have totally misunderstood the proposal):

When glancing at the examples, they strike me as mostly being marshalling, which in my opinion would be better served by a safe marshalling API followed by unsafe handling of the resulting buffer, and vice versa for unmarshalling. I think it is very important (in the long run) that code that doesn't interact with C directly has safe ways of doing inherently safe operations, and not take the unsafe route just because that is the only API available.

My question is, how does this API fit into the bigger picture of marshalling, and what are the benefits of using this API instead of marshalling with safe buffers?

/Magnus


(Andrew Trick) #5

Only very recently, I remember running into the very issue identified in this write-up regarding the Data API. I'm glad that this proposal is aiming to address some of that.

Question, though: In what sense is UnsafeBytes unsafe?

It’s not reference counted. UnsafeBytes is really a slice into raw memory that someone else is managing.

It might be nice to have a reference counted wrapper for this, but that’s *much* lower priority and it’s not nearly as clear how that should be done.
All the use cases I’ve looked at so far want to use manual allocation/deallocation (for a simple temp buffer) or `Data` or [UInt8] to persist the memory.

Note that [UInt8] can work well now as a temporary buffer as long as you’re using UnsafeBytes to copy data in:

var buffer = [UInt8]()

struct S {
  var x: Int
}

var s = S(x:3)

withUnsafeBytes(of: &s) {
  buffer += $0
}

-Andy

···

On Aug 12, 2016, at 7:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Fri, Aug 12, 2016 at 20:12 Andrew Trick via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Hi swift-evolutionaries,

I'm sorry to bring a proposal late to the table, but this could make a big difference to the Swift 3 migration experience.

This proposal adds basic usability for working with raw memory without breaking source. The need to provide higher level API for working with raw memory buffers has always been evident, but making improvements in this area depended on first introducing `UnsafeRawPointer`. It was not clear until the final week of source-breaking changes whether SE-0107 would make it into Swift 3. Now that it has, we should use the little remaining time to improve the migration experience and encourage correct use of the memory model by introducing a low-risk additive API.

Proposal:

https://github.com/atrick/swift-evolution/blob/unsafebytes/proposals/NNNN-UnsafeBytes.md

Intro:

[SE-0107: UnsafeRawPointer](https://github.com/apple/swift-evolution/blob/master/proposals/0107-unsaferawpointer.md) formalized Swift's memory model with respect to strict aliasing and prevented arbitrary conversion between `UnsafePointer` types. When moving to Swift 3, users will need to migrate much of their code dealing with `UnsafePointer`s. The new `UnsafeRawPointer` makes that possible. It provides a legal means to operate on raw memory (independent of the type of values in memory), and it provides an API for binding memory to a type for subsequent normal typed access. However, migrating to these new APIs is not always straightforward. It has become customary to use `[UInt8]` in APIs that deal with a buffer of bytes and are agnostic to the type of values held by the buffer. However, converting between `UInt8` and the client's element type at every API transition is difficult to do safely. See the [WIP UnsafeRawPointer Migration Guide]().

-Andy
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution


(Andrew Trick) #6

Yes that’s exactly right. Semantically, reads or writes on `UnsafeBytes` are untyped memory accesses. So you can get bytes into or out of it without binding memory.

But as you’ll see from the interfaces and examples I’ve shown, `UnsafeMutableRawBufferPointer` would be a terrible name. There’s no reason from the user’s point of view to link this type to `UnsafeBufferPointer` and that names conveys no additionally useful information. It’s often viewed as just a collection of Bytes and potentially an important type for a number of public interfaces.

-Andy

···

On Aug 12, 2016, at 7:32 PM, Brent Royal-Gordon <brent@architechies.com> wrote:

On Aug 12, 2016, at 6:12 PM, Andrew Trick via swift-evolution <swift-evolution@swift.org> wrote:

This proposal adds basic usability for working with raw memory without breaking source. The need to provide higher level API for working with raw memory buffers has always been evident, but making improvements in this area depended on first introducing `UnsafeRawPointer`. It was not clear until the final week of source-breaking changes whether SE-0107 would make it into Swift 3. Now that it has, we should use the little remaining time to improve the migration experience and encourage correct use of the memory model by introducing a low-risk additive API.

Proposal:

https://github.com/atrick/swift-evolution/blob/unsafebytes/proposals/NNNN-UnsafeBytes.md

<NNNN-UnsafeBytes.md>

I've only read a little but so far, but: Is the difference between this and UnsafeBufferPointer that it's built around a raw pointer rather than a bound pointer? If so, would UnsafeRawBufferPointer be a better name?


(Andrew Trick) #7

I agree. This proposal has `Unsafe` in its name because it’s not the final solution. It fills an important hole in these use cases—all of which currently interoperate with C—but leaves lifetime management of the memory on the table. It leaves the design of the marshalling feature up to the user intentionally because the primary goal is to migrate existing code. What you’re proposing is more of a new feature and less of a Swift 3 migration usability bug.

-Andy

···

On Aug 13, 2016, at 2:56 AM, Magnus Ahltorp <map@kth.se> wrote:

This proposal adds basic usability for working with raw memory without breaking source. The need to provide higher level API for working with raw memory buffers has always been evident, but making improvements in this area depended on first introducing `UnsafeRawPointer`. It was not clear until the final week of source-breaking changes whether SE-0107 would make it into Swift 3. Now that it has, we should use the little remaining time to improve the migration experience and encourage correct use of the memory model by introducing a low-risk additive API.

As I wrote during the UnsafeRawPointer review, which seems to apply to this proposal as well (forgive me if I have totally misunderstood the proposal):

When glancing at the examples, they strike me as mostly being marshalling, which in my opinion would be better served by a safe marshalling API followed by unsafe handling of the resulting buffer, and vice versa for unmarshalling. I think it is very important (in the long run) that code that doesn't interact with C directly has safe ways of doing inherently safe operations, and not take the unsafe route just because that is the only API available.

My question is, how does this API fit into the bigger picture of marshalling, and what are the benefits of using this API instead of marshalling with safe buffers?

/Magnus


(Andrew Trick) #8

I can give you a less dissmissive answer to this, because there is potential for confusion given that the value is actually a view over the bytes, not the bytes. I just don't think a longer name will clarify the semantics:

- `Bytes` sufficiently conveys a region of raw memory.

- The `Unsafe` already hints that this is only a "view" into memory,
  not a copy of the Bytes, and suggests that the developer needs to
  consult the doc comment if using it in a non-idiomatic way.

- In the expected idioms (see examples) this simple name will be far
  more meaningful and less distracting.

- Adding `Raw` to the name is purely redundant and adding `Buffer`,
  and `Pointer` into the name would cause confusion.

I think both your and Xiaodi's comments can be addressed by clarifying
the doc comments:

/// A non-owning view of raw memory as a collection of `UInt8` bytes.
///
/// Reads and writes on memory via `UnsafeBytes` are untyped operations that
/// do no require binding the memory to `UInt8`.
///
/// In addition to the `Collection` interface, this provides a bounds-checked
/// version of `UnsafeMutableRawPointer`'s interface to raw memory:
/// `load(fromByteOffset:as:)`, `storeBytes(of:toByteOffset:as:)`, and
/// `copyBytes(from:count:)`.
///
/// Because this is only a view into memory, and does not own the memory,
/// copying a value of type `UnsafeMutableBytes` does not copy the underlying
/// memory. However, assigning into `UnsafeMutableBytes` via a subscript copies
/// bytes into the memory, and assigning an `UnsafeMutableBytes` into a
/// value-based collection, such as `[UInt8]` copies bytes out of memory.
///
/// Example:
///
/// // View a slice of memory at someBytes.
/// var destBytes = someBytes[0..<n]
///
/// // Copy `n` bytes of data from sourceBytes into that view.
/// destBytes[0..<n] = sourceBytes
///
/// // View a different slice memory at someBytes.
/// destBytes = someBytes[n..<m]

-Andy

···

On Aug 12, 2016, at 7:47 PM, Andrew Trick via swift-evolution <swift-evolution@swift.org> wrote:

On Aug 12, 2016, at 7:32 PM, Brent Royal-Gordon <brent@architechies.com> wrote:

On Aug 12, 2016, at 6:12 PM, Andrew Trick via swift-evolution <swift-evolution@swift.org> wrote:

This proposal adds basic usability for working with raw memory without breaking source. The need to provide higher level API for working with raw memory buffers has always been evident, but making improvements in this area depended on first introducing `UnsafeRawPointer`. It was not clear until the final week of source-breaking changes whether SE-0107 would make it into Swift 3. Now that it has, we should use the little remaining time to improve the migration experience and encourage correct use of the memory model by introducing a low-risk additive API.

Proposal:

https://github.com/atrick/swift-evolution/blob/unsafebytes/proposals/NNNN-UnsafeBytes.md

<NNNN-UnsafeBytes.md>

I've only read a little but so far, but: Is the difference between this and UnsafeBufferPointer that it's built around a raw pointer rather than a bound pointer? If so, would UnsafeRawBufferPointer be a better name?

Yes that’s exactly right. Semantically, reads or writes on `UnsafeBytes` are untyped memory accesses. So you can get bytes into or out of it without binding memory.

But as you’ll see from the interfaces and examples I’ve shown, `UnsafeMutableRawBufferPointer` would be a terrible name. There’s no reason from the user’s point of view to link this type to `UnsafeBufferPointer` and that names conveys no additionally useful information. It’s often viewed as just a collection of Bytes and potentially an important type for a number of public interfaces.


(Félix Cloutier) #9

Can we do a quick recap? Please correct me if I'm wrong.

UnsafePointer: pointer to memory that the compiler may assume to be typed. Bounds unknown.
UnsafeBufferPointer: pointer to several objects that the compiler may assume to be typed Bounds known.
UnsafeRawPointer: pointer to memory that the compiler cannot assume to be typed. Bounds unknown.

I think that I'm coming to the same conclusion as Brent, that UnsafeBytes is to UnsafeRawPointer what UnsafeBufferPointer is to UnsafePointer.

It seems to me that this could be neatly laid out in a matrix, and to me that kind of justifies giving similar names.

            One logical object Many logical objects
Typed UnsafePointer UnsafeBufferPointer
Untyped UnsafeRawPointer UnsafeBytes

One thing that I don't really like about UnsafeBytes is that it poorly conveys that the memory is not typed. In fact, it looks like it's typed to be UInt8. (From what I recall, C++'s strict aliasing does an exception for arrays of chars, but that's not a reason to import that notion in Swift.)

Félix

···

Le 12 août 2016 à 19:47:36, Andrew Trick via swift-evolution <swift-evolution@swift.org> a écrit :

On Aug 12, 2016, at 7:32 PM, Brent Royal-Gordon <brent@architechies.com> wrote:

On Aug 12, 2016, at 6:12 PM, Andrew Trick via swift-evolution <swift-evolution@swift.org> wrote:

This proposal adds basic usability for working with raw memory without breaking source. The need to provide higher level API for working with raw memory buffers has always been evident, but making improvements in this area depended on first introducing `UnsafeRawPointer`. It was not clear until the final week of source-breaking changes whether SE-0107 would make it into Swift 3. Now that it has, we should use the little remaining time to improve the migration experience and encourage correct use of the memory model by introducing a low-risk additive API.

Proposal:

https://github.com/atrick/swift-evolution/blob/unsafebytes/proposals/NNNN-UnsafeBytes.md

<NNNN-UnsafeBytes.md>

I've only read a little but so far, but: Is the difference between this and UnsafeBufferPointer that it's built around a raw pointer rather than a bound pointer? If so, would UnsafeRawBufferPointer be a better name?

Yes that’s exactly right. Semantically, reads or writes on `UnsafeBytes` are untyped memory accesses. So you can get bytes into or out of it without binding memory.

But as you’ll see from the interfaces and examples I’ve shown, `UnsafeMutableRawBufferPointer` would be a terrible name. There’s no reason from the user’s point of view to link this type to `UnsafeBufferPointer` and that names conveys no additionally useful information. It’s often viewed as just a collection of Bytes and potentially an important type for a number of public interfaces.

-Andy
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Andrew Trick) #10

Can we do a quick recap? Please correct me if I'm wrong.

UnsafePointer: pointer to memory that the compiler may assume to be typed. Bounds unknown.
UnsafeBufferPointer: pointer to several objects that the compiler may assume to be typed Bounds known.
UnsafeRawPointer: pointer to memory that the compiler cannot assume to be typed. Bounds unknown.

I think that I'm coming to the same conclusion as Brent, that UnsafeBytes is to UnsafeRawPointer what UnsafeBufferPointer is to UnsafePointer.

It seems to me that this could be neatly laid out in a matrix, and to me that kind of justifies giving similar names.

            One logical object Many logical objects
Typed UnsafePointer UnsafeBufferPointer
Untyped UnsafeRawPointer UnsafeBytes

One thing that I don't really like about UnsafeBytes is that it poorly conveys that the memory is not typed. In fact, it looks like it's typed to be UInt8. (From what I recall, C++'s strict aliasing does an exception for arrays of chars, but that's not a reason to import that notion in Swift.)

Félix

That matrix is the correct starting point. UnsafeRawBufferPointer would be in the lower right. But it would be nothing more than a raw pointer with length. It wouldn’t be a collection of anything. UnsafeBytes is a powerful abstraction on top of what we just called UnsafeRawBufferPointer. It is a collection of typed elements `UInt8`. But the memory access used to read or write those elements is untyped. It’s precisely for code that needs to stream bytes into or out of an object without thinking about binding memory to a type.

`bytes` is already our commonly used label for either untyped memory of UInt8 sized values, or for `UnsafeBufferPointer<UInt8>`. But these two things are frustratingly incompatible. `UnsafeBytes` is much more important than filling in that square in your matrix. It also does away with the now common, but incorrect use of `UnsafeBufferPointer<UInt8>` all over the place.

Look at the examples, imagine you’ve never heard of an UnsafeRawPointer, and see if you can come up with a better API. This all makes perfect sense when you approach the problem from the common use cases, rather than from a standard library implementer’s point of view.

-Andy

···

On Aug 12, 2016, at 9:13 PM, Félix Cloutier <felixcca@yahoo.ca> wrote:

Le 12 août 2016 à 19:47:36, Andrew Trick via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

On Aug 12, 2016, at 7:32 PM, Brent Royal-Gordon <brent@architechies.com <mailto:brent@architechies.com>> wrote:

On Aug 12, 2016, at 6:12 PM, Andrew Trick via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

This proposal adds basic usability for working with raw memory without breaking source. The need to provide higher level API for working with raw memory buffers has always been evident, but making improvements in this area depended on first introducing `UnsafeRawPointer`. It was not clear until the final week of source-breaking changes whether SE-0107 would make it into Swift 3. Now that it has, we should use the little remaining time to improve the migration experience and encourage correct use of the memory model by introducing a low-risk additive API.

Proposal:

https://github.com/atrick/swift-evolution/blob/unsafebytes/proposals/NNNN-UnsafeBytes.md

<NNNN-UnsafeBytes.md>

I've only read a little but so far, but: Is the difference between this and UnsafeBufferPointer that it's built around a raw pointer rather than a bound pointer? If so, would UnsafeRawBufferPointer be a better name?

Yes that’s exactly right. Semantically, reads or writes on `UnsafeBytes` are untyped memory accesses. So you can get bytes into or out of it without binding memory.

But as you’ll see from the interfaces and examples I’ve shown, `UnsafeMutableRawBufferPointer` would be a terrible name. There’s no reason from the user’s point of view to link this type to `UnsafeBufferPointer` and that names conveys no additionally useful information. It’s often viewed as just a collection of Bytes and potentially an important type for a number of public interfaces.

-Andy
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution


(Brent Royal-Gordon) #11

But how is that different from UnsafeBufferPointer? Put another way, what is it about the UnsafeRawPointer -> UnsafeBytes relationship that isn't true about UnsafePointer -> UnsafeBufferPointer, and that therefore justifies the different name?

···

On Aug 12, 2016, at 9:34 PM, Andrew Trick <atrick@apple.com> wrote:

That matrix is the correct starting point. UnsafeRawBufferPointer would be in the lower right. But it would be nothing more than a raw pointer with length. It wouldn’t be a collection of anything. UnsafeBytes is a powerful abstraction on top of what we just called UnsafeRawBufferPointer. It is a collection of typed elements `UInt8`.

--
Brent Royal-Gordon
Architechies


(Xiaodi Wu) #12

I have to agree here that UnsafeBytes reads much better and makes more
intuitive sense at call sites than any of the alternatives.

···

On Fri, Aug 12, 2016 at 23:34 Andrew Trick via swift-evolution < swift-evolution@swift.org> wrote:

On Aug 12, 2016, at 9:13 PM, Félix Cloutier <felixcca@yahoo.ca> wrote:

Can we do a quick recap? Please correct me if I'm wrong.

   - UnsafePointer: pointer to memory that the compiler may assume to be
   typed. Bounds unknown.
   - UnsafeBufferPointer: pointer to several objects that the compiler
   may assume to be typed Bounds known.
   - UnsafeRawPointer: pointer to memory that the compiler cannot assume
   to be typed. Bounds unknown.

I think that I'm coming to the same conclusion as Brent, that UnsafeBytes
is to UnsafeRawPointer what UnsafeBufferPointer is to UnsafePointer.

It seems to me that this could be neatly laid out in a matrix, and to me
that kind of justifies giving similar names.

            One logical object Many logical objects
Typed UnsafePointer UnsafeBufferPointer
Untyped UnsafeRawPointer UnsafeBytes

One thing that I don't really like about UnsafeBytes is that it poorly
conveys that the memory is not typed. In fact, it looks like it's typed to
be UInt8. (From what I recall, C++'s strict aliasing does an exception for
arrays of chars, but that's not a reason to import that notion in Swift.)

Félix

That matrix is the correct starting point. UnsafeRawBufferPointer would be
in the lower right. But it would be nothing more than a raw pointer with
length. It wouldn’t be a collection of anything. UnsafeBytes is a powerful
abstraction on top of what we just called UnsafeRawBufferPointer. It is a
collection of typed elements `UInt8`. But the memory access used to read or
write those elements is untyped. It’s precisely for code that needs to
stream bytes into or out of an object without thinking about binding memory
to a type.

`bytes` is already our commonly used label for either untyped memory of
UInt8 sized values, or for `UnsafeBufferPointer<UInt8>`. But these two
things are frustratingly incompatible. `UnsafeBytes` is much more important
than filling in that square in your matrix. It also does away with the now
common, but incorrect use of `UnsafeBufferPointer<UInt8>` all over the
place.

Look at the examples, imagine you’ve never heard of an UnsafeRawPointer,
and see if you can come up with a better API. This all makes perfect sense
when you approach the problem from the common use cases, rather than from a
standard library implementer’s point of view.

-Andy

Le 12 août 2016 à 19:47:36, Andrew Trick via swift-evolution < > swift-evolution@swift.org> a écrit :

On Aug 12, 2016, at 7:32 PM, Brent Royal-Gordon <brent@architechies.com> > wrote:

On Aug 12, 2016, at 6:12 PM, Andrew Trick via swift-evolution < > swift-evolution@swift.org> wrote:

This proposal adds basic usability for working with raw memory without
breaking source. The need to provide higher level API for working with raw
memory buffers has always been evident, but making improvements in this
area depended on first introducing `UnsafeRawPointer`. It was not clear
until the final week of source-breaking changes whether SE-0107 would make
it into Swift 3. Now that it has, we should use the little remaining time
to improve the migration experience and encourage correct use of the memory
model by introducing a low-risk additive API.

Proposal:

https://github.com/atrick/swift-evolution/blob/unsafebytes/proposals/NNNN-UnsafeBytes.md

<NNNN-UnsafeBytes.md>

I've only read a little but so far, but: Is the difference between this
and UnsafeBufferPointer that it's built around a raw pointer rather than a
bound pointer? If so, would UnsafeRawBufferPointer be a better name?

Yes that’s exactly right. Semantically, reads or writes on `UnsafeBytes`
are untyped memory accesses. So you can get bytes into or out of it without
binding memory.

But as you’ll see from the interfaces and examples I’ve shown,
`UnsafeMutableRawBufferPointer` would be a terrible name. There’s no reason
from the user’s point of view to link this type to `UnsafeBufferPointer`
and that names conveys no additionally useful information. It’s often
viewed as just a collection of Bytes and potentially an important type for
a number of public interfaces.

-Andy
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Andrew Trick) #13

Giving UnsafeRawPointer a memory size doesn’t imply a collection of any specific type. You’re supposed to used bindMemory(to:capacity:) to get a collection out of it. Giving UnsafeBytes a name analogous to UnsafeBufferPointer only exposes that subtle difference, which is actually irrelevant. In the common case, users don’t need to know how UnsafeRawPointer works, so why start with that analogy?

The use cases justify the name. `UnsafeBytes` is what developers have been trying to get all along with `UnsafeBufferPointer<UInt8>`. The concept already exists to developers, but we have failed to give them a distinct, simple, and intuitive name for it, not to mention a correct implementation.

-Andy

···

On Aug 13, 2016, at 12:17 AM, Brent Royal-Gordon <brent@architechies.com> wrote:

On Aug 12, 2016, at 9:34 PM, Andrew Trick <atrick@apple.com> wrote:

That matrix is the correct starting point. UnsafeRawBufferPointer would be in the lower right. But it would be nothing more than a raw pointer with length. It wouldn’t be a collection of anything. UnsafeBytes is a powerful abstraction on top of what we just called UnsafeRawBufferPointer. It is a collection of typed elements `UInt8`.

But how is that different from UnsafeBufferPointer? Put another way, what is it about the UnsafeRawPointer -> UnsafeBytes relationship that isn't true about UnsafePointer -> UnsafeBufferPointer, and that therefore justifies the different name?


(Félix Cloutier) #14

And then, we can't really use UnsafeBufferPointer<UInt8> for the purpose of UnsafeBytes because we want to expose a different API. Is that right?

Félix

···

Le 13 août 2016 à 01:44:28, Andrew Trick <atrick@apple.com> a écrit :

On Aug 13, 2016, at 12:17 AM, Brent Royal-Gordon <brent@architechies.com> wrote:

On Aug 12, 2016, at 9:34 PM, Andrew Trick <atrick@apple.com> wrote:

That matrix is the correct starting point. UnsafeRawBufferPointer would be in the lower right. But it would be nothing more than a raw pointer with length. It wouldn’t be a collection of anything. UnsafeBytes is a powerful abstraction on top of what we just called UnsafeRawBufferPointer. It is a collection of typed elements `UInt8`.

But how is that different from UnsafeBufferPointer? Put another way, what is it about the UnsafeRawPointer -> UnsafeBytes relationship that isn't true about UnsafePointer -> UnsafeBufferPointer, and that therefore justifies the different name?

Giving UnsafeRawPointer a memory size doesn’t imply a collection of any specific type. You’re supposed to used bindMemory(to:capacity:) to get a collection out of it. Giving UnsafeBytes a name analogous to UnsafeBufferPointer only exposes that subtle difference, which is actually irrelevant. In the common case, users don’t need to know how UnsafeRawPointer works, so why start with that analogy?

The use cases justify the name. `UnsafeBytes` is what developers have been trying to get all along with `UnsafeBufferPointer<UInt8>`. The concept already exists to developers, but we have failed to give them a distinct, simple, and intuitive name for it, not to mention a correct implementation.

-Andy


(Andrew Trick) #15

And then, we can't really use UnsafeBufferPointer<UInt8> for the purpose of UnsafeBytes because we want to expose a different API. Is that right?

UnsafeBufferPointer<UInt8> should be used in the same situation that UnsafePointer<T> is used for any T. A view over an array of UInt8 that can bypasses release bounds checks and can interoperate with C.

UnsafeBufferPointer<UInt8> should not be used to erase the memory’s pointee type.

UnsafeBytes erases the pointee type and gives algorithms a collection of bytes to work with. It turns out to be an important use case that I very much want to distinguish from the UnsafeBufferPointer use case. I don’t want to present users with a false analogy to UnsafeBufferPointer.

-Andy

···

On Aug 13, 2016, at 7:12 AM, Félix Cloutier <felixcca@yahoo.ca> wrote:

Le 13 août 2016 à 01:44:28, Andrew Trick <atrick@apple.com <mailto:atrick@apple.com>> a écrit :

On Aug 13, 2016, at 12:17 AM, Brent Royal-Gordon <brent@architechies.com <mailto:brent@architechies.com>> wrote:

On Aug 12, 2016, at 9:34 PM, Andrew Trick <atrick@apple.com <mailto:atrick@apple.com>> wrote:

That matrix is the correct starting point. UnsafeRawBufferPointer would be in the lower right. But it would be nothing more than a raw pointer with length. It wouldn’t be a collection of anything. UnsafeBytes is a powerful abstraction on top of what we just called UnsafeRawBufferPointer. It is a collection of typed elements `UInt8`.

But how is that different from UnsafeBufferPointer? Put another way, what is it about the UnsafeRawPointer -> UnsafeBytes relationship that isn't true about UnsafePointer -> UnsafeBufferPointer, and that therefore justifies the different name?

Giving UnsafeRawPointer a memory size doesn’t imply a collection of any specific type. You’re supposed to used bindMemory(to:capacity:) to get a collection out of it. Giving UnsafeBytes a name analogous to UnsafeBufferPointer only exposes that subtle difference, which is actually irrelevant. In the common case, users don’t need to know how UnsafeRawPointer works, so why start with that analogy?

The use cases justify the name. `UnsafeBytes` is what developers have been trying to get all along with `UnsafeBufferPointer<UInt8>`. The concept already exists to developers, but we have failed to give them a distinct, simple, and intuitive name for it, not to mention a correct implementation.

-Andy


(Michael Ilseman) #16

It seems like there’s a potential for confusion here, in that people may see “UInt8” and assume there is some kind of typed-ness, even though the whole point is that this is untyped. Adjusting the header comments slightly might help:

/// A non-owning view of raw memory as a collection of bytes.
///
/// Reads and writes on memory via `UnsafeBytes` are untyped operations that
/// do no require binding the memory to a type. These operations are expressed
/// in terms of `UInt8`, though the underlying memory is untyped.

You could go even further towards hinting this fact with a `typealias Byte = UInt8`, and use Byte throughout. But, I don’t know if that’s getting too excessive.

···

On Aug 13, 2016, at 9:34 AM, Andrew Trick via swift-evolution <swift-evolution@swift.org> wrote:

On Aug 13, 2016, at 7:12 AM, Félix Cloutier <felixcca@yahoo.ca <mailto:felixcca@yahoo.ca>> wrote:

And then, we can't really use UnsafeBufferPointer<UInt8> for the purpose of UnsafeBytes because we want to expose a different API. Is that right?

UnsafeBufferPointer<UInt8> should be used in the same situation that UnsafePointer<T> is used for any T. A view over an array of UInt8 that can bypasses release bounds checks and can interoperate with C.

UnsafeBufferPointer<UInt8> should not be used to erase the memory’s pointee type.

UnsafeBytes erases the pointee type and gives algorithms a collection of bytes to work with. It turns out to be an important use case that I very much want to distinguish from the UnsafeBufferPointer use case. I don’t want to present users with a false analogy to UnsafeBufferPointer.

-Andy

Le 13 août 2016 à 01:44:28, Andrew Trick <atrick@apple.com <mailto:atrick@apple.com>> a écrit :

On Aug 13, 2016, at 12:17 AM, Brent Royal-Gordon <brent@architechies.com <mailto:brent@architechies.com>> wrote:

On Aug 12, 2016, at 9:34 PM, Andrew Trick <atrick@apple.com <mailto:atrick@apple.com>> wrote:

That matrix is the correct starting point. UnsafeRawBufferPointer would be in the lower right. But it would be nothing more than a raw pointer with length. It wouldn’t be a collection of anything. UnsafeBytes is a powerful abstraction on top of what we just called UnsafeRawBufferPointer. It is a collection of typed elements `UInt8`.

But how is that different from UnsafeBufferPointer? Put another way, what is it about the UnsafeRawPointer -> UnsafeBytes relationship that isn't true about UnsafePointer -> UnsafeBufferPointer, and that therefore justifies the different name?

Giving UnsafeRawPointer a memory size doesn’t imply a collection of any specific type. You’re supposed to used bindMemory(to:capacity:) to get a collection out of it. Giving UnsafeBytes a name analogous to UnsafeBufferPointer only exposes that subtle difference, which is actually irrelevant. In the common case, users don’t need to know how UnsafeRawPointer works, so why start with that analogy?

The use cases justify the name. `UnsafeBytes` is what developers have been trying to get all along with `UnsafeBufferPointer<UInt8>`. The concept already exists to developers, but we have failed to give them a distinct, simple, and intuitive name for it, not to mention a correct implementation.

-Andy

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Andrew Trick) #17

Thanks i'll try to clarify in the comments that The memory is read out as UInt8 but the memory is untyped.

Andy

···

On Aug 15, 2016, at 11:55 AM, Michael Ilseman <milseman@apple.com> wrote:

It seems like there’s a potential for confusion here, in that people may see “UInt8” and assume there is some kind of typed-ness, even though the whole point is that this is untyped. Adjusting the header comments slightly might help:

/// A non-owning view of raw memory as a collection of bytes.
///
/// Reads and writes on memory via `UnsafeBytes` are untyped operations that
/// do no require binding the memory to a type. These operations are expressed
/// in terms of `UInt8`, though the underlying memory is untyped.

You could go even further towards hinting this fact with a `typealias Byte = UInt8`, and use Byte throughout. But, I don’t know if that’s getting too excessive.

On Aug 13, 2016, at 9:34 AM, Andrew Trick via swift-evolution <swift-evolution@swift.org> wrote:

On Aug 13, 2016, at 7:12 AM, Félix Cloutier <felixcca@yahoo.ca> wrote:

And then, we can't really use UnsafeBufferPointer<UInt8> for the purpose of UnsafeBytes because we want to expose a different API. Is that right?

UnsafeBufferPointer<UInt8> should be used in the same situation that UnsafePointer<T> is used for any T. A view over an array of UInt8 that can bypasses release bounds checks and can interoperate with C.

UnsafeBufferPointer<UInt8> should not be used to erase the memory’s pointee type.

UnsafeBytes erases the pointee type and gives algorithms a collection of bytes to work with. It turns out to be an important use case that I very much want to distinguish from the UnsafeBufferPointer use case. I don’t want to present users with a false analogy to UnsafeBufferPointer.

-Andy

Le 13 août 2016 à 01:44:28, Andrew Trick <atrick@apple.com> a écrit :

On Aug 13, 2016, at 12:17 AM, Brent Royal-Gordon <brent@architechies.com> wrote:

On Aug 12, 2016, at 9:34 PM, Andrew Trick <atrick@apple.com> wrote:

That matrix is the correct starting point. UnsafeRawBufferPointer would be in the lower right. But it would be nothing more than a raw pointer with length. It wouldn’t be a collection of anything. UnsafeBytes is a powerful abstraction on top of what we just called UnsafeRawBufferPointer. It is a collection of typed elements `UInt8`.

But how is that different from UnsafeBufferPointer? Put another way, what is it about the UnsafeRawPointer -> UnsafeBytes relationship that isn't true about UnsafePointer -> UnsafeBufferPointer, and that therefore justifies the different name?

Giving UnsafeRawPointer a memory size doesn’t imply a collection of any specific type. You’re supposed to used bindMemory(to:capacity:) to get a collection out of it. Giving UnsafeBytes a name analogous to UnsafeBufferPointer only exposes that subtle difference, which is actually irrelevant. In the common case, users don’t need to know how UnsafeRawPointer works, so why start with that analogy?

The use cases justify the name. `UnsafeBytes` is what developers have been trying to get all along with `UnsafeBufferPointer<UInt8>`. The concept already exists to developers, but we have failed to give them a distinct, simple, and intuitive name for it, not to mention a correct implementation.

-Andy

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(David Sweeris) #18

I don't think that's too excessive at all. I might even go further and say that we should call it "Untyped" instead of "Byte", to really drive home the point (many people see "byte" and think "8-bit int", which is merely a side effect of CPUs generally not having support for types *other* than ints and floats, rather than a reflection of the true "type" of the data).

- Dave Sweeris

···

On Aug 15, 2016, at 13:55, Michael Ilseman via swift-evolution <swift-evolution@swift.org> wrote:

It seems like there’s a potential for confusion here, in that people may see “UInt8” and assume there is some kind of typed-ness, even though the whole point is that this is untyped. Adjusting the header comments slightly might help:

/// A non-owning view of raw memory as a collection of bytes.
///
/// Reads and writes on memory via `UnsafeBytes` are untyped operations that
/// do no require binding the memory to a type. These operations are expressed
/// in terms of `UInt8`, though the underlying memory is untyped.

You could go even further towards hinting this fact with a `typealias Byte = UInt8`, and use Byte throughout. But, I don’t know if that’s getting too excessive.


(Karl) #19

‘Byte’ is sufficient, I think.

In some sense, it is typed as bytes. It reflects the fact that anything that is representable to the computer must be expressible as a sequence of bits (the same way we have string de/serialisation — which of course is not to say that the byte representation is good for serialisation purposes). “withUnsafeBytes” can be seen as doing a reversible type conversion the same way LosslessStringConvertible does; only in this case the conversion is free.

Karl

···

On 16 Aug 2016, at 01:14, David Sweeris via swift-evolution <swift-evolution@swift.org> wrote:

On Aug 15, 2016, at 13:55, Michael Ilseman via swift-evolution <swift-evolution@swift.org> wrote:

It seems like there’s a potential for confusion here, in that people may see “UInt8” and assume there is some kind of typed-ness, even though the whole point is that this is untyped. Adjusting the header comments slightly might help:

/// A non-owning view of raw memory as a collection of bytes.
///
/// Reads and writes on memory via `UnsafeBytes` are untyped operations that
/// do no require binding the memory to a type. These operations are expressed
/// in terms of `UInt8`, though the underlying memory is untyped.

You could go even further towards hinting this fact with a `typealias Byte = UInt8`, and use Byte throughout. But, I don’t know if that’s getting too excessive.

I don't think that's too excessive at all. I might even go further and say that we should call it "Untyped" instead of "Byte", to really drive home the point (many people see "byte" and think "8-bit int", which is merely a side effect of CPUs generally not having support for types *other* than ints and floats, rather than a reflection of the true "type" of the data).

- Dave Sweeris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution


(Andrew Trick) #20

Yes. Byte clearly refers to a value's in-memory representation. But typealias Byte = UInt8 would imply the opposite of what needs to be conveyed. The name Byte refers to raw memory being accessed, not the value being returned by the collection. The in-memory value's bytes are loaded from memory and reinterpreted as UInt8 values. UInt8 is the correct type for the value after it is loaded. Calling the collection’s element type Byte sends the wrong message. e.g. [Byte] or UnsafePointer<Byte> would be nonsense.

Keep in mind the important use case is code that needs to work with a collection of UInt8 values without knowing the type of the values in memory. This makes it intuitive and convenient to implement correctly without needing to reason about the Swift-specific notions of raw vs. typed pointers and binding memory to a type.

The documentation should be fixed to clarify that the in-memory value is not the same as the loaded value.

-Andy

···

On Aug 16, 2016, at 7:13 PM, Karl via swift-evolution <swift-evolution@swift.org> wrote:

On 16 Aug 2016, at 01:14, David Sweeris via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Aug 15, 2016, at 13:55, Michael Ilseman via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

It seems like there’s a potential for confusion here, in that people may see “UInt8” and assume there is some kind of typed-ness, even though the whole point is that this is untyped. Adjusting the header comments slightly might help:

/// A non-owning view of raw memory as a collection of bytes.
///
/// Reads and writes on memory via `UnsafeBytes` are untyped operations that
/// do no require binding the memory to a type. These operations are expressed
/// in terms of `UInt8`, though the underlying memory is untyped.

You could go even further towards hinting this fact with a `typealias Byte = UInt8`, and use Byte throughout. But, I don’t know if that’s getting too excessive.

I don't think that's too excessive at all. I might even go further and say that we should call it "Untyped" instead of "Byte", to really drive home the point (many people see "byte" and think "8-bit int", which is merely a side effect of CPUs generally not having support for types *other* than ints and floats, rather than a reflection of the true "type" of the data).

- Dave Sweeris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

‘Byte’ is sufficient, I think.

In some sense, it is typed as bytes. It reflects the fact that anything that is representable to the computer must be expressible as a sequence of bits (the same way we have string de/serialisation — which of course is not to say that the byte representation is good for serialisation purposes). “withUnsafeBytes” can be seen as doing a reversible type conversion the same way LosslessStringConvertible does; only in this case the conversion is free.