Given that we're headed for ABI (and thus stdlib API) stability, I've
been giving lots of thought to the bottom layer of our collection
abstraction and how it may limit our potential for efficiency. In
particular, I want to keep the door open for optimizations that work on
contiguous memory regions. Every cache-friendly data structure, even if
it is not an array, contains contiguous memory regions over which
operations can often be vectorized, that should define boundaries for
parallelism, etc. Throughout Cocoa you can find patterns designed to
exploit this fact when possible (NSFastEnumeration). Posix I/O bottoms
out in readv/writev, and MPI datatypes essentially boil down to
identifying the contiguous parts of data structures. My point is that
this is an important class of optimization, with numerous real-world
examples.
If you think about what it means to build APIs for contiguous memory
into abstractions like Sequence or Collection, at least without
penalizing the lowest-level code, it means exposing UnsafeBufferPointers
as a first-class part of the protocols, which is really
unappealing... unless you consider that *borrowed* UnsafeBufferPointers
can be made safe.
[Well, it's slightly more complicated than that because
UnsafeBufferPointer is designed to bypass bounds checking in release
builds, and to ensure safety you'd need a BoundsCheckedBuffer—or
something—that checks bounds unconditionally... but] the point remains
that
A thing that is unsafe when it's arbitrarily copied can become safe if
you ensure that it's only borrowed (in accordance with well-understood
lifetime rules).
And this leads me to wonder about our practice of embedding the word
"unsafe" in names. A construct that is only conditionally unsafe
shouldn't be spelled "unsafe" when used in a safe way, right? So this
*seems* to argue for an "unsafe" keyword that can be used to label
the constructs that actually add unsafety (as has been previously
suggested on this list). Other ideas are of course most welcome.
Yes, I’ve always found this more appealing (“operations are unsafe, not types”). This allows you to make more subtle distinctions, and expose “low level” APIs for otherwise safe types (e.g. unchecked indexing on Array). I believe Graydon made a draft proposal for this a while back, but neither of us can recall what became of it.
That said, in this particular case the distinction isn’t very helpful: basically everything you can do with an Unsafe(Buffer)Pointer is truly unsafe today, and I wouldn’t really expect this to change with ownership stuff. You need a completely unchecked pointer type for the very lowest levels of abstractions, where scoped lifetimes can’t capture the relationships that are involved.
I would expect there to be two types of interest, one with safe borrowed semantics (Pointer/BufferPointer?), and one with unsafe unchecked semantics (today’s UnsafePointer/UnsafeBufferPointer). For those familiar with Rust, this is roughly equivalent to: &T, &[T], *mut T, and *mut [T] respectively. Most APIs should operate in terms of the safe types, requiring the holder of an unsafe type to do some kind of cast, asserting that the whatever guarantees the safe types make will be upheld.
99% of code should subsequently never actually interact with the Unsafe types, instead using the safe ones. Anything using the Unsafe types should subsequently try to get into the world of safe types as fast as possible. For instance, much of Rust’s growable array type (Vec) is implemented as “convert my unsafe pointer into a safe, borrowed slice, then operate on the slice”. Similarly, any API which is interested in passing around a non-growable pile of memory communicates in terms of these slices.
···
On Nov 6, 2016, at 4:20 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
--
-Dave
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution