This is precisely what the new [unchecked:] will give you. It will compile to a bare load in both debug and release builds. Currently the only way to spell that is to go to offset from the baseAddress.
I understand that, but it feels pretty tiresome to have to opt in at every call site to get something so fundamental. Iām imagining looking at a page full of code like my2dArray[0, .unsafe].pointee[24, .unsafe].pointee
and wondering why I didnāt just write the thing in C or C++. There are so many words compared to something that scans easily: my2dArray[0][24]
. And I donāt have to remember to throw the word unsafe
into every pointer access or risk falling off a performance cliff.
Presumably because you wanted to benefit from bounds checks and other memory safety features, so write it as my2DArray[0, 24]
and get those. If you need to opt out for specific accesses after using a profiling tool to identify them, you might write my2DArray[unchecked: 0, 24]
. I (personally) think that's acceptable.
High-performance code in C and C++ is astonishingly verbose, too. Clearly communicating intended program semantics to the compiler requires a lot of words.
I donāt expect the things that are verbose on C/C++ to be less verbose in Swift. The pain comes from things which are concise and commonplace in C/C++ being verbose or presenting gotchas.
I think I can adapt @Karl ās opinion a little bit: āunsafeā should be a superset of āuncheckedā. Iāve clearly indicated my desire to trade off safety for performance by using an unsafe type; please donāt sneak the opposite trade off back in unless I ask, and especially donāt give me a false sense of security that my trade-off is being respected until I happen to exceed the optimizerās current level of intelligence.
Is it not like this today (without "pointee")?
var matrix: UnsafeMutableBufferPointer<UnsafeMutableBufferPointer<Double>> = ...
let element: Double = matrix[0][24] // ā
The "pointee" is analogous to C's *
prefix operator:
let element: Double = ((matrix.baseAddress! + 0).pointee.baseAddress! + 24).pointee
However, I do understand what you mean ā that you don't want to add extra words to get the current "unchecked" behaviour:
let element: Double = matrix[unchecked: 0][unchecked: 24]
Considering some alternatives:
- Here:
((matrix.baseAddress! + 0).pointee.baseAddress! + 24).pointee
we could consider (ab)using &+
operator to get the "non-default" unchecked behaviour:
((matrix.baseAddress! &+ 0).pointee.baseAddress! &+ 24).pointee
- C/C++'s
#pragma
way of doing this would look like:
#pragma optimize(push, checks, off)
func foo() {
// no checks here
}
#pragma optimize(pop)
There's good and bad in that approach, it doesn't look nice and greatly pollutes the code for starters.
- One other possibility would be to achieve almost the same as in (2) on the per-function granularity basis by having an attribute:
@optimization(...) func foo() {
}
- We should also consider leaving the current "unchecked" behaviour the default one and opting in to the checked behaviour by using a long winded form. Personally I don't like it but worth to mention it here for completeness.
Pointer arithmetic is already unchecked, and this proposal would not change that. You're doing pointer arithmetic. There's no need for any shenanigans.
At worst, this would be matrix.baseAddress![0].baseAddress![24]
. But surely one would wrap this into a struct
that provided multi indexing if you were using it more than once, and render it as matrix[unchecked: 0, 24]
.
Got you. Maybe this then?
((matrix + 0).pointee + 24).pointee // checked version
((matrix &+ 0).pointee &+ 24).pointee // unchecked version
A questionable point would be: is this legal way to get the 1st element:
(matrix + 100 - 99).pointee
or would it overflow mid-term.
This does look like only "x" is unchecked
There is no checking for pointer arithmetic currently, and this proposal would not change that. Not just within a buffer; you can wrap around the memory space a couple times and that's totally fine.Ā¹
1> let a = UnsafeMutablePointer<Int>.allocate(capacity: 1)
a: UnsafeMutablePointer<Int> = 0x600002458000 {
pointee = 0
}
2> a + Int.max - Int.max
$R0: UnsafeMutablePointer<Int> = 0x600002458000 {
pointee = 0
}
3>
Ā¹ "Fine" as far as Swift checks are concerned. There is a question about the interaction of this with LLVM-level optimization passes that may assume C-like semantics for such things and make it "not fine";Ā² I would need to investigate that further.
Ā² No more "not fine" than it is in C (which is very not fine).
Ah, no, I didn't mean pointer arithmetic there. More like this:
extension UnsafeMutableBufferPointer {
struct Something {
let buffer: UnsafeMutableBufferPointer
let offset: Int
var pointee: UnsafeMutableBufferPointer.Element {
get { buffer[checked_or_unchecked: offset] }
set { buffer[checked_or_unchecked: offset] = newValue }
}
static func + (something: Self, offset: Int) -> Something {
Something(buffer: something.buffer, offset: something.offset + offset)
}
}
static func + (buffer: Self, offset: Int) -> Something {
Something(buffer: buffer, offset: offset)
}
}
Or something similar to that account (that potentially could check "offset" validity inside "+" operation instead of (or in addition to) the "pointee" operation).
All of my yes, it's bad enough that the C++ equivalent (std::span
) doesn't bound check, there's no need for Swift to continue the same mistake. +1 to the bound checks and +1 to the additional unchecked subscript.
In sympathy to @Karl's line of thinking, would it make sense to extend this [unchecked: x]
syntax to safer collection types too, like Array
? Would that perhaps lessen the need to use Unsafe[Mutable]BufferPointer
to begin with?
So while I'm sympathetic to this argument, I want to reinforce that @karl's fear is entirely reasonable. The Swift optimizer is consistently horrible at hoisting bounds checks. The only reason this hasn't been more of a disaster is that Array has magic semantics annotations that allow the code to be wholesale replaced.
As an example, below is the world's simplest collection type: a stupid reimplementation of Range
:
struct BoundsCheckedRange: RandomAccessCollection {
var startIndex: Int
var endIndex: Int
subscript(_ index: Int) -> Int {
precondition(index >= startIndex)
precondition(index < endIndex)
return index
}
}
We can call firstIndex(where:)
in the safest possible way like this:
func funComputing(_ range: BoundsCheckedRange, element: Int) -> Int? {
return range.firstIndex(where: { $0 >= element })
}
Here is the generated asm for this function (godbolt link for comparison]:
output.funComputing(_: output.BoundsCheckedRange, element: Swift.Int) -> Swift.Int?:
mov rax, rdx
mov dl, 1
cmp rdi, rsi
jne .LBB47_2
.LBB47_1:
xor edi, edi
mov rax, rdi
ret
.LBB47_2:
jge .LBB47_12
cmp rdi, rax
jge .LBB47_8
inc rdi
cmp rdi, rsi
je .LBB47_1
.LBB47_5:
cmp rdi, rsi
jge .LBB47_11
cmp rax, rdi
je .LBB47_10
inc rdi
cmp rsi, rdi
jne .LBB47_5
jmp .LBB47_1
.LBB47_8:
xor edx, edx
mov rax, rdi
ret
.LBB47_10:
xor edx, edx
ret
.LBB47_11:
ud2
.LBB47_12:
ud2
We have two precondition
checks here: one defensible, one not. One is in the setup block LBB47_2
. We confirm that the endIndex is greater than the startIndex, a justifiable check (failure is LBB47_12
).
The other is in the loop body at LBB47_5
. Here we immediately check that our current index (rdi
) is not greater than the endIndex
(rsi
). This is one of our two bounds checks in the subscript, and it has not been eliminated. Swift has successfully dropped one, but not the other.
Now, this leaves aside the fact that an actually optimized version of this code can correctly observe that the correct implementation of this function is validity checks on startIndex
and endIndex
, followed by if element < endIndex { return element} else { return nil }
. It's totally understandable that the optimizer wasn't able to see its way to that, but we're a long way from that goal.
To be absolutely clear about this, this is an optimizer miss, not a stdlib issue. We performed the following checks:
cmp rdi, rsi
jne .LBB47_2 // taken
jge .LBB47_12 // not taken
This means we know that rdi
is not equal to rsi
, and not greater than rsi
: that is, it's strictly less than. All subsequent operations on rdi
are inc
: there is no assignment, and no other math.
Therefore, the subsequent cmp rdi, rsi; jge .LBB47_11
cannot be taken because it's strictly dominated by the loop-end check at the bottom of the loop body. The optimizer had enough knowledge to know this, but failed to.
I want to stress that none of this is me saying that this patch is reasonable. My personal use of the buffer pointers has not been to remove bounds checks, and I've often wanted to add them back. I accept the motivation.
I just want us to be cautious of promising that the optimizer will fix things. Swift's optimizer has never done a good job of removing bounds checks for anything but Array.
Unsafe
prefix tells that using of api is .. unsafe.
So seeing in code something like unsafe[unchecked: Int]
is a little bit misleading.
What are your thoughts about keeping current api as is and add new [checked: Int]
?
There will be no source or abi breakage, existing code will behave the same and in Swift 6 mode it become possible to use new [checked: Int]
.
It is worth to mention that such kind of design doesn't match well with Swift's safety by default
on the one side, and the fact that unsafe
is unsafe by design on the other.
Yes, absolutely, this will be part of the actual proposal.
Unless we have "safe"
/ "unsafe"
as formal attributes/qualifiers on functions, we have to rely on the naming convention of "UnsafeSomething" to denote unsafe constructs. "unchecked" array subscript is unsafe (instead of safely crashing the app on the relevant out of bounds precondition check the app would likely crash or misbehave unsafely at some later point), hence I'd expect to see "Unsafe" embedded in the name for the unchecked array subscript. array[x, .unsafeUnchecked]
or something to that account. Otherwise when auditing code for unsafely we'd have to search for both "unsafe" and "unchecked" terms.
This isn't totally clear. Yes, it's an optimizer miss, but maybe we should be working around that in the standard library instead of waiting for the optimizer to get smarter. Explicitly unchecked subscripts can be one of the tools we use to build those workarounds.
This is a strange divergence. If bounds checking is so cheap to be enabled by default for subscripting, why shouldnāt UnsafeBufferPointer.baseAddress
return a type whose arithmetic operations perform bounds checking?
I would much rather this functionality be introduced under a new safe(r) but still unmanaged BufferPointer type.
How would that type be different from UnsafeBufferPointer
itself? A raw pointer without size can't reasonably provide arithmetic with bounds checking. Perhaps one could argue that pointers alone shouldn't have arithmetic operations at all (but probably as a different proposal from this one).
It would need to be a fat pointer.
I oppose this proposal and the comments suggesting to weaken Swift's pointer types.