Swift has been putting a lot of thought into data-race safety recently.
That hasn't been a problem for me in practice - but what has caused me a lot of pain are fatal errors relating to bounds issues.
Specifically in my own code, I have had two (three?) cases this year of production crashes which amounted to this bug (of course - less obviously arrived at)
For the classic case of accessing an out of bounds index on an Array, I have my own helper
public extension Collection {
/// Returns the element at the specified index iff it is within bounds, otherwise nil.
subscript(safe index: Index) -> Element? {
return indices.contains(index) ? self[index] : nil
}
}
But I wonder if there might be a broader language-supported approach to provide safer variants of these basic language features.
For example
start?...finish
returns nil if start > finish
or perhaps let range = try start...finish
throws if start > finish
and for an array
array[? index]
returns nil if index out of bounds
I'm not suggesting this would be the right syntax - just raising the idea that there might be some syntax
These would of course have to be optional - but might encourage people towards more defensive programming.
I'd love to see some stats on how many Apple crashes relate to out-of-bounds type errors. I bet it's a meaningful proportion...
As has been said before, halting execution is safe. It is continuing execution in an unexpected state that is potentially unsafe.
The idea of a "lenient" subscript has been discussed at length and you can review the existing conversations over the past decade using the search function. As an example:
When resurrecting ideas such as this with an extensive history, it is helpful to the community if you can take some time to summarize existing contributions in such a way that the conversation doesn't end up as a rehash: after all, if the goal is to make progress, repeating the same things won't get us there.
An array indexing model that is statically correct would introduce conceptual complexity and annotation burden that far exceeds that of data race safety, though.
I have also run into issues with crashes in production due to indexes being out of bounds and ranges being improperly formed, but I don't really blame the language for that happening; I think it's the right thing to do. In the past couple years, it's become less and less common for me to encounter these crashes as I've gotten to know the cases where forming a range from unknown values or using indices directly is truly necessary. I don't yet have a conscious understanding of what exactly has changed in my approach, but there are often other ways of accomplishing things that avoid these sorts of issues. My view is that this is not the job of the language to provide these extra "safety" nets.
Yeah, exactly. Replacing all subscripting operations with a “safe” variant is one of those things that often just creates a new problem: how do you handle a nil return?
While it might be technically possible to recover in some way in every case, if some execution path that leads to out of bounds access isn’t properly tested, you’re still going to end up in an undefined state, and it’s very possible that something will end up crashing anyway later, in which case you’ve gained nothing, and now the root cause is harder to debug.
Ironically, given that the crash is inside a weak-reference collection of some sort, it's possible that this is caused by a data race in your app.
Optional and error-handling are very different beasts from fatal errors caused by programmer error. They are tools for dealing with the inherent unknowability of interacting with the unpredictable runtime environment. A user might not input a valid number, and you need to account for that, etc. There's a lot of ways to handle it, but it's up to the programmer to decide how they want to handle it depending on the context.
Of course, one could say the same thing about subscripts range literals that fatalError, but every programming language is full of tradeoffs. Swift's creators decided early on that Collection subscripts would not return an Optional<Element>. Same with x...y. You can argue with whether that was correct, but I don't think it's likely to get very far. That ship has sailed. And again, I don't think it's up to the language to provide alternative tools for every single case that might cause a fatal error (integer overflow/underflow is another one, for example; hell, even a stack overflow caused by a recursive function call).
The use cases for arrays and dictionaries are different, though.
I’d say about 80% of the time you subscript an array, you’re using an index that was somehow derived from the array—for instance, a range like 0..<array.count, or array.indices, or array[indexPath.row] where tableView(_:numberOfRowsInSection:) returns array.count. This is very different from dictionaries, where the key is usually some piece of data from somewhere else and you’re trying to look up the value corresponding to it. You rarely say, for instance, array[2] or array[someRandomNumberFromSomewhere], but dictionary[“myKey”] or dictionary[someRandomValueFromSomewhere] are pretty common.
Because the use cases are different, arrays have a non-optional subscriptor which fails a precondition when the index is invalid, while dictionaries have an optional subscriptor which returns nil when the index is invalid.
None of which prevents you from adding your own lenient subscript implementation or lenient range initializer, if you find them useful. The ability to add operations to existing types as if they were built-in is one of the delights of writing Swift.
I'd also suggest that perma-checking indexes at subscripting time is too big a hammer. IRL there are usually "gates" in your code where untrusted indexes will cause problems if allowed through. However, if indexes are validated at the gates, they're essentially trustworthy for as long as they're on the inside, and re-checking them at each subscripting operation is pointless.
Another way of looking at this, without the hokey metaphors, is that indexes should be trusted by default when arrived at via local reasoning, and you should write your own validation checks when they're arrived at via globally complex reasoning.
It surprises me how keen people are to enforce onerous requirements in order to prevent one source of problems - whilst being uninterested in providing options to avoid another.
but hey-ho, I'm clearly way out of sync with the swift mood music...
Just to throw my 2¢ in, the only time I've had a crash from out-of-bounds array access that wasn't because of an obvious bug that was caught early in development, it was because I said array[0] rather than array.first.
The current industry push to memory safe languages is because this approach has proven not to scale.
Moreover, it fails to account for bugs. For instance, it is trivial to make an index invalid "remotely" by mutating the backing storage -- now that index no longer points to a valid element, even though the index itself didn't change. Code inside the "gate" would have to be vigilant to avoid such issues, but what if it isn't? What if somebody forgets, or a calculation fails to account for some edge case?
If that bug leads to invalid accesses to memory, the effects can be particularly severe. Not only can it lead to crashes on its own (just much more difficult to debug crashes), it can also make your code vulnerable to being exploited.
There are still things that can be done to optimise bounds checking. In WebURL, I implemented my own bounds checking on top of unsafe pointers with a focus on speed. It drops certain checks related to Collection protocol semantics but not strictly required for memory safety. Last time I checked, the bounds-checking it implements can almost entirely be optimised out by the compiler; you can change the UnsafeBufferPointer.boundsChecked accessor to return self (i.e. disabling bounds-checking), and the performance is exactly the same. Part of it is that I've written the library very deliberately to make that easier on the compiler, though.
Once we have Span to guarantee lifetimes, I'll see about implementing this bounds-checking strategy on top of it and releasing it as a package.
No amount of testing will make it so that a network request cannot fail. try and more generally Swift Errors are for expected runtime errors that are impossible to guarantee will succeed. There is no amount of precondition checking that will let you know if you can open a file; you have to just try and handle the failure if it turns out you can't. Indexing an array does not work like that because no one is changing the array behind your back†. You can check if the index you want to access is in bounds, and if that check succeeds then indexing the array will also always succeed.
†Assuming no data races, which is why Swift is trying to solve that problem.
I'm a bit confused by this. I was suggesting that adding (a) some kind of syntax to wrap the existing compiler-generated array bounds check, and (b) your own code to check your optional subscripting result for nil would count as an "onerous" requirement you've been imposing on yourself. I'm suggesting an option — writing a much smaller amount of checking code in a much smaller number of place.
I was genuinely trying to suggest something easier, not something harder.
As @xwu already pointed out about, this thread isn't about safety (in Swift terms), but about the question, "Is there an easier way to avoid having my app crash at array subscript bounds checks, other than writing boilerplate at every such access to turn the failure into a more ordinary error?"
I wonder if you thought I was suggesting removing the index bounds check from the compiler's code generation? I wasn't. I was suggesting a methodology to avoid wrapping that bounds check in additional boilerplate to avoid the crash.