Add accessor with bounds check to Array

lostatseajoshua · October 10, 2018, 8:12pm

I have found it odd that the Swift Array type doesn't have a safe method to access an element.
I am proposing to add a method that returns an element at a specified index with bounds checking.

struct Array {
   /**
    * Accesses the element at the specified index (position). 
    * If the array is empty or the index is out of bounds this method returns nil
    */
   function at(_ index: Int) -> Element? {
       return (self.startIndex..<self.endIndex).contains(index) ? self[index] : nil
   }
}

This is influenced from c++'s std::vector::at method. The c++ implementation throws an out of range exception but I want to avoid throwing or any fatal errors with this implementation. This should be an easy convenience method to access elements and help remove redundant bounds checking in the code. I believe this method fits in the Swift language arrays as it promotes safety to arrays. There are already similar methods to safely access elements of a Swift array such as: first, last, randomElement().

Alternative names: elementAt(_ index:), element(at index:),

Nobody1707 · October 10, 2018, 8:58pm

Oh, boy. This again. This comes up a lot. A casual search turned up some threads. There were probably more, but I really didn't feel like scrolling that far down.

A list of discussions on "safe" array indexing.

The tl;dr here is that safe does not mean, "does not crash." It means "don't silently propagate invalid data", which can be accomplished here by either crashing or returning nil. As out of bounds indexing is a logic error, crashing is appropriate.

Returning nil has two problems:

it makes using arrays very awkward.
it's not appropriate in a generic context. The only way to get a bad index from a generic collection is if you kept an invalidated index around (or worse, used an index from a different collection of the same type), which is a logic error that you really want to catch early.

That said, there are some niche cases for this (like using an array instead of a Dictionary<Int, Element>), but those would be better handled by a new type that wraps an array, rather than directly using an array. So, while there's nothing wrong with giving array an at(_:) method or a second labled subscript, I don't think it's really worth putting in the standard library.

anthonylatsis · October 10, 2018, 9:06pm

It's also in the commonly rejected list together with the rationale.

Tino · October 10, 2018, 9:11pm

The list also says that

Single-quotes '' for character literals: Swift takes the approach of highly valuing Unicode. However, there are multiple concepts of a character that could make sense in Unicode, and none is so much more commonly used than the others that it makes sense to privilege them. We'd rather save single quoted literals for a greater purpose (e.g. non-escaped string literals).

;-)

Nobody1707 · October 10, 2018, 9:23pm

Yeah, that one is a little out of date after RawStrings were excepted. The core team might want to do some cleanup on that list.

Tino · October 10, 2018, 9:48pm

The rationale for "no checked indexing" might even predate the deprecation of the C-style for loop, and using arrays in this way isn't that popular anymore...
I think it's ok to question old decisions from time to time - and those who are tired of such threads can easily ignore them and leave the rookies alone.

Joe_Groff · October 10, 2018, 9:57pm

It's true that this has been discussed before, and it's helpful to link back to old threads about similar discussions, but having an attitude about it like this is unnecessary.

jawbroken · October 10, 2018, 11:53pm

To be precise, the idea of changing the type of the unlabelled subscript on Array to an optional is on the commonly rejected list, whereas this thread is about adding a new accessor with bounds checking. There has been a lot of past discussion about this though

I wouldn't necessarily be opposed to a separate labelled subscript that returns an optional. I sometimes define this subscript myself to clean up multi-dimensional array processing code where you need to look at neighbouring entries. Nil coalescing is a blessing for code clarity in this situation, because the alternative is a lot of specially casing or conditional testing for the edges.

lostatseajoshua · October 11, 2018, 3:20am

Thanks, Joe!

@Nobody1707 If I am echoing other threads happy to be told so but it doesn't help to make me feel unwelcomed. Either way, thanks for the links to other threads about this topic. Being that there have been many threads about this topic it could be worth finding a happy medium on the solution I believe there is a place for it but happy to discuss. Thanks.

lostatseajoshua · October 11, 2018, 3:31am

I am looking to have this method to be added as an addition and not replace the subscript accessor. I believe that is where the hesitation comes from about this proposal. This is just a method someone could use for convenience. The goal would be to have its performance of the proposed method equal to performing bounds checking before sub-scripting which is what many do in code today.
Ex:

print(array.at(index)) // checks against start index & endIndex - 1

// at(_:) should be yield the same performance as the code below
if index >= array.startIndex && index < array.endIndex - 1 {
  print(array[index])
}

My intention with this proposal is I believe sometimes it is better to take a slight performance hit than crash the app on out of bounds checks. I see this being used interchangeably with sub-scripting. In areas where high performance is needed then use sub-scripting but if it is a one time access and you need safety you can use this method.

Thanks for your input.

jawbroken · October 11, 2018, 3:44am

To expand on what I was saying, and echo some of the earlier threads, it's possible to have multiple subscripts on a type and this functionality would probably be more naturally represented as a labelled subscript rather than a method:

extension Array {
  subscript(checked i: Int) -> Element? {
    get { return indices.contains(i) ? self[i] : nil }
  }
}

let a = Array(1...10)
a[checked:  9] // .some(10)
a[checked: 10] // .none

Jens · October 11, 2018, 11:04am

I guess "checked" would be misleading in that the standard subscript is in no way unchecked (it couldn't trap unless it checked if the index was out of bounds).

Is there some obvious term that can be used here?

Swift has failable initializers, but I don't know what I think about
let maybeElement = arr[failable: 123]

Perhaps there has been some good suggestion in earlier discussions?

Chris_Lattner3 · October 20, 2018, 9:38pm

Agreed. It would be nice to clarify the wording on the commonly rejected list from "Make Array<T> subscript access return" to "Change Array<T> subscript access to return" to clarify this.

Agreed, the most frequently suggested spelling for this is: yourArray[safe: idx], which seems great to me. I am very +1 for adding this.

-Chris

anthonylatsis · October 20, 2018, 11:11pm

Yes, @lostatseajoshua my apologies for not being attentive enough. An addition is a sensible suggestion.

The subscript that Chris mentions looks great, but I feel it will be hard to discover given how often we use the regular one; apparently code completion doesn't respond to a correct opening [ .

salutis · October 21, 2018, 6:58pm

Personally, I would like to see this work with e.g. Data too, not just arrays.

allevato · October 21, 2018, 7:46pm

Maybe we should knock everything out in one go then by making it an extension to Collection where Index == Int. This would provide the protection for any integer-indexable collection where users can put any direct number they want into the subscript, which isn't usually the case for other types of collections that use custom index types that the user must retrieve by querying the collection (thus preventing the problem of invalid indices because the collection itself verifies the integrity of the indices it creates).

ebg · October 22, 2018, 1:46am

If you have let arr: [Int?] = [5, 2, nil, 7, nil], with a safe accessor how do you differentiate between out of bounds and not?

Alejandro · October 22, 2018, 1:57am

You can still check if value is nil or not:

let arr: [Int?] = [5, 2, nil, 7, nil]
let element = arr[safe: 4] // This has type Int??

guard let safeElement = element else { return }

// safeElement is now Int?
print(safeElement as Any) // nil

It's a little awkward, I agree, but it's certainly doable.

maven · October 22, 2018, 11:21am

I dislike this pitch.

You have an index into a collection from somewhere. This index has a meaning (maybe it's a particular element, or the beginning of a range, whatever).

Now you use the safe accessor, ostensibly because you aren't sure the index is valid anymore -- so the collection may have changed, I guess? If you aren't sure whether the collection hasn't changed in the meantime, how do you know that the safe accessor will not return you a different element (now at the original index) than the one you thought you were dealing with?

I fear this "safe" access pattern will lead to more code that is subtly wrong, but we'll notice it in fewer instance and thus have a harder time to fix it.

Tino · October 22, 2018, 11:42am

When a collection can be changed while you are working in it, you're in trouble anyway, and imho this pattern has different applications.
There will always be situations where you have some requirements that can't be enforced with the type system, and only a fraction of those can be resolved with tools like zip.
I guess the most important might be conversion of data structures like the output of an universal parser:
Many of us dislike optional collections, but a backend-developer might decide that it's stupid to explicitly deliver a long list of empty strings.
With non-crashing indexing, this can be addressed easily:

init(serverStrings: [String]) {
    self.value0 = serverStrings[checked: 0] ?? ""
    self.value1 = serverStrings[checked: 1] ?? "n/a"
    //...