Add accessor with bounds check to Array

collection

(Joshua Alvarado) #1

I have found it odd that the Swift Array type doesn't have a safe method to access an element.
I am proposing to add a method that returns an element at a specified index with bounds checking.

struct Array {
   /**
    * Accesses the element at the specified index (position). 
    * If the array is empty or the index is out of bounds this method returns nil
    */
   function at(_ index: Int) -> Element? {
       return (self.startIndex..<self.endIndex).contains(index) ? self[index] : nil
   }
}

This is influenced from c++'s std::vector::at method. The c++ implementation throws an out of range exception but I want to avoid throwing or any fatal errors with this implementation. This should be an easy convenience method to access elements and help remove redundant bounds checking in the code. I believe this method fits in the Swift language arrays as it promotes safety to arrays. There are already similar methods to safely access elements of a Swift array such as: first, last, randomElement().

Alternative names: elementAt(_ index:), element(at index:),


(Nobody1707) #2

Oh, boy. This again. This comes up a lot. A casual search turned up some threads. There were probably more, but I really didn't feel like scrolling that far down.

A list of discussions on "safe" array indexing.





The tl;dr here is that safe does not mean, "does not crash." It means "don't silently propagate invalid data", which can be accomplished here by either crashing or returning nil. As out of bounds indexing is a logic error, crashing is appropriate.

Returning nil has two problems:

  • it makes using arrays very awkward.
  • it's not appropriate in a generic context. The only way to get a bad index from a generic collection is if you kept an invalidated index around (or worse, used an index from a different collection of the same type), which is a logic error that you really want to catch early.

That said, there are some niche cases for this (like using an array instead of a Dictionary<Int, Element>), but those would be better handled by a new type that wraps an array, rather than directly using an array. So, while there's nothing wrong with giving array an at(_:) method or a second labled subscript, I don't think it's really worth putting in the standard library.


(Anthony Latsis) #3

It's also in the commonly rejected list together with the rationale.


(Tino) #4

The list also says that

  • Single-quotes '' for character literals: Swift takes the approach of highly valuing Unicode. However, there are multiple concepts of a character that could make sense in Unicode, and none is so much more commonly used than the others that it makes sense to privilege them. We'd rather save single quoted literals for a greater purpose (e.g. non-escaped string literals).

;-)


(Nobody1707) #5

Yeah, that one is a little out of date after RawStrings were excepted. The core team might want to do some cleanup on that list.


(Tino) #6

The rationale for "no checked indexing" might even predate the deprecation of the C-style for loop, and using arrays in this way isn't that popular anymore...
I think it's ok to question old decisions from time to time - and those who are tired of such threads can easily ignore them and leave the rookies alone.


(Joe Groff) #7

It's true that this has been discussed before, and it's helpful to link back to old threads about similar discussions, but having an attitude about it like this is unnecessary.


#8

To be precise, the idea of changing the type of the unlabelled subscript on Array to an optional is on the commonly rejected list, whereas this thread is about adding a new accessor with bounds checking. There has been a lot of past discussion about this though

I wouldn't necessarily be opposed to a separate labelled subscript that returns an optional. I sometimes define this subscript myself to clean up multi-dimensional array processing code where you need to look at neighbouring entries. Nil coalescing is a blessing for code clarity in this situation, because the alternative is a lot of specially casing or conditional testing for the edges.


(Joshua Alvarado) #9

Thanks, Joe!

@Nobody1707 If I am echoing other threads happy to be told so but it doesn't help to make me feel unwelcomed. Either way, thanks for the links to other threads about this topic. Being that there have been many threads about this topic it could be worth finding a happy medium on the solution I believe there is a place for it but happy to discuss. Thanks.


(Joshua Alvarado) #10

I am looking to have this method to be added as an addition and not replace the subscript accessor. I believe that is where the hesitation comes from about this proposal. This is just a method someone could use for convenience. The goal would be to have its performance of the proposed method equal to performing bounds checking before sub-scripting which is what many do in code today.
Ex:

print(array.at(index)) // checks against start index & endIndex - 1

// at(_:) should be yield the same performance as the code below
if index >= array.startIndex && index < array.endIndex - 1 {
  print(array[index])
}

My intention with this proposal is I believe sometimes it is better to take a slight performance hit than crash the app on out of bounds checks. I see this being used interchangeably with sub-scripting. In areas where high performance is needed then use sub-scripting but if it is a one time access and you need safety you can use this method.

Thanks for your input.


#11

To expand on what I was saying, and echo some of the earlier threads, it's possible to have multiple subscripts on a type and this functionality would probably be more naturally represented as a labelled subscript rather than a method:

extension Array {
  subscript(checked i: Int) -> Element? {
    get { return indices.contains(i) ? self[i] : nil }
  }
}

let a = Array(1...10)
a[checked:  9] // .some(10)
a[checked: 10] // .none

(Jens Persson) #12

I guess "checked" would be misleading in that the standard subscript is in no way unchecked (it couldn't trap unless it checked if the index was out of bounds).

Is there some obvious term that can be used here?

Swift has failable initializers, but I don't know what I think about
let maybeElement = arr[failable: 123]

Perhaps there has been some good suggestion in earlier discussions?


(Chris Lattner) #13

Agreed. It would be nice to clarify the wording on the commonly rejected list from "Make Array<T> subscript access return" to "Change Array<T> subscript access to return" to clarify this.

Agreed, the most frequently suggested spelling for this is: yourArray[safe: idx], which seems great to me. I am very +1 for adding this.

-Chris


(Anthony Latsis) #14

Yes, @lostatseajoshua my apologies for not being attentive enough. An addition is a sensible suggestion.

The subscript that Chris mentions looks great, but I feel it will be hard to discover given how often we use the regular one; apparently code completion doesn't respond to a correct opening [ .


(Rudolf Adamkovič) #15

Personally, I would like to see this work with e.g. Data too, not just arrays.


(Tony Allevato) #16

Maybe we should knock everything out in one go then by making it an extension to Collection where Index == Int. This would provide the protection for any integer-indexable collection where users can put any direct number they want into the subscript, which isn't usually the case for other types of collections that use custom index types that the user must retrieve by querying the collection (thus preventing the problem of invalid indices because the collection itself verifies the integrity of the indices it creates).


#17

If you have let arr: [Int?] = [5, 2, nil, 7, nil], with a safe accessor how do you differentiate between out of bounds and not?


(Alejandro Alonso) #18

You can still check if value is nil or not:

let arr: [Int?] = [5, 2, nil, 7, nil]
let element = arr[safe: 4] // This has type Int??

guard let safeElement = element else { return }

// safeElement is now Int?
print(safeElement as Any) // nil

It's a little awkward, I agree, but it's certainly doable.


(Daniel Vollmer) #19

I dislike this pitch.

You have an index into a collection from somewhere. This index has a meaning (maybe it's a particular element, or the beginning of a range, whatever).

Now you use the safe accessor, ostensibly because you aren't sure the index is valid anymore -- so the collection may have changed, I guess? If you aren't sure whether the collection hasn't changed in the meantime, how do you know that the safe accessor will not return you a different element (now at the original index) than the one you thought you were dealing with?

I fear this "safe" access pattern will lead to more code that is subtly wrong, but we'll notice it in fewer instance and thus have a harder time to fix it.


(Tino) #20

When a collection can be changed while you are working in it, you're in trouble anyway, and imho this pattern has different applications.
There will always be situations where you have some requirements that can't be enforced with the type system, and only a fraction of those can be resolved with tools like zip.
I guess the most important might be conversion of data structures like the output of an universal parser:
Many of us dislike optional collections, but a backend-developer might decide that it's stupid to explicitly deliver a long list of empty strings.
With non-crashing indexing, this can be addressed easily:

init(serverStrings: [String]) {
    self.value0 = serverStrings[checked: 0] ?? ""
    self.value1 = serverStrings[checked: 1] ?? "n/a"
    //...