[Pitch] - Add `firstIndex` and `lastIndex` to `Collection`

I think Collection should have these to match the API for first and last

extension Collection {
  public var firstIndex: Index? {
    guard !isEmpty else { return nil }
    return startIndex
  }
}
extension BidirectionalCollection {
  public var lastIndex: Index? {
    guard !isEmpty else { return nil }
    return index(before: endIndex)
  }
}

Currently if someone wants to get the index of the first item in a collection, they will reach for startIndex but that will not work if the collection is empty. Or they might write indices.first which is less efficient.

Things are worse with lastIndex as if you need the index of the last element of a collection, there is no equivalent API so you would either have to manually write: someLongName.index(before: someLongName.endIndex) which, again is error-prone if the collection is empty, or write this which is usually not optimal: indices.last.

5 Likes

It seems to me that the startIndex issue you describe is already solved by the isEmpty check:

guard let firstIndex else { ... }
// doesn't seem particularly beneficial over
guard !isEmpty else { ... }

What are some motivating examples for wanting arbitrary access to the last valid index?

i’m inclined to say it’s the responsibility of the library author to make indices appropriately @inlinable so as to prevent an access to it from performing a retain-and-release on the base collection, it really doesn’t seem worth it to give firstIndex and lastIndex spots in the (Bidirectional)Collection witness table.

Why not? Is there a downside?

+1 I am very in favor of this, because We Deserve Nice Things™. I have both of these in my own extensions to the standard library.

These are great for API symmetry. We have firstIndex(of:) and lastIndex(of:); adding .firstIndex and .lastIndex is a natural extension to that. We also have .first (but no .last??), and the naming matches with that. It also rounds out the "beginning and end" index retrieval to complement startIndex and endIndex

Yes, this is trivially implementable using indices.first, but it's not obvious that this is how you'd do it. The more common implementation is to conditionally return startIndex. But … wouldn't it be nice if I didn't have to? If I have a subsequence of a collection, wouldn't it be great if I could 1) know if the subsequence is empty and 2) figure out where it starts, all at once? .firstIndex would let me do that.

5 Likes

Ah, but this would make the addition of properties named firstIndex and lastIndex to the standard library a source-breaking change.

1 Like

Source-breaking is okay for Swift 6? I think the direction is correct, Swift 6 is a time to revisit the Sequence and Collection API set in standard library (which SE-0132 tried to do and the decision at that time was to defer it to Swift 5).

If you've run into places where this is less efficient, please file an issue! indices.first and indices.last are/should be a supported way to get these indices right now.

The source break here would be that if people used the bare name to refer to the functions without any other type context, the type of the result would switch from (Element) -> Index? to just Index?. Using the same code in a typed context, or using the names in an actual function call, would still compile.

let a = [1, 2, 3]
// Today:
a.firstIndex(of: 2)   // 1
let b = a.firstIndex  // this gets the function a.firstIndex(of:)
b(2)                  // 1
// With additional property:
a.firstIndex(of: 2)   // 1
let c = a.firstIndex  // 0 (different value / type)
c(2)                  // error

This is a rare enough pattern that I'm pretty sure we've made this kind of break in the past, though I can't remember the specific instance. (I think we introduced a different type for a collection operation like joined() or something.)

1 Like

I think having a pair of properties named firstIndex and lastIndex, and another pair named startIndex and endIndex, would be too confusing and an overall usability regression. Especially since, when the collection is not empty, firstIndex == startIndex, but lastIndex != endIndex.

Moreover, you still need to write an emptiness check - it's just obscured behind optional unwrapping. I don't think it's actually that useful, and it's less clear at the point of use.

20 Likes

That's the whole point I feel like; endIndex is somewhat misleading if you're not familiar with it, because it kind of sounds like it means "the index of the end of the collection (aka the index of the last element of the collection) but really it's the index after the index of the last item in the collection. If you actually want the last index of a collection you have to jump through hoops

1 Like

Perhaps a consequence of C++ heavily influencing Swift's design.

I think @Karl raises an excellent point about the potential for confusion.

I do wonder, though, if there's inspiration to be taken from ranges - Swift has both half-open and closed ranges, and that works well with little confusion.

The index of the end of the collection is not the same as the index of the last element. And I don't mean that in a tautological sense (that we just defined them to be different) -- they are truly different locations.

Perhaps it would help to consider a function which inserts an element:

extension RangeReplaceableCollection {
  mutating func insert(_: Element, at: Index)
}

startIndex is the location which causes this function to prepend an element -- to insert it before every existing element, so it is at the start of the collection.

endIndex is the location which causes this function to append the new element -- to insert it after every existing element, so it is at the end of the collection.

Indexes are locations, and as you say, the name endIndex communicates that it is the location of the collection's end. If that were the same thing as "the location of the last element", the insert function above would only be able to insert before the final element, and never after it. Clearly then, that cannot be the location of the collection's end -- there is another location after that element which is not occupied, and which truly represents the collection's end.

I actually think it's quite elegant.

FWIW, I think the documentation could be refined. The docs for endIndex describe it as a "past the end" position, which seems contradictory (how can the end be past-the-end?). I think it should say "past the last".

3 Likes

Those docs support my opinion on this: If endIndex is the "past-the-end" position, that means that the end position is the last element

The docs are not completely perfect. They try to explain a concept, but sometimes they could do with refinement to avoid misunderstandings. Documentation is hard.

endIndex is "past the last", not "past the end". endIndex is the end position by definition and by its behaviour. The end position is not the last element, but the position after that.

1 Like

In the past, our policy on source breaks caused by adding overloads has been that we don't reject proposals because of the theoretical possibility of a source break; we only reject them if we find it's a significant problem in practice. For example, if we find that a bunch of the Source Compatibility Suite stops compiling, or we get a lot of bug reports during the beta cycle, we'll revise or reject the proposal.

5 Likes