[Pitch] Change the endIndex value for closed Ranges and Collections


(Pedro Vieira) #1

Hi Swift developers,
I hereby present my first Swift proposal regarding the endIndex on
`Collections` and closed `Ranges`. I’ve searched the mailing list archives
for something similar to this but couldn’t find it, so I decided to come
forward.

*The problem:*
I was recently working on a library and used a closed Range to define the
bounds of a board game and the tests that used its endIndex were all
failing. Started debugging it around and, to my complete surprise, found
out that the endIndex of the closed Ranges was always +1 from the value I
initially set. With this, I decided to dive into the Range source code and
discovered that all closed ranges are converted to half-open ones when
initialized:
a) 1..<10 stays 1..<10 (endIndex = 10)
b) 1...10 is converted to 1..<11 (endIndex = 11)

To work around this behavior I had to subtract 1 every time I checked for
the endIndex, since I didn't want to use last! (force unwrapping) or
if-let, which ultimately polluted my code. You could argue that changing
from 1...10 to 1...9 would get rid of all of this, since it gets translated
to 1..<10 with endIndex being 10 (which is the value I expect), but I think
it’s just not worth it to “obscure” that way.

I’ve asked some fellow developer friends their thoughts having the value 11
when accessing endIndex on b) and a lot of them were confused and did not
expect the outcome, and I totally agree, it’s not intuitive at all. To me,
endIndex implies that the returned value is the last valid and accessible
index of a Collection, not its size/count or any other value that is
outside the collection’s bounds.

*The solution:*
Add an optional boolean parameter, `closed`, to the Range init method with
the default value of false (instead of `closed` there’s always the
`halfOpen` alternative):
   init(_start: Element, end: Element, closed: Bool = false)
The parameter is an optional parameter to minimize the impact on existing
code that is currently initializing Ranges using the init method directly.
This parameter would also be a public property.

Then, the ... constructor would become:

public func ... <Pos : ForwardIndex> (minimum: Pos, maximum: Pos) -> Range<

{

  - return Range(_start: minimum, end: maximum.successor())

  + return Range(_start: minimum, end: maximum, closed: true)

}

Also, the next() method from the RangeGenerator would become:

public mutating func next() -> Element? {

  - if startIndex == endIndex.successor() { return nil }

  + if startIndex == (isClosed ? endIndex.successor() : endIndex) { return
nil }

  let element = startIndex

  startIndex._successorInPlace()

  return element

}

Performing this change has 2 main benefits:
1 — The Range description and debugDescription properties would finally
return a *true* self String representation. So, for instance, the
description property would become:

public var description: String {

  - return "\(startIndex)..<\(endIndex)"

  + return "\(startIndex)..\(isClosed ? "." : "<")\(endIndex)"

}

2 — It becomes a lot more intuitive. WYSIWYG. Also, by having the `closed`
parameter in the init method, developers that create Ranges directly from
the init method will actually know what type of Range they’re getting
straight away:
- currently: Range(start: 1, end: 10) ==> is it 1...10 or 1..<10?
- proposed: Range(start: 1, end: 10, closed: true) ==> 1...10

This behavior is also present in Swift Collections:

let array = [5, 10, 15]

array.count // 3

array.endIndex // 3 (should be 2)

array.last! // 15 (apologies for the force unwrap :nerd_face:)

Again, the endIndex returns an index that is outside the array bounds,
which, in fact, acts as the array count. Instead, it should have returned
the index of the last element.
I know that, in the comments, it’s explicit: “A 'past-the-end' element
index; the successor of the last valid subscript argument.”, but, in the
end, it all comes down to readability.

What are your thoughts on this? Let me know.

Cheers,

···

--
Pedro Vieira
http://pedrovieira.me


(Dave Abrahams) #2

Please see the work currently underway in
https://github.com/apple/swift/blob/swift-3-indexing-model/stdlib/public/core/ClosedRange.swift,
which we intend to propose soon. ClosedRange becomes a separate type
that represents its upperBound without modification.

···

on Wed Mar 23 2016, Pedro Vieira <swift-evolution@swift.org> wrote:

Hi Swift developers,
I hereby present my first Swift proposal regarding the endIndex on
`Collections` and closed `Ranges`. I’ve searched the mailing list archives
for something similar to this but couldn’t find it, so I decided to come
forward.

*The problem:*
I was recently working on a library and used a closed Range to define the
bounds of a board game and the tests that used its endIndex were all
failing. Started debugging it around and, to my complete surprise, found
out that the endIndex of the closed Ranges was always +1 from the value I
initially set. With this, I decided to dive into the Range source code and
discovered that all closed ranges are converted to half-open ones when
initialized:
a) 1..<10 stays 1..<10 (endIndex = 10)
b) 1...10 is converted to 1..<11 (endIndex = 11)

To work around this behavior I had to subtract 1 every time I checked for
the endIndex, since I didn't want to use last! (force unwrapping) or
if-let, which ultimately polluted my code.

--
Dave


(Andrey Tarantsov) #3

Pedro,

endIndex is exclusive by definition. You can iterate over any collection by going from startIndex until you reach endIndex.

If we change endIndex on ranges, *all* generic iteration code will be broken when dealing with ranges, even the for statement.

I understand that there are two concerns here:

1) The desire to keep the semantics of 1...10 intact, so that you can access that 10 somehow (perhaps something like lastValidIndex?), the description returns the expected value, etc. To achieve this, a different class (ClosedRange) would probably make more sense, because there's no need to make this decision at runtime.

2) The desire to change endIndex specifically. This is a no-go.

A.


(Andrew Bennett) #4

Hi Pedro,

I'm going to refer to what you're proposing as *lastIndex* to distinguish
it from the current endIndex.

Worth considering is indexes for things like a linked list (or other
ForwardIndexType collections):

   - You can represent the endIndex in O(1) by using a nullable internally.
   - It's likely to be an O(n) operation resolve the *lastIndex* property*.*

The current system seems much cleaner to represent empty ranges:

   - How do you represent an empty range with *lastIndex*? Range(start: 0,
   end: -1) ?
   - Does this mean that UInt cannot be a ForwardIndexType, or cannot be
   used with Range, how do you show this in the protocol?

You should also consider migration in your proposal:

   - I don't think migration will be trivial, or even necessarily safely
   automated. I think the changes will be extensive.
   - Reusing the same name will likely make it confusing when people google
   Swift code, it will be unclear if code is in the old system or the new one.
   - If you do a formal proposal I recommend it proposes to deprecate the
   current system and introduce a new property with a different, clearer,
   name. I recommend *lastIndex*.

It's probably worthwhile for you to consider your proposed changes in the
context of this upcoming proposal, I think it's very likely to go ahead in
some form:

https://github.com/gribozavr/swift-evolution/blob/new-collections/proposals/NNNN-collections-move-indices.md

I'm in favour of having a new field:
    // for example
    var lastIndex: Index? {
        let count = self.count
        guard count > 0 else { return nil }
        return startIndex.advancedBy(count-1)
    }

It's optional to represent an empty collection.

I'm not in favour of the proposal as it is stated. It's a fairly well
established convention for endIndex to be the index after the last element,
although perhaps it could be called something clearer. I think that
discussion is a good idea.

Andrew Bennett

···

On Thu, Mar 24, 2016 at 11:15 AM, Dave Abrahams via swift-evolution < swift-evolution@swift.org> wrote:

on Wed Mar 23 2016, Pedro Vieira <swift-evolution@swift.org> wrote:

> Hi Swift developers,
> I hereby present my first Swift proposal regarding the endIndex on
> `Collections` and closed `Ranges`. I’ve searched the mailing list
archives
> for something similar to this but couldn’t find it, so I decided to come
> forward.
>
> *The problem:*
> I was recently working on a library and used a closed Range to define the
> bounds of a board game and the tests that used its endIndex were all
> failing. Started debugging it around and, to my complete surprise, found
> out that the endIndex of the closed Ranges was always +1 from the value I
> initially set. With this, I decided to dive into the Range source code
and
> discovered that all closed ranges are converted to half-open ones when
> initialized:
> a) 1..<10 stays 1..<10 (endIndex = 10)
> b) 1...10 is converted to 1..<11 (endIndex = 11)
>
> To work around this behavior I had to subtract 1 every time I checked for
> the endIndex, since I didn't want to use last! (force unwrapping) or
> if-let, which ultimately polluted my code.

Please see the work currently underway in

https://github.com/apple/swift/blob/swift-3-indexing-model/stdlib/public/core/ClosedRange.swift
,
which we intend to propose soon. ClosedRange becomes a separate type
that represents its upperBound without modification.

--
Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Andrey Tarantsov) #5

PS:

endIndex is exclusive by definition

If you want a rationale on this...

1) It's how cursors typically work, and people familiar with the concept already expect this.

2) It's often the most convenient choice. Back in the day, you would write C code like this:

  for (int *cur = array, *end = array + size; cur < end; ++cur) {
            process(*cur);
        }

    Similarly, you can compute the number of elements by just getting the distance between startIndex and endIndex (or end - cur in the C example above).

    Making the upper boundary inclusive would require explicit '- 1' and '+ 1' in quite a few places, which is ugly and unnecessary.

3) The upper bound is almost always excluded in programming, in many unrelated APIs (e.g. arc4random_uniform), so it's a good default choice.

A.


(Howard Lovatt) #6

+1 I think it is a good idea to make a collection run from firstIndex to
lastIndex inclusively (note name change to match firstElement and
lastElement). For an empty collection both firstIndex and lastIndex would
be invalid values that would cause both c[c.firstIndex] and c.[c,lastIndex]
to fail.

  -- Howard.

···

On 24 March 2016 at 14:07, Andrew Bennett via swift-evolution < swift-evolution@swift.org> wrote:

Hi Pedro,

I'm going to refer to what you're proposing as *lastIndex* to distinguish
it from the current endIndex.

Worth considering is indexes for things like a linked list (or other
ForwardIndexType collections):

   - You can represent the endIndex in O(1) by using a nullable
   internally.
   - It's likely to be an O(n) operation resolve the *lastIndex* property
   *.*

The current system seems much cleaner to represent empty ranges:

   - How do you represent an empty range with *lastIndex*? Range(start:
   0, end: -1) ?
   - Does this mean that UInt cannot be a ForwardIndexType, or cannot be
   used with Range, how do you show this in the protocol?

You should also consider migration in your proposal:

   - I don't think migration will be trivial, or even necessarily safely
   automated. I think the changes will be extensive.
   - Reusing the same name will likely make it confusing when people
   google Swift code, it will be unclear if code is in the old system or the
   new one.
   - If you do a formal proposal I recommend it proposes to deprecate the
   current system and introduce a new property with a different, clearer,
   name. I recommend *lastIndex*.

It's probably worthwhile for you to consider your proposed changes in the
context of this upcoming proposal, I think it's very likely to go ahead in
some form:

https://github.com/gribozavr/swift-evolution/blob/new-collections/proposals/NNNN-collections-move-indices.md

I'm in favour of having a new field:
    // for example
    var lastIndex: Index? {
        let count = self.count
        guard count > 0 else { return nil }
        return startIndex.advancedBy(count-1)
    }

It's optional to represent an empty collection.

I'm not in favour of the proposal as it is stated. It's a fairly well
established convention for endIndex to be the index after the last element,
although perhaps it could be called something clearer. I think that
discussion is a good idea.

Andrew Bennett

On Thu, Mar 24, 2016 at 11:15 AM, Dave Abrahams via swift-evolution < > swift-evolution@swift.org> wrote:

on Wed Mar 23 2016, Pedro Vieira <swift-evolution@swift.org> wrote:

> Hi Swift developers,
> I hereby present my first Swift proposal regarding the endIndex on
> `Collections` and closed `Ranges`. I’ve searched the mailing list
archives
> for something similar to this but couldn’t find it, so I decided to come
> forward.
>
> *The problem:*
> I was recently working on a library and used a closed Range to define
the
> bounds of a board game and the tests that used its endIndex were all
> failing. Started debugging it around and, to my complete surprise, found
> out that the endIndex of the closed Ranges was always +1 from the value
I
> initially set. With this, I decided to dive into the Range source code
and
> discovered that all closed ranges are converted to half-open ones when
> initialized:
> a) 1..<10 stays 1..<10 (endIndex = 10)
> b) 1...10 is converted to 1..<11 (endIndex = 11)
>
> To work around this behavior I had to subtract 1 every time I checked
for
> the endIndex, since I didn't want to use last! (force unwrapping) or
> if-let, which ultimately polluted my code.

Please see the work currently underway in

https://github.com/apple/swift/blob/swift-3-indexing-model/stdlib/public/core/ClosedRange.swift
,
which we intend to propose soon. ClosedRange becomes a separate type
that represents its upperBound without modification.

--
Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dave Abrahams) #7

That direction is pretty much a nonstarter with me. There are *many*
algorithms that are correct for empty collections without any
special-case testing, as long as you use half-open ranges. There's no
reason to make every algorithm that uses indices test for emptiness
before proceeding.

See also
https://www.quora.com/Why-are-Python-ranges-half-open-exclusive-instead-of-closed-inclusive
for philosophical background on why most ranges should be half-open.

···

on Wed Mar 23 2016, Howard Lovatt <swift-evolution@swift.org> wrote:

+1 I think it is a good idea to make a collection run from firstIndex to
lastIndex inclusively (note name change to match firstElement and
lastElement). For an empty collection both firstIndex and lastIndex would
be invalid values that would cause both c[c.firstIndex] and c.[c,lastIndex]
to fail.

--
Dave


(Howard Lovatt) #8

Not sure what you are saying?

    for index in empty.indices {
          // never executed
    }

Would still work.

···

On Friday, 25 March 2016, Dave Abrahams via swift-evolution < swift-evolution@swift.org> wrote:

on Wed Mar 23 2016, Howard Lovatt <swift-evolution@swift.org > <javascript:;>> wrote:

> +1 I think it is a good idea to make a collection run from firstIndex to
> lastIndex inclusively (note name change to match firstElement and
> lastElement). For an empty collection both firstIndex and lastIndex would
> be invalid values that would cause both c[c.firstIndex] and
c.[c,lastIndex]
> to fail.

That direction is pretty much a nonstarter with me. There are *many*
algorithms that are correct for empty collections without any
special-case testing, as long as you use half-open ranges. There's no
reason to make every algorithm that uses indices test for emptiness
before proceeding.

See also

https://www.quora.com/Why-are-Python-ranges-half-open-exclusive-instead-of-closed-inclusive
for philosophical background on why most ranges should be half-open.

--
Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <javascript:;>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
-- Howard.


(Dave Abrahams) #9

Not every algorithm is based on a linear traversal (e.g. binary search,
sort, etc.) For those you really need to access startIndex and endIndex
independently.

···

on Thu Mar 24 2016, Howard Lovatt <swift-evolution@swift.org> wrote:

Not sure what you are saying?

    for index in empty.indices {
          // never executed
    }

Would still work.

--
Dave