Proposal: Add Safe Subquence Access Via subscript For ColloctionType


(Daniel Duan) #1

In CollectionType, a `Range` is accepted as the argument in a version of `subscript`, which returns a subsequence.

    [1,2,3,4][2...3] // [3, 4]

`subscript` raises a fatal error if the range is out of bound, which is really a side-effect from accessing an element with an out of bound index. This behavior forces users to check bounds beforehand. It has been serving us well.

I propose adding a new interface where user can recover from/defer out of bound error in this context. Here are two potential approaches.

Approach #1 is more conservative, we add a throwing version of `subscript`. It throws an error if the range is out of bound. We can give the range parameter a name for distinction, resulting usage would look like:

    do {
       let gimme = [1,2,3,4][safe: 2...4]
    } catch {
        recover()
    }

As an alternative, we can replace the original `subscript` with this version, breaking backward compatibilty.

Apporoach #2 is a really sweet syntax sugar. We add a new subscript that accepts 2 arugments:

    extension CollectionType where Self.Index: RandomAccessIndexType {
        public subscript(start:Int?, end:Int?) -> Self.SubSequence { ... }
    }

This version would make ANY combination of arugment safe by enabling a sematic similar to Python's slicing mechanism. Explanations come after these examples:

    [0,1,2,3][1, -1] // [1, 2]
    ["H","e","l","l","o"][-1000, nil] // ["H","e","l","l","o"]
    [1,2,3,4,5,6,7,8][1,5][2,3] // [4]

This should look familiar to Python users:

* the access is always clamped in-bound. Interpret out-of-bound number as the boundary. [1,2,3][0: 100] means [1,2,3][0: 2].
* nil indicate the boundary itself. [1,2,3][0: nil] means [1,2,3][0: 2]
* negative index counts from the end of the sequence. [1,2,3][-2, -1] means [1,2,3][(3-2), (3-1)]

Admittedly, this syntax suger seems a little out-of-place in Swift :stuck_out_tongue:

Both approaches require just a little of work. As an example, here's one implementation of the 2nd: https://github.com/dduan/Lic/blob/master/Lic/Lic.swift (please ignore the extension for String, that'd be in a separate proposal, if any).

What do you think?

- Daniel Duan


(Jordan Rose) #2

Hi, Daniel. Thanks for bringing this up. May I ask where you would use a "safe" subscript? When are you performing a subscript where the bounds being…um, out-of-bound…is not a programmer error?

As for the second half of this, we deliberately decided to not make the subscript operator "smart" (handling negative indexes and such) because extra branches can do very bad things to performance, and because allowing negative indexes sometimes hide bugs. It's also not meaningful for collections whose indexes are not integers.

Best,
Jordan

···

On Dec 14, 2015, at 12:52, Daniel Duan via swift-evolution <swift-evolution@swift.org> wrote:

In CollectionType, a `Range` is accepted as the argument in a version of `subscript`, which returns a subsequence.

   [1,2,3,4][2...3] // [3, 4]

`subscript` raises a fatal error if the range is out of bound, which is really a side-effect from accessing an element with an out of bound index. This behavior forces users to check bounds beforehand. It has been serving us well.

I propose adding a new interface where user can recover from/defer out of bound error in this context. Here are two potential approaches.

Approach #1 is more conservative, we add a throwing version of `subscript`. It throws an error if the range is out of bound. We can give the range parameter a name for distinction, resulting usage would look like:

   do {
      let gimme = [1,2,3,4][safe: 2...4]
   } catch {
       recover()
   }

As an alternative, we can replace the original `subscript` with this version, breaking backward compatibilty.

Apporoach #2 is a really sweet syntax sugar. We add a new subscript that accepts 2 arugments:

   extension CollectionType where Self.Index: RandomAccessIndexType {
       public subscript(start:Int?, end:Int?) -> Self.SubSequence { ... }
   }

This version would make ANY combination of arugment safe by enabling a sematic similar to Python's slicing mechanism. Explanations come after these examples:

   [0,1,2,3][1, -1] // [1, 2]
   ["H","e","l","l","o"][-1000, nil] // ["H","e","l","l","o"]
   [1,2,3,4,5,6,7,8][1,5][2,3] // [4]

This should look familiar to Python users:

* the access is always clamped in-bound. Interpret out-of-bound number as the boundary. [1,2,3][0: 100] means [1,2,3][0: 2].
* nil indicate the boundary itself. [1,2,3][0: nil] means [1,2,3][0: 2]
* negative index counts from the end of the sequence. [1,2,3][-2, -1] means [1,2,3][(3-2), (3-1)]

Admittedly, this syntax suger seems a little out-of-place in Swift :stuck_out_tongue:

Both approaches require just a little of work. As an example, here's one implementation of the 2nd: https://github.com/dduan/Lic/blob/master/Lic/Lic.swift (please ignore the extension for String, that'd be in a separate proposal, if any).

What do you think?

- Daniel Duan
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Daniel Duan) #3

IMHO, Swift could give users more help to reduce boundary related errors. Perhaps a more practical thing to have is `array[clamp: x…y]`, which always returns a Subsequence.

Along that line, I realize “safe” may be conveying the wrong idea.

Right now, a user have to remember that `someArray[3]` may crash while they don’t need to worry about `someDictionary[3]` crashing, I guess what’s really missing is a throwing version of `subscript`, or one such that nil is returned for invalid indexes.

As for negative index, I accept the performance argument. “sometimes hide bugs” is less convincing as “crash early in production”, depending on context, is not necessarily the best way a programmer error to manifest. (example: crashing is the worst user experience on iOS. Wrong set of data *sometimes* gets along with the view layer just fine). From personal experience with Python, most of the time it’s a really useful feature.

Regardless, this proposal doesn’t change any existing behavior, negative index is allowed only for retrieve subsequence (where Self.index: RandomAccessIndexType, so that `count` is accessible at O(1)). But I understand the the idea as a whole is a bit out there :stuck_out_tongue:

Best,
Daniel

···

On Dec 14, 2015, at 1:54 PM, Jordan Rose <jordan_rose@apple.com> wrote:

Hi, Daniel. Thanks for bringing this up. May I ask where you would use a "safe" subscript? When are you performing a subscript where the bounds being…um, out-of-bound…is not a programmer error?

As for the second half of this, we deliberately decided to not make the subscript operator "smart" (handling negative indexes and such) because extra branches can do very bad things to performance, and because allowing negative indexes sometimes hide bugs. It's also not meaningful for collections whose indexes are not integers.

Best,
Jordan

On Dec 14, 2015, at 12:52, Daniel Duan via swift-evolution <swift-evolution@swift.org> wrote:

In CollectionType, a `Range` is accepted as the argument in a version of `subscript`, which returns a subsequence.

  [1,2,3,4][2...3] // [3, 4]

`subscript` raises a fatal error if the range is out of bound, which is really a side-effect from accessing an element with an out of bound index. This behavior forces users to check bounds beforehand. It has been serving us well.

I propose adding a new interface where user can recover from/defer out of bound error in this context. Here are two potential approaches.

Approach #1 is more conservative, we add a throwing version of `subscript`. It throws an error if the range is out of bound. We can give the range parameter a name for distinction, resulting usage would look like:

  do {
     let gimme = [1,2,3,4][safe: 2...4]
  } catch {
      recover()
  }

As an alternative, we can replace the original `subscript` with this version, breaking backward compatibilty.

Apporoach #2 is a really sweet syntax sugar. We add a new subscript that accepts 2 arugments:

  extension CollectionType where Self.Index: RandomAccessIndexType {
      public subscript(start:Int?, end:Int?) -> Self.SubSequence { ... }
  }

This version would make ANY combination of arugment safe by enabling a sematic similar to Python's slicing mechanism. Explanations come after these examples:

  [0,1,2,3][1, -1] // [1, 2]
  ["H","e","l","l","o"][-1000, nil] // ["H","e","l","l","o"]
  [1,2,3,4,5,6,7,8][1,5][2,3] // [4]

This should look familiar to Python users:

* the access is always clamped in-bound. Interpret out-of-bound number as the boundary. [1,2,3][0: 100] means [1,2,3][0: 2].
* nil indicate the boundary itself. [1,2,3][0: nil] means [1,2,3][0: 2]
* negative index counts from the end of the sequence. [1,2,3][-2, -1] means [1,2,3][(3-2), (3-1)]

Admittedly, this syntax suger seems a little out-of-place in Swift :stuck_out_tongue:

Both approaches require just a little of work. As an example, here's one implementation of the 2nd: https://github.com/dduan/Lic/blob/master/Lic/Lic.swift (please ignore the extension for String, that'd be in a separate proposal, if any).

What do you think?

- Daniel Duan
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dennis Lysenko) #4

Jordan, I think the inspiration here might come from Ruby. I must admit
that seeing that `Array.first` returns an optional, but Array#subscript
raises a runtime error when the index is out of bounds threw me for a loop.
In Ruby, both Array#first and Array.subscript return an optional.

If one of the original tenets of swift was to provide greater compile-time
null-safety, which it definitely seems it was given the commendable
emphasis on optionals being easy to use, then returning an optional would
be a solid way to go about subscripting. Think of it this way: when I call
a method with nullable return value, I am forced to deal with the fact that
the method can fail at compile time. When I subscript an array, I am not
forced to deal with it at compile time, and it will fail at runtime instead.

Nullable subscripting is a big departure from the way most modern languages
do things and that is why I don't blame you for rejecting it. That said, it
is a pleasant change in the way you think about subscripting.

As a closing thought, subscripting hashes returns an optional value. You
might consider this a pretty big inconsistency with arrays. Let me flip
your argument against optional array subscripting, for dictionaries: *When
you are performing a subscript where the key is out of the key set, is it
not a programmer error?*

···

On Mon, Dec 14, 2015 at 4:54 PM Jordan Rose via swift-evolution < swift-evolution@swift.org> wrote:

Hi, Daniel. Thanks for bringing this up. May I ask where you would use a
"safe" subscript? When are you performing a subscript where the bounds
being…um, out-of-bound…is not a programmer error?

As for the second half of this, we deliberately decided to not make the
subscript operator "smart" (handling negative indexes and such) because
extra branches can do very bad things to performance, and because allowing
negative indexes sometimes hide bugs. It's also not meaningful for
collections whose indexes are not integers.

Best,
Jordan

> On Dec 14, 2015, at 12:52, Daniel Duan via swift-evolution < > swift-evolution@swift.org> wrote:
>
>
> In CollectionType, a `Range` is accepted as the argument in a version of
`subscript`, which returns a subsequence.
>
> [1,2,3,4][2...3] // [3, 4]
>
> `subscript` raises a fatal error if the range is out of bound, which is
really a side-effect from accessing an element with an out of bound index.
This behavior forces users to check bounds beforehand. It has been serving
us well.
>
> I propose adding a new interface where user can recover from/defer out
of bound error in this context. Here are two potential approaches.
>
> Approach #1 is more conservative, we add a throwing version of
`subscript`. It throws an error if the range is out of bound. We can give
the range parameter a name for distinction, resulting usage would look like:
>
> do {
> let gimme = [1,2,3,4][safe: 2...4]
> } catch {
> recover()
> }
>
> As an alternative, we can replace the original `subscript` with this
version, breaking backward compatibilty.
>
> Apporoach #2 is a really sweet syntax sugar. We add a new subscript that
accepts 2 arugments:
>
> extension CollectionType where Self.Index: RandomAccessIndexType {
> public subscript(start:Int?, end:Int?) -> Self.SubSequence { ... }
> }
>
> This version would make ANY combination of arugment safe by enabling a
sematic similar to Python's slicing mechanism. Explanations come after
these examples:
>
> [0,1,2,3][1, -1] // [1, 2]
> ["H","e","l","l","o"][-1000, nil] // ["H","e","l","l","o"]
> [1,2,3,4,5,6,7,8][1,5][2,3] // [4]
>
> This should look familiar to Python users:
>
> * the access is always clamped in-bound. Interpret out-of-bound number
as the boundary. [1,2,3][0: 100] means [1,2,3][0: 2].
> * nil indicate the boundary itself. [1,2,3][0: nil] means [1,2,3][0: 2]
> * negative index counts from the end of the sequence. [1,2,3][-2, -1]
means [1,2,3][(3-2), (3-1)]
>
> Admittedly, this syntax suger seems a little out-of-place in Swift :stuck_out_tongue:
>
> Both approaches require just a little of work. As an example, here's one
implementation of the 2nd:
https://github.com/dduan/Lic/blob/master/Lic/Lic.swift (please ignore the
extension for String, that'd be in a separate proposal, if any).
>
>
> What do you think?
>
> - Daniel Duan
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Jordan Rose) #5

These are good points to bring up, Dennis. I'm not one of the standard library authors, so I might not get this exactly right, but I'll try to address each point.

Jordan, I think the inspiration here might come from Ruby. I must admit that seeing that `Array.first` returns an optional, but Array#subscript raises a runtime error when the index is out of bounds threw me for a loop. In Ruby, both Array#first and Array.subscript return an optional.

I do remember there being a discussion about this. One of the arguments in favor of the current behavior was "seq.first ?? defaultValue", which isn't too uncommon. The equivalent with an arbitrary subscript comes up much less often.

If one of the original tenets of swift was to provide greater compile-time null-safety, which it definitely seems it was given the commendable emphasis on optionals being easy to use, then returning an optional would be a solid way to go about subscripting. Think of it this way: when I call a method with nullable return value, I am forced to deal with the fact that the method can fail at compile time. When I subscript an array, I am not forced to deal with it at compile time, and it will fail at runtime instead.

I think the equivalent would be forcing the user to check the input rather than the output, just as you are forced to check whether an optional is nil before using it rather than after. If you ignore a method return value you're not actually dealing with its failure.

Nullable subscripting is a big departure from the way most modern languages do things and that is why I don't blame you for rejecting it. That said, it is a pleasant change in the way you think about subscripting.

As a closing thought, subscripting hashes returns an optional value. You might consider this a pretty big inconsistency with arrays. Let me flip your argument against optional array subscripting, for dictionaries: When you are performing a subscript where the key is out of the key set, is it not a programmer error?

No, it is not; it is the canonical way to tell if a key has an entry in the dictionary, and the canonical way to insert a new entry into the dictionary. The same is not true for Array.

(Note also that Dictionary's subscript that takes an Index does not return an optional result.)

Jordan

···

On Dec 14, 2015, at 15:40, Dennis Lysenko <dennis.s.lysenko@gmail.com> wrote:

From my perspective, array subscripting is not "an operation that can fail". It just has a precondition on its input parameter. Optional return values force you to deal with dynamic failures, but array subscripting should never get to that point.


(Brent Royal-Gordon) #6

As a closing thought, subscripting hashes returns an optional value. You might consider this a pretty big inconsistency with arrays. Let me flip your argument against optional array subscripting, for dictionaries: When you are performing a subscript where the key is out of the key set, is it not a programmer error?

The use cases for arrays and dictionaries are different, though.

I’d say about 80% of the time you subscript an array, you’re using an index that was somehow derived *from* the array—for instance, a range like `0..<array.count`, or `array.indices`, or `array[indexPath.row]` where `tableView(_:numberOfRowsInSection:)` returns `array.count`. This is very different from dictionaries, where the key is usually some piece of data from somewhere *else* and you’re trying to look up the value corresponding to it. You rarely say, for instance, `array[2]` or `array[someRandomNumberFromSomewhere]`, but `dictionary[“myKey”]` or `dictionary[someRandomValueFromSomewhere]` are pretty common.

Because the use cases are different, arrays have a non-optional subscriptor which fails a precondition when the index is invalid, while dictionaries have an optional subscriptor which returns nil when the index is invalid.

···

--
Brent Royal-Gordon
Architechies


(Dave Abrahams) #7

Thank you, Brent; that captures the rationale exactly. The one thing missing is the performance piece: allowing arrays to do anything other than abort on out-of-range accesses would make them non-competitive with C arrays.

HTH,
-Dave

···

On Dec 14, 2015, at 6:13 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

As a closing thought, subscripting hashes returns an optional value. You might consider this a pretty big inconsistency with arrays. Let me flip your argument against optional array subscripting, for dictionaries: When you are performing a subscript where the key is out of the key set, is it not a programmer error?

The use cases for arrays and dictionaries are different, though.

I’d say about 80% of the time you subscript an array, you’re using an index that was somehow derived *from* the array—for instance, a range like `0..<array.count`, or `array.indices`, or `array[indexPath.row]` where `tableView(_:numberOfRowsInSection:)` returns `array.count`. This is very different from dictionaries, where the key is usually some piece of data from somewhere *else* and you’re trying to look up the value corresponding to it. You rarely say, for instance, `array[2]` or `array[someRandomNumberFromSomewhere]`, but `dictionary[“myKey”]` or `dictionary[someRandomValueFromSomewhere]` are pretty common.

Because the use cases are different, arrays have a non-optional subscriptor which fails a precondition when the index is invalid, while dictionaries have an optional subscriptor which returns nil when the index is invalid.


(Dennis Lysenko) #8

Sorry, looks like I conflated a few different terms in that message. Just
pretend I said "optional" anywhere I said "nullable", and "dictionary"
anywhere I said "hash". Been context switching among Swift, Java, Ruby and
Kotlin all day.

···

On Mon, Dec 14, 2015 at 6:40 PM Dennis Lysenko <dennis.s.lysenko@gmail.com> wrote:

Jordan, I think the inspiration here might come from Ruby. I must admit
that seeing that `Array.first` returns an optional, but Array#subscript
raises a runtime error when the index is out of bounds threw me for a loop.
In Ruby, both Array#first and Array.subscript return an optional.

If one of the original tenets of swift was to provide greater compile-time
null-safety, which it definitely seems it was given the commendable
emphasis on optionals being easy to use, then returning an optional would
be a solid way to go about subscripting. Think of it this way: when I call
a method with nullable return value, I am forced to deal with the fact that
the method can fail at compile time. When I subscript an array, I am not
forced to deal with it at compile time, and it will fail at runtime instead.

Nullable subscripting is a big departure from the way most modern
languages do things and that is why I don't blame you for rejecting it.
That said, it is a pleasant change in the way you think about subscripting.

As a closing thought, subscripting hashes returns an optional value. You
might consider this a pretty big inconsistency with arrays. Let me flip
your argument against optional array subscripting, for dictionaries: *When
you are performing a subscript where the key is out of the key set, is it
not a programmer error?*

On Mon, Dec 14, 2015 at 4:54 PM Jordan Rose via swift-evolution < > swift-evolution@swift.org> wrote:

Hi, Daniel. Thanks for bringing this up. May I ask where you would use a
"safe" subscript? When are you performing a subscript where the bounds
being…um, out-of-bound…is not a programmer error?

As for the second half of this, we deliberately decided to not make the
subscript operator "smart" (handling negative indexes and such) because
extra branches can do very bad things to performance, and because allowing
negative indexes sometimes hide bugs. It's also not meaningful for
collections whose indexes are not integers.

Best,
Jordan

> On Dec 14, 2015, at 12:52, Daniel Duan via swift-evolution < >> swift-evolution@swift.org> wrote:
>
>
> In CollectionType, a `Range` is accepted as the argument in a version
of `subscript`, which returns a subsequence.
>
> [1,2,3,4][2...3] // [3, 4]
>
> `subscript` raises a fatal error if the range is out of bound, which is
really a side-effect from accessing an element with an out of bound index.
This behavior forces users to check bounds beforehand. It has been serving
us well.
>
> I propose adding a new interface where user can recover from/defer out
of bound error in this context. Here are two potential approaches.
>
> Approach #1 is more conservative, we add a throwing version of
`subscript`. It throws an error if the range is out of bound. We can give
the range parameter a name for distinction, resulting usage would look like:
>
> do {
> let gimme = [1,2,3,4][safe: 2...4]
> } catch {
> recover()
> }
>
> As an alternative, we can replace the original `subscript` with this
version, breaking backward compatibilty.
>
> Apporoach #2 is a really sweet syntax sugar. We add a new subscript
that accepts 2 arugments:
>
> extension CollectionType where Self.Index: RandomAccessIndexType {
> public subscript(start:Int?, end:Int?) -> Self.SubSequence { ...
}
> }
>
> This version would make ANY combination of arugment safe by enabling a
sematic similar to Python's slicing mechanism. Explanations come after
these examples:
>
> [0,1,2,3][1, -1] // [1, 2]
> ["H","e","l","l","o"][-1000, nil] // ["H","e","l","l","o"]
> [1,2,3,4,5,6,7,8][1,5][2,3] // [4]
>
> This should look familiar to Python users:
>
> * the access is always clamped in-bound. Interpret out-of-bound number
as the boundary. [1,2,3][0: 100] means [1,2,3][0: 2].
> * nil indicate the boundary itself. [1,2,3][0: nil] means [1,2,3][0: 2]
> * negative index counts from the end of the sequence. [1,2,3][-2, -1]
means [1,2,3][(3-2), (3-1)]
>
> Admittedly, this syntax suger seems a little out-of-place in Swift :stuck_out_tongue:
>
> Both approaches require just a little of work. As an example, here's
one implementation of the 2nd:
https://github.com/dduan/Lic/blob/master/Lic/Lic.swift (please ignore
the extension for String, that'd be in a separate proposal, if any).
>
>
> What do you think?
>
> - Daniel Duan
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dennis Lysenko) #9

Jordan, thanks for the clarification. It is pretty easy to check if your
index is in-bounds before subscripting. Just unexpected as most things
causing a runtime crash tend to be marked with a '!' (try!, unwrapping an
optional with !, implicitly unwrapped optionals...)

···

On Mon, Dec 14, 2015 at 6:52 PM Jordan Rose <jordan_rose@apple.com> wrote:

These are good points to bring up, Dennis. I'm not one of the standard
library authors, so I might not get this exactly right, but I'll try to
address each point.

On Dec 14, 2015, at 15:40, Dennis Lysenko <dennis.s.lysenko@gmail.com> > wrote:

Jordan, I think the inspiration here might come from Ruby. I must admit
that seeing that `Array.first` returns an optional, but Array#subscript
raises a runtime error when the index is out of bounds threw me for a loop.
In Ruby, both Array#first and Array.subscript return an optional.

I do remember there being a discussion about this. One of the arguments in
favor of the current behavior was "seq.first ?? defaultValue", which isn't
too uncommon. The equivalent with an arbitrary subscript comes up much less
often.

If one of the original tenets of swift was to provide greater compile-time
null-safety, which it definitely seems it was given the commendable
emphasis on optionals being easy to use, then returning an optional would
be a solid way to go about subscripting. Think of it this way: when I call
a method with nullable return value, I am forced to deal with the fact that
the method can fail at compile time. When I subscript an array, I am not
forced to deal with it at compile time, and it will fail at runtime instead.

From my perspective, array subscripting is not "an operation that can
fail". It just has a precondition on its input parameter. Optional return
values force you to deal with *dynamic* failures, but array subscripting
should never get to that point.

I think the equivalent would be forcing the user to check the *input* rather
than the *output,* just as you are forced to check whether an optional is
nil *before* using it rather than *after.* If you ignore a method return
value you're not actually dealing with its failure.

Nullable subscripting is a big departure from the way most modern
languages do things and that is why I don't blame you for rejecting it.
That said, it is a pleasant change in the way you think about subscripting.

As a closing thought, subscripting hashes returns an optional value. You
might consider this a pretty big inconsistency with arrays. Let me flip
your argument against optional array subscripting, for dictionaries: *When
you are performing a subscript where the key is out of the key set, is it
not a programmer error?*

No, it is not; it is the canonical way to tell if a key has an entry in
the dictionary, and the canonical way to insert a new entry into the
dictionary. The same is not true for Array.

(Note also that Dictionary's subscript that takes an Index does not return
an optional result.)

Jordan