[Rant] Indexing into ArraySlice

Ah yes, I missed that comment.

Could you please link me directly to a post in that thread which mentions the idea of types named Start and End?

To be honest, I am not even 100% positive the exact design you have in mind was brought up there, but I am pretty sure something very similar came up. I mentioned it from the top of my head and don't have time to dig a specific post.

Then it seems quite…out of place to say it was “thoroughly discussed” there.

In fact, I remember that thread quite well, I was an active participant in it, and to the best of my knowledge the idea of Start and End as freestanding types never arose there. Thus I mentioned them here, as a new possibility.

Oh, I see. I didn't mean to be dismissive. That is how I remembered it. I stand corrected.

Please see my response to @taylorswift above. I don't want to go that path to avoid complexity.

1 Like

FWIW that drift was from trying to make something that would be able to offset arbitrary indices and not just startIndex and endIndex. But that dried out as you mention, hence, I reverted back to the earlier design that did get some acceptance. I did this because it seemed a waste to not even try to solve the original problem simply because a completely general solution could not be devised.

A PR for the proposal has been made. I also posted a link to the proposal in that original thread, but not many people looked at it.

I wrote a little playground to try a test it out since I think it really works well when you actually try to use. But anyway I'll just drop the proposed solution from the actual proposal here for easy access.

Proposed solution

A solution we propose to this problem is to extend Collection and
RangeReplaceableCollection with a subscript that takes a range, which
would be used to offset an index of a collection.
Collection and MutableCollection will receive subscripts that take a
single Int to return a single element from the collection.

A highly request ability for getting a slice of a collection is the ability to
offset relative to the endIndex of a collection. Currently range support can
cover most of the desired requests, however, due to ranges requiring an
upperBound >= lowerBound constraint certain use cases are not met. To solve this
Four new operators should be added along with a new type to help model this
behavior. To encapsulate this behavior, a new protocol should also be added.
The other range types will then also conditionally conform to this protocol.

How some example cases of the proposed design will look like are shown below.
Where the first operation is the proposed design and
the following code under that is how one might do the same operation in Swift
currently.

var x = "ABCDEFGHIJ"
// CDEFGH
x[offset: 2..<-2]

x[x.index(x.startIndex, offsetBy: 2)..<x.index(x.endIndex, offsetBy: -2)]
// EFGH
x[offset: 4...7] 

let t = x.index(x.startIndex, offsetBy: 4)
let u = x.index(t, offsetBy: 3)
x[t...u]
// x == ABCDEXYZ
x[offset: 5...] = "XYZ"

x.replaceSubrange(x.index(x.startIndex, offsetBy: 5)..., with: "XYZ")

Now to delete the newly inserted sequence that are in the last 3 positions of
the string.

// x == ABCDE
x[offset: (-3)...] = ""

x.replaceSubrange(x.index(x.endIndex, offsetBy: -3)..., with: "")

We replace the middle 3 characters with "..."

// x == A...E
x[offset: 2..<-2] = "..."

let start = x.index(x.startIndex, offsetBy: 1)
let end = x.index(x.endIndex, offsetBy: -1)
x.replaceSubrange(start..<end, with: "...")

An example of an offset being used with a slice:

let y = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
let z = y[3..<9] //  40 50 60 70 80 90
// 50 60 70 80
z[offset: 1..<-1]

z[z.index(z.startIndex, offsetBy: 1)..<z.index(z.endIndex, offsetBy: -1)]
z[4..<8]

As shown by the last example the proposed solution helps with using slices that
don't have a zero based startIndex. Using offset with a slice can become more
natural to use in certain situations with an offset if need be.

Might you be thinking about one of the keyPath solutions?

1 Like

Thank you for clarifying it for us. I have a major concern about your proposal:

Adding offset-based APIs to Collection opens the door for abuse. Performance characteristic of offset operations for non-random access collections can be extremely poor. Having a very easy to (ab)use offset API on them will steer people towards writing bad code. That is why I am limiting my suggestion to RandomAccessCollection.

Otherwise, we are pretty close.

2 Likes

For the moment, I am focused on keeping it simple and focused. I think what I am proposing, although pretty limited in scope, is still worth considering. We need to define how much additional tangible benefit we get from more complex designs and see if that extra benefit justifies the extra complexity.

This was discussed in that thread. From here down through about 10 posts (not all the posts in between are about the performance).

Thank you for the reference.

I stand by my opinion about restricting offset APIs to RandomAccessCollection despite the arguments brought up there. For me, having a lightweight syntax and semantics is more important than the extra functionality and convenience.

By the way, in order to incorporate offsets relative to other given indexes, I would use a function to first convert the index to an offset, and then I will have the full range of integer operations at my disposal to manipulate it.

Continuing from the main code I posted:

public extension RandomAccessCollection {
    func offset(of index: Self.Index) -> Int {
        return distance(from: startIndex, to: index)
    }
}

// Just an excuse to start with an index:
func doSomthing<E>(with slice: ArraySlice<E>, index: ArraySlice<E>.Index) {
    let o1 = slice.offset(of: index)
    print("slice[o: (o1+1)...]=\(slice[o: (o1+1)...])")
}

doSomthing(with: b, index: 3)

Interesting thought. In that case, "indices" shouldn't be used as subscript parameters, as that form of index has (by other languages, and CS literature) been ingrained to range over (0..n) (for better or worse).

That.

It's also ridiculously easy (if breaking by now), isn't it? With the offset stored internally, only the user interface would change.

I think there's a point here. By subscripting a range out of a(n ordered) collection, I kind of expect to receive a collection of the same type. Maybe wrapped in some form of lazy evaluator (stream?), but still equivalent in a value-sense. So the problem is as much the API of slices as that we get them when we don't necessarily expect to.

Can you elaborate on that? I fail to see how "adding the base offset" to the current code would cause runtimes to change that much.

(I don't have the time to reply to the more meaty responses, sorry. Thanks to everybody who contributes in more constructive ways than I have been!)

Maybe we should have a warning when you subscript Slice or ArraySlice with a literal.

5 Likes

The generic design of Collection protocol family creates a user experience for Array/ArraySlice, (and other RandomAccessCollections) that is far from ideal. Unfortunately, we are well past the point where we can truly fix this issue.

We need to do more in various fronts to reduce its negative impact on novice developers and recent switchers. Your suggestion of additional warnings and even outright errors (based on static analysis) will certainly help.

We should also add emphasis in documenting this behavior from the very first introduction of array slices in Swift book and other introductory material and examples to raise awareness of this oddity.

My suggestion of providing parallel offset-based APIs for random access collections (while convenient and useful on its own) can also provide an alternative for helping developers get a more familiar behavior.

1 Like

ArraySliceWithStride works as you expect.

Code:

let abc = Array("abcdef")
let some = abc[1.~]
some[1] == abc[1]
some[1] == abc[2]
let same = some.map{ $0 }
same[1] == abc[1]
same[1] == abc[2]

Output:

["a", "b", "c", "d", "e", "f"]
[b, c, d, e, f]
false
true
(6 times) // f
false
true

ArraySliceWithStride was made for presentation of a possible implementation of striding operators in my pitch.