[Rant] Indexing into ArraySlice

paiv · July 2, 2018, 6:36am

I appreciate the work put into designing Swift collections, but at this point it neglects very basic user scenario. When slicing a collection, I expect the slice behave the same: if I'm not required to use startIndex in indexing array, I shouldn't be required to use startIndex on its slice.

We all learned arrays from very first steps of programming. Telling me I should treat arrays as some abstract collection, makes arrays second-class citizen.

Make arrays great again.

palimondo · July 2, 2018, 6:46am

@paiv I’m not interested in arguing with you anymore. That last sentence makes your attitude quite clear and further discussion pointless.

For anybody open to learning new things, it’s part of that journey to realize that Swift isn’t C. It has generalized and logically extended a lot of fundamental concepts and packaged them into syntax that might seem familiar at a first glance. But indexing using a subscript notation isn’t just a primitive pointer arithmetic as is the case in C.

DevAndArtist · July 2, 2018, 7:04am

I can only second this, I know enough folks that don't even heard of dynamic arrays and live completely in a world of lists. It's hard to explain them that there is a fundamental difference between those things. Personally I really enjoy how Swift implemented the collections in the stdlib. Sure those are not quite the same if you're coming from a different language but that's the point here, every language has it's own little things we have to discover and understand otherwise it won't make any sense to swap the languages.

The only things I don't like about some slices is that no all of them are implemented in the same manner. For instance Data from the core-foundation will return Data as it's own view which will probably trap if you forget to wrap it into new Data instance when it's escaped and indexed by different assumptions of the indexes.

paiv · July 2, 2018, 8:50am

I'm advocating simple user need. Array is a basic building block, has its simplified type form [T], integer indexed, zero-based index. These are facts.

These facts are pushed to new learners, and system programmers, installing some expectation, helping you adopt Swift coming from other languages, since these facts are true in other languages, and you learned them from first days of programming.

But then comes slicing, and everything breaks. Suddenly having common mental model of array is error-prone. You are told your mental model is wrong, you should not think of arrays as arrays, and integer index as integer index.

At this point, this is not the time when user should learn new concepts, and switch mental models. This is the point where Swift breaks expectations. Swift should either not install these from the start, or make the effort to be consinstent, and deliver to expectations.

I'm not accepting current state of things. Indexing into array slice should be as simple and logical as

>>> abc = list('abcdef')
>>> abc
['a', 'b', 'c', 'd', 'e', 'f']
>>> some = abc[1:]
>>> some
['b', 'c', 'd', 'e', 'f']
>>> some[1] == abc[1]
False
>>> some[1] == abc[2]
True

paiv · July 2, 2018, 9:31am

And these errors are not enforced with the compiler. You slice array, and Swift lets you think you still using array interface. But you get runtime errors.

yxckjhasdkjh · July 2, 2018, 10:32am

As someone relatively new to Swift, but used to several other programming languages, array slices do feel weird to me. It might be different for beginners, I don't know. I like to think functionally and to me, if I create a slice of an array, it seems weird to me that the result is not an independent piece of data but something that depends on where exactly it came from. It also breaks referential transparency: If slice1 == slice2, then slice1[i] == slice2[i] should also be true for every i in the valid range, which currently isn't the case.

I also kind of moderately dislike the need for the back-and-forth conversion between different kinds of sequences (e.g. recently having had to call Array() on a zip result), reminds me a bit too much of Java, but I can understand that wanting to reason about performance more explicitly implies these kinds of tradeoffs.

Personally, I only start caring about performance once there's a bottleneck, so I'd be the guy who just wraps things in Array() so they're easier to use. YMMV.

jawbroken · July 2, 2018, 11:23am

This is a bad assumption for indices throughout Swift, and is unrelated to slicing.

You can find a complete set of a collection’s valid indices by starting with the collection’s startIndex property and finding every successor up to, and including, the endIndex property. All other values of the Index type, such as the startIndex property of a different collection, are invalid indices for this collection.

Tino · July 2, 2018, 12:46pm

How come?
I think it's sensible to assume that an object can be used in place of another object that is equal, without any problems. Given the current behavior, imho it would be actually better if slices wouldn't be equatable at all...

slightly OT:
Wouldn't you say that every collection should have an Int-based subscript? Afair you have been strongly supporting the current behavior of Set and Dictionary, and when I can retrieve the first three elements of a collection, shouldn't it be as straightforward to retrieve the element with index 2?

yxckjhasdkjh · July 2, 2018, 12:52pm

I respect that there are valid design decisions that led to the current implementation.

But since the question was raised in this thread whether the array slice indexing behaviour is intuitive for newcomers (to the language) or not, I just wanted to provide my perspective. Personally I find it not very desirable to break equational reasoning in such a drastic way (just making slices not conform to Equatable, as proposed above, would already help). It is certainly a stumbling block; not everyone who picks up a language will read the guides very thoroughly, so this can trip them up.

jawbroken · July 2, 2018, 1:05pm

Sure, that would be great, but for many practical reasons this is not possible. So there are aspects of a type that are documented as being salient for equality and those that aren't. As a simple example, consider ObjectIdentifier. Then try to implement your own efficient versions of standard library data structures. Again, this is nothing to do with slicing; this doesn't hold for collections in general because the indices themselves are not salient for equality (i.e. Collections that are equal do not have equal indices), so removing Equatable from slices doesn't help at all.

No, because that would strongly imply some things about performance that aren't true, and lead to a lot of practical performance issues. It's not straightforward to retrieve the element with index n, because it's generally O(n), not O(1) as a subscript would suggest. You remember incorrectly, as I've stated that I think Collection conformance would ideally be moved to a view on Set and Dictionary, which would be declared to be non-salient for equality. I'm not in favour of a more complex protocol hierarchy with no demonstrated benefit, or pretending that being able to use something in a for-in loop is unrelated to all the other Sequence methods.

reitzig · July 5, 2018, 5:32pm

At least when you have learned about algorithms in any form of CS course, you'll already know how to adjust indices; it's in every algorithm. The inevitable off-by-ones you catch with testing.

So no, I don't think this is more usable. It is bound to surprise many developers and seems to directly violate value semantics.

Joe_Groff · July 5, 2018, 5:36pm

There are tradeoffs to both approaches, sure, but confusion about the indexing model should just as readily be caught by testing as by off-by-one errors, since you'll trap fairly quickly if you try to use the parent collection's index range on a slice. Value semantics is preserved since the slice independently maintains its index range regardless of whether it's modified.

reitzig · July 5, 2018, 5:36pm

I think the crucial thing here are expectations.

We don't write down types because they are inferred. Looking at this, I don't care about the concrete type of some. It clearly is something that behaves like an array (or, generally, collection) since it's a subarray (or subcollection). And those are indexed from 0 to n-1. Anything else is surprising, and therefore bad.

That the indices are inconsistent within a method chain is appalling.

reitzig · July 5, 2018, 5:48pm

And then I stare at the code and debug-printlns, unable to find the issue. In fact, I think I created a bug report about a similar thing out of desperation once.

Depends on how you look at it. Yes, slices are values. But some does not stand on its own: it's indices depend on abc. some could have the same content but wildly different indices depending on where it came from.

To be clear, technically this is all fine, and I'm sure there are valid technical reasons. What I'm saying is that it's horrible UX, and therefore bad API design. (And yes, that's an opinion, not objective fact.)

Joe_Groff · July 5, 2018, 7:08pm

That's fair, I personally somewhat share your opinion even. I feel like thinking of what Swift calls indexes as somewhat-memory-and-invalidation-safer versions of pointers or STL iterators, instead of indexes as other languages treat the term, is closer to what the abstraction in Swift is trying to achieve.

taylorswift · July 5, 2018, 7:36pm

for what it’s worth I think slices should index starting at 0. having them share indices with the parent array is inconsistent with most other languages (python, C, etc) which I think is far more of a problem than anything else. While I’m not saying Swift should do something just bc all the other languages do it, it does cause problems, even for people like me who know about the behavior, and it’s not something that you get a warning in the compiler for either. if you’re lucky you’ll catch the bug in testing. And if you’re like 95% of Swift developers who omit the type annotations unless the compiler complains, it will be very, very hard to root out the bug just by inspecting the source unless you’ve been doing this for a long time.

Also: adopting 0-based indexing does have the benefit of simplifying slicing for some types like UnsafeBufferPointer. there is no other reason for that type’s slice to return a monstrosity like Slice<UnsafeBufferPointer<Element>>.

Joe_Groff · July 5, 2018, 8:00pm

In the current model, it'd be a reasonable alternative design for UnsafeBufferPointer to be its own slice type, and use UnsafePointer as its "index" type.

Dante-Broggi · July 5, 2018, 9:13pm

I think perhaps ArraySlice, and possibly also Array, for redundant consistency, should have an additional subscript slice[rebased: Int] which would pretend the slice was 0 based, so that there is a trivial direct conversion from algorithms in languages which automatically rebase slices to Swift.

ArraySlice cannot rebase its indices because not rebasing indices is part of the Collection.Subsequence semantics.

Tino · July 6, 2018, 6:25am

I really like this thread just because it is much more educating than an average rant
I'm not sure if I have ever seen a slice in a real-world programm, so I don't think their pitfalls are a big threat — but I'll probably think twice before writing "Slice" in a property or parameter declaration.

The docs even say

Long-term storage of ArraySlice instances is discouraged

but who reads the docs anyway? ;-)

One use case for slices is ad-hoc modification of their underlying collection, and the current behavior seems to be the best fit for that task.

But when you store a slice, things change a bit:

var array0 = Array(stride(from: 0, to: 16, by: 2))
var array1 = array0

            array1[array1.index(of: 12)!...].reverse()
var slice = array0[array0.index(of: 12)!...]
                                       slice.reverse()
print(array0)
print(array1)

I think it is easy to assume that both arrays are identical - I only stored an intermediate step, didn't I?
But like changing array1 doesn't change array0, changing a stored slice of array0 doesn't alter it either.
COW kicks in, and it looks like the whole underlying array is duplicated:

print(array0[slice.index(of: 12)!]) // no, not 12 - it's 14

So, transforming a Slice into an Array might not be that terrible - and although I'm not sure how reliable it is, you can even implement slicing-functionality that returns a zero-based result:

extension Array {
    public func suffix(_ maxLength: Int) -> Array<Element> {
        let slice: Array.SubSequence = self.suffix(maxLength)
        return Array(slice)
    }
}

let test = array0.suffix(3)
print(test.index(of: 10)!) // 0

My personal conclusion: It's fun to explore Slices - and reading documentation doesn't hurt either ;-)

beccadax · July 6, 2018, 1:47pm

Putting specific algorithms aside, as a purely practical matter, our current slices save us from the range: parameters which infest so many Foundation classes. The fact that you can say foo[i..<j].index(of: bar) means that we don’t need index(of:range:), and the same is true of a hundred other APIs in the standard library. Personally, I find that really nice.

Bottom line: The more you pretend that Array.Index is not Int, the happier you’ll be.