Dropping Comparable requirement for indices


(Dave Abrahams) #1

I've already raised the question here of whether we should weaken the
requirement that Collection.Index be Comparable to merely being
Equatable.

Part of the motivation was to support data structures that otherwise
would be expensive or impossible to implement. For example, with
Comparable indices, you can't build a linked list that supports
restructuring (e.g. insert, delete, splice) in O(1) without invalidating
indices... not even an unsafe linked list with reference semantics.
[While I don't think linked lists are terribly good data structures for
most things, they are still useful in teaching, and it bothered me that
we were ruling them out.] Even if you take away the index invalidation
restriction, you have to store a counter in the index, which is an awkward inefficiency.
Supporting Comparable indices for a tree data structure requires
encoding the path from the root to the node. It's only one or two words
in practice, but it's another awkward inefficiency.

Over the weekend, I tried lifting the restriction to see what kind of
effect it would have on the standard library.
https://github.com/apple/swift/pull/3325

Although I *had* been leaning strongly toward making this change, having
looked at the effects, I am now more ambivalent:

* A Range<T>, where T is not Comparable, could be constructed with any
  pair of Equatable T instances. We'd only detect that the Range may be
  invalid when T happened to also be Comparable. However, the fact that
  you are creating a Range already sort of implies an underlying
  ordering.

* A Collection with non-Comparable indices still imposes an ordering
  upon the indices.

* In a Collection with Comparable indices, the ordering implied by <
  needs to match the ordering of indices in the collection. Otherwise,
  there would be no way to form Ranges (for use in slicing, for example)
  without causing a trap. This is *weird*! There's little precedent
  for imposing stronger semantic requirements on an associated type if
  it conforms to a particular protocol (Comparable in this case), except
  where the requirement is part of the protocol.

Also, Dmitri reminded me that even with a Comparable requirement on
indices, there is still no problem converting an equatable iteration
state into an index; you just need to augment it with an integer
counter. So it's still trivial to create a Collection from nearly
everything that is today a multipass Sequence. It does make indexing
slightly more expensive, but it's likely we'd optimize that overhead
away in many critical situations.

Anyway, the thoughts of the community on this topic would be interesting
to us. We need to make a decision about this very soon.

Thanks!

···

--
Dave


(Brent Royal-Gordon) #2

Anyway, the thoughts of the community on this topic would be interesting
to us. We need to make a decision about this very soon.

I've wanted the Index protocols to be Comparable since the Swift 1 days. (Apple people, see rdar://17768142; non-Apple people, I was looking to measure distance and, under the old indexing model, the inability to determine which index came later was one of several impediments.)

Going back to the beginning:

For example, with
Comparable indices, you can't build a linked list that supports
restructuring (e.g. insert, delete, splice) in O(1) without invalidating
indices... not even an unsafe linked list with reference semantics.
[While I don't think linked lists are terribly good data structures for
most things, they are still useful in teaching, and it bothered me that
we were ruling them out.] Even if you take away the index invalidation
restriction, you have to store a counter in the index, which is an awkward inefficiency.

You *can* have O(1) restructuring without index invalidation if you're willing to tolerate O(n) comparisons; you just walk forward from the left-hand side and return `true` if you encounter the right-hand side before reaching the end. I'm not certain, but I think there's a three-way tradeoff here: you can have any two of fast restructuring, fast comparisons, and always-valid indices.

In general, though, `Collection` is not very good at modeling linked lists—for one thing, it can't model a value-typed linked list without enormous index invalidation issues. I can't quite bring myself to worry too much about linked-list handling here.

* A Range<T>, where T is not Comparable, could be constructed with any
pair of Equatable T instances. We'd only detect that the Range may be
invalid when T happened to also be Comparable. However, the fact that
you are creating a Range already sort of implies an underlying
ordering.

Such a Range can't actually test whether a given value is *in* the range. That basically just makes it a glorified typealias for `(lowerBound: Bound, upperBound: Bound)`. A Range without Comparable is a type without any semantics.

* A Collection with non-Comparable indices still imposes an ordering
upon the indices.

And it also still needs to be able to *test* the ordering of indices. For instance, `distance(from:to:)` is a bit difficult to implement (at least in `BidirectionalCollection`) if you can't tell which index comes earlier in the collection.

···

On Jul 5, 2016, at 7:39 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

--
Brent Royal-Gordon
Architechies


(Matthew Johnson) #3

I've already raised the question here of whether we should weaken the
requirement that Collection.Index be Comparable to merely being
Equatable.

Part of the motivation was to support data structures that otherwise
would be expensive or impossible to implement. For example, with
Comparable indices, you can't build a linked list that supports
restructuring (e.g. insert, delete, splice) in O(1) without invalidating
indices... not even an unsafe linked list with reference semantics.
[While I don't think linked lists are terribly good data structures for
most things, they are still useful in teaching, and it bothered me that
we were ruling them out.] Even if you take away the index invalidation
restriction, you have to store a counter in the index, which is an awkward inefficiency.
Supporting Comparable indices for a tree data structure requires
encoding the path from the root to the node. It's only one or two words
in practice, but it's another awkward inefficiency.

Over the weekend, I tried lifting the restriction to see what kind of
effect it would have on the standard library.
https://github.com/apple/swift/pull/3325

Although I *had* been leaning strongly toward making this change, having
looked at the effects, I am now more ambivalent:

* A Range<T>, where T is not Comparable, could be constructed with any
pair of Equatable T instances. We'd only detect that the Range may be
invalid when T happened to also be Comparable. However, the fact that
you are creating a Range already sort of implies an underlying
ordering.

* A Collection with non-Comparable indices still imposes an ordering
upon the indices.

* In a Collection with Comparable indices, the ordering implied by <
needs to match the ordering of indices in the collection. Otherwise,
there would be no way to form Ranges (for use in slicing, for example)
without causing a trap. This is *weird*! There's little precedent
for imposing stronger semantic requirements on an associated type if
it conforms to a particular protocol (Comparable in this case), except
where the requirement is part of the protocol.

Also, Dmitri reminded me that even with a Comparable requirement on
indices, there is still no problem converting an equatable iteration
state into an index; you just need to augment it with an integer
counter. So it's still trivial to create a Collection from nearly
everything that is today a multipass Sequence. It does make indexing
slightly more expensive, but it's likely we'd optimize that overhead
away in many critical situations.

Anyway, the thoughts of the community on this topic would be interesting
to us. We need to make a decision about this very soon.

Dmitri's suggestion removes the primary concern I had about retaining the Comparable requirement if we require all multi pass sequences to conform to Collection. (I was focused on comparability of the iteration state itself and didn't consider other ways to meet that requirement)

I agree that the points you list above seem to point in the direction of retaining the Comparable requirement.

···

Sent from my iPad

On Jul 5, 2016, at 9:39 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

Thanks!

--
Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Nate Cook) #4

I've already raised the question here of whether we should weaken the
requirement that Collection.Index be Comparable to merely being
Equatable.

Part of the motivation was to support data structures that otherwise
would be expensive or impossible to implement. For example, with
Comparable indices, you can't build a linked list that supports
restructuring (e.g. insert, delete, splice) in O(1) without invalidating
indices... not even an unsafe linked list with reference semantics.
[While I don't think linked lists are terribly good data structures for
most things, they are still useful in teaching, and it bothered me that
we were ruling them out.] Even if you take away the index invalidation
restriction, you have to store a counter in the index, which is an awkward inefficiency.
Supporting Comparable indices for a tree data structure requires
encoding the path from the root to the node. It's only one or two words
in practice, but it's another awkward inefficiency.

So far in Swift 3 I've enjoyed the Comparable requirement on indices—it's much easier to perform bounds checks in collection algorithms as a result. However, I haven't tried implementing linked-lists or other non-linear collections since the Comparable requirement was added. I can see that requirement being a significant burden, particularly in a structure like a self-balancing tree, where managing index comparability could grow quite cumbersome.

Over the weekend, I tried lifting the restriction to see what kind of
effect it would have on the standard library.
https://github.com/apple/swift/pull/3325

Although I *had* been leaning strongly toward making this change, having
looked at the effects, I am now more ambivalent:

* A Range<T>, where T is not Comparable, could be constructed with any
pair of Equatable T instances. We'd only detect that the Range may be
invalid when T happened to also be Comparable. However, the fact that
you are creating a Range already sort of implies an underlying
ordering.

This was exactly how Ranges worked from Swift's public launch up until the collections-move-indices changes landed, so it doesn't seem like a major problem to bring this behavior back. If we can promise better/faster bounds checking for Comparable indices, hopefully most collection authors would opt into that whenever possible.

* A Collection with non-Comparable indices still imposes an ordering
upon the indices.

* In a Collection with Comparable indices, the ordering implied by <
needs to match the ordering of indices in the collection. Otherwise,
there would be no way to form Ranges (for use in slicing, for example)
without causing a trap. This is *weird*! There's little precedent
for imposing stronger semantic requirements on an associated type if
it conforms to a particular protocol (Comparable in this case), except
where the requirement is part of the protocol.

This sounds like an argument *for* removing the Comparable requirement. Is the issue that it's difficult to guarantee that the < ordering matches the collection, or is the issue that even if you drop the Comparable requirement, there's an implicit comparability anyway?

Also, Dmitri reminded me that even with a Comparable requirement on
indices, there is still no problem converting an equatable iteration
state into an index; you just need to augment it with an integer
counter. So it's still trivial to create a Collection from nearly
everything that is today a multipass Sequence. It does make indexing
slightly more expensive, but it's likely we'd optimize that overhead
away in many critical situations.

Anyway, the thoughts of the community on this topic would be interesting
to us. We need to make a decision about this very soon.

I don't know that I've helped at all, but those are my thoughts. Thanks!
Nate

···

On Jul 5, 2016, at 9:39 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:


(Anton Zhilin) #5

What about dropping all requirements on indices? Collections will handle
comparison of their indices, as well. These methods will be
default-implemented in case Index is actually Comparable.
With this change, ranges really become pairs of indices that mean nothing
without their collection. But I don't think this is somethng catastrophic.


(plx) #6

My own 2c is to drop the comparable requirement.

I can elaborate if necessary, but wanted to at least cast my “+1” for removing the requirement.

···

On Jul 5, 2016, at 9:39 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

I've already raised the question here of whether we should weaken the
requirement that Collection.Index be Comparable to merely being
Equatable.

Part of the motivation was to support data structures that otherwise
would be expensive or impossible to implement. For example, with
Comparable indices, you can't build a linked list that supports
restructuring (e.g. insert, delete, splice) in O(1) without invalidating
indices... not even an unsafe linked list with reference semantics.
[While I don't think linked lists are terribly good data structures for
most things, they are still useful in teaching, and it bothered me that
we were ruling them out.] Even if you take away the index invalidation
restriction, you have to store a counter in the index, which is an awkward inefficiency.
Supporting Comparable indices for a tree data structure requires
encoding the path from the root to the node. It's only one or two words
in practice, but it's another awkward inefficiency.

Over the weekend, I tried lifting the restriction to see what kind of
effect it would have on the standard library.
https://github.com/apple/swift/pull/3325

Although I *had* been leaning strongly toward making this change, having
looked at the effects, I am now more ambivalent:

* A Range<T>, where T is not Comparable, could be constructed with any
pair of Equatable T instances. We'd only detect that the Range may be
invalid when T happened to also be Comparable. However, the fact that
you are creating a Range already sort of implies an underlying
ordering.

* A Collection with non-Comparable indices still imposes an ordering
upon the indices.

* In a Collection with Comparable indices, the ordering implied by <
needs to match the ordering of indices in the collection. Otherwise,
there would be no way to form Ranges (for use in slicing, for example)
without causing a trap. This is *weird*! There's little precedent
for imposing stronger semantic requirements on an associated type if
it conforms to a particular protocol (Comparable in this case), except
where the requirement is part of the protocol.

Also, Dmitri reminded me that even with a Comparable requirement on
indices, there is still no problem converting an equatable iteration
state into an index; you just need to augment it with an integer
counter. So it's still trivial to create a Collection from nearly
everything that is today a multipass Sequence. It does make indexing
slightly more expensive, but it's likely we'd optimize that overhead
away in many critical situations.

Anyway, the thoughts of the community on this topic would be interesting
to us. We need to make a decision about this very soon.

Thanks!

--
Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Haravikk) #7

I think the question is why you need to retain indices in these cases?

When it comes to these operations I wonder if we might want to investigate something like a mutating iterator; you might still use an index to jump to an initial position, but then use .insert(), .remove() etc. methods of the iterator to perform modification without the need to track indices at all. This is essentially how you want to edit trees anyway, as indexing them isn't especially pretty, as it avoids the need to track the indices at all for these operations, and many common cases should work well when done as part of an iterator in this way.

···

On 6 Jul 2016, at 03:39, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
For example, with
Comparable indices, you can't build a linked list that supports
restructuring (e.g. insert, delete, splice) in O(1) without invalidating
indices... not even an unsafe linked list with reference semantics.


(Tino) #8

Right now I'm working on a lib for charts/plots, and I choose to create a custom interval-type that has no comparable requirement.
Of course, as soon as I leave the completely abstract and generic world, I need to bring things into an order — but if this order has to be defined by a global function "<", there is a loss in flexibility.
My plan is to use "<" as a default wherever possible to define my ranges, with the option to change it to a custom alternative; maybe similar reasoning can be applied to the index-problem.

Tino

Real world example:
Let's assume I want to create a bar-chart to compare the population of several countries.
There is no natural order for countries (I hope we can agree on that ;-), so no matter how I'd model them, it would feel wrong to define "<":
Sort by name might be the most obvious choice, but it could also be the size, the timezone, average income, the peak hight of its highest mountain...
Additionally, I might want to use different orders for different charts, which is only possible when I can change the function which dictates the order.


(Dave Abrahams) #9

Anyway, the thoughts of the community on this topic would be interesting
to us. We need to make a decision about this very soon.

I've wanted the Index protocols to be Comparable since the Swift 1
days. (Apple people, see rdar://17768142; non-Apple people, I was
looking to measure distance and, under the old indexing model, the
inability to determine which index came later was one of several
impediments.)

Going back to the beginning:

For example, with
Comparable indices, you can't build a linked list that supports
restructuring (e.g. insert, delete, splice) in O(1) without invalidating
indices... not even an unsafe linked list with reference semantics.
[While I don't think linked lists are terribly good data structures for
most things, they are still useful in teaching, and it bothered me that
we were ruling them out.] Even if you take away the index invalidation
restriction, you have to store a counter in the index, which is an awkward inefficiency.

You *can* have O(1) restructuring without index invalidation if you're
willing to tolerate O(n) comparisons; you just walk forward from the
left-hand side and return `true` if you encounter the right-hand side
before reaching the end.

Yeah, but O(1) is a basic requirement of Comparable. I consider that to
be inviolable.

I'm not certain, but I think there's a three-way tradeoff here: you
can have any two of fast restructuring, fast comparisons, and
always-valid indices.

In general, though, `Collection` is not very good at modeling linked
lists—for one thing, it can't model a value-typed linked list without
enormous index invalidation issues. I can't quite bring myself to
worry too much about linked-list handling here.

Yes, I'm not too worried about it either. I'm more concerned about
trees, FWIW.

* A Range<T>, where T is not Comparable, could be constructed with any
pair of Equatable T instances. We'd only detect that the Range may be
invalid when T happened to also be Comparable. However, the fact that
you are creating a Range already sort of implies an underlying
ordering.

Such a Range can't actually test whether a given value is *in* the
range. That basically just makes it a glorified typealias for
`(lowerBound: Bound, upperBound: Bound)`.

Yes.

A Range without Comparable is a type without any semantics.

Practically speaking, yes.

* A Collection with non-Comparable indices still imposes an ordering
upon the indices.

And it also still needs to be able to *test* the ordering of
indices. For instance, `distance(from:to:)` is a bit difficult to
implement (at least in `BidirectionalCollection`) if you can't tell
which index comes earlier in the collection.

Well, you can make it a precondition that the indices are in order.
That's what Swift 2 did (and what C++ does with its forward and
bidirectional iterators). It is certainly nice that Swift 3 lifts that
restriction.

···

on Wed Jul 06 2016, Brent Royal-Gordon <brent-AT-architechies.com> wrote:

On Jul 5, 2016, at 7:39 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

--
Dave


(Dave Abrahams) #10

For example, with
Comparable indices, you can't build a linked list that supports
restructuring (e.g. insert, delete, splice) in O(1) without invalidating
indices... not even an unsafe linked list with reference semantics.

I think the question is why you need to retain indices in these cases?

When it comes to these operations I wonder if we might want to
investigate something like a mutating iterator; you might still use an
index to jump to an initial position, but then use .insert(),
.remove() etc. methods of the iterator to perform modification without
the need to track indices at all.

There is no way, AFAIK, to implement important algorithms like rotate
binarySearch and several others, without having some representation of
position within a collection.

This is essentially how you want to edit trees anyway, as indexing
them isn't especially pretty, as it avoids the need to track the
indices at all for these operations, and many common cases should work
well when done as part of an iterator in this way.

I don't know what you mean by “track,” here. We don't track the
indices of an array.

···

on Wed Jul 06 2016, Haravikk <swift-evolution@swift.org> wrote:

On 6 Jul 2016, at 03:39, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

--
Dave


(David Sweeris) #11

Do you need indices to be explicitly comparable for that, or will simply being able to test for equality and being within a "range" work? I realize that in most cases, testing an index for being in a range implies comparable, but what about multi-dimensional indices? Comparison isn't well defined for, say, 2D points, but in theory all the points within a circle or something could be the indices for something.

- Dave Sweeris

···

On Jul 6, 2016, at 20:41, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

There is no way, AFAIK, to implement important algorithms like rotate
binarySearch and several others, without having some representation of
position within a collection.


(Xiaodi Wu) #12

I for one would be interested in your elaboration. Based on Dave's
comments, I'm pretty convinced that the Comparable requirement is best left
in place.

···

On Wed, Jul 6, 2016 at 19:19 plx via swift-evolution < swift-evolution@swift.org> wrote:

My own 2c is to drop the comparable requirement.

I can elaborate if necessary, but wanted to at least cast my “+1” for
removing the requirement.

> On Jul 5, 2016, at 9:39 PM, Dave Abrahams via swift-evolution < > swift-evolution@swift.org> wrote:
>
>
> I've already raised the question here of whether we should weaken the
> requirement that Collection.Index be Comparable to merely being
> Equatable.
>
> Part of the motivation was to support data structures that otherwise
> would be expensive or impossible to implement. For example, with
> Comparable indices, you can't build a linked list that supports
> restructuring (e.g. insert, delete, splice) in O(1) without invalidating
> indices... not even an unsafe linked list with reference semantics.
> [While I don't think linked lists are terribly good data structures for
> most things, they are still useful in teaching, and it bothered me that
> we were ruling them out.] Even if you take away the index invalidation
> restriction, you have to store a counter in the index, which is an
awkward inefficiency.
> Supporting Comparable indices for a tree data structure requires
> encoding the path from the root to the node. It's only one or two words
> in practice, but it's another awkward inefficiency.
>
> Over the weekend, I tried lifting the restriction to see what kind of
> effect it would have on the standard library.
> https://github.com/apple/swift/pull/3325
>
> Although I *had* been leaning strongly toward making this change, having
> looked at the effects, I am now more ambivalent:
>
> * A Range<T>, where T is not Comparable, could be constructed with any
> pair of Equatable T instances. We'd only detect that the Range may be
> invalid when T happened to also be Comparable. However, the fact that
> you are creating a Range already sort of implies an underlying
> ordering.
>
> * A Collection with non-Comparable indices still imposes an ordering
> upon the indices.
>
> * In a Collection with Comparable indices, the ordering implied by <
> needs to match the ordering of indices in the collection. Otherwise,
> there would be no way to form Ranges (for use in slicing, for example)
> without causing a trap. This is *weird*! There's little precedent
> for imposing stronger semantic requirements on an associated type if
> it conforms to a particular protocol (Comparable in this case), except
> where the requirement is part of the protocol.
>
> Also, Dmitri reminded me that even with a Comparable requirement on
> indices, there is still no problem converting an equatable iteration
> state into an index; you just need to augment it with an integer
> counter. So it's still trivial to create a Collection from nearly
> everything that is today a multipass Sequence. It does make indexing
> slightly more expensive, but it's likely we'd optimize that overhead
> away in many critical situations.
>
> Anyway, the thoughts of the community on this topic would be interesting
> to us. We need to make a decision about this very soon.
>
> Thanks!
>
> --
> Dave
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Haravikk) #13

By track I just mean store in a variable.

As an example, consider removing multiple values using a closure:

  let delete:(Int) -> Bool = { ($0 % 2) == 1 } // Delete all odd numbers
  let filtered = myArray.filter { !delete($0) } // Produces new copy of array with elements filtered

  // In-place removal
  var index = myArray.startIndex, indices:[Array<Int>.Index] = []
  for eachElement in myArray {
    if delete(eachElement) { indices.append(index) }
    myArray.formIndex(after: &index)
  }
  for eachIndex in indices { myArray.remove(at: eachIndex }

The latter case only works with types where you know that there's a safe way to use the indices (removing in reverse order doesn't invalidate earlier indices of Array) so it's not suitable for generic collections. Since it requires iterating myArray anyway, then if the removals could be performed at the same time it would eliminate the need to store and then use indices at all, and would be a much better way to work with linked-lists, trees and so-on. So with a mutable iterator for example the in-place removal would look like:

  var iterator = myArray.makeMutatingIterator()
  while let eachElement = iterator.next() {
    if delete(eachElement) { iterator.remove() }
  }

Maybe this isn't directly applicable to this topic, but this for example seems like a better solution to the linked list problems raised than removing Comparable as a requirement which was my intended point, that perhaps this isn't necessary?

Otherwise I think the main issue with types that don't seem like they can implement Comparable is that they need a means of detecting that they've been invalidated; arguably a singly-linked list's indices should become unusable if it has been changed, which could be done for example by giving the list a mutationCount value that is incremented each time it is mutated; indices would take a copy of this value and if they don't match, they produce a runtime error. Like so (heavily abridged):

  struct LinkedListIndex<Element> : Comparable {
    let mutationCount:Int
    var position:Int, node:LinkedListNode<Element>
  }
  func == <E>(lhs:LinkedListIndex<E>, rhs:LinkedListIndex<E>) { return lhs.position == rhs.position }
  func < <E>(lhs:LinkedListIndex<E>, rhs:LinkedListIndex<E>) { return lhs.position < rhs.position }

  class LinkedListNode<Element> {
    var element:Element, next:LinkedListNode<Element>?
    init(_ element:Element) { self.element = element }
  }

  struct LinkedList<Element> : Indexable {
    var mutationCount:Int = 0, head:LinkedListNode<Element>?, tail:LinkedListNode<Element>?

    typealias Index = LinkedListIndex
    var startIndex:Index { return LinkedListIndex(mutationCount: self.mutationCount, position: 0) }
    func formIndex(after i:inout Index) {
      precondition(i.mutationCount == self.mutationCount, "Invalid index")
      i.position += 1
      i.node = i.node?.next
    }
    func index(after i:Index) -> Index { var i = i; self.formIndex(after: &i); return i }
    subscript(i:Index) -> Element {
      precondition(i.mutationCount == self.mutationCount, "Invalid index")
      return i.node!.element
    }

    mutating func append(_ element:Element) {
      let node = LinkedListNode(element)
      if self.tail == nil { self.head = node; self.tail = node }
      else { self.tail.next = node; self.tail = node }
      self.mutationCount = self.mutationCount &+ 1
    }
  }

Please forgive typos/omissions, tried to keep it as short as possible. But yeah, this is essentially how I'd try to implement a linked list, while retaining the Comparable constraint.

···

On 7 Jul 2016, at 02:41, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

on Wed Jul 06 2016, Haravikk <swift-evolution@swift.org> wrote:

On 6 Jul 2016, at 03:39, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
For example, with
Comparable indices, you can't build a linked list that supports
restructuring (e.g. insert, delete, splice) in O(1) without invalidating
indices... not even an unsafe linked list with reference semantics.

I think the question is why you need to retain indices in these cases?

When it comes to these operations I wonder if we might want to
investigate something like a mutating iterator; you might still use an
index to jump to an initial position, but then use .insert(),
.remove() etc. methods of the iterator to perform modification without
the need to track indices at all.

There is no way, AFAIK, to implement important algorithms like rotate
binarySearch and several others, without having some representation of
position within a collection.

This is essentially how you want to edit trees anyway, as indexing
them isn't especially pretty, as it avoids the need to track the
indices at all for these operations, and many common cases should work
well when done as part of an iterator in this way.

I don't know what you mean by “track,” here. We don't track the
indices of an array.


(Xiaodi Wu) #14

This sounds like a scenario where you'd be using or extending an existing
stdlib generic type such as Set, no?

···

On Fri, Jul 8, 2016 at 15:53 Tino Heth via swift-evolution < swift-evolution@swift.org> wrote:

Right now I'm working on a lib for charts/plots, and I choose to create a
custom interval-type that has no comparable requirement.
Of course, as soon as I leave the completely abstract and generic world, I
need to bring things into an order — but if this order has to be defined by
a global function "<", there is a loss in flexibility.
My plan is to use "<" as a default wherever possible to define my ranges,
with the option to change it to a custom alternative; maybe similar
reasoning can be applied to the index-problem.

Tino

Real world example:
Let's assume I want to create a bar-chart to compare the population of
several countries.
There is no natural order for countries (I hope we can agree on that ;-),
so no matter how I'd model them, it would feel wrong to define "<":
Sort by name might be the most obvious choice, but it could also be the
size, the timezone, average income, the peak hight of its highest
mountain...
Additionally, I might want to use different orders for different charts,
which is only possible when I can change the function which dictates the
order.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Matthew Johnson) #15

I for one would be interested in your elaboration. Based on Dave's comments, I'm pretty convinced that the Comparable requirement is best left in place.

+1

···

Sent from my iPad

On Jul 6, 2016, at 7:33 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

On Wed, Jul 6, 2016 at 19:19 plx via swift-evolution <swift-evolution@swift.org> wrote:
My own 2c is to drop the comparable requirement.

I can elaborate if necessary, but wanted to at least cast my “+1” for removing the requirement.

> On Jul 5, 2016, at 9:39 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
>
>
> I've already raised the question here of whether we should weaken the
> requirement that Collection.Index be Comparable to merely being
> Equatable.
>
> Part of the motivation was to support data structures that otherwise
> would be expensive or impossible to implement. For example, with
> Comparable indices, you can't build a linked list that supports
> restructuring (e.g. insert, delete, splice) in O(1) without invalidating
> indices... not even an unsafe linked list with reference semantics.
> [While I don't think linked lists are terribly good data structures for
> most things, they are still useful in teaching, and it bothered me that
> we were ruling them out.] Even if you take away the index invalidation
> restriction, you have to store a counter in the index, which is an awkward inefficiency.
> Supporting Comparable indices for a tree data structure requires
> encoding the path from the root to the node. It's only one or two words
> in practice, but it's another awkward inefficiency.
>
> Over the weekend, I tried lifting the restriction to see what kind of
> effect it would have on the standard library.
> https://github.com/apple/swift/pull/3325
>
> Although I *had* been leaning strongly toward making this change, having
> looked at the effects, I am now more ambivalent:
>
> * A Range<T>, where T is not Comparable, could be constructed with any
> pair of Equatable T instances. We'd only detect that the Range may be
> invalid when T happened to also be Comparable. However, the fact that
> you are creating a Range already sort of implies an underlying
> ordering.
>
> * A Collection with non-Comparable indices still imposes an ordering
> upon the indices.
>
> * In a Collection with Comparable indices, the ordering implied by <
> needs to match the ordering of indices in the collection. Otherwise,
> there would be no way to form Ranges (for use in slicing, for example)
> without causing a trap. This is *weird*! There's little precedent
> for imposing stronger semantic requirements on an associated type if
> it conforms to a particular protocol (Comparable in this case), except
> where the requirement is part of the protocol.
>
> Also, Dmitri reminded me that even with a Comparable requirement on
> indices, there is still no problem converting an equatable iteration
> state into an index; you just need to augment it with an integer
> counter. So it's still trivial to create a Collection from nearly
> everything that is today a multipass Sequence. It does make indexing
> slightly more expensive, but it's likely we'd optimize that overhead
> away in many critical situations.
>
> Anyway, the thoughts of the community on this topic would be interesting
> to us. We need to make a decision about this very soon.
>
> Thanks!
>
> --
> Dave
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dave Abrahams) #16

There is no way, AFAIK, to implement important algorithms like rotate
binarySearch and several others, without having some representation of
position within a collection.

Do you need indices to be explicitly comparable for that, or will
simply being able to test for equality and being within a "range"
work?

Equality is all that's needed for most of these algorithms.
I don't know what you mean about testing for being within a "range."
Doing that efficiently comes down to index ordering comparison AFAICT.

I realize that in most cases, testing an index for being in a range
implies comparable, but what about multi-dimensional indices?
Comparison isn't well defined for, say, 2D points, but in theory all
the points within a circle or something could be the indices for
something.

If you have to do something like that, you wrap them so they're ordered
by angle or something.

Multi-dimentionsal indices only fit into our model of collection insofar
as they are linearizable. The only place I've seen this come up is in
linear algebra libraries, where indices in matrices have a linear order
that depends on storage format, and .row and .column properties that you
can read to get the two-dimensional position.

···

on Wed Jul 06 2016, David Sweeris <swift-evolution@swift.org> wrote:

On Jul 6, 2016, at 20:41, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

--
Dave


(Dave Abrahams) #17

For example, with
Comparable indices, you can't build a linked list that supports
restructuring (e.g. insert, delete, splice) in O(1) without invalidating
indices... not even an unsafe linked list with reference semantics.

I think the question is why you need to retain indices in these cases?

When it comes to these operations I wonder if we might want to
investigate something like a mutating iterator; you might still use an
index to jump to an initial position, but then use .insert(),
.remove() etc. methods of the iterator to perform modification without
the need to track indices at all.

There is no way, AFAIK, to implement important algorithms like rotate
binarySearch and several others, without having some representation of
position within a collection.

This is essentially how you want to edit trees anyway, as indexing
them isn't especially pretty, as it avoids the need to track the
indices at all for these operations, and many common cases should work
well when done as part of an iterator in this way.

I don't know what you mean by “track,” here. We don't track the
indices of an array.

By track I just mean store in a variable.

As an example, consider removing multiple values using a closure:

  let delete:(Int) -> Bool = { ($0 % 2) == 1 } // Delete all odd numbers
  let filtered = myArray.filter { !delete($0) } // Produces new copy of array with elements filtered

  // In-place removal
  var index = myArray.startIndex, indices:[Array<Int>.Index] = []
  for eachElement in myArray {
    if delete(eachElement) { indices.append(index) }
    myArray.formIndex(after: &index)
  }
  for eachIndex in indices { myArray.remove(at: eachIndex }

I'm afraid that doesn't work (remove elements from [0, 1] with the
predicate {true}), and even if it did work it would take O(N^2) time and
O(N) space. The right algorithm is O(N) time and O(1) space.

The latter case only works with types where you know that there's a
safe way to use the indices (removing in reverse order doesn't
invalidate earlier indices of Array) so it's not suitable for generic
collections.

Exercise for the reader: write a generic one that works for any
RangeReplaceableCollection. It should be easy. Hint: you only want one
call removeSubrange.

Since it requires iterating myArray anyway, then if the removals could
be performed at the same time it would eliminate the need to store and
then use indices at all, and would be a much better way to work with
linked-lists, trees and so-on. So with a mutable iterator for example
the in-place removal would look like:

  var iterator = myArray.makeMutatingIterator()
  while let eachElement = iterator.next() {
    if delete(eachElement) { iterator.remove() }
  }

In general, because arrays have value semantics, there's no way to
mutate the array through another object such as a mutating iterator.
And even if there was a way, it would still force the whole
array's storage to be cloned at least once, which is not something we
want to pay for.

···

on Thu Jul 07 2016, Haravikk <swift-evolution@swift.org> wrote:

On 7 Jul 2016, at 02:41, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Wed Jul 06 2016, Haravikk <swift-evolution@swift.org> wrote:

On 6 Jul 2016, at 03:39, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

Maybe this isn't directly applicable to this topic, but this for
example seems like a better solution to the linked list problems
raised than removing Comparable as a requirement which was my intended
point, that perhaps this isn't necessary?

Otherwise I think the main issue with types that don't seem like they
can implement Comparable is that they need a means of detecting that
they've been invalidated; arguably a singly-linked list's indices
should become unusable if it has been changed, which could be done for
example by giving the list a mutationCount value that is incremented
each time it is mutated; indices would take a copy of this value and
if they don't match, they produce a runtime error. Like so (heavily
abridged):

  struct LinkedListIndex<Element> : Comparable {
    let mutationCount:Int
    var position:Int, node:LinkedListNode<Element>
  }
  func == <E>(lhs:LinkedListIndex<E>, rhs:LinkedListIndex<E>) { return lhs.position == rhs.position }
  func < <E>(lhs:LinkedListIndex<E>, rhs:LinkedListIndex<E>) { return lhs.position < rhs.position }

  class LinkedListNode<Element> {
    var element:Element, next:LinkedListNode<Element>?
    init(_ element:Element) { self.element = element }
  }

  struct LinkedList<Element> : Indexable {
    var mutationCount:Int = 0, head:LinkedListNode<Element>?, tail:LinkedListNode<Element>?

    typealias Index = LinkedListIndex
    var startIndex:Index { return LinkedListIndex(mutationCount: self.mutationCount, position: 0) }
    func formIndex(after i:inout Index) {
      precondition(i.mutationCount == self.mutationCount, "Invalid index")
      i.position += 1
      i.node = i.node?.next
    }
    func index(after i:Index) -> Index { var i = i; self.formIndex(after: &i); return i }
    subscript(i:Index) -> Element {
      precondition(i.mutationCount == self.mutationCount, "Invalid index")
      return i.node!.element
    }

    mutating func append(_ element:Element) {
      let node = LinkedListNode(element)
      if self.tail == nil { self.head = node; self.tail = node }
      else { self.tail.next = node; self.tail = node }
      self.mutationCount = self.mutationCount &+ 1
    }
  }

Please forgive typos/omissions, tried to keep it as short as possible. But yeah, this is essentially how I'd try to implement a linked list, while retaining the Comparable constraint.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
Dave


(plx) #18

I am short on time (and will remain so for several more weeks) so this should be seen as essentially a fire-and-forget message.

I'll preface the rest with the disclaimer that I don't think any of the assessments already made are *wrong* on the facts...I simply (a) disagree on what to make of them and also (b) think there is more cost to requiring `Comparable` than has been noted so far.

For (a), I agree with the following pair of facts:

- there are "collections" for which the "natural" choice of index is awkward to make `Comparable` (thus making conformance to `Collection` awkward)...
- such "non-comparable indices" can easily be made comparable when needed, e.g. by stapling an `Int` onto each "natural" index (or via some similar augmentation)

...but to me this argues in favor of dropping the requirement:

- dropping it allows more "collections" to become proper `Collection`s (while still using the "natural" index)
- when you specifically need comparable indices, they are easy to add (so why not *not* pay for comparability except when you specifically need it)

For (b), I think that the long-term costs of techniques like "`Int`-stapling" are actually rather more severe than just the overhead of the stapled-on `Int` (etc.); to illustrate this, consider linked-lists, since they've already been brought up.

I'll state without proof that the following are reasonable:

- if we only need `Equatable` indices:
  - the list nodes are the natural choice of "index"
  - the list nodes are *robust* indices (very few updates invalidate them)
- if we instead need `Comparable` indices:
  - the easiest index is a pair like `(list-node, counter)`
  - these indices are *fragile* indices (invalidated after most updates)

...and proceed to my point:

It's not hard to imagine at some point introducing a protocol for a mutable collection with *robust* indices -- e.g., behaving like the "natural" indices we could get away with here if we only needed `Equatable` -- and providing either (or both):

- improved generic implementations of standard mutation operations
- additional mutation operations that are only practical with robust indices

...and the unfortunate consequence of the `Comparable`-index requirement is that our linked-list would **not** be able to adopt such a protocol -- and benefit from such algorithms, etc. -- b/c that requirement means that the `Index` it will expose in a generic setting is the more-fragile, “stapled index” instead of the more-robust, “natural index” we could have used otherwise.

This has been discussing linked-lists but I think it generalizes; e.g. for other things that “could be ‘collections’” without the `Comparable`-`Index` requirement, you can easily make indices that *do* satisfy the `Comparable` requirement…at the cost of winding up with comparatively-fragile indices that are a bit pessimal for use with mutation operations. I’ll speculate that the issues being alluded-to for trees are roughly similar: it’s not that it’s hard to make the tree indices comparable, it’s that there’s no good way to do that without making the resulting indices “artificially” fragile.

And so on.

So to conclude, I don’t really disagree that the issues raised by non-comparable indices are real issues, but I *do* think the cost/benefit analysis should include the cost of having "locked-in" the use of fragile, easily-invalidated indices vis-a-vis what might otherwise have been used. It’s not amazingly convincing without more-concrete examples, but this is all I have time for.

···

On Jul 6, 2016, at 7:33 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

I for one would be interested in your elaboration. Based on Dave's comments, I'm pretty convinced that the Comparable requirement is best left in place.


(Dave Abrahams) #19

I for one would be interested in your elaboration. Based on Dave's comments, I'm pretty convinced that the Comparable requirement is best left in place.

I am short on time (and will remain so for several more weeks) so this should be seen as essentially a fire-and-forget message.

I'll preface the rest with the disclaimer that I don't think any of
the assessments already made are *wrong* on the facts...I simply (a)
disagree on what to make of them and also (b) think there is more cost
to requiring `Comparable` than has been noted so far.

For (a), I agree with the following pair of facts:

- there are "collections" for which the "natural" choice of index is awkward to make `Comparable` (thus making conformance to `Collection` awkward)...
- such "non-comparable indices" can easily be made comparable when
needed, e.g. by stapling an `Int` onto each "natural" index (or via
some similar augmentation)

...but to me this argues in favor of dropping the requirement:

- dropping it allows more "collections" to become proper `Collection`s (while still using the "natural" index)
- when you specifically need comparable indices, they are easy to add (so why not *not* pay for comparability except when you specifically need it)

For (b), I think that the long-term costs of techniques like
"`Int`-stapling" are actually rather more severe than just the
overhead of the stapled-on `Int` (etc.); to illustrate this, consider
linked-lists, since they've already been brought up.

I'll state without proof that the following are reasonable:

- if we only need `Equatable` indices:
  - the list nodes are the natural choice of "index"
  - the list nodes are *robust* indices (very few updates invalidate them)
- if we instead need `Comparable` indices:
  - the easiest index is a pair like `(list-node, counter)`
  - these indices are *fragile* indices (invalidated after most updates)

...and proceed to my point:

It's not hard to imagine at some point introducing a protocol for a
mutable collection with *robust* indices -- e.g., behaving like the
"natural" indices we could get away with here if we only needed
`Equatable` -- and providing either (or both):

- improved generic implementations of standard mutation operations
- additional mutation operations that are only practical with robust indices

...and the unfortunate consequence of the `Comparable`-index
requirement is that our linked-list would **not** be able to adopt
such a protocol -- and benefit from such algorithms, etc. -- b/c that
requirement means that the `Index` it will expose in a generic setting
is the more-fragile, “stapled index” instead of the more-robust,
“natural index” we could have used otherwise.

This has been discussing linked-lists but I think it generalizes;
e.g. for other things that “could be ‘collections’” without the
`Comparable`-`Index` requirement, you can easily make indices that
*do* satisfy the `Comparable` requirement…at the cost of winding up
with comparatively-fragile indices that are a bit pessimal for use
with mutation operations. I’ll speculate that the issues being
alluded-to for trees are roughly similar: it’s not that it’s hard to
make the tree indices comparable, it’s that there’s no good way to do
that without making the resulting indices “artificially” fragile.

And so on.

So to conclude, I don’t really disagree that the issues raised by
non-comparable indices are real issues, but I *do* think the
cost/benefit analysis should include the cost of having "locked-in"
the use of fragile, easily-invalidated indices vis-a-vis what might
otherwise have been used.

Agreed, and that's basically what motivated the investigation of
dropping the Comparable requirement. One thing to remember, though:
this ability to avoid invalidating indices would not apply to all
collections. The most-common (and usually best)
RangeReplaceableCollection, Array, can't support it.

The ability to maintain a stable notion of “position” in the collection
after it is restructured is a special property that only really applies
to “node-based” collections where each element gets a separate node.
Individual data structures with this property are free to provide their
own *additional* Index-like type that is less fragile.

Are there any generic algorithms over such data structures that take
advantage of index stability? I don't know of any. If they exist, it's
a potential concern for the protocol design. If they don't, it's
something we can just add to individual concrete types.

The concern that comparability may impose a performance penalty on some
data structures is a bigger one for me. However, since node-based
collections tend to have terrible locality-of-reference, I feel sort of
OK about the idea that people who care about performance will be using
data structures containing at least some contiguous allocations
(e.g. Deques, B+ trees...)

I find either decision to have some distasteful effects on the design,
but at the moment I think the effects of dropping the Comparable
requirement are worse.

It’s not amazingly convincing without more-concrete examples, but this
is all I have time for.

Thanks for making some time despite your busy schedule!

···

on Fri Jul 08 2016, plx <swift-evolution@swift.org> wrote:

On Jul 6, 2016, at 7:33 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

--
Dave