[Review] SE-0074: Implementation of Binary Search functions


(Nicola Salmoria) #4

* What is your evaluation of the proposal?

I support the idea, since binary search is obviously a very important
algorithm which would be important to have in the standard library.

I am, however, unsure about the implementation. The current proposal
suggests to add the new methods to the Collection protocol, leaving to the
caller the burden of ensuring that the collection is properly sorted, and
giving up on the opportunity of using types to enforce the invariant.

In addition, the whole idea of being sorted doesn't even make sense for many
Collections: e.g. a Dictionary can never be in a condition where calling a
sortedIndex() method would make sense, so it seems odd that the protocol
would allow it.

Another thing to notice is that the isOrderedBefore predicate passed to
sortedIndex() can't be arbitrary, but must be the same predicate used to
sort the collection in the first place. It would therefore seem natural for
the predicate to be part of the collection itself, and not be passed to
sortedIndex().

It seems to me that it would be more appropriate to have new
SortedCollection protocol, declaring the proposed methods. The sorted()
method of Collection would need to return a new type similar to Array but
conforming to the SortedCollection protocol. I'm afraid I don't know how
in-place sort() could be handled efficiently.

Mind you, I'm not sure if such an approach would actually be viable, but the
fact that the "Alternatives considered" section doesn't mention it makes me
think that the authors haven't fully evaluated the possibility.

* Is the problem being addressed significant enough to warrant a change to

Swift?

Yes, it's important for the standard library to provide common efficient
algorithms.

* Does this proposal fit well with the feel and direction of Swift?

For the reasons said above, I don't think it fits perfectly. However, if it
isn't possible to do better, this would be at least a first step.

* If you have used other languages or libraries with a similar feature,

how do you feel that this proposal compares to those?

I've used the C++ counterparts that are referenced in the proposal. Very
recently, I worked on a project where an array violated the prerequisite of
being sorted, causing bugs. So my concerns stated above are very real and I
have personally run into them.

* How much effort did you put into your review? A glance, a quick reading,

or an in-depth study?

A quick reading and my past experience with this kind of algorithms.

···

--
Nicola


(Dave Abrahams) #5

I am posting this review on behalf of Dmitri Gribenko, Max Moiseev, and
myself.

Hello Swift community,

The review of "SE-0074: Implementation of Binary Search functions"
begins now and runs through May 9. The proposal is available here:

  https://github.com/apple/swift-evolution/blob/master/proposals/0074-binary-search.md

Reviews are an important part of the Swift evolution process. All reviews should be sent to the swift-evolution mailing list at

  https://lists.swift.org/mailman/listinfo/swift-evolution

or, if you would like to keep your feedback private, directly to the review manager.

What goes into a review?

The goal of the review process is to improve the proposal under review
through constructive criticism and contribute to the direction of
Swift. When writing your review, here are some questions you might
want to answer in your review:

  * What is your evaluation of the proposal?

We think binary searches are fundamental and important, and want to see
them added. However, we don't agree with the form of the current
proposal. In particular, we think:

* Operations that depend on sorted-ness and use binary predicates should
  not be available on all Collections; they're too easy to misuse,
  they're hard to name well, and as Nicola Salmoria has noted, they
  would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
  the predicate with the elements, e.g.

    let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
    let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

  Maybe there should also be protocols for this; CountableRange<T> would
  already already conform to the immutable version. We might want a
  mutable form of the protocol for sorted collections with
  insertion/removal methods. This whole area needs more design.

* The polarity of the predicates is wrong: a method with a “where:
  predicate” should always return you the index of an element that
  *does* satisfy the predicate.

* Any code in the proposal should be updated to:

  - conform to the new
    indexing model; it still shows indices being moved without
    participation of the collection instance.
  
  - attach the @noescape attribute to a parameter's type, rather than
    its name.

* The existing partition algorithms in the stdlib are indeed hobbled.
  The more-general partition function is a great idea, but it belongs in
  a separate proposal.

* Something like the method the proposal calls `partitionedIndex`
  *should* be included in the standard library for Swift 3, with the
  following differences:

  - it should be called partitionPoint(where:)

  - the meaning of the predicate result should be inverted so that the
    result of calling the function points to an element actually
    satisfying the predicate

  This would give people the primitive they need to build higher-level
  functionality without broadening the interface too far, or committing
  us to a design that will be hard to use correctly.

  * Is the problem being addressed significant enough to warrant a
change to Swift?

Yes.

* Does this proposal fit well with the feel and direction of Swift?

Not as written, we think.

* If you have used other languages or libraries with a similar
feature, how do you feel that this proposal compares to those?

We have used the C++ standard library and Python's `bisect` function.
This proposal is obviously inspired by much of the work in C++, but we
think we can do much better for Swift.

* How much effort did you put into your review? A glance, a
quick reading, or an in-depth study?

An in-depth study.

More information about the Swift evolution process is available at

  https://github.com/apple/swift-evolution/blob/master/process.md

Thank you,

-Chris Lattner
Review Manager

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Thanks,

···

on Tue May 03 2016, Chris Lattner <swift-evolution@swift.org> wrote:

--
Dave


Sorted Collection Protocol
(Brent Royal-Gordon) #6

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they
would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

   let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
   let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case to support. In particular, it's quite common to use sorted `Array`s, and these APIs would help you do that.

What I might consider doing is tying this to `RangeReplaceableCollection`. That protocol is applied only to types which allow insertion at arbitrary indices, which is a good, though not perfect, proxy for types which might allow you to manually maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and the mutable `String` views would get these methods, while `Set` and `Dictionary` would not.

···

--
Brent Royal-Gordon
Architechies


(Nate Cook) #7

Proposal:

  https://github.com/apple/swift-evolution/blob/master/proposals/0074-binary-search.md

I am posting this review on behalf of Dmitri Gribenko, Max Moiseev, and
myself.

Thanks for the feedback! This reply is only from me—I haven't had a chance to consult with the other authors and they may disagree with me or have better arguments on everything below.

  * What is your evaluation of the proposal?

We think binary searches are fundamental and important, and want to see
them added. However, we don't agree with the form of the current
proposal. In particular, we think:

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they
would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

   let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
   let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I can certainly see this as an alternate approach. I would still prefer to have access to APIs that require but don't enforce sorting as a precondition, since, as Brent mentioned, sorted arrays are fairly common. That said, if there's a thoughtfully designed collection protocol and a "sorted collection" wrapper that doe smart things without duplicating storage, I'd be on board with that as well.

* The polarity of the predicates is wrong: a method with a “where:
predicate” should always return you the index of an element that
*does* satisfy the predicate.

Wow, yeah, this is totally backwards. So these should really be:

let a = [10, 20, 30, 30, 30, 40, 60]
a.partitionedIndex(where: { $0 >= 20 }) // 1

let lowerBound30 = a.partitionedIndex(where: { $0 >= 30 })
let upperBound30 = a.partitionedIndex(where: { $0 > 30 })

* Any code in the proposal should be updated to:

- conform to the new
   indexing model; it still shows indices being moved without
   participation of the collection instance.

- attach the @noescape attribute to a parameter's type, rather than
   its name.

* The existing partition algorithms in the stdlib are indeed hobbled.
The more-general partition function is a great idea, but it belongs in
a separate proposal.

* Something like the method the proposal calls `partitionedIndex`
*should* be included in the standard library for Swift 3, with the
following differences:

- it should be called partitionPoint(where:)

- the meaning of the predicate result should be inverted so that the
   result of calling the function points to an element actually
   satisfying the predicate

This would give people the primitive they need to build higher-level
functionality without broadening the interface too far, or committing
us to a design that will be hard to use correctly.

This would definitely be a more limited and focused proposal. Thanks again for the feedback!

Nate

···

On May 6, 2016, at 5:16 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

  * Is the problem being addressed significant enough to warrant a
change to Swift?

Yes.

* Does this proposal fit well with the feel and direction of Swift?

Not as written, we think.

* If you have used other languages or libraries with a similar
feature, how do you feel that this proposal compares to those?

We have used the C++ standard library and Python's `bisect` function.
This proposal is obviously inspired by much of the work in C++, but we
think we can do much better for Swift.

* How much effort did you put into your review? A glance, a
quick reading, or an in-depth study?

An in-depth study.

More information about the Swift evolution process is available at

  https://github.com/apple/swift-evolution/blob/master/process.md

Thank you,

-Chris Lattner
Review Manager

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Thanks,

--
Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Joe Groff) #8

I am posting this review on behalf of Dmitri Gribenko, Max Moiseev, and
myself.

Hello Swift community,

The review of "SE-0074: Implementation of Binary Search functions"
begins now and runs through May 9. The proposal is available here:

  https://github.com/apple/swift-evolution/blob/master/proposals/0074-binary-search.md

Reviews are an important part of the Swift evolution process. All reviews should be sent to the swift-evolution mailing list at

  https://lists.swift.org/mailman/listinfo/swift-evolution

or, if you would like to keep your feedback private, directly to the review manager.

What goes into a review?

The goal of the review process is to improve the proposal under review
through constructive criticism and contribute to the direction of
Swift. When writing your review, here are some questions you might
want to answer in your review:

  * What is your evaluation of the proposal?

We think binary searches are fundamental and important, and want to see
them added. However, we don't agree with the form of the current
proposal. In particular, we think:

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they
would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

   let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
   let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I worry that attaching these methods to a strongly-typed `Sorted` wrapper limits their appeal. It's useful to be able to binary-search through data in a standard container that's known to be sorted, or over a subregion of the data that's sorted. While you can of course cobble together a Sorted(Slice(container[sortedRange])), that's pretty inconvenient. Is misapplying binary search algorithms to unsorted data a big problem in practice in C++?

-Joe

···

On May 6, 2016, at 3:16 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Tue May 03 2016, Chris Lattner <swift-evolution@swift.org> wrote:

* The polarity of the predicates is wrong: a method with a “where:
predicate” should always return you the index of an element that
*does* satisfy the predicate.

* Any code in the proposal should be updated to:

- conform to the new
   indexing model; it still shows indices being moved without
   participation of the collection instance.

- attach the @noescape attribute to a parameter's type, rather than
   its name.

* The existing partition algorithms in the stdlib are indeed hobbled.
The more-general partition function is a great idea, but it belongs in
a separate proposal.

* Something like the method the proposal calls `partitionedIndex`
*should* be included in the standard library for Swift 3, with the
following differences:

- it should be called partitionPoint(where:)

- the meaning of the predicate result should be inverted so that the
   result of calling the function points to an element actually
   satisfying the predicate

This would give people the primitive they need to build higher-level
functionality without broadening the interface too far, or committing
us to a design that will be hard to use correctly.

  * Is the problem being addressed significant enough to warrant a
change to Swift?

Yes.

* Does this proposal fit well with the feel and direction of Swift?

Not as written, we think.

* If you have used other languages or libraries with a similar
feature, how do you feel that this proposal compares to those?

We have used the C++ standard library and Python's `bisect` function.
This proposal is obviously inspired by much of the work in C++, but we
think we can do much better for Swift.

* How much effort did you put into your review? A glance, a
quick reading, or an in-depth study?

An in-depth study.

More information about the Swift evolution process is available at

  https://github.com/apple/swift-evolution/blob/master/process.md

Thank you,

-Chris Lattner
Review Manager

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Thanks,

--
Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Joe Groff) #9

We could also introduce a new OrderedCollection protocol. (This would also be useful in the future for supporting `case` pattern matching on collections. It makes sense to pattern-match arrays and other ordered collections in order by element, but you'd expect very different semantics pattern-matching an unordered Set.)

-Joe

···

On May 9, 2016, at 6:23 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they
would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

  let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
  let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case to support. In particular, it's quite common to use sorted `Array`s, and these APIs would help you do that.

What I might consider doing is tying this to `RangeReplaceableCollection`. That protocol is applied only to types which allow insertion at arbitrary indices, which is a good, though not perfect, proxy for types which might allow you to manually maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and the mutable `String` views would get these methods, while `Set` and `Dictionary` would not.


(Daniel Vollmer) #10

I agree with Joe here, just look at the recursive construction of a kd-tree where you recursively sort (and then search for a partitioning point) in smaller and smaller slices sorted along different dimensions.

I don’t think misapplying is a big problem in C++.

  Daniel.

···

On 10 May 2016, at 19:36, Joe Groff via swift-evolution <swift-evolution@swift.org> wrote:

I worry that attaching these methods to a strongly-typed `Sorted` wrapper limits their appeal. It's useful to be able to binary-search through data in a standard container that's known to be sorted, or over a subregion of the data that's sorted. While you can of course cobble together a Sorted(Slice(container[sortedRange])), that's pretty inconvenient. Is misapplying binary search algorithms to unsorted data a big problem in practice in C++?


(Dave Abrahams) #11

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they
would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

   let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
   let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case to
support. In particular, it's quite common to use sorted `Array`s, and
these APIs would help you do that.

Sorted<Array<T>> would help you do that even more :slight_smile:

What I might consider doing is tying this to
`RangeReplaceableCollection`. That protocol is applied only to types
which allow insertion at arbitrary indices, which is a good, though
not perfect, proxy for types which might allow you to manually
maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and
the mutable `String` views would get these methods, while `Set` and
`Dictionary` would not.

That's interesting, but I don't understand why it's better than having
Sorted.

IME it's very rare to have a collection that's transiently sorted and
thus appropriate for a binary search. You may start out un-sorted, but
after a bit you're maintaining sorted order. It makes sense to reflect
that in the type system.

···

on Mon May 09 2016, Brent Royal-Gordon <brent-AT-architechies.com> wrote:

--
-Dave


(Dave Abrahams) #12

No, but:

1. Algorithms on collections in C++ are not methods that you'd find by
   code completion.

2. We have stronger naming conventions than C++ does, such that we're
   inclined to stick the word “sorted” into the names of all the
   algorithms that have a sortedness precondition. There are a few of
   them; e.g. don't forget unique(). The resulting design gets pretty ugly.

3. The idea of a collection that maintains its sorted order is a
   powerful one that has been useful in practice

···

on Tue May 10 2016, Joe Groff <swift-evolution@swift.org> wrote:

On May 6, 2016, at 3:16 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

I am posting this review on behalf of Dmitri Gribenko, Max Moiseev, and
myself.

on Tue May 03 2016, Chris Lattner <swift-evolution@swift.org> wrote:

Hello Swift community,

The review of "SE-0074: Implementation of Binary Search functions"
begins now and runs through May 9. The proposal is available here:

  https://github.com/apple/swift-evolution/blob/master/proposals/0074-binary-search.md

Reviews are an important part of the Swift evolution process. All
reviews should be sent to the swift-evolution mailing list at

  https://lists.swift.org/mailman/listinfo/swift-evolution

or, if you would like to keep your feedback private, directly to the review manager.

What goes into a review?

The goal of the review process is to improve the proposal under review
through constructive criticism and contribute to the direction of
Swift. When writing your review, here are some questions you might
want to answer in your review:

  * What is your evaluation of the proposal?

We think binary searches are fundamental and important, and want to see
them added. However, we don't agree with the form of the current
proposal. In particular, we think:

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they
would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

   let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
   let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I worry that attaching these methods to a strongly-typed `Sorted`
wrapper limits their appeal. It's useful to be able to binary-search
through data in a standard container that's known to be sorted, or
over a subregion of the data that's sorted. While you can of course
cobble together a Sorted(Slice(container[sortedRange])), that's pretty
inconvenient. Is misapplying binary search algorithms to unsorted data
a big problem in practice in C++?

--
-Dave


(Nate Cook) #13

Yet another alternative would be to drop Set and Dictionary down a level to a FiniteSequence protocol in between Sequence and Collection. Basically none of the index-based collection APIs (i.e. everything except `count` and `isEmpty`) make sense on sets and dictionaries. index(where:) was marginally useful with dictionaries, but now that Sequence is getting first(where:), née find(...), even that isn't necessary.

Nate

···

On May 9, 2016, at 9:48 PM, Joe Groff via swift-evolution <swift-evolution@swift.org> wrote:

On May 9, 2016, at 6:23 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they
would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case to support. In particular, it's quite common to use sorted `Array`s, and these APIs would help you do that.

What I might consider doing is tying this to `RangeReplaceableCollection`. That protocol is applied only to types which allow insertion at arbitrary indices, which is a good, though not perfect, proxy for types which might allow you to manually maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and the mutable `String` views would get these methods, while `Set` and `Dictionary` would not.

We could also introduce a new OrderedCollection protocol. (This would also be useful in the future for supporting `case` pattern matching on collections. It makes sense to pattern-match arrays and other ordered collections in order by element, but you'd expect very different semantics pattern-matching an unordered Set.)

-Joe


(Thorsten Seitz) #14

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they
would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case to support. In particular, it's quite common to use sorted `Array`s, and these APIs would help you do that.

What I might consider doing is tying this to `RangeReplaceableCollection`. That protocol is applied only to types which allow insertion at arbitrary indices, which is a good, though not perfect, proxy for types which might allow you to manually maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and the mutable `String` views would get these methods, while `Set` and `Dictionary` would not.

We could also introduce a new OrderedCollection protocol. (This would also be useful in the future for supporting `case` pattern matching on collections. It makes sense to pattern-match arrays and other ordered collections in order by element, but you'd expect very different semantics pattern-matching an unordered Set.)

big +1

Smalltalk has an OrderedCollection which implies that the elements are ordered (though not automatically sorted) and a SortedCollection which maintains a given sort order like the `Sorted` from Dave’s comment. Smalltalk’s Set and Dictionary are just Collections. I always found these useful distinctions.

-Thorsten

···

Am 10.05.2016 um 04:48 schrieb Joe Groff via swift-evolution <swift-evolution@swift.org>:

On May 9, 2016, at 6:23 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

-Joe
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dave Abrahams) #15

What do you mean by “Ordered” here? Please note that when Cocoa uses
“Ordered” it means something very different from “Sorted.” I don't find
the Cocoa usage intuitive myself, but it might be best to avoid that
term to avoid confusion.

···

on Mon May 09 2016, Joe Groff <jgroff-AT-apple.com> wrote:

On May 9, 2016, at 6:23 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they

would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

  let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
  let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case
to support. In particular, it's quite common to use sorted `Array`s,
and these APIs would help you do that.

What I might consider doing is tying this to
`RangeReplaceableCollection`. That protocol is applied only to types
which allow insertion at arbitrary indices, which is a good, though
not perfect, proxy for types which might allow you to manually
maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and
the mutable `String` views would get these methods, while `Set` and
`Dictionary` would not.

We could also introduce a new OrderedCollection protocol. (This would
also be useful in the future for supporting `case` pattern matching on
collections. It makes sense to pattern-match arrays and other ordered
collections in order by element, but you'd expect very different
semantics pattern-matching an unordered Set.)

--
-Dave


(plx) #16

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they
would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case to support. In particular, it's quite common to use sorted `Array`s, and these APIs would help you do that.

What I might consider doing is tying this to `RangeReplaceableCollection`. That protocol is applied only to types which allow insertion at arbitrary indices, which is a good, though not perfect, proxy for types which might allow you to manually maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and the mutable `String` views would get these methods, while `Set` and `Dictionary` would not.

We could also introduce a new OrderedCollection protocol. (This would also be useful in the future for supporting `case` pattern matching on collections. It makes sense to pattern-match arrays and other ordered collections in order by element, but you'd expect very different semantics pattern-matching an unordered Set.)

I have some high-level comments about this proposal: it feels rather muddled and as if it's mixing different concerns.

Taking a step back, at a *minimum* I think the right way to add this functionality is to:

A) add generic functions like `binarySearch` (etc.) to `Collection` (taking the GIGO risk, of course)
B) add overridable methods like `sortedIndex(of:)` to `Collection` (taking the GIGO risk again, of course)
C) provide default implementations of e.g. `sortedIndex(of:)` that use `binarySearch` (etc.)

…and *ideally* I think both (A) and probably (B) wind up introducing some methods like `_customBinarySearchForComparableElement` (etc.), but that is a level of detail that can be skipped for this discussion.

I understand the motivation to try and add only “intent-focused” methods like `sortedIndex(of:)`, but a binary search should be expressible generically and is a very useful building block to have when building such higher-level methods; it would also prove useful to *implement* the `Sorted(…)` concept mentioned above, one would think.

I also think it will be a mistake to restrict the availability of any of these APIs to “sorted” collections; there are several reasons here.

One reason is simply b/c such sorted collections aren’t part of the standard library yet.

Also, it seems like for *some* of these methods (binarySearch) it’s a category error to restrict them to sorted collections: such sorted collections should instead simply exploit their own ordering/structure where appropriate!

Finally, things like binary searches are often basic building blocks (etc.) for *constructing* such ordered collections, and thus being unable to use them in that context would be limiting (e.g. if you wanted to implement `Sorted(…)` as suggested above, you’d probably benefit from being able to use these methods…).

Thus although I understand the desire for jumping immediately to the higher-level, "intent-focused” API, and although I understand the GIGO risk for having some of these methods defined too broadly, I am not sure the cures aren't worse than the disease here, so to speak.

-Joe

Yet another alternative would be to drop Set and Dictionary down a level to a FiniteSequence protocol in between Sequence and Collection. Basically none of the index-based collection APIs (i.e. everything except `count` and `isEmpty`) make sense on sets and dictionaries. index(where:) was marginally useful with dictionaries, but now that Sequence is getting first(where:), née find(...), even that isn't necessary.

I’ve argued this (unsuccessfully) here in the past, but with 2 such distinctions:

- Sequence -> FiniteSequence
- Sequence -> StableSequence (“stable” meaning identical on each iteration)
- Collection: FiniteSequence, StableSequence

…but it didn’t go well; I can certainly dredge up my design notes if there’s someone else interested in taking up this case.

···

On May 9, 2016, at 10:28 PM, Nate Cook via swift-evolution <swift-evolution@swift.org> wrote:

On May 9, 2016, at 9:48 PM, Joe Groff via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On May 9, 2016, at 6:23 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Nate

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dave Abrahams) #17

Yet another alternative would be to drop Set and Dictionary down a
level to a FiniteSequence protocol in between Sequence and
Collection. Basically none of the index-based collection APIs
(i.e. everything except `count` and `isEmpty`) make sense on sets and
dictionaries.

Strongly disagreed. Any read-only operation that makes sense on a
bidirectional collection makes sense on these data structures.

index(where:) was marginally useful with dictionaries, but now that
Sequence is getting first(where:), née find(...), even that isn't
necessary.

   s.remove(at: s.index(where: { $0 < 1 }))

···

on Mon May 09 2016, Nate Cook <natecook-AT-gmail.com> wrote:

--
-Dave


(Joe Groff) #18

By "ordered", I only mean "ordering is significant to the value of the collection", so Array is ordered but Set is not.

-Joe

···

On May 13, 2016, at 7:30 AM, Dave Abrahams <dabrahams@apple.com> wrote:

on Mon May 09 2016, Joe Groff <jgroff-AT-apple.com> wrote:

On May 9, 2016, at 6:23 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they

would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case
to support. In particular, it's quite common to use sorted `Array`s,
and these APIs would help you do that.

What I might consider doing is tying this to
`RangeReplaceableCollection`. That protocol is applied only to types
which allow insertion at arbitrary indices, which is a good, though
not perfect, proxy for types which might allow you to manually
maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and
the mutable `String` views would get these methods, while `Set` and
`Dictionary` would not.

We could also introduce a new OrderedCollection protocol. (This would
also be useful in the future for supporting `case` pattern matching on
collections. It makes sense to pattern-match arrays and other ordered
collections in order by element, but you'd expect very different
semantics pattern-matching an unordered Set.)

What do you mean by “Ordered” here? Please note that when Cocoa uses
“Ordered” it means something very different from “Sorted.” I don't find
the Cocoa usage intuitive myself, but it might be best to avoid that
term to avoid confusion.


(Nate Cook) #19

Yet another alternative would be to drop Set and Dictionary down a
level to a FiniteSequence protocol in between Sequence and
Collection. Basically none of the index-based collection APIs
(i.e. everything except `count` and `isEmpty`) make sense on sets and
dictionaries.

Strongly disagreed. Any read-only operation that makes sense on a
bidirectional collection makes sense on these data structures.

I don't see how the methods that impose meaning on the order of a set are useful. What does mySet.prefix(upTo: i) mean when I have no control or dependable way of knowing which elements lie between startIndex and i? mySet.first is useful, but it's meaning is more like NSSet's anyObject.

index(where:) was marginally useful with dictionaries, but now that
Sequence is getting first(where:), née find(...), even that isn't
necessary.

  s.remove(at: s.index(where: { $0 < 1 }))

Since Set's remove(at:) method is type-specific, it would need to be rewritten as remove(where:).

This example is kind of my point, though - it removes the first element less than 1, but only one such element, and there's no telling which. That's not an operation I've ever needed to perform on a set.

To clarify, I don't think the current system is hurting Set and Dictionary in any way. It's simply providing them with methods that aren't very useful for that particular data structure.

Nate

···

On May 13, 2016, at 9:36 AM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Mon May 09 2016, Nate Cook <natecook-AT-gmail.com> wrote:

--
-Dave
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dave Abrahams) #20

Thanks for clarifying.

···

on Fri May 13 2016, Joe Groff <jgroff-AT-apple.com> wrote:

On May 13, 2016, at 7:30 AM, Dave Abrahams <dabrahams@apple.com> wrote:

on Mon May 09 2016, Joe Groff <jgroff-AT-apple.com> wrote:

On May 9, 2016, at 6:23 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they

would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case
to support. In particular, it's quite common to use sorted `Array`s,
and these APIs would help you do that.

What I might consider doing is tying this to
`RangeReplaceableCollection`. That protocol is applied only to types
which allow insertion at arbitrary indices, which is a good, though
not perfect, proxy for types which might allow you to manually
maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and
the mutable `String` views would get these methods, while `Set` and
`Dictionary` would not.

We could also introduce a new OrderedCollection protocol. (This would
also be useful in the future for supporting `case` pattern matching on
collections. It makes sense to pattern-match arrays and other ordered
collections in order by element, but you'd expect very different
semantics pattern-matching an unordered Set.)

What do you mean by “Ordered” here? Please note that when Cocoa uses
“Ordered” it means something very different from “Sorted.” I don't find
the Cocoa usage intuitive myself, but it might be best to avoid that
term to avoid confusion.

By "ordered", I only mean "ordering is significant to the value of the
collection", so Array is ordered but Set is not.

--
-Dave


(Dave Abrahams) #21

That does not strike me as a useful-enough distinction to be worth
enshrining in a protocol. What generic components would be constrained
on it?

···

on Fri May 13 2016, Joe Groff <swift-evolution@swift.org> wrote:

On May 13, 2016, at 7:30 AM, Dave Abrahams <dabrahams@apple.com> wrote:

on Mon May 09 2016, Joe Groff <jgroff-AT-apple.com> wrote:

On May 9, 2016, at 6:23 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they

would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case
to support. In particular, it's quite common to use sorted `Array`s,
and these APIs would help you do that.

What I might consider doing is tying this to
`RangeReplaceableCollection`. That protocol is applied only to types
which allow insertion at arbitrary indices, which is a good, though
not perfect, proxy for types which might allow you to manually
maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and
the mutable `String` views would get these methods, while `Set` and
`Dictionary` would not.

We could also introduce a new OrderedCollection protocol. (This would
also be useful in the future for supporting `case` pattern matching on
collections. It makes sense to pattern-match arrays and other ordered
collections in order by element, but you'd expect very different
semantics pattern-matching an unordered Set.)

What do you mean by “Ordered” here? Please note that when Cocoa uses
“Ordered” it means something very different from “Sorted.” I don't find
the Cocoa usage intuitive myself, but it might be best to avoid that
term to avoid confusion.

By "ordered", I only mean "ordering is significant to the value of the
collection", so Array is ordered but Set is not.

--
-Dave


(Dave Abrahams) #22

Yet another alternative would be to drop Set and Dictionary down a
level to a FiniteSequence protocol in between Sequence and
Collection. Basically none of the index-based collection APIs
(i.e. everything except `count` and `isEmpty`) make sense on sets and
dictionaries.

Strongly disagreed. Any read-only operation that makes sense on a
bidirectional collection makes sense on these data structures.

I don't see how the methods that impose meaning on the order of a set
are useful. What does mySet.prefix(upTo: i) mean when I have no
control or dependable way of knowing which elements lie between
startIndex and i? mySet.first is useful, but it's meaning is more like
NSSet's anyObject.

If you give me a random collection of Ts, I may want to find the index
of the first element that satisfies some predicate. Then I may want to
process the prefix of elements that don't satisfy that predicate, and
repeat.

In general, many algorithms that operate on a collection place no
requirements on the semantic relationship and/or ordering of the
elements, and such algorithms can still be useful on Sets. Use your
imagination and you'll see more of them, e.g. find the index of the
maximum element in the set (so that you can remove it).

index(where:) was marginally useful with dictionaries, but now that
Sequence is getting first(where:), née find(...), even that isn't
necessary.

  s.remove(at: s.index(where: { $0 < 1 }))

Since Set's remove(at:) method is type-specific, it would need to be
rewritten as remove(where:).

? What does “type-specific” mean and why do you say so?

If we don't have a remove(at:) method for Set, that's a bug. Whether we
need a RangeRemovableCollection that is refined by
RangeReplaceableCollection is an interesting question.

This example is kind of my point, though - it removes the first
element less than 1, but only one such element, and there's no telling
which. That's not an operation I've ever needed to perform on a set.

To clarify, I don't think the current system is hurting Set and
Dictionary in any way. It's simply providing them with methods that
aren't very useful for that particular data structure.

It's true that it's harder to find algorithms that are *very likely* to
be useful on collections whose ordering is not controlled, but that
doesn't mean those collections shouldn't satisfy the Collection
protocol. Even parallelization of trivial operations such as “find the
minimum element” requires the the fundamental properties of Collection.

···

on Fri May 13 2016, Nate Cook <nate-AT-natecook.com> wrote:

On May 13, 2016, at 9:36 AM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Mon May 09 2016, Nate Cook <natecook-AT-gmail.com> wrote:

--
-Dave


(Dave Abrahams) #23

Yet another alternative would be to drop Set and Dictionary down a
level to a FiniteSequence protocol in between Sequence and
Collection. Basically none of the index-based collection APIs
(i.e. everything except `count` and `isEmpty`) make sense on sets and
dictionaries.

Strongly disagreed. Any read-only operation that makes sense on a
bidirectional collection makes sense on these data structures.

I don't see how the methods that impose meaning on the order of a set
are useful. What does mySet.prefix(upTo: i) mean when I have no
control or dependable way of knowing which elements lie between
startIndex and i? mySet.first is useful, but it's meaning is more like
NSSet's anyObject.

If you want to process the elements of your set in parallel, you want to
break it down into equally-sized regions and distribute the work across
cores. That's indexing and slicing.

index(where:) was marginally useful with dictionaries, but now that
Sequence is getting first(where:), née find(...), even that isn't
necessary.

  s.remove(at: s.index(where: { $0 < 1 }))

Since Set's remove(at:) method is type-specific, it would need to be
rewritten as remove(where:).

That would mean “remove all elements satisfying this predicate,” to me.

This example is kind of my point, though - it removes the first
element less than 1, but only one such element, and there's no telling
which. That's not an operation I've ever needed to perform on a set.

To clarify, I don't think the current system is hurting Set and
Dictionary in any way. It's simply providing them with methods that
aren't very useful for that particular data structure.

I agree that because of Set's nature, some Collection algorithms are
less-applicable than they would otherwise be. Does that mean Set isn't
fundamentally a Collection? No it does not. A Collection is a
multipass sequence with a representation of position. Set definitely
has those properties.

···

on Fri May 13 2016, Nate Cook <swift-evolution@swift.org> wrote:

On May 13, 2016, at 9:36 AM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Mon May 09 2016, Nate Cook <natecook-AT-gmail.com> wrote:

--
-Dave