[Review] SE-0074: Implementation of Binary Search functions

dabrahams · May 17, 2016, 11:30pm

That does not strike me as a useful-enough distinction to be worth
enshrining in a protocol. What generic components would be constrained
on it?

···

on Fri May 13 2016, Joe Groff <swift-evolution@swift.org> wrote:

On May 13, 2016, at 7:30 AM, Dave Abrahams <dabrahams@apple.com> wrote:

on Mon May 09 2016, Joe Groff <jgroff-AT-apple.com> wrote:

On May 9, 2016, at 6:23 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

* Operations that depend on sorted-ness and use binary predicates should
not be available on all Collections; they're too easy to misuse,
they're hard to name well, and as Nicola Salmoria has noted, they

would not make any sense at all for a Set<T>.

* They should be scoped to a kind of collection that bundles
the predicate with the elements, e.g.

let x = Sorted([3, 4, 1, 5, 2], >) // stores a sorted copy of the array
let y = Sorted(preSorted: 0..<100, <) // stores a copy of the range

Maybe there should also be protocols for this; CountableRange<T> would
already already conform to the immutable version. We might want a
mutable form of the protocol for sorted collections with
insertion/removal methods. This whole area needs more design.

I agree with both of these statements, but not with your conclusion.

There are three classes of collections:

1) Those which are always sorted, like a SortedSet.
2) Those which may or may not be sorted, like an Array.
3) Those which are never sorted, like a Set.

These APIs are useless on a #3, but #2 is still a valuable use case
to support. In particular, it's quite common to use sorted `Array`s,
and these APIs would help you do that.

What I might consider doing is tying this to
`RangeReplaceableCollection`. That protocol is applied only to types
which allow insertion at arbitrary indices, which is a good, though
not perfect, proxy for types which might allow you to manually
maintain a sort order. `Array`, `ArraySlice`, `ContiguousArray`, and
the mutable `String` views would get these methods, while `Set` and
`Dictionary` would not.

We could also introduce a new OrderedCollection protocol. (This would
also be useful in the future for supporting `case` pattern matching on
collections. It makes sense to pattern-match arrays and other ordered
collections in order by element, but you'd expect very different
semantics pattern-matching an unordered Set.)

What do you mean by “Ordered” here? Please note that when Cocoa uses
“Ordered” it means something very different from “Sorted.” I don't find
the Cocoa usage intuitive myself, but it might be best to avoid that
term to avoid confusion.

By "ordered", I only mean "ordering is significant to the value of the
collection", so Array is ordered but Set is not.

--
-Dave

dabrahams · May 15, 2016, 5:58pm

Yet another alternative would be to drop Set and Dictionary down a
level to a FiniteSequence protocol in between Sequence and
Collection. Basically none of the index-based collection APIs
(i.e. everything except `count` and `isEmpty`) make sense on sets and
dictionaries.

Strongly disagreed. Any read-only operation that makes sense on a
bidirectional collection makes sense on these data structures.

I don't see how the methods that impose meaning on the order of a set
are useful. What does mySet.prefix(upTo: i) mean when I have no
control or dependable way of knowing which elements lie between
startIndex and i? mySet.first is useful, but it's meaning is more like
NSSet's anyObject.

If you give me a random collection of Ts, I may want to find the index
of the first element that satisfies some predicate. Then I may want to
process the prefix of elements that don't satisfy that predicate, and
repeat.

In general, many algorithms that operate on a collection place no
requirements on the semantic relationship and/or ordering of the
elements, and such algorithms can still be useful on Sets. Use your
imagination and you'll see more of them, e.g. find the index of the
maximum element in the set (so that you can remove it).

index(where:) was marginally useful with dictionaries, but now that
Sequence is getting first(where:), née find(...), even that isn't
necessary.

s.remove(at: s.index(where: { $0 < 1 }))

Since Set's remove(at:) method is type-specific, it would need to be
rewritten as remove(where:).

? What does “type-specific” mean and why do you say so?

If we don't have a remove(at:) method for Set, that's a bug. Whether we
need a RangeRemovableCollection that is refined by
RangeReplaceableCollection is an interesting question.

This example is kind of my point, though - it removes the first
element less than 1, but only one such element, and there's no telling
which. That's not an operation I've ever needed to perform on a set.

To clarify, I don't think the current system is hurting Set and
Dictionary in any way. It's simply providing them with methods that
aren't very useful for that particular data structure.

It's true that it's harder to find algorithms that are *very likely* to
be useful on collections whose ordering is not controlled, but that
doesn't mean those collections shouldn't satisfy the Collection
protocol. Even parallelization of trivial operations such as “find the
minimum element” requires the the fundamental properties of Collection.

···

on Fri May 13 2016, Nate Cook <nate-AT-natecook.com> wrote:

On May 13, 2016, at 9:36 AM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Mon May 09 2016, Nate Cook <natecook-AT-gmail.com> wrote:

--
-Dave

dabrahams · May 17, 2016, 11:41pm

Yet another alternative would be to drop Set and Dictionary down a
level to a FiniteSequence protocol in between Sequence and
Collection. Basically none of the index-based collection APIs
(i.e. everything except `count` and `isEmpty`) make sense on sets and
dictionaries.

Strongly disagreed. Any read-only operation that makes sense on a
bidirectional collection makes sense on these data structures.

I don't see how the methods that impose meaning on the order of a set
are useful. What does mySet.prefix(upTo: i) mean when I have no
control or dependable way of knowing which elements lie between
startIndex and i? mySet.first is useful, but it's meaning is more like
NSSet's anyObject.

If you want to process the elements of your set in parallel, you want to
break it down into equally-sized regions and distribute the work across
cores. That's indexing and slicing.

index(where:) was marginally useful with dictionaries, but now that
Sequence is getting first(where:), née find(...), even that isn't
necessary.

s.remove(at: s.index(where: { $0 < 1 }))

Since Set's remove(at:) method is type-specific, it would need to be
rewritten as remove(where:).

That would mean “remove all elements satisfying this predicate,” to me.

This example is kind of my point, though - it removes the first
element less than 1, but only one such element, and there's no telling
which. That's not an operation I've ever needed to perform on a set.

To clarify, I don't think the current system is hurting Set and
Dictionary in any way. It's simply providing them with methods that
aren't very useful for that particular data structure.

I agree that because of Set's nature, some Collection algorithms are
less-applicable than they would otherwise be. Does that mean Set isn't
fundamentally a Collection? No it does not. A Collection is a
multipass sequence with a representation of position. Set definitely
has those properties.

···

on Fri May 13 2016, Nate Cook <swift-evolution@swift.org> wrote:

On May 13, 2016, at 9:36 AM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Mon May 09 2016, Nate Cook <natecook-AT-gmail.com> wrote:

--
-Dave