Is a set a sequence?

Probably I'm totally confused, then sorry. But Swift's 'Sequence' now means(read 'implemented as') "ordered or unordered" bag of things, no constraint, so it is not in harmony with mathematical definition. But 'Set' currently is unordered bag of things.

As I understand the core team's (and other's) replies on that subject [please correct me if I'm wrong] - the already existed generic code written over Sequence protocol should still be allowed to accept Sets and Dictionaries.

I, as a developer, given current meaning of Sequence, and understanding that this most likely will not be changed[ABI stabilization, breaking changes etc], want at least be able to limit MY generic code to allow only "ordered" Sequences, if this is important for that particular code.

So, isn't the solution(well, workaround) here is to be able to write something like:
func foo(..) where T: Sequence&Ordered {...}
, and don't conform Set and Dictionary to that Ordered protocol, but other sequences should be conformed to it.
I.e. nothing will be changed internally, just new empty(?) protocol Ordered be introduced and each built-in "ordered" sequence will conform to it additionally. Probably even typealias OrderedSequence = Sequence&Ordered could be introduced.

But IIUC the rule we have regarding protocols is that they should not be just "markers" but should be useful. And such Ordered protocol is not useful by its own. Probably there could be an exception in that rule?
Dumb idea? :-)

Well, the documentation can be really confusing (see https://developer.apple.com/documentation/swift/sequence/2947204-fill as example ;-)
But I couldn't find the word "unordered" in it, and as Sequence is just a protocol, there is no implementation of the core functionality.

Sequence can't have any requirements how the order in a type implementing it is defined, but many operations just only make sense when the order is imposed by the creator of the Sequence:
How do you reverse a bag of unordered things, and why should you want to do so?

The motivation of this thread stems from the discussion about one particular method (elementsEqual), which has a confusing effect for Sets:
You can apply it on two Sets that have exactly the same content — and get false as a result!
This happens if the two Sets have a different ordering, and the order inside a Set is something you can only predict when you know the internals of that type.

So the only contradiction with Sequence is that it is implemented by types like Set and Dictionary - and the idea is to simply change that.

1 Like

Mimicking mathematical sets or sequences is not the job of a Swift set. Its job is to provide a set of operations, with guaranteed algorithmic complexity, that cover a set of related use cases, and that all together form a "data structure" which happens to be named "set" because that data structure looks like a mathematical set:

  • uniqued elements
  • containment test
  • element insertion
  • element removal
  • iteration

An interesting divergence with mathematical sets is that a Swift set requires its elements to conform to Hashable, when a mathematical set would have been OK with Equatable (I mean a way to distinguish different elements). That's because Swift set wants its methods to have a low complexity. This divergence is interesting because it's a reminder that we're not talking mathematics, here.

An interesting convergence with mathematical sets and sequences is that a Swift set can be iterated. In maths, you can write "let F = { x in E | x * 2 }", or "let V(n) = U(n) + 1". OK, that's not really the same iteration as the iteration of a Swift set. But none of those objects really want to hide their contents (at least when they're countable).

When you iterate a Swift set, you materialize an ordering. Let's call it the iteration order of the set.

But can I? Does this "iteration order" even exist? Is it a stable property of a set? I mean, can I write the assertions below?

let s = Set(...)
assert(s.elementsEqual(s))
assert(Array(s) == Array(s))
if let i = s.index(where: predicate) {
    assert(predicate(s[i]))
}

The shy answer is: WHY NOT, as long as the Swift set keeps on fulfilling its job at being an implementation of the data structure that is called "set".

The pragmatic answer is: YES OF COURSE, because preventing users from relying on those assertions would 1. require to increase the complexity of the Swift Set implementation, and 2. make the Swift Set a painful data structure to work with.

The consequence is that a Swift set is an ordered data structure. You can't control this order. But you can't say it does not exist. And it's a hard fact.

The sentence "sets are unordered" is a teacher's sentence. It's a way to tell that one can't control the set order, and should not, generally, rely on it. You don't know in which order elements will be outputted in for x in set { print(x) }. It may change on the next program execution. It may be different for two sets that contain the same elements. I find it very funny that the word "unordered" is used to describe... an ordering. It should not be taken too seriously.

6 Likes

As I wrote in the other thread: I don't think that we can build any data structure that is truly unordered, and it would be futile to try to remove the ordering from Set.
What can be done is making a split between types that have their order because its user wants it that way, and those which "accidentally" become ordered, because there is a way to extract elements one after another.

Exactly. But many operations of sequence are there to query for properties of the order - which is always possible when there's iteration, but diminishes the semantic of Sequence and degrades it to a bunch of methods.

Using the same logic that justifies Set having methods like reversed, every single type in Swift should conform to Error.

To me, an ordering is something different. Just because I can iterate the elements and I do get these elements one after another, there is not necessarily ordering in the data structure itself.

Asserting that a Set itself has some kind of ordering isn't a good idea, because to make some sense of the iteration order you get, you need to rely on the implementation details of Set that creates the iteration order.

We seem to have a different understanding of "order", because "unordered" implies to me that there is no rule under which elements have a certain appearance in a row (in the current context), but not that I can't iterate them. Certainly there may be some sort of rule in the implementation, but this should be an implementation detail in case of Set and I would therefor assert this order of appearance to be arbitrary.

In general I do support creating a separate protocol for Iteratable types, because to me iteration is clearly something different then ordering.

As long as addressing inside a computer is based on natural numbers, there is always some ordering in sense of this elements address is before that other elements address, but this is not a sencefull ordering in most use cases.

4 Likes

The problem is that since iteration implies ordering, even if it is an implementation-driven ordering, separating those protocols will just have people dump their so-called unordered types into an array, and proceed with their tasks.

Just like they used to do, in the past:

Look at Objective-C NSSet. Its "unordered" smell is strong: it has no firstObject property, like NSArray, but instead anyObject. Much unordered! Of course, the illusion breaks as soon as you see that NSSet conforms to NSFastEnumeration (meaning it has its iteration order just like Swift sets). And, of course, it has the allObjects method which returns the set elements in an NSArray, because, come on, we sometimes want to leave the "unordered" game and do some actual work.

Enters Swift, a protocol-oriented language.

Protocol-oriented programming destroys the little game that NSSet was playing, the "unordered myth". You can no longer pretend that you are unordered when you are iterable. As soon as you have makeIterator(), you get all the methods that you can build on top of the iteration order. That's why Swift sets have prefix(while:), despite all teachers claiming that a Set is "unordered".

Oh, and we no longer have to consume memory by creating an ad-hoc array, like -[NSSet allObjects] when we want to break the unordered illusion, and actually use the iteration order of the set. That's pretty cool, IMHO.

The unordered set myth has been destroyed by POP.

7 Likes

You can write a (degenerated) version of Gauss' technique to find the sum of integers with reversed, and it works with sets :-)

func gaussSum<S: Sequence>(_ s: S) -> Int where S.Element == Int {
    return zip(s, s.reversed()).reduce(0) { $0 + $1.0 + $1.1 } / 2
}

gaussSum([1, 2, 3, 4, 5]) // 15
gaussSum(Set([1, 2, 3, 4, 5])) // 15
gaussSum(1..<6) // 15

(I'm kidding, this is not a "real use" of Set.reversed)

As I tried to point out, Iteration does not really require (persistent) ordering. If you need to do order dependant operations on a set, in my opinion you should itterate and copy it to an array, even if it is slow, because you can (should) not assume for a Set what is assumable for an ordered collection.

The for loop style iteration needs some kind of ordering in the moment the iteration runs, in terms that you wont get one element twice. Using forEach, which is also iterating for me, does (in it‘s syntax) not require ordering.

It seems as if Sequence destroyed what you call unordered mythos and not protocol orientated programming in general, because Iteratable would just be a protocol too with the operations required for iteration, but without the order dependant ones.

The question is what ordering has a Set?
Array‘s ordering is defined by it‘s indices, such that the first element is before the second, etc. When using an array you may assert some kind of additional ordering implied by your use, e.g. as a Stack where in this example elements with smaller indices are older than those with higher indices.

Now a Set is something diffrent then an Array with unique elements and has no clear ordering.

In my understanding of ordering any arbitrary ordering implies no use cases since to me ordering implies a rule of order. Please correct me if there are use cases for such an (un)ordering, that does not abstract from a specific order but requires specifically an arbitrary ordering.

To summand this: My point of view is: A temporal ordering is no real ordering.

I don‘t know the current implementation of Set, but in a HashSet like java‘s the iteration order could be the order of the hash values, which is an ordering, but no real ordering of the elements and if you are required to use this ordering, you should not use HashSet because this isn‘t a feature of HashSet. This would be like using a TreeSet to implement in-order traversation because TreeSet randomly implements iteration this way.

If you don't mutate a Set, I believe that it multiple iterations over it are guaranteed to be the same. There's no need to copy to an Array in this case.

3 Likes

I guess this is true for Swifts Set implementation, but speaking about all implementations of Sets, it is not required.

Assume a optimized Set implementation that decides to use another iteration strategy if memory is short or a device runs in low power mode. This implementation will not guarantee the same iteration order even if the Set isn‘t mutated.
The point is that this feature is not one of Sets features but follows from implementation details.

Note that the iteration order remains the same, (and also the implicit implementation order), but how can iteration order be kept if an element is inserted? Where should it be placed? At the end? Why? There is just no background rule in iteration ordering which makes it in my opinion no real ordering.

1 Like

It’s a little unclear, but if you are suggesting some equivalent of “allObjects” which returns a Sequence (not necessarily an array), I’d +1 that.

A set’s contents can be viewed as a sequence, but that’s different from the set itself being a sequence.

3 Likes

So we can do stupid things with Set.reversed ;-).
Most programming languages and especially Swift try to protect us from our own stupidity, and as elementsEqual demonstrated, that didn't work that well in this case.

To not only blame Set all the time I have another example:
Someone wrote a program for a sports and included a small new routine to print out the winners.

struct Marathon {
	var succesfulContestants: Array<(String, TimeInterval)>
 }
    // real Marathon type is bigger and defined elsewhere

func printPodium(of marathon: Marathon) {
	marathon.succesfulContestants.prefix(3).forEach {
		print("Contestant: \($0.0) (\($0.1 / 3600))")
	}
}

Thanks to autocompletion, it's very easy to write this down, and maybe there's even a unit test that said everything is ok.
Now, some time later, someone else has to change the software, and he has to add a feature so that he can easily look up the time for each runner.
So, he makes a little change:

var succesfulContestants: Dictionary<String, TimeInterval>

The compiler doesn't complain, the unit tests are green, but the developer silently broke an important functionality.

So, the question shouldn't be "is it harmful that Dictionary conforms to Sequence", but rather "is there any downside when I can only iterate through Dictionary's contents, and not call methods that depend on a meaningful order?"

2 Likes

I guess I didn't explain enough what I'm thinking of:
Just because of POP, it would be rather stupid to dump a Set into an array.

extension Set: Sequence {}

would be everything that's required for tasks like asking a set for a prefix (I'm still looking for an explanation why somebody should want to do so).
Imho "we do because we can" is a poor justification, and removing those methods isn't patronizing - it just makes Sequence a more powerful abstraction.

A single "real use" would be a big step forward for the whole discussion ;-)

Multi-pass iteration is a requirement that Collection imposes, and by conforming, Set must provide this guarantee as well.

Then this "Set" cannot conform to Collection, unless it clearly documents the departure (personally, I'd say that even documenting this doesn't make it right, but I digress)

As I mentioned, this guarantee only holds if you do not mutate the Set. Once you do that, all bets are off.

1 Like

The answer to this good question:

(see the rest of that topic for more interesting guarantees)

1 Like

Ok, I didn‘t knew this requirement. Thanks for pointing it out.

So, ...we would have

  • A new protocol Iterable, with just Element, Iterator, and makeIterator(). Swift will use this to base for-in loops on (instead of Sequence).
  • A refined protocol Sequence with the remaining interface of the old Sequence.
  • A now-directly refined protocol Collection, with the Sequence interface ripped out.
    • Index wouldn't necessarily be Comparable. This could be helpful for multi-dimensional containers. (This could possibly include fixed-size arrays if we decide that structural types can follow protocols.)
    • Maybe there would be a SubCollection, an analog of the no longer present-in-Collection Subsequence.
    • Should Indices only conform to Iterable? Or should they also always conform to Collection too? I feel that Indices should definitely conform to Sequence whenever the owning collection type also does.
  • The "unordered" containers Set and Dictionary would conform to just Collection.
  • But general linear containers, like Array, could conform to both Collection and Sequence.
    • SubSequence and SubCollection would have to be the same type.
    • Index now would have to conform to Comparable.
  • I think that MutableCollection could refine just Collection....
  • ...But BidirectionalCollection, RandomAccessCollection, and RangeReplaceableCollection would refine Sequence too.

Strangely, as someone who hasn't liked the recent talk about crippling interfaces so "not really sequences" Set and Dictionary would be denied, I like this idea.

5 Likes

I'd like for this Sequence to provide a multi-pass guarantee. See my comments on [Pitch] Remove the single-pass requirement on Sequence.

1 Like

Oh, Index for non-Sequence Collections still has to be at least Equatable.

We should do some name shuffling to retain some backwards compatibility.

protocol Iterable { /*...*/ }
protocol Sequence: Iterable { /*...*/ }
protocol UnorderedCollection: Iterable { /*...*/ }
protocol UnorderedMutableCollection: UnorderedCollection { /*...*/ }
typealias Collection = UnorderedCollection & Sequence where SubSequence == SubCollection, Index: Comparable, Indices: Sequence
typealias MutableCollection = UnorderedMutableCollection & Collection
protocol BidirectionalCollection: Collection { /*...*/ }
protocol RandomAccessCollection: BidirectionalCollection { /*...*/ }
protocol RangeReplaceableCollection: Collection { /*...*/ }
1 Like

And through that link, it seems that someone already came up with a similar idea. That link has at least one other attempted pass.