Is a set a sequence?

And through that link, it seems that someone already came up with a similar idea. That link has at least one other attempted pass.

If we're going to be bold about it, then let's just solve this once and for all by removing all order dependent operations from Set and Dictionary, including Collection conformance, use in for-in loops, etc. I still don't see why these are any less problematic than the other operations you've identified. This will leave only the ability to ask if they contain a given element, and remove all abilities to access all the contained elements as such access would inevitably reveal an order. These would then be truly unordered collections.

Unfortunately, that makes them essentially useless to anyone, because many algorithms are now either unimplementable or too computationally expensive to use. So instead I just accept that a Swift Set is also a Swift Sequence. This is inherent, and required to make them useful, not accidental, even if I can't think of a great use for a few Sequence methods. I also can't think of a great use for calling map on an infinite Sequence, which is probably more harmful, and if I dig around the whole Swift protocol hierarchy I'm sure I can find a lot of similar examples that, nonetheless, probably don't justify a more complex protocol hierarchy.

Now, the question remains about how exactly the fundamental Sequence nature should be exposed here. As I said in the other thread, perhaps Collection and Sequence conformance should have been provided as a view on Dictionary and Set instead of conforming them directly, but it's not clear to me that complexity would be worthwhile if we were starting from scratch, and I'm certainly not convinced it's worth breaking source compatibility now. I also definitely don't understand any solution here that retains a random assortment of functionality that reveals the ordering (e.g. for-in loops, Collection functionality), while removing or quarantining another random assortment (e.g. reversed, prefix). What would justify making that arbitrary distinction?

2 Likes

I'm not in the position to complain about satire - but I'd prefer it to be funny ;-)
The confusion with elementsEqual is a real world problem, and there are several other examples that illustrate that the status quo is far from perfect.
On the other hand, I'm still waiting for a single example that shows that Set.elementsEqual can be useful, or that removing this method would cause any harm.

The for loop is a vehicle to process the contents of a collection, and you can use it for tasks that depend on a certain order, as well as to perform operations where you just need to do something with each element.
There is even the possibility to use loop-syntax to perform those operations in parallel. (just emphasized this, because I think that really hasn't come up yet - and this is a case where it makes a big difference if things don't have to be computed sequentially...)

The methods in question however just make no sense (or at least nobody could present it yet) for Set and Dictionary.
We can't prevent users from shooting into their own feet, but we don't have to equip them with a footgun.

2 Likes

Kind of rude, but it's not satirical, it's just an outline of the thought process that led to my conclusions. elementsEqual is indeed a problem, mostly because it takes two Sequences and was an attractive nuisance before things were made Equatable, but that is already being discussed in its own thread so I presume this thread is about the remaining APIs that expose order. The other handful of methods are perhaps not particularly useful in a lot of situations, but I am not convinced they are particularly harmful.

Sure, but equally something like prefix could be used to get the first n elements of a collection that I know is in a given order, as in your marathon example, or to get an arbitrary n elements, perhaps because you have some efficient method of processing data in fixed sized chunks.

Your marathon example could have been reasonably written as

func printPodium(of marathon: Marathon) {
  var place = 1

  for contestant in marathon.succesfulContestants {
    guard place <= 3 else { break }
    print("Contestant: \(contestant.0) (\(contestant.1 / 3600))")
    place += 1
  }
}

which exhibits exactly the same issue as your prefix code. So why is prefix a problem but not for-in? You seem to have a strong opinion about which operations that expose an order are harmful and which ones aren't, but I don't understand the basis for those opinions.

The problematic methods are those which are there to reason about the order in Sequence, and they are problematic because order isn't meaningful for Set.

It is very straightforward to write extension Sequence where Element: Equatable, and that extension is perfect for most true sequences, and even when there are better specific implementations, it will always return the right answer.

Equality for Set and Dictionary needs a special implementation, because despite the common definition of sequence, order doesn't matter for those.
This is just a tiny example, but the general problem is that we can't write generic algorithms that work on sequences, because those can't exclude Set and Dictionary

This is definitely bad, but every language is full of tradeoffs, and there are situations where you have to choose the lesser evil.
But I wonder when this is such a situation, why can nobody tell us what's the "bigger evil" of changing the protocol hierarchy is?
I've read many posts about this issue, and the "best" argument for keeping the status quo that I've heard is "it doesn't bug me, and change means work" (and that's the path to mediocrity).

Somebody did:

As you see, even someone from Core can't give a single example... without details, this is just an opinion as well.
Even if it's the opinion of someone who probably has more insight about Swift than we all together, it's no real explanation that shows the benefits of the current design.
The downsides, on the other hand, can be demonstrated easily.

3 Likes

I guess we could have had the relationship between Collection and Sequence be like Sequence and IteratorProtocol. Collection would return a sequence-accessor from a makeSequence() method. The core language would have to be slightly changed to recognize both Sequence instances to use makeIterator() and Collection instances to use makeSequence().makeIterator(). Sequence and UnorderedCollection would be completely distinct protocols without needing to create a common Iterable base.

Just like a sequence type can conform to both Sequence and IteratorProtocol and have instances be their own iterators, we could have the ordered collections conform to both Sequence and UnorderedCollection and let collections be their own sequences. We would need to enforce having SubSequence and SubCollection being equal and Index being Comparable (instead of just being Equatable).

(Hmm, a collection that conforms to both UnorderedCollection and Sequence couldn't really also triple-conform to IteratorProtocol, because that implies being mutli-pass and suicidally single-pass at the same time. But is there any way to prevent that? Technically, we could do that now, but what does that mean in practice?)

for-in, Collection conformance, etc also expose an order, and I showed that your example of why prefix could be harmful applies equally to for-in. I keep asking about how you are separating harmful API from benign here, but I haven't worked it out yet. I would understand if the argument was that all such order-exposing API be quarantined on a Collection and Sequence conforming view, and I'm not unsympathetic to that view, but I'm struggling to understand why some seemingly random assortment need to be removed. And which assortment should it be? I've given an example of how prefix can be perfectly useful on a Set, for example.

There is API that has no valid use case, and can only be used in a harmful way - and if it's possible to get rid of such functionality that does nothing but cause confusion, that's imho a strong motivation to remove it.

Note that I expose my position quite a lot with the first statement: A single example that shows the merit of Set.elementsEqual or Dictionary.starts(with:) would force me to back down at least a bit... but that didn't happen yet.

for-in has useful applications for Set, so I don't think it makes any sense to forbid it:
If someone iterates over a Set he probably knows what he is doing.
But Sequence does not only say "this can be looped over" - it also builds abstractions on top of this requirement, which don't work for all (current) implementers.

Would you say it's a good idea to "enrich" an array of bytes with string-methods, instead of having a dedicated String type?

Everyone has already agreed that elementsEqual is, at the very least, poorly named in a way that makes it harmful, sure. As you know, it's the subject of a proposal that is currently under review and I hope it is addressed in some way. I'm having trouble understanding how starts(with:) is confusing or harmful though, because the name makes it clear to me that it must apply to the iteration/Sequence order. Usefulness is another question, but I haven't thought about it much. So your criteria for creating a separate protocol hierarchy is “can I find some conforming type where I can't think of a use for one or more methods”?

Strings are a great example for you to raise here, because they were recently made to conform to Collection for pragmatic reasons despite violating some of the semantics/expectations of that protocol. Collection conformance used to be available on a .characters view instead, but this wasn't considered ergonomic. I suppose an alternative would have been to create an NotQuiteACollection protocol that somehow addressed the strange aspects of String as a Collection (appending two Strings may not result in a collection with a length that is the sum of the two, etc). Generic algorithms written for Collections can fail unexpectedly on certain input Strings. There are some good parallels here, I think.

It is harmful because in many cases, you don't actually see or even know that you are calling the problematic methods of Set:
Swift can infer types in many situations, so when you work with the result of a method call, you may expect an array, and call methods that make a lot of sense for Array on it - without even realizing that it's actually a Set or Dictionary!

It's even worse when you try to write algorithms for sequences:

extension Sequence where Element: Equatable {
	static func ==(lhs: Self, rhs: Self) -> Bool {
		var left = lhs.makeIterator()
		var right = rhs.makeIterator()
		while let l = left.next(), let r = right.next() {
			if l != r {
				return false
			}
		}
		return left.next() == right.next()
	}
}

This (or "this" without the bugs I may have included, and some other improvements ;-) is a perfect implementation of an equality check for every true sequence, and it gives a right result even when there are specific solutions that are faster.
It just isn't compatible with types like Set which always needs special treatment.
Of course, I don't have to write Set.== -- but we face the same problem of special treatment whenever we build on top of Sequence, just because it got thrown into the same bucket as Set.

Unlike with your position that a Swift Set isn't a Swift Sequence, I completely agree here. And I'm not trying to say Set has to cease being a Sequence – rather, the problem of lacking granularity of the current hierarchy, which will aggravate as Swift evolves, is something undoubtedly if not important, then at least worth attention and to be addressed sooner or later. And definitely not a topic that must be banned in the community by updating the commonly rejected proposals list.

3 Likes

I hope I never said so, as it's definitely not my position: Set: Sequence is a fact. I try hard never to deny facts (but I sometimes question their reasonableness ;-).
I wouldn't even mind an update of the commonly rejected proposals list... as long as this includes a rationale that really illustrates the reason for rejection.

You indeed haven't said so directly, at least to me. Is "Swift's Set has to drop conformance to Sequence" the right way to put it?

Apparently that's a bit of an ordeal right now.

I'm rarely truly serious when I'm nitpicking -- that would be to much stress ;-)
But if I was asked for a summary... "I asked for a justification of the status quo, with a somewhat muddy vision that a slimmed down Iteratable would be superior to Sequence"

I had to look up "ordeal" -- and depending on which translation I pick, that might apply to all software development ;-)

1 Like

If you expect some random Array from a method call that you seemingly have no idea about the semantics of (e.g. is it ordered in any reasonable way?) then I don't see why these order-dependent methods are any more helpful for that essentially unordered Array than they are for a Set.

Your definition of == here is a perfectly sensible operation for a generic context where you only know you have two Sequences. It is, however, poorly named (just like elementsEqual) because == is tied so closely to the Equatable protocol that it is likely to be confusing. It isn't just incompatible with our understanding of equality for “unordered” collections like Set and Dictionary, it's also incompatible for any type that conforms to Sequence, has a clear notion of order, but also has a notion of equality that isn't solely related to the iterated Sequence (e.g. it has other pertinent properties that also need to be compared). Perhaps you could separate all such types by introducing a EqualInSequenceOrderImpliesEqualInstancesSequence protocol…

It's like the case of the for loop:
If you can do something useful with it, imho it's fine to have it. I just see little value in things that can only be used for mischief.
Asking an array if it starts with a prefix may not always make sense -- but doing so with a Dictionary most likely never makes sense.

I say both method names are just fine, and that instead the requirements for Sequence are poorly defined (so far, every definition of sequence has a big emphasize on order).
It's straightforward to make arrays, vectors, lists and other variants conform to Equatable with a single default implementation (if their elements are equatable as well), and there are many other algorithms that can be implemented in the exactly same way for all those types.

I dislike when people claim that you should always favor composition over inheritance -- but when you have a sequence with additional properties, I'd probably say that this type shouldn't be a Sequence, but rather a separate type with a property that is a Sequence.

(Replying to the original post, I've not been following the discussion very closely)

Does the suggestion (aside from adding Iterable), involve making Sequence require an ordering? Because as far as I can tell, none of the existing collection protocols have this requirement, making this a breaking change. Has adding a new protocol been considered?

It seems all the order-dependant functions on Sequence and inheriting protocols are there simply because no existing protocols offer insight into whether the conforming type maintains an order. Today's Sequence would be better described as Iterable, as the original post implies, and Collection is just a ‘bag of things’, with only RangeReplaceableCollection (one of the five Collection protocols) having ordering requirements via append.

The proper course of action, then, would be to introduce OrderedSequence as an ordered form of Sequence. Existing order-dependent extensions on Sequence would be moved to OrderedSequence, with the originals deprecated, and order-dependant extensions on Collection protocols would have an added where Self: OrderedSequence requirement. RangeReplaceableCollection would be inherit from OrderedSequence, and because OrderedSequence has no additional requirements, this wouldn't be a breaking change.

This way, all existing RangeReplaceableCollection constraints and types get OrderedSequence for free. Deprecation warnings would only appear when a user is using order-dependent functions on a type which doesn't provide an ordering guarantee, which seems almost beneficial.

Whether it's worth adding this level of detail is up for debate, but from what I can tell, the migration path is pretty straightforward. I regret not having the time to forward a proposal right now.

Yes, that is imho one big problem of the current system.
As I understand protocols, they are meant to be an abstraction -- and Sequence fails disastrous in this aspect:
Only a handful of its many methods can be used without risk, and there is no way to write algorithms that operate on ordered collections, so that they can be used with Array as well as with a list...

I don't think this would be sufficient, because of the many methods of Sequence that rely on order -- and it would be kind of a tautology (they often can even be used synonymous)

So far, nobody brought up any fundamental issues with that approach (except that it's work to do ;-) -- and there has been lots of time to come up reasons why it should not be feasible to change some protocols.
What I have seen in other threads is claims that the option already has been discussed and discarded... but I'd prefer Evolution to be evidence-based rather than claim-based.