It wasn't presented that way, but that option was considered.
The combination of first
and dropFirst()
is sometimes used in âreduce-styleâ algorithms which need to visit every element of a collection but are not sensitive to the order of elements. These algorithms work just fine with Dictionary
since first
will return the same element dropFirst()
removes as long as you donât mutate the collection between the two calls.
For instance, consider this implementation of max()
:
extension Collection {
func myMax(
by areInIncreasingOrder: (Element, Element) throws -> Bool
) rethrows -> Element? {
guard var result = first else { return nil }
for elem in dropFirst() where areInIncreasingOrder(result, elem) {
result = elem
}
return result
}
}
It works just fine on a Dictionary
:
print(
âkey of largest value:â,
someDictionary.myMax(by: { $0.value < $1.value })?.key
)
randomElement()
would not work here because there is no equivalent to dropFirst()
for it. Moreover, even if we did have a dropRandomElement()
which returned the elements the previous randomElement()
hadnât, or some other way to split a Dictionary
into first and rest with a random âfirst elementâ, we would be giving up performance for no good reasonâgenerating a random number is relatively slow and it buys us nothing here.
Iâve shown this as an extension on Collection
, but it would make perfect sense in an extension directly on Dictionary
, too. In fact, Iâd argue that there is no better implementation for it on Dictionary
than this one.
Sometimes you just want to get the first element the iteration will return to you, no matter what arbitrary order the iteration might be in. Thatâs what first
is for, and itâs a perfectly coherent thing to want even on a Dictionary
.
Itâs far from obvious why you wouldnât use an iterator for this.
Iteration implies order, just as much as, if not more so than, first
does. I'm not understanding how using iteration for that particular task would change anything.
Not exactly. The for
-in
system only requires that an element-baring instance allows the loop to visit each element exactly once. That does not mean said visitation order is part of that instance's type's semantic! That's the whole problem with Sequence
et al.; its significant methods do imply that semantic, which is an over-specification. It's the cause of the whole "Dictionary
and Set
aren't really sequences" problem.
An improved hierarchy would be:
-
Collective
(i.e.Sequence
without thedropFirst
,prefix
,suffix
, etc. parts) -
PersistentCollective: Collective
(safe for multi-pass) Sequence: Collective
typealias PersistentSequence = PersistentCollective & Sequence
-
IndexedCollective: PersistentCollective
(i.e.Collection
without thestartIndex
,endIndex
, etc. parts; it wouldn't have range-based indexing either, sinceIndex
would beEquatable
instead ofComparable
) Collection: Sequence & IndexedCollective
But inserting new base protocols is ABI-breaking, even if the members are distributed in a way that still preserves source compatibility. I did once see a post on this forum suggesting a way to make a protocol conform to another after the fact (i.e. like an extension). If that's added, then we could reexamine adding the above protocols.
Even if we did, switching Dictionary
and Set
to IndexedCollective
would be source-breaking. We would have to make a NeoDictionary
and NeoSet
with the reduced requirements, change the implementations of the first versions to use the second ones, then transition users to prefer the second ones under most circumstances.
I feel itâs quite different to say âa dictionary can provide an iterator that yields its pairs in some orderâ and âa dictionary has some orderâ. If this kind of algorithm can be rewritten in terms of iterators in general, which I believe is the case, it negates Brentâs conclusion:
I see. Thank you for explaining.
I might be missing something here, but even if we were saying the dictionary or set can only provide an iterator over its element in some order rather than being a collection of elements in some order itself wouldn't solve all problems, would it?
The underlying conflict of these unordered collections vending an iterator and their Equatable conformance would remain. And I really can't think of a realistic way to solve this problem: Iterating over unordered collections has to happen in some order. And the moment this order is nondeterministic or depends on things other than the value of such a collection, i.e. the order in which the elements were inserted, we have this fundamental conflict with Equatable: suddenly these instances are not substitutable any more!
Threading #FAIL
Many of the postings in this topic appear to be replies to earlier postings, but are missing references to the things they're replies to, so I can't tell to whom I should respond. I don't know if this is because respondents are using the global "reply" button instead of the one below the posting to which they're replying, or because of a bug in Discourse, or something else, but it's definitely making it harder to participate.
@forum_admins Is there, perhaps, an update to Discourse that would make this better?
I've experienced that the button below individual posts is hit-or-miss as to whether it actually ends up linking the replied-to comment. It's happened enough that I'm reasonably sure (though not 100% sure!) it's not just a mistake on my part hitting the global reply button. I pretty much always quote now when making a direct response to avoid losing context.
the link doesn't show up when you reply to the last message, otherwise it does
It's consistent for me.
Oh, that's a terrible bug in Discourse, because new posts can come in while you're replying! @forum_admins I really do hope there's an update that could fix this problem.
Is there a reason you think that the link won't show up if new posts come while you're replying? While I'm writing a post, discourse still knows that it's a reply to your post.
It's not conclusive of course, but this post is one reason. It's clearly not a reply to the post just above it.
My observation is that Discourse will show which post a reply is to in the web interface unless it's a reply to the post just above it. It's probably a feature to declutter things. But it tracks this information fine since anyone receiving emails will get a clear mention of the post it is replying to.
Here I'm replying to Dave's post just above to show my point.
(This topic should probably go to a separate thread in the Site Feedback section.)
Actually I took it here. @forum_admins it does seem that there's a setting you can adjust; see the reply to my posting.
I'm not sure that should even be a goal - uniform iteration seems to be saying that we want a single protocol or syntax to represent both the multi-pass sequence (the Array
) and the single-pass stream, so the problem is locked-in to the question. One is a mutating operation and the other is not, so it's unclear to me why they should look the same. That mutability carries important semantic connotations - namely, about whether or not iteration change any state (and therefore may not be able to be repeated).
Iterators are definitely single-pass - once they return a nil
, that's it, all they will ever return is nil
. Sequence
claims to be single-pass, but supports creating an iterator without mutating (which somehow is not the same as non-mutating iteration...? ) - oh, and since almost all real-world
Sequence
s are multi-pass, almost no 3rd-party code gets tested against single-pass sequences, so following that particular piece of documentation is unlikely to be a good idea. If you're actually writing a single-pass sequence, you should expose an Iterator.
It seems to me that changing Sequence
to be multi-pass is the simplest way to fix our collection hierarchy (even just to accept reality). Then we can talk about whether Sequence
(a multi-pass sequence without a notion of position) makes sense as an abstraction level between IteratorProtocol
and Collection
(I guess probably not).
I'm not sure how move semantics would help with this - except perhaps as a general variable "poisoning" tool, so we can say "don't iterate this variable more than once" in the same way we say "don't use this variable after its contents have moved out". But for a single-pass sequence, I don't think either the elements or sequence itself need to be move-only IIUC.
We seem to have a different interpretation of the word "can" ;-) - or can you give a single example of a Collection
which cannot be walked backwards?
Even if this
is "fixed", you could still copy the old implementation from the stdlib - it's just not very efficient.
Fine, I will grant that if you have no concerns about complexity then all Collection
s can be walked backwards, at the very least by way of the trivial algorithm Array(myCollection)
. What I mean by "can be walked backwards" is "attempting to walk this collection backwards does not perform worse than O(n)", as if it does then I can just create an Array
from my Collection
and be done with it.
A very straightforward example of such a Collection
is a singly-linked list. The natural Index
for such a list is simply a pointer to the list element. Calculating the next index from the current one is very cheap (use the next
pointer), meeting the soft O(1) criterion on that operation that Collection
asks for. Attempting to find the previous Index
can only be done by walking from the head until you find the current element, and then returning the previous one. This is an O(n)
operation, so the complexity for attempting to walk the entire list this way is O(n^2)
.
The implementation in the stdlib forbids negative indices on non-bidirectional Collection
s.
Iâm not sure what youâre saying. The Collection sub protocols are about the algorithmic efficiency of specific operations. Pointing out that you can reproduce operations inefficiently does what?