I think an insert(contentsOf:at:)
method would probably be a reasonable addition, with some caveats.
The current version of OrderedSet doesn’t provide an insert(contentsOf:at:)
method because
- the need to filter out duplicates makes it somewhat tricky to efficiently implement this — the number of elements we need to insert can be less then the number of items in the source collection, so we cannot easily make room for the new items in one pass.
- probably more importantly, I’m worried that the semantics of such insertions would not be obvious — i.e., clients may not expect the number of inserted elements to be less than the count of the input.
insert(contentsOf:at:)
is uncomfortably close to the RangeReplaceableCollection requirement, but it would work in a potentially unexpected way.
Like Set, OrderedSet does provide a direct bulk insertion operation in formUnion
— the name of the method helps set expectations about the result there. On the other hand, the name formUnion
doesn’t tell me where exactly the new elements will be inserted, so that isn’t a perfect name, either. formUnion
is hardwired to append items; it would be possible to add a set of overloads of it that take an explicit target index, but it’s not clear to me if it’d better/clearer than insert(contentsOf:at:)
would be.
Bulk insertions can also be done through the OrderedSet.elements
view, which is a read-write property.
var set: OrderedSet<Int> = …
set.elements.insert(contentsOf: 0 ..< 100, at: 0)
Mutations of elements
need to finish by regenerating the entire hash table & filtering out duplicates to restore uniqueness. This adds overhead, but this is still an O(n) operation, as opposed to O(n^2) for your original insert loop.
Note that the order of elements in the resulting ordered set is very sensitive to the way the insertions are implemented:
var s0: OrderedSet = [2, 4, 6]
for value in 0 ..< 8 { s0.insert(value, at: 0) } // O(n^2) đź’Ł
// s0 is [7, 5, 3, 1, 0, 2, 4, 6]
var s1: OrderedSet = [2, 4, 6]
s1.formUnion(0 ..< 8)
// s1 is [2, 4, 6, 0, 1, 3, 5, 7]
var s2: OrderedSet = [2, 4, 6]
s2.elements.insert(contentsOf: 0 ..< 8, at: 0)
// s2 is [0, 1, 2, 3, 4, 5, 6, 7]
var s3: OrderedSet = [2, 4, 6]
s3.insert(contentsOf: 0 ..< 8, at: 0)
// s3 is [0, 1, 3, 5, 7, 2, 4, 6]
Of these, the behavior exhibited by s1 and s3 seem the most appropriate. Reordering existing items or reversing newly added items seems undesirable in the typical use case.
There are several ways we could resolve point (1) above — e.g., by appending items then moving them in place at the end, or by copying new items into a temp buffer first. In fact, the trickiness of these solutions would actually make a really good argument in favor of adding these in the package.
Point (2) I’m more worried about — for example, while RRC does not technically require insertions to increase the count of the collection by the number of elements given, but this seems like an obvious default expectation. On the other hand, it is true that this expectation is already violated by e.g. String:
var string = “hello”
var new = “\u{301}” // combining acute accent
string.count // 4
new.count // 1
string.insert(contentsOf: new, at: string.index(after: string.startIndex))
string // “héllo”
string.count // 4, not 4 + 1
However, while this weirdness is a niche edge case with String, in the case of OrderedSet, such behavior would be the norm, not the exception.
I think if we added OrderedSet.insert(contentsOf:)
, it would probably make sense to consider conforming OrderedSet
to the full RangeReplaceableCollection. The question is: are there any generic RRC algorithms that would stop working correctly when operating on such an unruly type? For instance, any code that tried to reorder elements by inserting duplicates before removing the old items wouldn’t work. But then again, generic algorithms must assume that RRC methods invalidate indices, so perhaps this isn’t a huge concern. But still, RRC conformance would probably be a step too far.
I guess this post is just a long-winded way of saying that I think we should implement OrderedSet.insert(contentsOf:at:)
, in O(count
+ new.count
) time, implementing the same ordering as the s3
example above. The “append new items then move them in place” strategy seems the most appropriate — it reduces the insertion to a formUnion
followed a range-based move operation. (Which may make sense to itself add as a standalone public member.)
PRs would be welcome! 