Overlay - A non-mutating replaceSubrange

Hi,

I'd like to get some thoughts on this potential addition to the algorithms library:

  • Do you think this is useful?

  • Do you think this is a good fit for the algorithms package?

  • What do you think about the naming? Is "overlay" clear about what this does?

  • What do you think about the use of a namespace struct?

  • Do you know of any similar features in other languages/libraries? I haven't been able to find anything.

Thanks


Overlay

Pull Request

Compose collections by overlaying the elements of one collection
over an arbitrary region of another collection.

Swift offers many interesting collections, for instance:

  • Range<Int> allows us to express the numbers in 0..<1000
    in an efficient way that does not allocate storage for each number.

  • Repeated<Int> allows us to express, say, one thousand copies of the same value,
    without allocating space for a thousand values.

  • LazyMapCollection allows us to transform the elements of a collection on-demand,
    without creating a copy of the source collection and eagerly transforming every element.

  • The collections in this package, such as .chunked, .cycled, .joined, and .interspersed,
    similarly compute their elements on-demand.

While these collections can be very efficient, it is difficult to compose them in to arbitrary datasets.
If we have the Range 5..<10, and want to insert a 0 in the middle of it, we would need to allocate storage
for the entire collection, losing the benefits of Range<Int>. Similarly, if we have some numbers in storage
(say, in an Array) and wish to insert a contiguous range in the middle of it, we have to allocate storage
in the Array and cannot take advantage of Range<Int> memory efficiency.

The OverlayCollection allows us to form arbitrary compositions without mutating
or allocating storage for the result.

// 'numbers' is a composition of:
// - Range<Int>, and
// - CollectionOfOne<Int>

let numbers = (5..<10).overlay.inserting(0, at: 7)

for n in numbers {
  // n: 5, 6, 0, 7, 8, 9
  //          ^
}
// 'numbers' is a composition of:
// - Array<Int>, and
// - Range<Int>

let rawdata = [3, 6, 1, 4, 6]
let numbers = rawdata.overlay.inserting(contentsOf: 5..<10, at: 3)

for n in numbers {
  // n: 3, 6, 1, 5, 6, 7, 8, 9, 4, 6
  //             ^^^^^^^^^^^^^
}

We can also insert elements in to a LazyMapCollection:

enum ListItem {
  case product(Product)
  case callToAction
}

let products: [Product] = ...

var listItems: some Collection<ListItem> {
  products
    .lazy.map { ListItem.product($0) }
    .overlay.inserting(.callToAction, at: min(4, products.count))
}

for item in listItems {
  // item: .product(A), .product(B), .product(C), .callToAction, .product(D), ...
  //                                              ^^^^^^^^^^^^^
}

Detailed Design

An .overlay member is added to all collections:

extension Collection {
  public var overlay: OverlayCollectionNamespace<Self> { get }
}

This member returns a wrapper structure, OverlayCollectionNamespace,
which provides a similar suite of methods to the standard library's RangeReplaceableCollection protocol.

However, while RangeReplaceableCollection methods mutate the collection they are applied to,
these methods return a new OverlayCollection value which substitutes the specified elements on-demand.

extension OverlayCollectionNamespace {

  // Multiple elements:

  public func replacingSubrange<Overlay>(
    _ subrange: Range<Elements.Index>, with newElements: Overlay
  ) -> OverlayCollection<Elements, Overlay>

  public func appending<Overlay>(
    contentsOf newElements: Overlay
  ) -> OverlayCollection<Elements, Overlay>

  public func inserting<Overlay>(
    contentsOf newElements: Overlay, at position: Elements.Index
  ) -> OverlayCollection<Elements, Overlay>

  public func removingSubrange(
    _ subrange: Range<Elements.Index>
  ) -> OverlayCollection<Elements, EmptyCollection<Elements.Element>>
  
  // Single elements:
  
  public func appending(
    _ element: Elements.Element
  ) -> OverlayCollection<Elements, CollectionOfOne<Elements.Element>>

  public func inserting(
    _ element: Elements.Element, at position: Elements.Index
  ) -> OverlayCollection<Elements, CollectionOfOne<Elements.Element>>

  public func removing(
    at position: Elements.Index
  ) -> OverlayCollection<Elements, EmptyCollection<Elements.Element>>
  
}

OverlayCollection conforms to BidirectionalCollection when both the base and overlay collections conform.

Conditional Overlays

In order to allow overlays to be applied conditionally, another function is added to all collections:

extension Collection {

  public func overlay<Overlay>(
    if condition: Bool,
    _ makeOverlay: (OverlayCollectionNamespace<Self>) -> OverlayCollection<Self, Overlay>
  ) -> OverlayCollection<Self, Overlay>
  
}

If the condition parameter is true, the makeOverlay closure is invoked to apply the desired overlay.
If condition is false, the closure is not invoked, and the function returns a no-op overlay,
containing the same elements as the base collection.

This allows overlays to be applied conditionally while still being usable as opaque return types:

func getNumbers(shouldInsert: Bool) -> some Collection<Int> {
  (5..<10).overlay(if: shouldInsert) { $0.inserting(0, at: 7) }
}

for n in getNumbers(shouldInsert: true) {
  // n: 5, 6, 0, 7, 8, 9
}

for n in getNumbers(shouldInsert: false) {
  // n: 5, 6, 7, 8, 9
}
7 Likes

I love this. I haven't really thought about how useful it'd be for my personal code, but at the very least I love this idea and want to see it explored further. Thank you for pitching this!

3 Likes

While the idea might be just fine on its own, I'd be worried about introducing yet another structuring concept on top of collections. It'd be another concept β€” rather than an implementation detail β€” because it's not just something that you'd do to a collection, but something that you create separately from the participating collections that has its own lifetime and semantics.

It also seems to overlap a bit with [CollectionDifference].

The other thing (on the third hand) that concerns me here is "premature generalization" (analogous to premature optimization, although Swift is mature enough that the term might sound strange).

What I mean is, how can be sure that a high-quality implementation of this is actually useful for more than a handful or scenarios, leaving the majority of actual scenarios unserved because it's-nearly-right-for-that-but-not-quite-so-I-have-to-write-out-code-to-do-it-the-old-way-instead?

2 Likes

Definitely think this is a good candidate for swift-algorithms. I could easily see how useful this could be in a codebase.

2 Likes

I'm not sure what you mean by this. Why wouldn't this argument apply to every Collection wrapper - including Slice, and almost everything in the algorithms package? They all fit the description of "something that you create separately from the participating collections that has its own lifetime and semantics."

But actually, overlays allow you to substitute elements while preserving more of the characteristics of the original collection.

For instance, let's imagine I have some (possibly large) dataset, stored in a specialised tree structure or whatever. Maybe it isn't even mutable. If I need to insert/remove/replace elements, in general I will need to copy the dataset to new storage with its own characteristics (such as Array). With an overlay, the original dataset is preserved and operations such as index traversal call through to the same underlying data structure as they always did; it just makes a little detour on the way.

Again this is quite abstract so I'm unsure how I can respond to your concern.

One thing I will say, though, is that this isn't a matter of new way vs. old way. I'm certainly not suggesting that developers should replace every mutating operation with an overlay.

Do you think the guide document needs to be clearer about that?

Yes, there are already quite a few abstractions on top of a "plain" collection, like Slice as you mentioned, but also things like lazy access. I think I'm suggesting that the conceptual cost of adding a new abstraction is (or seems) pretty small, but the conceptual cost of knowing all of them at the same time is large β€” greater than the sum of the parts.

You can make (and essentially have) made an argument that it's worth it for this functionality, and that's fine, so long as people consider the whole ecosystem, not just the part you're proposing to add.

FWIW, the lifetime of your overlays seems like it would be similar to (and subject to the same dangers as) indexes to the underlying collections. In particular, mutating a collection independently of the overlays should invalidate the overlay compositions, right?

There have been a lot of proposals lately (say in the last year or so) for things that amount to a quality of life improvement, but they run the risk of increasing the complexity barrier to becoming familiar with Swift. This isn't quite the right language for what you've proposed, but these tend to be cases of "There's a new spelling Y, which you use when you want to do X."

In a way, all of these suggestions are promoting breadth over depth.

However, I'm not trying to shoot your proposal down. Instead, I'm trying to encourage discussions around add-on features like this, to determine whether they are better or worse for the language/standard library as a whole. If the answer is yes, the answer is yes. :slight_smile:

2 Likes

Do I understand correctly that all overlays are essentially Arrays (by behaviour), irrespective of the underlying component(s)? I'd have to think it through properly, but my instinct is that this is unwise - at least for a general "overlay" type.

If I have a Set<Element> and I apply an overlay to it, I would intuitively expect the overlay to preserve the set semantics (e.g. the notion of inserting or removing at a specific is non-sensical in that context).

Naming it something more specific might be a start on those concerns, e.g. someSet.arrayOverlay….

If this were being proposed as part of the standard library I would agree with you, but to be fair, I don't think this is something that someone new to Swift would be expected to be familiar with; it's an addition to a separate library that implements algorithms meant for specific use cases. You already have to go a little out of your way to use this library.

4 Likes

I concur with both @QuinceyMorris and @JuneBash. The barrier for Swift Stdlib & Foundation (let-alone Swift the language) needs to be high. They're bundled with the compiler or OS and as such are very difficult & slow to change, and effectively cannot be replaced by 3rd parties (e.g. try actually using the new Foundation today in a GUI Mac or iOS app - and that's nominally just a version change!).

But the barrier for "non-standard" libraries, like swift-algorithms, should be much lower. Even though they're somewhat 'blessed' by being official Apple projects, functionally they still are optional and easy for anyone to fork or replace as they see fit.

Also, they serve as proving grounds for functionality that might eventually want to 'graduate' to Stdlib or Foundation. So necessarily the barrier has to be lower.

I don't see a need for this overlay functionality myself, but even though I regularly use swift-algorithms I have no objection to something like this going in it. As long as it's sufficiently well thought-out and designed.

4 Likes