Pitch: Sequence enhancements: chaining, repeating, batching

Well, Set is a Sequence in the sense that Set: Sequence.

1 Like

That's an unfortunate design choice, then, that won't make sense to many mathematically inclined readers.

But I'll play along. What's the result of Set([1,3,2]).followed(by: [5,3,0])? [1,3,2,5,0] or [1,2,5,3,0]? Or [0,1,2,3,5]? Imho, the only sanity-preserving answer is, "order doesn't matter for sets" -- yup, : Sequence doesn't make sense.

2 Likes

This is assuming that Set.followed(by:) returns a Set which I don't think it should or will? In which case the most obvious answer would be [1,3,2,5,3,0]

That wouldn't make any sense to me, both from an API standpoint (Sequence would not be a very useful return type) and from the semantic perspective ("joining" two sets shouldn't give me duplicates).

But well, someone apparently decided that, in Swift, sets are sequential. Woe them!

1 Like

If you need to join sets why not use union. followed(by:) reads as though it should just have the elements of one sequence 'follow' the other.

I think you should check out Is a set a sequence? - #38 by jawbroken.

2 Likes

Well, Set does conform to Sequence as a practical matter but it was just an example of mixing sequence types. For better or worse, the example should read, I guess, "the elements of this set in undefined order followed by the elements in this array." I know that's not great, but I'm not proposing changing Set here. :)

4 Likes

Right, Set([7,8,9]).followed(by: [1,2,3]) is 7, 8, 9 in some order, followed by 1, 2, 3 in some order--specifically, the orders that would be obtained if you iterated over the first set and then the second.

1 Like

This isn't quite as easy as it sounds. What we think of as Sequences with shared base elements can look very different to the compiler. So this has to produce a new kind of concatenated sequence that can also be concatenated with an arbitrary sequence, with some kind of way to lazily store these various items until each previous sequence is exhausted.

Yes, this is exactly what I've implemented. It's only about 6 lines of non-boilerplate code, though.

Why wouldn't we be able to create a sequence that grabbed an iterator from each of the sequences that it is handed, store both of them, and move on to the second iterator once the first one returns nil?

Sorry, this is what I've implemented. You don't need to store values, just the iterators.

Once I stopped trying to make appending variadic, it was fine:

<munched>

See @allevato 's version below instead.

That's true. I would even drop the Element == OtherSequence.Element constraint.

I agree that type-erased wrappers aren't practical at all here, so the variadic variant, if any, will have to be left homogeneous for now. But it's still a great complement, I believe.

One problem with this approach is that since your container is generic over the sequence types, you end up proliferating all of that information through the resulting types, recursively. If you call a.followed(by: b).followed(by: c).followed(by: d).followed(by: e), you've constructed a type that looks something like ConcatenatedSequence<ConcatenatedSequence<ConcatenatedSequence<ConcatenatedSequence<A, B>, C>, D>, E>. I would imagine that would have some effects on type checker performance.

The key observation is that when you've concatenated two sequences in this fashion, you don't care about the original sequence types anymore—just the element type. So a more type-checker-friendly approach would be to store type-erased iterators instead:

public struct ConcatenatedSequence<Element>: Sequence, IteratorProtocol {
  public mutating func next() -> Element? {
    if usingFirst {
      if let element = first.next() {
        return element
      }
      usingFirst = false
    }
    if let element = second.next() {
      return element
    }
    return nil
  }

  fileprivate init<First: Sequence, Second: Sequence>(_ first: First, _ second: Second)
  where First.Element == Element, Second.Element == Element {
    self.first = AnyIterator(first.makeIterator())
    self.second = AnyIterator(second.makeIterator())
  }

  private var usingFirst = true
  private var first: AnyIterator<Element>
  private var second: AnyIterator<Element>
}

extension Sequence {
  public func followed<Other: Sequence>(by sequence: Other) -> ConcatenatedSequence<Self.Element>
  where Other.Element == Self.Element {
    return ConcatenatedSequence(self, sequence)
  }
}

Then, no matter how many sequences you concatenate together (and no matter what their concrete types are), you always end up with ConcatenatedSequence<Element>.

Thank you!

The only way I could figure get a variadic version working was to force everything to be type erased, leading to this abomination just so it would look more functional:

extension Sequence {   
    /// Present as `AnySequence<Element>`
    public var typeErased: AnySequence<Element> {
        return AnySequence(self)
    }
}

By the way, I think you can squoosh the first two lines with a compound if

if usingFirst, let element = first.next()

Yes, I noticed that too. For most expression chains in practice (probably less than four or five), I don't think a conformance and equality check should be expensive, but the generics do look less scary with an erased container. All things considered, I don't really have a preference for one over the other. Using type-erased containers results in the same size type with similar performance, so that's probably the way to go.

I think first.next() ?? second.next() suffices, don't you think? :slight_smile:

Something in the back of my brain is saying "it's wrong to use next() repeatedly on a exhausted iterator". Why is this?

So, circling back to the user experience of the API, how are you all feeling about the overall ergonomics for the time being?