Pitch: Sequence enhancements: chaining, repeating, batching

bitjammer · April 9, 2018, 4:45pm

Well, Set does conform to Sequence as a practical matter but it was just an example of mixing sequence types. For better or worse, the example should read, I guess, "the elements of this set in undefined order followed by the elements in this array." I know that's not great, but I'm not proposing changing Set here. :)

xwu · April 11, 2018, 9:47pm

Right, Set([7,8,9]).followed(by: [1,2,3]) is 7, 8, 9 in some order, followed by 1, 2, 3 in some order--specifically, the orders that would be obtained if you iterated over the first set and then the second.

Erica_Sadun · April 11, 2018, 11:54pm

This isn't quite as easy as it sounds. What we think of as Sequences with shared base elements can look very different to the compiler. So this has to produce a new kind of concatenated sequence that can also be concatenated with an arbitrary sequence, with some kind of way to lazily store these various items until each previous sequence is exhausted.

bitjammer · April 11, 2018, 11:57pm

Yes, this is exactly what I've implemented. It's only about 6 lines of non-boilerplate code, though.

griotspeak · April 11, 2018, 11:58pm

Why wouldn't we be able to create a sequence that grabbed an iterator from each of the sequences that it is handed, store both of them, and move on to the second iterator once the first one returns nil?

bitjammer · April 11, 2018, 11:59pm

Sorry, this is what I've implemented. You don't need to store values, just the iterators.

Erica_Sadun · April 12, 2018, 1:11am

Once I stopped trying to make appending variadic, it was fine:

<munched>

See @allevato 's version below instead.

anthonylatsis · April 12, 2018, 3:11am

That's true. I would even drop the Element == OtherSequence.Element constraint.

I agree that type-erased wrappers aren't practical at all here, so the variadic variant, if any, will have to be left homogeneous for now. But it's still a great complement, I believe.

allevato · April 12, 2018, 3:35am

One problem with this approach is that since your container is generic over the sequence types, you end up proliferating all of that information through the resulting types, recursively. If you call a.followed(by: b).followed(by: c).followed(by: d).followed(by: e), you've constructed a type that looks something like ConcatenatedSequence<ConcatenatedSequence<ConcatenatedSequence<ConcatenatedSequence<A, B>, C>, D>, E>. I would imagine that would have some effects on type checker performance.

The key observation is that when you've concatenated two sequences in this fashion, you don't care about the original sequence types anymore—just the element type. So a more type-checker-friendly approach would be to store type-erased iterators instead:

public struct ConcatenatedSequence<Element>: Sequence, IteratorProtocol {
  public mutating func next() -> Element? {
    if usingFirst {
      if let element = first.next() {
        return element
      }
      usingFirst = false
    }
    if let element = second.next() {
      return element
    }
    return nil
  }

  fileprivate init<First: Sequence, Second: Sequence>(_ first: First, _ second: Second)
  where First.Element == Element, Second.Element == Element {
    self.first = AnyIterator(first.makeIterator())
    self.second = AnyIterator(second.makeIterator())
  }

  private var usingFirst = true
  private var first: AnyIterator<Element>
  private var second: AnyIterator<Element>
}

extension Sequence {
  public func followed<Other: Sequence>(by sequence: Other) -> ConcatenatedSequence<Self.Element>
  where Other.Element == Self.Element {
    return ConcatenatedSequence(self, sequence)
  }
}

Then, no matter how many sequences you concatenate together (and no matter what their concrete types are), you always end up with ConcatenatedSequence<Element>.

Erica_Sadun · April 12, 2018, 4:16am

Thank you!

The only way I could figure get a variadic version working was to force everything to be type erased, leading to this abomination just so it would look more functional:

extension Sequence {   
    /// Present as `AnySequence<Element>`
    public var typeErased: AnySequence<Element> {
        return AnySequence(self)
    }
}

By the way, I think you can squoosh the first two lines with a compound if

if usingFirst, let element = first.next()

bitjammer · April 12, 2018, 4:17am

Yes, I noticed that too. For most expression chains in practice (probably less than four or five), I don't think a conformance and equality check should be expensive, but the generics do look less scary with an erased container. All things considered, I don't really have a preference for one over the other. Using type-erased containers results in the same size type with similar performance, so that's probably the way to go.

bitjammer · April 12, 2018, 4:24am

I think first.next() ?? second.next() suffices, don't you think?

Erica_Sadun · April 12, 2018, 4:25am

Something in the back of my brain is saying "it's wrong to use next() repeatedly on a exhausted iterator". Why is this?

bitjammer · April 12, 2018, 4:25am

So, circling back to the user experience of the API, how are you all feeling about the overall ergonomics for the time being?

Erica_Sadun · April 12, 2018, 4:26am

Ergonomically, I've decided that appending one sequence at a time makes the most sense. Appending several looks ugly.

bitjammer · April 12, 2018, 4:28am

Oh, jeez, I sure hope not! That might indicate that the first time you call next() where it returns nil could be bad too.

allevato · April 12, 2018, 4:28am

Agreed—even if we had variadic generics, it would feel odd to me to write first.followed(by: second, third, fourth)—there's no reason to make the first one more prominent. We'd need a global function like zip for such an API to make sense.

So, if we limit ourselves to just two-at-a-time right now, something like first.followed(by: second) feels quite nice.

Erica_Sadun · April 12, 2018, 4:29am

Paging @Lily_Ballard to the white sequence courtesy phone.

(I'd do whatever Kevin agrees is most cromulent).

I'm still not sold on batched (or even really on cycled(count:)). Do you have some use-cases to help with that?

bitjammer · April 12, 2018, 4:57am

Both are sort of alternatives to control flow (actually, all of these are).

For batching, I think probably the most common use case is doing things related to scheduling like grouping items into work units. If you have a large or infinite sequence that you need to chew through, it's nice to be able to be able to do things like:

myHugeSequence
    .batched(maxCount: Worker.batchSize)
    .forEach(funnel.schedule(batch)

rather than creating batch after batch on your own.

For cycling, there's all kinds of cool periodic behaviors, especially with simulations, DSP, or just avoiding while true loops.

For example:

Toggling between periodic values

// Create a 25% duty cycle rectangular wave
let naiveFlipFlop = [true].followed(by: repeatElement(false, count: 3))
let bandLimited = convolve(kernel, with: naiveFlipflop)
let wave = bandLimited.cycle(.forever)
play(wave)

Repeating a periodic waveform

let myPeriodicWaveform = getSineCycle(frequency, length, sampleRate).repeated(.times(count))
play(myPeriodicWaveform)

This saves memory by reusing the same wave.

I think these APIs basically boil down to creating declarative data as an alternative to imperative control flow, which can be more difficult to follow, at least in some context.

Erica_Sadun · April 12, 2018, 5:40am

Have a looksee: 1 SequenceExtensions.swift · GitHub

Although I may rejigger to use your enums