[Starter Pitch] Introducing a `cycled` method to `Sequence`

moiseev · March 23, 2018, 9:22pm

Indeed, it is possible to implement cycle(_:) using what's already in the standard library. What would be interesting to see is the performance difference between a custom type vs. the sequence-joined combination.

It would be an interesting micro-benchmark regardless, actually. Would you like to give it a shot?

moiseev · March 23, 2018, 9:27pm

This will hit the same problem of a potentially trapping count (although in this case it will only be trapping in some cases, when constructed with count: nil). Besides, as @benrimmington posted previously, sequence(element) { $0 } will achieve the same result. Admittedly a little more cryptic, but not too hard to write.

Erica_Sadun · March 23, 2018, 9:59pm

I deleted my benchmarks here, as they were skewed by a silly typo. I fed an array version to one and a sequence to the other

Alexandre_Lopoukhine · March 24, 2018, 12:14am

That's annoying indeed, I don't think that many will hit the trapping count, but it's unfortunate that that is allowed at all. It would be great to come up with a way for Repeated to be possibly infinite, while not sacrificing much efficiency. The easiest way right now would be to make Repeated.count optional, but that would involve a few more checks against nil on every access. Alternatively, Repeated could be a type-erased thing that fed accessors from FiniteRepeated and InfiniteRepeated. These will still all have the constraint of having Int.max - 1 be the last possible index, trapping on Int.max, but that's because of the way Collection works, which I assume is not up for debate at this point

Back to cycled: if we are going to consider the ?? .max trick here, I'd still argue for it being placed in repeatElement. The optional count argument is the API that I expected the function to have anyway, and I reckon that the benefits outweigh the costs.

benrimmington · March 24, 2018, 1:33am

Edit 1: Added sequence(state:next:) test.
Edit 2: Added XCTAssertEqual calls.
Edit 3: Added build configuration.

The custom type was faster than the other combinations.

Average times (seconds):

	Swift 4.0.3	Swift 4.1	Swift 4.2
	(Xcode 9.2)	(Xcode 9.3)	(Xcode 9.3)
testCustom()	0.227	0.587	0.417
testRepeat()	0.320	0.685	0.501
testUnfoldFirst()	0.318	0.677	0.510
testUnfoldState()	0.273	0.632	0.471

Using:

Xcode 9.2 (9C40b) with Swift 4.0.3
Xcode 9.3 beta 4 (9Q127n) with Swift 4.1
Trunk Development (master) March 17, 2018 snapshot.
"Debug" build configuration with no optimization [-Onone].

I added methods to Collection instead of Sequence.

import XCTest

extension Collection {

  public func cycledCustom() -> CycleSequence<Self> {
    return CycleSequence(self)
  }

  public func cycledRepeat() -> FlattenSequence<Repeated<Self>> {
    return repeatElement(self, count: self.isEmpty ? 0 : .max).joined()
  }

  public func cycledUnfoldFirst() -> FlattenSequence<UnfoldFirstSequence<Self>> {
    return sequence(first: self, next: { $0.isEmpty ? nil : $0 }).joined()
  }

  public func cycledUnfoldState() -> UnfoldSequence<Element, Iterator> {
    return sequence(state: makeIterator(), next: { iterator in
      if let nextElement = iterator.next() {
        return nextElement
      }
      iterator = self.makeIterator() // NOTE: captures `self`.
      return iterator.next()
    })
  }
}

public struct CycleSequence<C: Collection> : Sequence, IteratorProtocol {

  internal let _collection: C
  internal var _iterator: C.Iterator

  internal init(_ collection: C) {
    _collection = collection
    _iterator = collection.makeIterator()
  }

  public mutating func next() -> C.Element? {
    if let nextElement = _iterator.next() {
      return nextElement
    }
    _iterator = _collection.makeIterator();
    return _iterator.next()
  }
}

class CycleBenchmark: XCTestCase {

  let _prefix = 1_000_000
  let _source = 0 ... 999

  func testCustom() {
    self.measure {
      var count = 0
      for _ in _source.cycledCustom().prefix(_prefix) {
        count += 1
      }
      XCTAssertEqual(count, _prefix)
    }
  }

  func testRepeat() {
    self.measure {
      var count = 0
      for _ in _source.cycledRepeat().prefix(_prefix) {
        count += 1
      }
      XCTAssertEqual(count, _prefix)
    }
  }

  func testUnfoldFirst() {
    self.measure {
      var count = 0
      for _ in _source.cycledUnfoldFirst().prefix(_prefix) {
        count += 1
      }
      XCTAssertEqual(count, _prefix)
    }
  }

  func testUnfoldState() {
    self.measure {
      var count = 0
      for _ in _source.cycledUnfoldState().prefix(_prefix) {
        count += 1
      }
      XCTAssertEqual(count, _prefix)
    }
  }
}

Alexandre_Lopoukhine · March 24, 2018, 8:29am

Nice! Could you please show the test function? I assume you cycled over an array? Does array size have an impact?

benrimmington · March 24, 2018, 9:20am

@Alexandre_Lopoukhine, the tests are available at https://forums.swift.org/raw/11254/28, or by scrolling the above source code. I cycled over a CountableClosedRange, so it doesn't have an impact.

Using Xcode, you can create a cross-platform "Empty" project; add a macOS "Unit Testing Bundle" target; and paste the code into the target's source file.

Alexandre_Lopoukhine · March 24, 2018, 9:21am

I'm a victim of the invisible scrollbar, didn't realise there was more

Alexandre_Lopoukhine · March 24, 2018, 9:26am

Ok, well if the results are so close, then we better choose the solution for other reasons. I'd go for the custom type, it lets us mutate the implementation if there's ever a reason, and swift's type system plays better with custom types than typealiases in general.

Erica_Sadun · March 24, 2018, 5:57pm

I rewrote and re-ran my tests from scratch.

The candidates were:

public struct SequenceCycle, as originally proposed
public func cycledRepeated, as in my updated gist
public func cycledUnfoldFirst, using sequence(first, next)

I chose to keep these all as sequence and not collection, which requires testing whether you have to see if a first Element is available. This ruled out the unfoldFirst solution.

I ran Ben's benchmarks on an optimized build:

Test Case '-[Test2Tests.CycleBenchmark testCycledRepeated]' passed (3.526 seconds).
Test Case '-[Test2Tests.CycleBenchmark testSequenceCycle]' passed (2.360 seconds).

And I ran my own pressure tests:

Starting Test: Using cycledSequenceCycle()
Ending Test : Using cycledSequenceCycle()
Elapsed time: 0.496450918028131
Starting Test: Using cycledRepeated()
Ending Test : Using cycledRepeated()
Elapsed time: 0.509369512787089

All things being equal, I'm leaning towards reverting to my original implementation, which follows how SE-0094 was implemented.

As for my different tests yesterday, they seem to be due to a typo, where I was feeding an array as the first argument to one test and the original sequence to the other.

bitjammer · April 5, 2018, 3:37am

Sorry I am late to this party. I just wanted to add one thing you might find interesting: it may be useful to differentiate between cycling over the sequence forever, and for some number of times through the entire sequence, as in:

[1,2,3].cycled() == [1,2,3].cycled(.forever /* or what have you */) // [1,2,3,1,2,3...]
[1,2,3].cycled(.times(2)) // [1,2,3,1,2,3]

Erica_Sadun · April 5, 2018, 8:16pm

For known repetitions, just use:

repeatElement([1, 2, 3], count: 2).joined()

benrimmington · April 5, 2018, 9:11pm

@moiseev, I've re-tested the benchmark using Xcode 9.3 (9E145) from the App Store.

The results for Swift 4.1 (the built-in toolchain) are the same, i.e. Swift 4.0.3 is still twice as fast.

I'm using the "Debug" build configuration with no optimization [-Onone] as before.

moiseev · April 5, 2018, 9:14pm

Wow.. I totally missed there was a regression in 4.1. Something @Erik_Eckstein might be interested in.

UPD: filed [SR-7365] 'cycled(_:)' performance regression in Swift 4.1 · Issue #49913 · apple/swift · GitHub

benrimmington · April 5, 2018, 9:37pm

@moiseev, I'd previously assumed the slowdown was due to assertions enabled in the snapshots, which is why I didn't mention it before.

griotspeak · April 5, 2018, 9:41pm

This seems like a reason to reuse repeatElement for cycling, no?

Erica_Sadun · April 5, 2018, 10:34pm

I suppose Int.max is basically infinity but I prefer actual infinity. I haven't re-run my benchmarks yet.

griotspeak · April 6, 2018, 12:47am

Sorry, I meant something along the lines of repeatElement([1, 2, 3], count: .forever).lazy.joined()

Thinking about it now, though, I realize that I am thinking in a lazily evaluated context. This is all to say, I wanted the infinite and finite versions to share a name for discoverability and, maybe, that is what bitjammer was after, as well.

griotspeak · April 6, 2018, 12:57am

maybe repeatElements(of: [1,2, 3], count: .cycled) and repeatElements(of: [1,2, 3], count: 3) where 3 is achieved by ExpressibleByIntegerLiteral and, basically, becomes .finite(3) under the covers.

bitjammer · April 6, 2018, 2:07am

Hm, FWIW doesn't read super well to me in comparison. I look at repeatElement and I imagine someone could get confused as to whether [1,2,3] is repeated or its elements are.