[Pitch] Remove destructive consumption from Sequence


(Jon Hull) #1

What are the actual use-cases where people have needed destructive iterators? Every time I have thought I wanted it, I ended up wanting multi-pass later. For example, I had a sequence of random numbers, but ended up having to build a random hash instead so I could reliably re-create that sequence. The other use-cases I can think of (e.g. markov-chains) would probably end up needing repeatability at some point as well.

The only thing I can think of without an eventual multi-pass requirement would be reading in an input or signal of some sort.

The most troubling thing to me is having the value type be destructive in such a weird way.

I would like to see Iterator take a more functional approach. In addition to the mutating next()->T? function, it would be nice to have a non-mutating nextIterator() -> (T, Iterator<T>)?, which non-destructively returns a tuple of the next value and an iterator set to the next step. This would be nondestructive, so it would probably require pre-calculating the next value early for destructive sequences. The end result would be that next value is guaranteed to be the same whenever it is called from the non-mutated iterator, but future values may quickly diverge (from separate copies of the value type).

The other alternative would be to make Iterator a reference type.

I would be ok with either solution. I would be most in favor of whichever version allows us to build some sort of yielding iterator construct later. I.e. I am going to run through this iterator until a condition is met… then later I will pick up where I left off. This is a useful pattern, especially for things which get destructively consumed.

I would like sequences to always be nondestructive / multi-pass, and gain everything from collectionType except those things which require an end index (e.g. count). Sequences could be infinite. Sequences could/would still vend destructively consumed iterators… they would just be required to vend the exact same iterator (which generates the same sequence of values) each time you ask for it.

Then we have collectionType (which should be renamed FiniteSequence), which re-gains count, dropLast, etc…. Basically the same as collectionType today.

What I like about splitting finite and potentially infinite sequences:
1) It allows compiler warnings for .forEach and for-in without break
2) You could build interesting infinite sequences, and then create a finite sequence by adding limits/predicates (e.g. all prime numbers where x < 250)

FiniteSequence(from: infSequence, boundedBy: 0…100)
FiniteSequence(from: infSequence, stoppingWhen: {$0 % 2 == 0})
FiniteSequence(from: infSequence, maxCount: 1000) //The above may need a maxCount parameter as well (which defaults to Int.max)

Top of my list for interesting infinite sequences would be RandomSequence, which inits with a seed (defaulting to a random seed), and produces a reproducible sequence of random T.

Thanks,
Jon


(Russ Bishop) #2

I use it in a LazyRowSequence<T: SqlModelConvertible> where querying Sqlite in WAL mode allows multiple concurrent readers to get point-in-time snapshots of the database. The query can’t be replayed without buffering all the rows in memory because Sqlite’s step functions are not bi-directional. In some cases we are talking about tens of thousands of rows (or even hundreds of thousands) and the ability to avoid buffering them is a feature, not a bug.

Sequence is the stream of things you can iterate over; that’s the only real promise it makes. If you want repeated iteration use a Collection.

I’m sure someone has a use for a stream of values that can’t be buffered but can be iterated over multiple times, but is that common enough to warrant redesigning the entire sequence/collection/iterator hierarchy?

Russ

···

On Jun 23, 2016, at 12:26 AM, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

What are the actual use-cases where people have needed destructive iterators? Every time I have thought I wanted it, I ended up wanting multi-pass later. For example, I had a sequence of random numbers, but ended up having to build a random hash instead so I could reliably re-create that sequence. The other use-cases I can think of (e.g. markov-chains) would probably end up needing repeatability at some point as well.

The only thing I can think of without an eventual multi-pass requirement would be reading in an input or signal of some sort.

Thanks,
Jon


(Jon Hull) #3

Good use case!

Would being handed an Iterator work for you in this case? Are there methods on Sequence that you need which aren’t on (or couldn’t be added to) Iterator?

The main issue for me is that Iterators, by the definition of their API are destructive (you can only use them once), but the ability of a sequence to vend a brand new Iterator implies multi-pass. I am not saying we get rid of single-pass ability, so much as we should use Iterators for that purpose, and Sequences (which vend Iterators) should be multi-pass because they kind of pretend to be already…

Thanks
Jon

···

On Jun 26, 2016, at 3:41 PM, Russ Bishop <xenadu@gmail.com> wrote:

On Jun 23, 2016, at 12:26 AM, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

What are the actual use-cases where people have needed destructive iterators? Every time I have thought I wanted it, I ended up wanting multi-pass later. For example, I had a sequence of random numbers, but ended up having to build a random hash instead so I could reliably re-create that sequence. The other use-cases I can think of (e.g. markov-chains) would probably end up needing repeatability at some point as well.

The only thing I can think of without an eventual multi-pass requirement would be reading in an input or signal of some sort.

Thanks,
Jon

I use it in a LazyRowSequence<T: SqlModelConvertible> where querying Sqlite in WAL mode allows multiple concurrent readers to get point-in-time snapshots of the database. The query can’t be replayed without buffering all the rows in memory because Sqlite’s step functions are not bi-directional. In some cases we are talking about tens of thousands of rows (or even hundreds of thousands) and the ability to avoid buffering them is a feature, not a bug.

Sequence is the stream of things you can iterate over; that’s the only real promise it makes. If you want repeated iteration use a Collection.

I’m sure someone has a use for a stream of values that can’t be buffered but can be iterated over multiple times, but is that common enough to warrant redesigning the entire sequence/collection/iterator hierarchy?

Russ