Trying my hardest to summarize the talking points of the conversation so far. Please let me know if I’ve missed any particular points:
- The thread was started (by yours truly) as an offshoot of another on-list conversation involving Dave Abrahams. In it, I voiced a concern that there were not enough legitimate needs for a single pass sequence as a core type (supporting concepts such as for..in) to warrant the possibility of single-pass behavior in Sequence when writing dependent algorithms
- There was a concern about the possible mutability of the methods causing complexities around standard naming, in particular the recurring effort to rename map/filter/reduce ‘terms of art’.
- For example in the case of filter vs filtered, the immutably-named version might still destroy the ability to access the original data
- Dave Abrahams’s primary concern about the existing system is that the restriction that Sequence is single-pass is not intuitive, and that generalized functions over sequences would be tested with the commonly-available Sequences, all of which are multi-pass.
- Considering single vs multi-pass can affect the algorithmic complexity and memory requirements of their code, this could be a considerably difficult bug to solve later on.
- There is also a desire to simplify things by reducing the # of types if at all possible, and likewise a distaste for increasing the number of types or overall complexity
- One example here would be that if single-pass sequences are rare, it may be possible to drop Sequence completely and just have Collection
- Single-pass sequences seem to be very rare and centered around I/O. One interesting example given was a database row cursor.
- A concern was voiced (by yours truly) that Sequences, unlike Collections, can today be:
- Single pass/destructive
- Infinite
- Non-indexed
and that the elimination of single pass behavior would not eliminate the other differentiating factors
It was countered that
- Infinite sequences may be representable with Collection
- It is possible to have the state in a programmatic generator be encoded in or as the Index in a collection, so even programmatic sequences can support indexes
- A single-pass sequence is likely best represented by not having implement Sequence, but rather having an Iterator implementation directly.
- Such an Iterator should be a reference type rather than representing it as having value semantics - if you actually support value semantics, then your state should be self-contained and thus you can support multi-pass.
- IteratorProtocol might gain several of the sequence methods, such as dropFirst() and prefix().
- Scala’s name for a single-pass sequence is TraversableOnce, and such descriptive naming would be useful in developer understanding on how to use single-pass sequences
- There was concern that the current model might be appropriate, and that the changes may be underestimating developers’ ability to deal with the complexity of single and multi pass sequences represented by the same type.
- It is a possibility that for..in would work with two distinct protocols (such as IteratorProtocol and Sequence) , rather than having single pass iteration be differentiated by where you are in an inheritance chain
-There was also a concern that if we have a single root type that is for..in-able, we would [just be shuffling things around], e.g. calling Sequence multi-pass, and creating a new SinglePassSequence above it.
- While single-vs-multipass is a distinction which needs to be made, there is some disagreement over whether it makes sense to differentiate infinite vs finite sequences. A collection with UInt64.max elements is finite but is still effectively infinite in terms of computation time. E.g., algorithms might fail due to memory consumption or performance for a 2 billion entry collection the same as they would fail for an infinite one.
- There is a desire if possible to eliminate Sequence if possible, and just have Collection. However, Collection has the requirement of returning a finite count and is Indexable.
- Indexable requires a Index which can be used to look up a collection value as an O(1) operation
- Indexable is comparable/equatable, although there is a possibility that Comparable could be dropped to simplify implementation of Collections
- Indexable requires a end index, which for an infinite Collection would be a synthetic value
- Indexable incudes a distance between two indexes, thus having similar problems to Collection.count
- Around infinite sequences, there has been some discussion on whether it is worth differentiating them via different types to provide developer warning that they may be working with an infinite loop. There was discussion of a for..in..until syntax for this (I presume because for..in..while would be harder grammar-wise?). The justification is that requiring until requires the developer to reason about the infinite sequence.
- For computed sequences, it would be possible to use an iterator directly as the index, as the index is not required to be an Integer type to meet Collection requirements.
- There was a discussion of having a Iterator and FiniteIterator. It was also mentioned that Iterator could be func next() ->T while FiniteIterator is func next() -> T? . There was also discussion that there might be a PossiblyInfiniteIterator with Finite and Infinite iterators as subclasses.
- There was discussion about forcing value-semantic iterators to be Collections. This could be done by allowing for infinite Collections, using something similar to a value-semantic iterator as the Index, while reducing the requirements for Index to support such usage (e.g. drop Comparable)
- There was a strawman to move eager operations (map, etc) and count to a FiniteCollection sub-protocol of Collection. (note: it was not clarified what other protocols Collection maintains, like Indexable, where Indexable still has requirements for functions like distance() )
- "AFAIK there is no interesting multipass Sequence that cannot reasonably be made to support indexing.” -Dave Abrahams
- Discussion steered toward using sequence(first:next:) and sequence(state:next:) to create collections, and using this as a replacement for iterator/sequence usage
- If Iterator and Sequence are no longer part of multi-pass sequences, one could instead support for..in using formIndex and subscripting.
- This would support subscript setters, and thus mutation of values in a for..in
- Dave Abrahams indicated a desire to support “partially formed” multipass sequences which cannot meet indexable requirements easy via the existing iterator system. An example given was collections imported from Foundation, such as NSSet.
- Thus it would be likely that Collection would retain makeIterator. Thus iterators themselves would not give a destructive vs nondestructive guarantee.
- Three possible approaches were given for differentiating finite and infinite single-pass sequences, to protect against generalized algorithms that would go into an infinite loop when called:
1. Separate protocols (provide count only on FiniteIterator)
2. Implicit additional requirements from iterator source documentation (don’t call count on the /dev/urandom-backed iterator)
3. Make all provided iterator methods lazy (instead of count, provide underestimateCount or possibly have count return an optional)
-DW