[Pitch] Sequence Cleanup


(Pavol Vaskovic) #1

Hello,

I was planning to wait pitching this after I have tested the implementation, but I’m getting anxious, that I’ll miss the 4.0 boat, so here goes. This is early draft. I would especially appreciate analysis of the impact on source compatibility and ABI stability.

https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md

Sequence Cleanup

Proposal: SE-NNNN <https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md>
Authors: Pavol Vaskovic <https://github.com/palimondo>, Author 2 <https://github.com/swiftdev>
Review Manager: TBD
Status: Pitch
<https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md#introduction>Introduction

This proposal is to change the default implementation of protocol Sequence methods so that all operations are eager (not lazy) and return Array<Element>. This removes the need to return type erased AnySequence that negatively impacts performance in order to hide the heterogenous implementation and provides Swift users with more homogenous interface. The lazy computation of prefix and dropFirst methods will be moved to LazySequence.

Swift-evolution thread: Discussion thread topic for that proposal <https://lists.swift.org/pipermail/swift-evolution/>
<https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md#motivation>Motivation

The API of Sequence protocol evolved over time to include irregular mixture of eager and lazy methods. In particular, it included lazy implementation of prefix and dropFirst from before the lazy sequence views found their home in LazySequence protocol. They are implemented by internal classes that wrap the underlying sequence and delay the computation until its elements are requested through iteration. SE-0045 <https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/SE-0045.md> added third internal class to implement drop(while:_), that eagerly drops the required element and wraps the partially used Iterator from the original sequence.

All other default implementations of Sequence methods are realized on top of Arrays and perform their computation eagerly. Due to the requirements of the associated type SubSequence returned from many Sequence protocol methods, all these heterogeneous implementations need to be hidden using type erasure by wrapping the results in AnySequence. The heterogeneity of implementation is currently well documented, if the user notices the different complexity of the methods that return lazy wrappers -- Complexity: O(1), instead of O(n). But given the existence of other lazy sequence views on LazySequence one might be surprised by this irregularity and overlook this historical quirk of implementation.

Furthermore, the presence of AnySequence and AnyIterator it returns, present serious optimization obstacle for the Swift compiler, when we chain the calls to these sequence methods. This results is non-specialized generic iterator, that ends up in a tight for-in loop. Currently there is no prefix implementation on LazySequence - preventing the workaround for avoiding AnySequence by going through .lazy call.

s.lazy.prefix(maxIterations).prefix(while:{ ... })
<https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md#proposed-solution>Proposed solution

We should take this opportunity to clean up the interface of default Sequence implementation to present Swift users with homogenous interface of eager methods that directly return Array as SubSequence. We'll move the lazy implementation for prefix and dropFirst to LazySequence.

<https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md#detailed-design>Detailed design

The following changes will be made to the standard library:

extension Sequence {

  public func suffix(_ maxLength: Int) -> [Iterator.Element] { ... }
  
  public func split(
    maxSplits: Int = Int.max,
    omittingEmptySubsequences: Bool = true,
    whereSeparator isSeparator: (Iterator.Element) throws -> Bool
  ) rethrows -> [[Iterator.Element]] { ... }
}

extension Sequence where Iterator.Element : Equatable {
  public func split(
    separator: Iterator.Element,
    maxSplits: Int = Int.max,
    omittingEmptySubsequences: Bool = true
  ) -> [[Iterator.Element]] { ... }
}

extension Sequence where
  SubSequence : Sequence,
  SubSequence.Iterator.Element == Iterator.Element,
  SubSequence.SubSequence == SubSequence {

  /// Returns a subsequence containing all but the given number of initial
  /// elements.
  ...
  /// - Complexity: O(*n*), where *n* is the length of the sequence.
  @_inlineable
  public func dropFirst(_ n: Int) -> [Iterator.Element] { ... }

  /// Returns a subsequence containing all but the given number of final
  /// elements.
  ...
  /// - Complexity: O(*n*), where *n* is the length of the sequence.
  @_inlineable
  public func dropLast(_ n: Int) -> [Iterator.Element] { ... }
  
  /// Returns a subsequence by skipping the initial, consecutive elements that
  /// satisfy the given predicate.
  ...
  /// - Complexity: O(*n*), where *n* is the length of the collection.
  /// - SeeAlso: `prefix(while:)`
  @_inlineable
  public func drop(
    while predicate: (Iterator.Element) throws -> Bool
  ) rethrows -> [Iterator.Element] { ... }

  /// Returns a subsequence, up to the specified maximum length, containing the
  /// initial elements of the sequence.
  ...
  /// - Complexity: O(*n*), where *n* is the length of the sequence.
  @_inlineable
  public func prefix(_ maxLength: Int) -> [Iterator.Element] { ... }
  
  /// Returns a subsequence containing the initial, consecutive elements that
  /// satisfy the given predicate.
  ...
  /// - Complexity: O(*n*), where *n* is the length of the collection.
  /// - SeeAlso: `drop(while:)`
  @_inlineable
  public func prefix(
    while predicate: (Iterator.Element) throws -> Bool
  ) rethrows -> [Iterator.Element] { ... }
}

// TODO describe lazy views for prefix and dropFirst

<https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md#source-compatibility>


SE-0234: Removing Sequence.SubSequence
(Anders) #2

I do not feel `AnySequence` for these overloads being a problem TBH, at least conceptually. IIUC a subsequence is expected to be a view to a consecutive range of the sequence, and require no extra storage since it is just about index manipulation.

On the other hand, filtering or mapping a collection involves transformations with non-trivial computational cost, in which case the transformed collections are generally expected to be further manipulated, instead of just iteration. That is likely why these are by default eager, since being lazy would mean the transformation has to happen every time an element is touched, given the constraint of no extra storage and no mutation as a read-only view (i.e. cannot cache the results).

Anyway, I remembered in the early days of swift-evolution, a similar topic had been brought up and the response was that the transforming methods e.g. `map` are deliberately chosen to be eager. Please correct me if my memory serves me wrong.

Regards
Anders

···

On 26 Apr 2017, at 4:26 AM, Pavol Vaskovic via swift-evolution <swift-evolution@swift.org> wrote:

Hello,

I was planning to wait pitching this after I have tested the implementation, but I’m getting anxious, that I’ll miss the 4.0 boat, so here goes. This is early draft. I would especially appreciate analysis of the impact on source compatibility and ABI stability.

https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md

Sequence Cleanup

Proposal: SE-NNNN <https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md>
Authors: Pavol Vaskovic <https://github.com/palimondo>, Author 2 <https://github.com/swiftdev>
Review Manager: TBD
Status: Pitch
<https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md#introduction>Introduction

This proposal is to change the default implementation of protocol Sequence methods so that all operations are eager (not lazy) and return Array<Element>. This removes the need to return type erased AnySequence that negatively impacts performance in order to hide the heterogenous implementation and provides Swift users with more homogenous interface. The lazy computation of prefix and dropFirst methods will be moved to LazySequence.

Swift-evolution thread: Discussion thread topic for that proposal <https://lists.swift.org/pipermail/swift-evolution/>
<https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md#motivation>Motivation

The API of Sequence protocol evolved over time to include irregular mixture of eager and lazy methods. In particular, it included lazy implementation of prefix and dropFirst from before the lazy sequence views found their home in LazySequence protocol. They are implemented by internal classes that wrap the underlying sequence and delay the computation until its elements are requested through iteration. SE-0045 <https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/SE-0045.md> added third internal class to implement drop(while:_), that eagerly drops the required element and wraps the partially used Iterator from the original sequence.

All other default implementations of Sequence methods are realized on top of Arrays and perform their computation eagerly. Due to the requirements of the associated type SubSequence returned from many Sequence protocol methods, all these heterogeneous implementations need to be hidden using type erasure by wrapping the results in AnySequence. The heterogeneity of implementation is currently well documented, if the user notices the different complexity of the methods that return lazy wrappers -- Complexity: O(1), instead of O(n). But given the existence of other lazy sequence views on LazySequence one might be surprised by this irregularity and overlook this historical quirk of implementation.

Furthermore, the presence of AnySequence and AnyIterator it returns, present serious optimization obstacle for the Swift compiler, when we chain the calls to these sequence methods. This results is non-specialized generic iterator, that ends up in a tight for-in loop. Currently there is no prefix implementation on LazySequence - preventing the workaround for avoiding AnySequence by going through .lazy call.

s.lazy.prefix(maxIterations).prefix(while:{ ... })
<https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md#proposed-solution>Proposed solution

We should take this opportunity to clean up the interface of default Sequence implementation to present Swift users with homogenous interface of eager methods that directly return Array as SubSequence. We'll move the lazy implementation for prefix and dropFirst to LazySequence.

<https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md#detailed-design>Detailed design

The following changes will be made to the standard library:

extension Sequence {

  public func suffix(_ maxLength: Int) -> [Iterator.Element] { ... }
  
  public func split(
    maxSplits: Int = Int.max,
    omittingEmptySubsequences: Bool = true,
    whereSeparator isSeparator: (Iterator.Element) throws -> Bool
  ) rethrows -> [[Iterator.Element]] { ... }
}

extension Sequence where Iterator.Element : Equatable {
  public func split(
    separator: Iterator.Element,
    maxSplits: Int = Int.max,
    omittingEmptySubsequences: Bool = true
  ) -> [[Iterator.Element]] { ... }
}

extension Sequence where
  SubSequence : Sequence,
  SubSequence.Iterator.Element == Iterator.Element,
  SubSequence.SubSequence == SubSequence {

  /// Returns a subsequence containing all but the given number of initial
  /// elements.
  ...
  /// - Complexity: O(*n*), where *n* is the length of the sequence.
  @_inlineable
  public func dropFirst(_ n: Int) -> [Iterator.Element] { ... }

  /// Returns a subsequence containing all but the given number of final
  /// elements.
  ...
  /// - Complexity: O(*n*), where *n* is the length of the sequence.
  @_inlineable
  public func dropLast(_ n: Int) -> [Iterator.Element] { ... }
  
  /// Returns a subsequence by skipping the initial, consecutive elements that
  /// satisfy the given predicate.
  ...
  /// - Complexity: O(*n*), where *n* is the length of the collection.
  /// - SeeAlso: `prefix(while:)`
  @_inlineable
  public func drop(
    while predicate: (Iterator.Element) throws -> Bool
  ) rethrows -> [Iterator.Element] { ... }

  /// Returns a subsequence, up to the specified maximum length, containing the
  /// initial elements of the sequence.
  ...
  /// - Complexity: O(*n*), where *n* is the length of the sequence.
  @_inlineable
  public func prefix(_ maxLength: Int) -> [Iterator.Element] { ... }
  
  /// Returns a subsequence containing the initial, consecutive elements that
  /// satisfy the given predicate.
  ...
  /// - Complexity: O(*n*), where *n* is the length of the collection.
  /// - SeeAlso: `drop(while:)`
  @_inlineable
  public func prefix(
    while predicate: (Iterator.Element) throws -> Bool
  ) rethrows -> [Iterator.Element] { ... }
}

// TODO describe lazy views for prefix and dropFirst

<https://github.com/palimondo/swift-evolution/blob/sequence-cleanup/proposals/NNNN-SequenceCleanup.md#source-compatibility>_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution