Swift Async Algorithms Proposal: Interspersed

Motivation

A common transformation that is applied to async sequences is to intersperse the elements with
a separator element.

Proposed solution

We propose to add a new method on AsyncSequence that allows to intersperse
a separator between each emitted element. This proposed API looks like this

extension AsyncSequence {
  /// Returns a new asynchronous sequence containing the elements of this asynchronous sequence, inserting
  /// the given separator between each element.
  ///
  /// Any value of this asynchronous sequence's element type can be used as the separator.
  ///
  /// The following example shows how an async sequences of `String`s can be interspersed using `-` as the separator:
  ///
  /// ```
  /// let input = ["A", "B", "C"].async
  /// let interspersed = input.interspersed(with: "-")
  /// for await element in interspersed {
  ///   print(element)
  /// }
  /// // Prints "A" "-" "B" "-" "C"
  /// ```
  ///
  /// - Parameter separator: The value to insert in between each of this async
  ///   sequenceโ€™s elements.
  /// - Returns: The interspersed asynchronous sequence of elements.
  @inlinable
  public func interspersed(with separator: Element) -> AsyncInterspersedSequence<Self> {
    AsyncInterspersedSequence(self, separator: separator)
  }
}

Detailed design

The bulk of the implementation of the new interspersed method is inside the new
AsyncInterspersedSequence struct. It constructs an iterator to the base async sequence
inside its own iterator. The AsyncInterspersedSequence.Iterator.next() is forwarding the demand
to the base iterator.
There is one special case that we have to call out. When the base async sequence throws
then AsyncInterspersedSequence.Iterator.next() will return the separator first and then rethrow the error.

Below is the implementation of the AsyncInterspersedSequence.

/// An asynchronous sequence that presents the elements of a base asynchronous sequence of
/// elements with a separator between each of those elements.
public struct AsyncInterspersedSequence<Base: AsyncSequence> {
  @usableFromInline
  internal let base: Base

  @usableFromInline
  internal let separator: Base.Element

  @usableFromInline
  internal init(_ base: Base, separator: Base.Element) {
    self.base = base
    self.separator = separator
  }
}

extension AsyncInterspersedSequence: AsyncSequence {
  public typealias Element = Base.Element

  /// The iterator for an `AsyncInterspersedSequence` asynchronous sequence.
  public struct Iterator: AsyncIteratorProtocol {
    @usableFromInline
    internal enum State {
      case start
      case element(Result<Base.Element, Error>)
      case separator
    }

    @usableFromInline
    internal var iterator: Base.AsyncIterator

    @usableFromInline
    internal let separator: Base.Element

    @usableFromInline
    internal var state = State.start

    @usableFromInline
    internal init(_ iterator: Base.AsyncIterator, separator: Base.Element) {
      self.iterator = iterator
      self.separator = separator
    }

    public mutating func next() async rethrows -> Base.Element? {
      // After the start, the state flips between element and separator. Before
      // returning a separator, a check is made for the next element as a
      // separator is only returned between two elements. The next element is
      // stored to allow it to be returned in the next iteration. However, if
      // the checking the next element throws, the separator is emitted before
      // rethrowing that error.
      switch state {
        case .start:
          state = .separator
          return try await iterator.next()
        case .separator:
          do {
            guard let next = try await iterator.next() else { return nil }
            state = .element(.success(next))
          } catch {
            state = .element(.failure(error))
          }
          return separator
        case .element(let result):
          state = .separator
          return try result._rethrowGet()
      }
    }
  }

  @inlinable
  public func makeAsyncIterator() -> AsyncInterspersedSequence<Base>.Iterator {
        Iterator(base.makeAsyncIterator(), separator: separator)
  }
}
4 Likes

FYI I'm getting a 404 for the source link, and "Tests" links to TestLazy.swift.

As for the proposal, I thought this already existed, so I'm all for it! :smile:

1 Like

Thanks for catching this. It should now work!

The implementation already existed since the early days of the project; however, we never got around to formally pitch it.

Can you explain why this is useful/desirable?

This is a good point. I was wondering the same when drafting up the proposal. @Philippe_Hausler can you chime in here and explain the reasoning for why we return the separator first and the re throw in this particular case.

Re-looking at the code; It could throw immediately - however the output may look delayed in that case since it has to prime the next iteration.

I could imagine an alteration that makes it throw at the same throwing point, but that might take some fixing.

1 Like

Interspersing is important, but this single addition is insufficient:

public func interspersed(with separator: Element) โ†’ AsyncInterspersedSequence<Self>

This signature assumes that Element is a value type and is clone-able via copy-on-write. If, on the other hand, you're dealing with reference types, this will end up interspersing the same allocated value into the final sequence.

For example: If you're dealing with things like UIViews and wanting to intersperse separators between sections, then you've now interspersed the same UIView between your sections, which means it will only end up once in the final view hierarchy.

So, there needs to also be a version of this that accepts a () โ†’ Element closure so it can generate the interspersed elements.

(FWIW, Swift-Algorithms also has this flaw)

4 Likes

I wish that be () async -> Element. Not sure if throwing makes sense here. The proposed API could be a thin wrapper over it.

1 Like

One signature would be enough if it was with an @autoclosure () -> Element argument.

When I've implemented this in the past in my own extensions, I start with the @autoclosure version and invariably end up splitting it out into two functions. @autoclosure is great if I have a single-line constructor that I want to use (such as Divider()), but:

  1. it cannot be used for more complex constructors that rely on constructing intermediate state/initializer parameters
  2. it cannot accept an actual closure as a parameter

In general I've found @autoclosure to be a nice thing for simplifying the odd API here and there, but it tends to not hold up very well when building a comprehensive API surface.

2 Likes

The one issue with the closure there is it might need to be Sendable.

I know it's not that simple but I wish T and () -> T where just interchangeable.

I think I want the API to be able to intersperse a separator every n elements (defaulting to 1), rather than only every other element. I expect "every other element" to account for like 98% of use, so I wouldn't do this if it added any additional complexity, but it doesn't, so my gut feeling is that we should include it.

7 Likes