Swift Async Algorithms Proposal: Interspersed

Motivation

A common transformation that is applied to async sequences is to intersperse the elements with
a separator element.

Proposed solution

We propose to add a new method on AsyncSequence that allows to intersperse
a separator between each emitted element. This proposed API looks like this

extension AsyncSequence {
  /// Returns a new asynchronous sequence containing the elements of this asynchronous sequence, inserting
  /// the given separator between each element.
  ///
  /// Any value of this asynchronous sequence's element type can be used as the separator.
  ///
  /// The following example shows how an async sequences of `String`s can be interspersed using `-` as the separator:
  ///
  /// ```
  /// let input = ["A", "B", "C"].async
  /// let interspersed = input.interspersed(with: "-")
  /// for await element in interspersed {
  ///   print(element)
  /// }
  /// // Prints "A" "-" "B" "-" "C"
  /// ```
  ///
  /// - Parameter separator: The value to insert in between each of this async
  ///   sequence’s elements.
  /// - Returns: The interspersed asynchronous sequence of elements.
  @inlinable
  public func interspersed(with separator: Element) -> AsyncInterspersedSequence<Self> {
    AsyncInterspersedSequence(self, separator: separator)
  }
}

Detailed design

The bulk of the implementation of the new interspersed method is inside the new
AsyncInterspersedSequence struct. It constructs an iterator to the base async sequence
inside its own iterator. The AsyncInterspersedSequence.Iterator.next() is forwarding the demand
to the base iterator.
There is one special case that we have to call out. When the base async sequence throws
then AsyncInterspersedSequence.Iterator.next() will return the separator first and then rethrow the error.

Below is the implementation of the AsyncInterspersedSequence.

/// An asynchronous sequence that presents the elements of a base asynchronous sequence of
/// elements with a separator between each of those elements.
public struct AsyncInterspersedSequence<Base: AsyncSequence> {
  @usableFromInline
  internal let base: Base

  @usableFromInline
  internal let separator: Base.Element

  @usableFromInline
  internal init(_ base: Base, separator: Base.Element) {
    self.base = base
    self.separator = separator
  }
}

extension AsyncInterspersedSequence: AsyncSequence {
  public typealias Element = Base.Element

  /// The iterator for an `AsyncInterspersedSequence` asynchronous sequence.
  public struct Iterator: AsyncIteratorProtocol {
    @usableFromInline
    internal enum State {
      case start
      case element(Result<Base.Element, Error>)
      case separator
    }

    @usableFromInline
    internal var iterator: Base.AsyncIterator

    @usableFromInline
    internal let separator: Base.Element

    @usableFromInline
    internal var state = State.start

    @usableFromInline
    internal init(_ iterator: Base.AsyncIterator, separator: Base.Element) {
      self.iterator = iterator
      self.separator = separator
    }

    public mutating func next() async rethrows -> Base.Element? {
      // After the start, the state flips between element and separator. Before
      // returning a separator, a check is made for the next element as a
      // separator is only returned between two elements. The next element is
      // stored to allow it to be returned in the next iteration. However, if
      // the checking the next element throws, the separator is emitted before
      // rethrowing that error.
      switch state {
        case .start:
          state = .separator
          return try await iterator.next()
        case .separator:
          do {
            guard let next = try await iterator.next() else { return nil }
            state = .element(.success(next))
          } catch {
            state = .element(.failure(error))
          }
          return separator
        case .element(let result):
          state = .separator
          return try result._rethrowGet()
      }
    }
  }

  @inlinable
  public func makeAsyncIterator() -> AsyncInterspersedSequence<Base>.Iterator {
        Iterator(base.makeAsyncIterator(), separator: separator)
  }
}
5 Likes

FYI I'm getting a 404 for the source link, and "Tests" links to TestLazy.swift.

As for the proposal, I thought this already existed, so I'm all for it! :smile:

1 Like

Thanks for catching this. It should now work!

The implementation already existed since the early days of the project; however, we never got around to formally pitch it.

Can you explain why this is useful/desirable?

This is a good point. I was wondering the same when drafting up the proposal. @Philippe_Hausler can you chime in here and explain the reasoning for why we return the separator first and the re throw in this particular case.

Re-looking at the code; It could throw immediately - however the output may look delayed in that case since it has to prime the next iteration.

I could imagine an alteration that makes it throw at the same throwing point, but that might take some fixing.

1 Like

Interspersing is important, but this single addition is insufficient:

public func interspersed(with separator: Element) β†’ AsyncInterspersedSequence<Self>

This signature assumes that Element is a value type and is clone-able via copy-on-write. If, on the other hand, you're dealing with reference types, this will end up interspersing the same allocated value into the final sequence.

For example: If you're dealing with things like UIViews and wanting to intersperse separators between sections, then you've now interspersed the same UIView between your sections, which means it will only end up once in the final view hierarchy.

So, there needs to also be a version of this that accepts a () β†’ Element closure so it can generate the interspersed elements.

(FWIW, Swift-Algorithms also has this flaw)

4 Likes

I wish that be () async -> Element. Not sure if throwing makes sense here. The proposed API could be a thin wrapper over it.

1 Like

One signature would be enough if it was with an @autoclosure () -> Element argument.

When I've implemented this in the past in my own extensions, I start with the @autoclosure version and invariably end up splitting it out into two functions. @autoclosure is great if I have a single-line constructor that I want to use (such as Divider()), but:

  1. it cannot be used for more complex constructors that rely on constructing intermediate state/initializer parameters
  2. it cannot accept an actual closure as a parameter

In general I've found @autoclosure to be a nice thing for simplifying the odd API here and there, but it tends to not hold up very well when building a comprehensive API surface.

2 Likes

The one issue with the closure there is it might need to be Sendable.

I know it's not that simple but I wish T and () -> T where just interchangeable.

I think I want the API to be able to intersperse a separator every n elements (defaulting to 1), rather than only every other element. I expect "every other element" to account for like 98% of use, so I wouldn't do this if it added any additional complexity, but it doesn't, so my gut feeling is that we should include it.

7 Likes

Thanks for all the great feedback here. I opened a PR that integrates all of the feedback. Here is a quick summary:

  • Change the trailing separator behaviour. We are no longer returning a separator before we are forwarding the error
  • Add a synchronous and asynchronous closure based interspersed method.
  • Support interspersing every n elements

I would love to get some feedback on the updated proposal. Especially on:

  • The naming of the every parameter
  • The closures are not throwing because we can't easily make them rethrows and this would make this sequence always throwing. To support re throws we probably have to introduce another generic parameter to this type.

every is a decent name - it does what it says on the tin and provides natural language of the method; but swift does have a similar nomenclature for things like that - stride.

In this case the API ends up bifurcating to two types - one for throwing closures and one for non-throwing closures. From a usage standpoint it should be isomorphic to users but from the type system it needs to have concrete awareness that the closure participates in the error-ness of the iteration.

Right, we can totally do this by having a AsyncThrowingInterspersedSequence. Should we include it in this proposal?

Yea I think it is a direct consequence of the closure - that should support failure modes and therefore we need throwing variant of the sequence.

1 Like

Updated the PR to also include the throwing closure variants + the backing AsyncThrowingInterspersedSequence. I will keep the PR open for a bit so that we can get more feedback in here.

1 Like

Thanks for the update! I think it looks really good.

I noticed that the three methods have the same example snippet in their doc comments, even though they're different. IE, they all show:

let input = ["A", "B", "C"].async
let interspersed = input.interspersed(with: "-")

... even for the variations that take a closure. If that's "the way" docs are done then so be it, but I think it'd be better if each method had a different example, so it's more obvious from the docs why you'd want to choose one variant over another.

3 Likes

Ah nope good catch. I will update those. It is a bit hard to come up with sensible examples for the async and throws variants. Might just stick to something very basic there or re-use the regular closure based example.

1 Like