startOfSuffix() can be much simplified by using `.lastIndex`?

I got the answer on how to implement dropLast(while:). I was using suffix(while:) from Swift Algorithms. So I was looking at the source of .suffix(while:) and saw this:

extension BidirectionalCollection {
  /// Returns the inclusive lower bound of the suffix of elements that satisfy
  /// the predicate.
  ///
  /// - Parameter predicate: A closure that takes an element of the collection
  ///   as its argument and returns `true` if the element is part of the suffix
  ///   or `false` if it is not. Once the predicate returns `false` it will not
  ///   be called again.
  ///
  /// - Complexity: O(*n*), where *n* is the length of the collection.
  @inlinable
  internal func startOfSuffix(
    while predicate: (Element) throws -> Bool
  ) rethrows -> Index {
    var index = endIndex
    while index != startIndex {
      let after = index
      formIndex(before: &index)
      if try !predicate(self[index]) {
        return after
      }
    }
    return index
  }
}

seems this is kind of re-inventing the wheel and can be much simplified by using BidirectionalCollection.lastIndex(where:).

1 Like

This issue of implementation has been discussed:

I don't understand why someone would rather write some duplicate code to do exactly what BidirectionalCollection.lastIndex does. This startOfSuffix() can be writen in one short line using .lastIndex(where:) without any loop and early exit.

It will probably help to see the difference if we look at what an implementation of startOfSuffix(while:) on top of lastIndex(where:) looks like:

  func startOfSuffix(
    while predicate: (Element) throws -> Bool
  ) rethrows -> Index {
    guard let lastFailingIndex = lastIndex(where: { !predicate($0) }) else {
      return startIndex
    }
    return index(after: lastFailingIndex)
  }

Compared to the implementation in the Algorithms package, this approach requires two (or three) extra operations:

  1. The result of lastIndex(where:) needs to be boxed into an Optional and unwrapped
  2. There's an extra call to index(after:)

Those differences are why it's implemented at that level — startOfSuffix(while:) and its counterpart, endOfPrefix(while:), essentially end up as building blocks used in a few different algorithms. (All that said, I don't believe we've benchmarked the difference in performance, so it may end up being moot.)

5 Likes

And it’s worth noting this call may be arbitrarily expensive. For example in String it could involve traversing a Character with any number of code-points.

1 Like

I prefer this way:

try lastIndex(where: { try !predicate($0) }).map { index(after: $0) } ?? startIndex

less to read and still quite clear.

is index(after:) generally problematic, on a BidirectionalCollection?

It's not generally problematic, but it's unnecessary work. For the Algorithms package and the Standard Library, there's rarely a good excuse for doing unnecessary work. As Karoy said:

It would be really unfortunate if someone had to re-implement one of our algorithms to be faster because we had opted for a slightly more concise but less efficient solution.

1 Like

Maybe add a comment stating this code is written for maximum performance is the reason why .lastIndex is not used. Or stating this "re-write for max performance" choice in the doc.

And it seems this makes the case that .lastIndex is insufficient for this use case, as such, the startOfSuffix() is generally useful, maybe it should be public?

And should the stdlib address this apparent shortcoming of .lastIndex? Perhaps provide a variant that does what the startOfSuffix() does? Or one the returns both (lastIndex, after)

1 Like

I think these questions are addressed adequately by @lorentey's discussion on the thread linked above. I quote some salient parts below:

It is the goal of the standard library (and algorithms library, which is a proving ground for the standard library) to provide a rich array of useful algorithms for general use. It's a non-goal to remove the need for raw loops to be used in its own implementations, and by the same token, the need for raw loops in the standard library or algorithms library doesn't demonstrate that there's a missing building block that's generally useful. Nor is it even really remarkable enough to warrant a comment in the code, since doing this legwork in the standard library and algorithms library so that users don't have to is the entire point of these libraries.

1 Like
Terms of Service

Privacy Policy

Cookie Policy