[Pre-Pitch] Getting the index to the next valid element

I have the index to an element. I want the index to the next one, but only if there actually is a next element. Yes, I can get the next element index, compare it to endIndex, then do a dereference if the test passes. But the standard way of doing that stops the flow by needing a selection statement in the middle of that chain. Flows can be kept up with Optional chaining, so we could add a compatible analog to Collection.index(after:):

extension Collection {
    /// Returns the position immediately after the given index, if they're
    /// dereferencable.
    func validIndex(after i: Index) -> Index? {
        precondition(i < endIndex)
        let next = index(after: i)
        return next < endIndex ? next : nil
    }
}

We would add a BidirectionalCollection.validIndex(before:) too. Thoughts?

1 Like

Here's a fuller version of the code (untested):

extension Collection {

    /// Returns the position immediately after the given index, if they're
    /// dereferencable.
    func elementIndex(after i: Index) -> Index? {
        precondition(i < endIndex)

        let next = index(after: i)
        return next < endIndex ? next : nil
    }

}

extension BidirectionalCollection {

    /// Returns the position immediately before the given index, if they're
    /// dereferencable.
    func elementIndex(before i: Index) -> Index? {
        guard i > startIndex else { return nil }

        return index(before: i)
    }

}

extension RandomAccessCollection {

    /// Returns an index that is the specified distance from the given index,
    /// unless that location isn't dereferencable.
    func elementIndex(from i: Index, offsetBy distance: Int) -> Index? {
        precondition(i < endIndex || distance < 0)

        let limit = distance < 0 ? startIndex : index(before: endIndex)
        return index(i, offsetBy: distance, limitedBy: limit)
    }

}

This is self.index(idx, offsetBy: 1, limitedBy: self.endIndex). It could be more concise, as you note, but I'm kind of wondering how often it comes up. I have needed it myself when doing split-like operations (find a delimiter, then step over it), but I'm not sure I've needed it any other time, and a three-way split is a better way to handle that case.

@jrose My reading of the documentation for index(_:after:limitedBy:) suggests that the limiting index is a valid return value, is that not the case?

Oops, you're right. If you actually want to stop before endIndex you'd have to do that specially, and you can't just ask for that on a non-BidirectionalCollection. That makes this a little more useful, I suppose.

1 Like

+1 for me!

Generally speaking I'm in favor of anything that adds index-safe APIs to Arrays.

IMO it's the most glaring exception to Swift's general safety, resulting in easily created warning-less crashes.

Ideally I would want direct-index accessors to default to Optional, but I'm guessing at this point that ship has sailed. I would be happy with an overloaded subscript, but at least a func get(index: Int) -> Element?

Regardless, this would at least be a step in that direction :slight_smile:

1 Like

Can you provide an example use case?

Since my misadventures in creating an enum-based trie resulted in discovering a compiler bug, I've gone back to manually creating a CR/LF/CRLF/CR-CRLF parser. I tried years ago, and now updating it with my better experience with Swift.

I start with a protocol to indicate the CR and LF values:

/// A type that supports values for the ASCII line feed and carriage return.
protocol InternetLineBreakerValues {
    /// The value of the ASCII+ carriage return code point.
    static var crValue: Self { get }
    /// The value of the ASCII+ line feed code point.
    static var lfValue: Self { get }
}

I could have extended just UInt8, but I made this protocol then extended it with default implementations for ExpressibleByIntegerLiteral and ExpressibleByUnicodeScalarLiteral types so I can then add extensions for UnicodeScalar, Int8, etc. (Just in case I go back to the Swift-language token iterator.)

Now add an iterator to search for line breaks:

/// An iterator over locations within a given collection of where its line-
/// breaking sequences are.
struct LineTerminatorLocationIterator<Base: Collection> where Base.Element: Equatable & InternetLineBreakerValues {
    /// The remaining sub-collection to search.
    var collection: Base.SubSequence
    /// Which line-breaking sequences to search for.
    let targets: LineTerminatorSearchTargets
}

Since I need to look at most 3 code-points back, I originally manually arranged to get the next three values, but had to add special cases when there were less than three elements left. Get a function to simply return nil once going out of bounds was a lot easier then having to stop the flow to create if-let-else branches in my initializations.

extension Collection {
    /// Returns the position immediately after the given index, if they're
    /// dereferencable.
    func elementIndex(after i: Index) -> Index? {
        precondition(i < endIndex)
        let next = index(after: i)
        return next < endIndex ? next : nil
    }
}

extension LineTerminatorLocationIterator: IteratorProtocol {

    mutating func next() -> Range<Base.Index>? {
        var result: Range<Base.Index>?
        var first = collection.isEmpty ? nil : collection.startIndex
        var second = first.flatMap { collection.elementIndex(after: $0) }
        var third = second.flatMap { collection.elementIndex(after: $0) }
        while let firstIndex = first, result == nil {
            defer {
                first = second
                second = third
                third = third.flatMap { collection.elementIndex(after: $0) }
            }

            let secondValue = second.map { collection[$0] }
            let thirdValue = third.map { collection[$0] }
            switch (collection[firstIndex], secondValue, thirdValue) {
            case (Base.Element.crValue, Base.Element.crValue?, Base.Element.lfValue?) where targets.contains(.crcrlf):
                result = firstIndex..<collection.index(after: third!)
            case (Base.Element.crValue, Base.Element.lfValue?, _) where targets.contains(.crlf):
                result = firstIndex..<collection.index(after: second!)
            case (Base.Element.crValue, _, _) where targets.contains(.cr),
                 (Base.Element.lfValue, _, _) where targets.contains(.lf):
                result = firstIndex..<collection.index(after: firstIndex)
            default:
                break
            }
        }
        collection = collection[(result?.upperBound ?? collection.endIndex)...]
        return result
    }

}

I made a Sequence for this iterator, then upgraded it with Collection extensions, then optionally added BidirectionalCollection extensions. Doing index(before:) required the elementIndex(before:) method to help in the same way.

The methods I described go from Index to Index?, where I used Optional.flatMap to cascade to new elements. You still have to start with one known-good element. Hmm, maybe a:

extension Collection {
    var startingIndex: Index? {
        return startIndex < endIndex ? startIndex : nil
    }
}

Would a BidirectionalCollection.endingIndex be useful? Remember, unlike startIndex and startingIndex, endingIndex returns the last valid element index, which is the one that last points to, and always differs from endIndex. (I don't want to use the firstIndex and lastIndex names, since they are already present and have a different meaning.)

If you always stick with Index?, then you can get your optional dereferencing interface with myPossibleIndex.map { myCollection[$0] },

Terms of Service

Privacy Policy

Cookie Policy