String Extension with Full Subscript Support

Peter-Schorn · April 19, 2020, 9:45am

I've written an extension for String that provides getter and setter subscripts for indices, open ranges, closed ranges, PartialRangeFrom, PartialRangeThrough, and PartialRangeUpTo. The extension even supports passing in negative numbers to access characters from the end backwards!

My package, which includes many other helpful extensions, is available here. Here are all the String extensions:

public extension String {
    
    /**
     Enables passing in negative indices to access characters
     starting from the end and going backwards.
     if num is negative, then it is added to the
     length of the string to retrieve the true index.
     */
    func negativeIndex(_ num: Int) -> Int {
        return num < 0 ? num + self.count : num
    }
    
    func strOpenRange(index i: Int) -> Range<String.Index> {
        let j = negativeIndex(i)
        return strOpenRange(j..<(j + 1), checkNegative: false)
    }
    
    func strOpenRange(
        _ range: Range<Int>, checkNegative: Bool = true
    ) -> Range<String.Index> {

        var lower = range.lowerBound
        var upper = range.upperBound

        if checkNegative {
            lower = negativeIndex(lower)
            upper = negativeIndex(upper)
        }
        
        let idx1 = index(self.startIndex, offsetBy: lower)
        let idx2 = index(self.startIndex, offsetBy: upper)
        
        return idx1..<idx2
    }
    
    func strClosedRange(
        _ range: CountableClosedRange<Int>, checkNegative: Bool = true
    ) -> ClosedRange<String.Index> {
        
        var lower = range.lowerBound
        var upper = range.upperBound

        if checkNegative {
            lower = negativeIndex(lower)
            upper = negativeIndex(upper)
        }
        
        let start = self.index(self.startIndex, offsetBy: lower)
        let end = self.index(start, offsetBy: upper - lower)
        
        return start...end
    }
    
    // MARK: - Subscripts
    
    /**
     Gets and sets a character at a given index.
     Negative indices are added to the length so that
     characters can be accessed from the end backwards
     
     Usage: `string[n]`
     */
    subscript(_ i: Int) -> String {
        get {
            return String(self[strOpenRange(index: i)])
        }
        set {
            let range = strOpenRange(index: i)
            replaceSubrange(range, with: newValue)
        }
    }
    
    
    /**
     Gets and sets characters in an open range.
     Supports negative indexing.
     
     Usage: `string[n..<n]`
     */
    subscript(_ r: Range<Int>) -> String {
        get {
            return String(self[strOpenRange(r)])
        }
        set {
            replaceSubrange(strOpenRange(r), with: newValue)
        }
    }

    /**
     Gets and sets characters in a closed range.
     Supports negative indexing
     
     Usage: `string[n...n]`
     */
    subscript(_ r: CountableClosedRange<Int>) -> String {
        get {
            return String(self[strClosedRange(r)])
        }
        set {
            replaceSubrange(strClosedRange(r), with: newValue)
        }
    }
    
    /// `string[n...]`. See PartialRangeFrom
    subscript(r: PartialRangeFrom<Int>) -> String {
        
        get {
            return String(self[strOpenRange(r.lowerBound..<self.count)])
        }
        set {
            replaceSubrange(strOpenRange(r.lowerBound..<self.count), with: newValue)
        }
    }
    
    /// `string[...n]`. See PartialRangeThrough
    subscript(r: PartialRangeThrough<Int>) -> String {
        
        get {
            let upper = negativeIndex(r.upperBound)
            return String(self[strClosedRange(0...upper, checkNegative: false)])
        }
        set {
            let upper = negativeIndex(r.upperBound)
            replaceSubrange(
                strClosedRange(0...upper, checkNegative: false), with: newValue
            )
        }
    }
    
    /// `string[...<n]`. See PartialRangeUpTo
    subscript(r: PartialRangeUpTo<Int>) -> String {
        
        get {
            let upper = negativeIndex(r.upperBound)
            return String(self[strOpenRange(0..<upper, checkNegative: false)])
        }
        set {
            let upper = negativeIndex(r.upperBound)
            replaceSubrange(
                strOpenRange(0..<upper, checkNegative: false), with: newValue
            )
        }
    }


}

Usage:

let text = "012345"
print(text[2]) // "2"
print(text[-1] // "5"

print(text[1...3]) // "123"
print(text[2..<3]) // "2"
print(text[3...]) // "345"
print(text[...3]) // "0123"
print(text[..<3]) // "012"
print(text[(-3)...] // "345"
print(text[...(-2)] // "01234"

All of the above works with assignment as well. All subscripts have getters and setters.

Lantua · April 19, 2020, 10:41am

Personally, I’d suggest that subscript like this has label. It’ll otherwise look confusingly like O(1) index subscripts.

Peter-Schorn · April 19, 2020, 10:52am

I'm honestly not that worried about performance. I have yet to run into a case where this kind of performance difference matters, although I agree that it's still worth pointing out.

Diggory · April 19, 2020, 11:19am

Looks very useful.

[Disclaimer] - I know almost nil about Unicode but does this work as expected?

let arabic = "من كافة قطاعات الصناعة على الشبكة العالمية "
print("ultimate: \(text[-1])")
let german = "Straße"
print("ultimate: \(text[-1])")

gives:

ultimate:
ultimate:

Peter-Schorn · April 19, 2020, 12:14pm

@Diggory There is a blank space at the end of the arabic and text strings. That's why the output is

ultimate:

Diggory · April 19, 2020, 12:23pm

// Playground generated with 🏟 Arena (https://github.com/finestructure/arena)
// ℹ️ If running the playground fails with an error "no such module ..."
//    go to Product -> Build to re-trigger building the SPM package.
// ℹ️ Please restart Xcode if autocomplete is not working.

import Utilities

func testUtils() {
    print(text[2]) // "C"
    print(text[-1]) // "G"

    print(text[1...3]) // "BCD"
    print(text[2..<3]) // "C"
    print(text[3...]) // "DEFG"
    print(text[...3]) // "ABCD"
    print(text[..<3]) // "ABC"
    print(text[(-3)...]) // "EFG"
    print(text[...(-2)]) // "ABCDEF"
}

var text = "ABCDEFG"
testUtils()

text = "من كافة قطاعات الصناعة على الشبكة العالمية "
testUtils()

let arabic = "من كافة قطاعات الصناعة على الشبكة العالمية "
print("ultimate: \(text[-1])")

let german = "Straße"
print("ultimate: \(text[-1])")

Oh yes, sorry - copy & paste error meant that the second print was not testing the German string...

CTMacUser · April 21, 2020, 3:13am

We got yet another:

Slap subscripts based on integer offsets to String (or collections in general).
Slap negative-integer-value support on top of [1] for String (or bidirectional collections in general) to index values from the end.

proposal. Someone posts ideas covering [1] or [2] or both every few months. You could probably search for previous threads on why doing [1] or [2] is problematic. (Hint/summary: we don't do [1] because it isn't efficient for the internal structure of String. We don't do [2] because the philosophy of Collection.Index doesn't support "negative is backwards." Although the most popular container, Array, does 0..<count for its indices, that is not a thing in general for collections, not even for array slices.)

Maybe we need to add both of these to the "commonly rejected ideas" docs.

(Sorry for any cynicism.)

Peter-Schorn · April 21, 2020, 5:37pm

I'm not suggesting that this be added to the standard library. I understand that the performance characteristics of accessing elements by index is poor. I'm just posting this so that other people can import use it in their projects if they want.

schmidt · August 28, 2022, 2:27am

This is used so often, so it must be a part of the standard library.
If there are performance concerns, they should be specified in the documentation.
From what I see, most people just copy-paste those extensions from StackOverflow, so we have same performance concern + time spent to google it.

tera · August 28, 2022, 2:37am

I remember this suggestion to handle all range variants in one go.

sspringer · August 28, 2022, 3:40am

This is a general problem with the Foundation library: Many easy to use methods common in other programming languages are implemented again and again as extensions, like a String.trim() (or OK, in Swift style String.trimming()) or easy to use methods/properties for URLs like isFile or isDirectory. This is quite confusing for people coming from other programming languages not to find these simple methods in Swift. Swift has to become better in this regard. The String index thing is kind of a special case, as we have indeed a case of such a „common“ method being inefficient in Swift (for good reasons). Maybe in this case it deserves — on the contrary to what I said before — a method with a longer name like „findingWithIndices“ and not easy to write subscripts.