Is String's subscript O(n) when given a lot of combining code points?

Collection's subscript is documented as O(1), so that is indeed a very reasonable assumption for people to have.

String indexes can store their character stride, so subscripting them is O(1); but they might not, in which case the subscript will have to calculate it. So I think you may be right, and that String might not be a valid Collection in that it does not fulfil its performance guarantees.

startIndex in particular never knows its character stride; and we would need to store that information along with the code-units so that we can continue returning startIndex itself in O(1). I don't believe String's storage class (the one which holds big strings where this may be an issue) is @frozen, so we have room to add storage for that. Of course, all mutating operations would need to check and possibly update that value.

I've had to do similar things when faced with the same problem: when you have variable-length elements that are parsed on-demand from some underlying storage, Swift's Collection model effectively requires that indexes contain both the position and length of the element they refer to, and you need to cache the length of the first element. This is certainly a familiar issue, and I'm sure many collection authors have likewise encountered this.

5 Likes