allevato:
Notice that spaces and word-breaking punctuation is excluded, but an apostrophe in the word "isn't" is handled correctly as part of that word.
If users want access to the intervening spaces/punctuation, they can still do so; given a string S two adjacent words W1 and W2, the content between those words is S[W1.endIndex..<W2.startIndex].
IMO, this strikes a nice balance between a clean API for the majority of users’ needs and correctness. If we find that we need to more completely expose word break iterators in Swift, we can do so later; but designing a complete and ergonomic WBI API for Swift is non-trivial and is likely far more advanced than most users would need.
This seems like a similar design space that @jrose was mentioning in another thread :
I’ll note that I had a use case recently where I wanted split to actually preserve the separators in the results. You can do this from the API that’s here, but it’s, uh, not great
Similarly, @Ben_Cohen 's blog post identifies a starter pitch for a lazy split collection.
It seems like there's a general need for a (configurable) lazy split collection. The proposed String.Lines
likewise would be inventing a (less general) collection for this purpose. It may make sense to spin off a discussion to design this general construct.
1 Like