AttributedString to String

What is a designated API to get String from AttributedString?

2 Likes

As far as I can tell, there are currently two ways to fetch a String from an AttributedString at the moment, both revolving around AttributedString.CharacterView:

  1. String has an initializer which takes a Sequence of Characters and constructs itself from that sequence. Given an AttributedString attrStr, you can write String(attrStr.characters).

    However: I believe there's may be a performance pitfall here, because as far as I can tell, there's currently no fast-path in place here to help avoid iterating over characters one character at a time with a regular for-in loop, which is slower than getting to copy bytes from an underlying buffer directly

  2. It's not particularly intuitive to find, but the AttributedString work added an initializer to String which takes a slice of an AttributedString, reaches into its _guts to grab the underlying String, slices the underlying String, and returns a bulk copy of that.

    Although also necessarily an O(n) copy, ths should be faster in practice than iterating over the character view and copying one character at a time.

    You can achieve this at the moment by writing String(attrStr.characters[...]).

I haven't thought through the slicing aspects, but it appears to me that not offering read access to the underlying String directly may only be an API oversight. Someone from the Foundation team, feel free to correct me, but this may be worth filing Feedback for. (As an alternative, it also seems reasonable to offer String.init(_: AttributedString.CharacterView) and String.init<S: AttributedStringProtocol>(_: S) as fast-path shortcuts for performing this conversion with fewer performance implications.)

15 Likes

Thank you. I was using 1. without thinking about performance. It just didn't feel like a proper API to do the thing. For now I perhaps just switch to 2.

+1 for encouraging for more expressive API.

3 Likes

One more way: NSAttributedString(attributedString).string.

Good call with the characters[...] idea.

FWIW some very rough timing tests suggest 'characters[...]' is about 25% faster than 'characters'.

They are both more the 10 x faster than NSAttributedString(attrStr).string route.

But context and caching makes it all very approximate.

Yes using attrStr.characters[...] is much faster because we have an explicit overload for this in Foundation, but using attrStr.characters is slower as it falls back to the Sequence implementation. The use of [...] shouldn't be necessary, it's simply an oversight in the original API and the new API to fix this exists in the repo but is disabled as it needs to be pitched/approved: swift-foundation/Sources/FoundationEssentials/AttributedString/Conversion.swift at e072f824bcd6f0d4bdb28142d15439e8afc4df00 · swiftlang/swift-foundation · GitHub, but I don't see any reason why that couldn't be added if it were to be pitched

I'd definitely recommend against NSAttributedString(attrStr).string as that converts the entire AttributedString contents to an NSAttributedString (potentially dlopen-ing UI frameworks to find their attribute scopes, converting attribute values, and bridging the string content to a UTF-16-based NSString before bridging the NSString back to a String) which is why you see that one being far slower.

3 Likes