What is a designated API to get String
from AttributedString
?
As far as I can tell, there are currently two ways to fetch a String
from an AttributedString
at the moment, both revolving around AttributedString.CharacterView
:
-
String
has an initializer which takes aSequence
ofCharacters
and constructs itself from that sequence. Given anAttributedString
attrStr
, you can writeString(attrStr.characters)
.However: I believe there's may be a performance pitfall here, because as far as I can tell, there's currently no fast-path in place here to help avoid iterating over
characters
one character at a time with a regularfor-in
loop, which is slower than getting to copy bytes from an underlying buffer directly -
It's not particularly intuitive to find, but the
AttributedString
work added an initializer toString
which takes a slice of anAttributedString
, reaches into its_guts
to grab the underlyingString
, slices the underlyingString
, and returns a bulk copy of that.Although also necessarily an O(n) copy, ths should be faster in practice than iterating over the character view and copying one character at a time.
You can achieve this at the moment by writing
String(attrStr.characters[...])
.
I haven't thought through the slicing aspects, but it appears to me that not offering read access to the underlying String
directly may only be an API oversight. Someone from the Foundation team, feel free to correct me, but this may be worth filing Feedback for. (As an alternative, it also seems reasonable to offer String.init(_: AttributedString.CharacterView)
and String.init<S: AttributedStringProtocol>(_: S)
as fast-path shortcuts for performing this conversion with fewer performance implications.)
Thank you. I was using 1. without thinking about performance. It just didn't feel like a proper API to do the thing. For now I perhaps just switch to 2.
+1 for encouraging for more expressive API.
One more way: NSAttributedString(attributedString).string
.
Good call with the characters[...] idea.
FWIW some very rough timing tests suggest 'characters[...]' is about 25% faster than 'characters'.
They are both more the 10 x faster than NSAttributedString(attrStr).string route.
But context and caching makes it all very approximate.
Yes using attrStr.characters[...]
is much faster because we have an explicit overload for this in Foundation, but using attrStr.characters
is slower as it falls back to the Sequence implementation. The use of [...]
shouldn't be necessary, it's simply an oversight in the original API and the new API to fix this exists in the repo but is disabled as it needs to be pitched/approved: swift-foundation/Sources/FoundationEssentials/AttributedString/Conversion.swift at e072f824bcd6f0d4bdb28142d15439e8afc4df00 · swiftlang/swift-foundation · GitHub, but I don't see any reason why that couldn't be added if it were to be pitched
I'd definitely recommend against NSAttributedString(attrStr).string
as that converts the entire AttributedString
contents to an NSAttributedString
(potentially dlopen
-ing UI frameworks to find their attribute scopes, converting attribute values, and bridging the string content to a UTF-16-based NSString
before bridging the NSString
back to a String
) which is why you see that one being far slower.