Best way to operate on the raw string content of an AttributedString

I want to use Foundation's AttributedString as a backing store for syntax-highlighted (LaTeX) source code in my editor app.

There's basically two categories of code dealing with LaTeX source in my app:

  1. Code that's only interested in the raw string content (the character data). This code compares, searches (also case-insensitive), slices, or modifies the source string.
  2. Code that stores or retrieves syntax highlighting information in the attributes (for integration with NSTextStorage / NSTextView).

Previously, syntax highlighting was implemented by copying the source into a separate NSMutableAttributedString, and applying the attributes there. This is of course ugly and undesirable, because the source is now stored twice.

I would like to replace the two separate strings with one AttributedString that can cater to both use cases.

I think porting the existing syntax highlighting should be pretty straightforward, but I am not sure how to best approach category 1, the code that operates on the raw string content.

For example, I can't find API for case-insensitive comparison. StringProtocol has compare(_:options:range:locale:), but I can't find anything like that on AttributedStringProtocol.

Would I need to operate on AttributedString's characters view instead? Or can I get a StringProtocol-like view on an AttributedString somehow?

Today, the best way to access string content is via the AttributedString.CharacterView or AttributedString.UnicodeScalarView (which behave very similarly to String itself and String.UnicodeScalarView). There is a variety of functionality defined generically over Collection<Character> or just Collection in general that is available for both String and the AttributedString.CharacterView. This is the most ergonomic and efficient way to interact with the character contents since it will view into the contents directly rather than copying them out of the tree backing structure and into a new type.

You are right however that there are some types that are defined on StringProtocol that might not have a direct equivalent for AttributedString.CharacterView. If you encounter these, it'd be super helpful if you could file a feedback / post an issue on the swift-foundation repo with an example of your use case and what API you're trying to use that isn't present (I believe case-insensitive comparison is likely one of these). If needed, you could create a String from the character view via String(myAttrStr.characters) but note that this will copy the characters out into the String and does not provide a "view" into the contents like the CharacterView does.

2 Likes

Jeremy,

thanks, that's in line with my understanding.

I'll explore that approach further and will post use cases.