Swift Regex: lookbehind

Notably missing from Swift's new regex features is a way to perform a lookbehind. Is this an intentional omission or something that will eventually arrive?

3 Likes

It is intended to be eventually supported. Can you share your use case? There's a lot of different kinds of lookbehind, and many engines do not support the fully general case as it can add extra algorithmic factors that aren't present with lookahead.

For example, there's lookbehind of fixed-length verbatim content, look-behind of an alternation of fixed-length verbatim content, and lookbehind of simple and reversible regexes. These can be supported with different levels of efficiency without really changing the regex's algorithmic complexity.

But, in the most general form, a look-behind regex component would need to attempt the regex from every starting position in the input prior to the current position. This is very different than a lookahead regex which only tries from the current position.

5 Likes

A place where I just wanted to use a lookbehind was in implementing a hashtag detector that follows the Unicode specification for Hashtag Identifiers. (Who knew such a thing existed?!)

The spec includes the following:

<Hashtag-Identifier> := <Start> <Continue>* (<Medial> <Continue>+)*

When parsing hashtags in flowing text, it is recommended that an extended Hashtag only be recognized when there is no Continue character before a Start character. For example, in “abc#def” there would be no hashtag, while there would be in “abc #def” or “abc.#def”.

One natural way to implement this would be with a negative lookbehind, verifying that there is not a <Continue> character before the <Start> character (which is generally '#').

So, this would be a single character lookbehind.

5 Likes