Custom string delimiters

David_Catmull · April 11, 2018, 5:16pm

Swift allows custom postfix, prefix, and infix operators, but I find myself also wishing for custom string delimiters to indicate different categories of strings - localizable, user interface identifiers, resource names, etc. Unicode has lots of options for expressive delimiters, some of which aren't too hard to type on a standard Mac US keyboard, such as «», ‹›, and “” (curly quotes).

A currently available option is to create a custom prefix/postfix operator and apply it to a regular string literal, such as §"image_name", but I think using other delimiters could make it more distinctive and expressive.

An idea for how this might work:

delimiter operator «»
delimiter func «»(stringLiteral: String) -> NSUserInterfaceItemIdentifier
{
  return NSUserInterfaceItemIdentifier(rawValue: stringLiteral)
}
delimiter func «»(stringLiteral: String) -> NSimage.Name
{
  return NSImage.Name(rawValue: stringLiteral)
}

This adds delimiter to the existing operator types, where the delimiter is a pair of characters. Should multi-character delimiters be allowed? My gut feeling is that would be too complicated, so I'm recommending a simple pair.

Once the delimiter is defined, you can define functions that take a string literal and return some non-Void type. As with generic functions that differ only in their return type, the compiler could determine which overload of the function to use depending on the context.

A string literal with a custom delimiter would of course have to allow for escaping the closing delimiter in case you wanted to have it as part of the string, and that seems straightforward enough, like «example\»string».

I considered a requirement that the function return a type that conforms to ExpressibleByStringLiteral and/or RawRepresentable, since those are related concepts, but couldn't think of exactly what value such a restriction would actually add.

Thoughts? Has this been discussed before? I searched and couldn't find anything.

nuclearace · April 11, 2018, 5:22pm

Yes, it was discussed in SE-0200: "Raw" mode string literals, which was returned.

David_Catmull · April 11, 2018, 5:27pm

Raw mode means interpreting the contents of the literal differently. What I'm proposing is being able to define custom delimiters without otherwise changing how the characters within the delimiters are interpreted.

nuclearace · April 11, 2018, 5:28pm

Yes, but one of the ways that was discussed in that thread was allowing custom delims, and there was interesting discussion around this idea.

David_Catmull · April 11, 2018, 5:32pm

OK. I missed that. It still looks like a significantly different case though - what I see there is about defining custom delimiters for a single instance to avoid escape characters within the string.

MutatingFunk · April 11, 2018, 8:03pm

As you say, the prefix operator on a regular string works well enough, so I'm not sure this is worth the complexity… It'd also be tough to find good delimiters without clashing with operators or venturing into unicode…

But practicality aside, that's a really creative and cool idea.

If the string parameter had to match a custom RegEx rule, it would almost let you create your own literals.

QuinceyMorris · April 11, 2018, 8:47pm

I can think of several reasons why this might not be a good idea:

They aren't strings any more. By encapsulating their raw value, the Swift versions of these types make the dependence on string representations an implementation detail.
They are subject to the same objection as custom operators generally: readers of the code have no prior knowledge of what they mean. Concision doesn't seem like a desirable goal here.
delimiter operator «» isn't actually an operator, since it's not a function from one type to another. The text inside the delimiters isn't a string literal. To make this work, the delimiter characters would have to be recognized during lexical analysis (just like "…"), but at that point the set of custom operators isn't known.

David_Catmull · April 12, 2018, 5:08pm

I though about your issue #2, but I figured since the same issue already exists with custom operators it's not such a big deal; it's nothing new.

On the other hand, I can see how the lexical analysis situation would be much more complicated than with custom operators.