Adding superscript and subscript syntax to Swift-DocC, and a question about strikethrough syntax

Hello! Today i want to propose a couple Markdown extensions that could be added to Swift-Markdown and Swift-DocC.

As the Swift book (TSPL) is getting converted to use Swift-DocC, it’s become apparent that there are some markup features that it requires that Swift-DocC currently doesn’t have. The one i’m looking at today is the little “opt” suffix in the grammar sections in the Reference:

The Markdown parser currently in use (swift-cmark, a fork of cmark-gfm) doesn’t have a syntax for subscripts like this, and i’d like to add it. Late last year, i messed around with adding a superscript extension that used Reddit’s syntax (thanks to Christian Selig who i was chatting with at the time, for the nerd-snipe): word^superscript or word^(superscript multiple words). This is a fairly popular syntax for writing superscripts in Markdown, as several different implementations have syntax much like this. (An alternative would be to use something like Pandoc’s syntax, which wraps the span in carets: word^superscript^ or word^superscript multiple words^.)

Finding a common syntax for subscripts is somewhat harder, though: There are far fewer implementations of subscript syntax, and the common thread uses surrounding tildes: word~subscript~. However, this currently clashes with GitHub’s “strikethrough” extension, which also uses surrounding tildes. This ambiguity can be solved by requiring two tildes instead of one, but for the moment Swift-DocC doesn’t do this, so both syntaxes are accepted: word ~stricken~ and word ~~stricken~~.

A way to mitigate the syntax clash is to actually tweak the syntax to appear more like Reddit’s superscript, i.e. word~subscript or word~(subscripting multiple words). This syntax doesn’t seem to have precedent anywhere that i could find, but it also doesn’t clash with GitHub’s strikethrough extension at all, which puts it at an advantage compared to Pandoc’s syntax, written above.

Summary of questions

To summarize the post:

  1. Can/should Swift-DocC adopt Reddit’s superscript syntax (word^superscript and word^(superscript))? This has an open PR on swift-cmark and could be integrated into Swift-DocC relatively easily.
  2. Can/should Swift-DocC adopt its own subscript syntax, based on Reddit’s superscript syntax (word~subscript and word~(subscript))? This would be relatively easy to implement, based on the existing work for superscripts.
  3. Can/should the subscript extension automatically require two tildes to use strikethrough (plain ~~stricken~~ but not plain ~stricken~)? Would this break anyone’s existing documentation?
  4. Can/should Swift-DocC adopt Pandoc’s superscript and subscript syntax instead of Reddit’s (word^superscript^ and word~subscript~)? This would require forcing two tildes for strikethrough, which may break existing documentation.

All of the above references to Swift-DocC also apply to Swift-Markdown and swift-cmark, where the integration would be written to begin with.

2 Likes

One thing about Reddit’s superscript syntax, is that it makes it really hard to put a closing-parenthesis in a superscript.

I don’t know if that is important for your use-case, but to the best of my knowledge this cannot be typed with Reddit syntax:

2(a+b)(c+d)

You can get close, as this:

2^((a+b)^) ^((c+d)^)

produces:

2(a+b) (c+d)

But besides having a non-obvious spelling, that also puts an unwanted space between the parenthesized terms.

6 Likes

Interesting! This is a really good point; i wonder if that could be mitigated with backslash-escapes, but it's still something worth noting.

1 Like

I don't know that I would think to try ~ for subscript. If anything I might try _, a la TeX, but that conflicts with italics.

Some implementations of markdown allow a subset of HTML; I've used <sup> and <sub> in markdown before. Would it be feasible to implement that?

2 Likes

Right, underscores are right out because it would clash with basic Markdown, like you mentioned. Tildes are one of the few remaining punctuation marks on a standard ANSI keyboard, sadly.

As far as HTML tags go, the base Markdown implementations we use let them through, but i think Swift-DocC filters them out. @ethankusters @marcus_ortiz Do either of you know if it would be a problem for docc to allow some select HTML tags through to Swift-DocC-Render?

Given that it's already used for strikethrough, it's not really one of the remaining punctuation marks... I'd be hesitant to repurpose something that's both already accepted and that people coming from GitHub would expect.

1 Like

The primary reason i went with tildes is because it's the only existing syntax i could find for subscripts used by any Markdown extension. I think Pandoc settled on that syntax (and also has two tildes for strikethrough, like GitHub) and everyone else who wrote an extension for their own purposes just went with that, and anyone who doesn't have an extension sticks with the corresponding HTML tags. I'm hesitant to just rely on the HTML tags, though, since that means that the Render JSON being output by Swift-DocC assumes it's getting translated to HTML in the end as well.

Personally, I think markdown was foolish to reserve both asterisks and underscores for italics.

Like, underscores clearly should’ve been for subscripts.

This is well into the weeds now, but Discord does

*italic*
_italic_
**bold**
__underline__

and I've become somewhat used to that, even though it seems to be more common for double-underscore to also be bold.