Hi everyone,
I'd like to propose an enhancement to Swift-DocC to make the output more easily consumed by tools that either don't evaluate JavaScript when visiting a page or tools that can evaluate JavaScript but are impacted by it in one way or another.
Such tools could for example be a search engine crawler/indexer, a large language model fetching a page in response to a prompt, or a custom tool that reads or processes the documentation pages.
Introduction
The Swift-DocC output—a '.doccarchive' directory—is a single page web application that contains a per-page JSON file which Swift-DocC Render turns into rich documentation web pages using JavaScript.
For compatibility with static hosting environments, like GitHub Pages, DocC added a per-page copy of the render template's "index.html" file back in 2021.
This meant that no matter which documentation page a reader visited, they'd load the blank "index.html" page with the necessary JavaScript imports to request the right JSON data file and render the page without setting up custom routing rules to host the documentation.
However, because the page is ultimately rendered using JavaScript, its content isn't available to tools unless they either run a headless browser that evaluates the JavaScript and waits for the page to finish loading, or know the URL of the JSON data file and know how the content is encoded in that file.
Proposed solution
Instead of making an identical "index.html" copy for each page, DocC would render each page's content into minimal semantic HTML and use that as the contents of the page's <noscript> element.
This means that when a person reads the documentation (with JavaScript enabled) in a web browser they're going to see the full fidelity page that's rendered by Swift-DocC Render—just like they do today.
However, when a tool accesses the documentation pages through curl or web requests through a system like URLSession, the response data will include a representation of the page's content that the tool can access.
This new content would be controlled by a new, off by default, --experimental-transform-for-static-hosting-with-content flag (following the spelling of the existing and on by default --transform-for-static-hosting flag).
You can see a live demo of this in Swift-DocC's library documentation. When you open that page in the browser with JavaScript enabled everything looks the same as it does today. However, if you disable JavaScript in your browser and refresh the page you'll see its documentation content instead of a "This page requires JavaScript" message. These pages are browsable for people too—you can navigate the documentation by clicking links—but it's primarily intended for tools at this point.
Detailed design
I propose adding a new, off by default, --experimental-transform-for-static-hosting-with-content to the docc executable.
When DocC is passed this flag, it will use the "index.html" file from the render template—located as "../share/docc/render/" relative to the docc executable in the toolchain—as a base for each per-page "index.html" file.
DocC will identify the <head><title> element and the <body><noscript> element and use this information to customize the base "index.html" data with per-page content. Specifically, DocC will make 3 modifications compared to the "base":
- Use the page's title—what appears in the pages's
<h1>element—as the content for the<title>element, instead of "Documentation" like today. - Add a
<meta name="description" content="...">element that contains a plain text version of the page's abstract—the paragraph of text right below its title. - Replace the "This page requires JavaScript" message in the page's
<noscript>element with a minimal semantic HTML version of the page's content.
The information that DocC considers as "the page's content" for this purpose is:
- the "breadcrumbs" that link to the hierarchy of container pages for the current page
- the type of page (for example "Article", "Class", "Instance Method", "Enumeration Case", etc.)
- the title and abstract
- the declaration (for symbols)
- any platform availability information (for example "iOS 12.2+")
- any parameter documentation (for symbols)
- any return value documentation (for symbols)
- a "Mentioned In" section that lists the articles that link to this symbol in their content (for symbols)
- the "Overview" or "Discussion" section and their subsections that contains the remainder of the page's authored documentation
- any authored documentation organization in a "Topics" section
- any automatic/default documentation organization based on the symbol hierarchy (for example "Instance Properties", "Type Methods", etc.)
- a "Relationships" section that lists the symbols that this symbol conforms to, inherits from, etc. (for symbols)
Because that HTML content is primarily aimed at tools, DocC aims for the HTML to be to be very minimal/concise and use semantic elements whenever possible.
To the goal of conciseness; DocC would join a symbol's declaration fragments into a single string, rather than creating individual elements for each declaration fragment:
<pre>
<code>func emit(_ problem: Problem)</code>
</pre>
This means that the "func" keyword fragment and the "emit" identifier fragment can't have different text colors for syntax highlighting.
It also means that the "Problem" type identifier fragment won't be a clickable link to that type.
Also to the goal of both conciseness; DocC would't wrap conceptual sections like "Parameters", "Returns", or "Topics" into a <section> element. Instead, it a parameters "section" would start at the <h2>Parameters</h2> heading and conceptually continue until the next <h2> element:
<h2>Parameters</h2>
<dl>
<dt>problem</dt>
<dd>
<p>The diagnostic to dispatch to this engine’s currently subscribed consumers.</p>
</dd>
</dl>
It's possible that this content would be more easily read by tools if DocC instead wrapped each conceptual section in a <section> element with an id attribute like "Parameters", "Returns", etc. See Open Questions below.
Because DocC uses a single URL to represent all language representations of a symbol and because this information is meant to be accessed without JavaScript enabled, DocC would only include the "primary" language representation's information inside the <noscript> element. This means that for example a Swift-only symbol would display all its information and a Objective-C only symbol would display all its information but a symbol that has representations in both Swift and Objective-C would only include the Swift representation's information inside the <noscript> element.
In order to support tools operating on these file locally, without a web server running, DocC would explicitly include the "index.html" component in links between pages.
This also has the added bonus that a person can locally read the documentation information by double clicking on one of the "index.html" files and clicking through the links in the documentation.
The SwiftDocC library won't expose any new public API related to these changes. By not adding any public API, it allows the specifics of the HTML creation to remain an implementation detail and allows for that implementation to change (completely) in the future without the risk of breaking any library clients.
Size and performance impact
Because DocC already makes a per-page copy of the base "index.html" file, this enhancement doesn't add any new files to the output.
Because macOS—and many Linux systems AFAICT—use a 4 kB file system block size, this enhancement adds very little to the total on-disk archive size. The base "index.html" file (without the "This page requires JavaScript" message in the page's <noscript> element) is just over 1 kB meaning that DocC can fit nearly 3 kB of HTML content in each file without increasing its on-disk size.
Using Swift-DocC's library documentation as an example (the live demo from above); the on-disk archive size increased by just 0.2% from adding the HTML content inside the "index.html" files. Only 46 of the 5730 "index".html files was larger than 4 kB.
Likewise, with Swift-DocC's library documentation as an example; building the documentation 5 times with this flag enabled and 5 times without showed too small a difference to be statistically significant.
It's possible that DocC library documentation isn't entirely representative of other's package's documentation, but because of the way this enhancement works well with file system block sizes, I would still expect most packages to see less than 1–2% on-disk size increase of the output.
Open questions
I'm positive that other people will have ideas that I haven't thought of, but my primary open question is how concise the content should be.
For example, would the information be more accessible to tools if each conceptual section was contained in a <section> element and used "id" attributes so that they could more easily be found and identified? For example, the parameters section from before could be structured as:
+ <section id="Parameters">
<h2>Parameters</h2>
<dl>
<dt>problem</dt>
<dd>
<p>The diagnostic to dispatch to this engine’s currently subscribed consumers.</p>
</dd>
</dl>
+ </section>
Similarly, the declaration from before could be given an "id" attribute:
+ <pre id="Declaration">
- <pre>
<code>func emit(_ problem: Problem)</code>
</pre>
Alternatives considered
Instead of placing the minimal HTML content inside the <noscript> element, DocC could place it inside the <div id="app"> element and let Swift-DocC Render replace it with the full fidelity rendered content.
As far as I'm aware, the only benefit of having the content inside the <div id="app"> element would be that people can browse the documentation files locally without disabling JavaScript. However, because this isn't the primary focus
Future directions
There are lots of features and future enhancements that can be built on top of this. These are out of scope for this thread but I'll describe a few because I personally find them to be an exciting direction for DocC and to some extent the current in-progress implementation is already taking future directions into some consideration, laying the groundwork for DocC's ability to produce rich HTML documentation that's meant to be read by people rather than tools.
Both examples from above of what this feature won't do—creating per-fragment elements for symbol's declaration and wrapping conceptual sections in <section> elements—the current in-progress implementation has code that does and tests that verify, but there's nothing in a docc convert flow that calls that specific code.
The current in-progress implementation also has tested code that inserts <wbr> elements into symbol names for nicer looking word breaks and an initial version of support for symbol pages with language specific content.
With that in mind, here's a list of some of the many future feature and enhancements that I think could be built on top of this:
Adding additional per-page <meta> elements for sharing purposes
An small and easy extension to this feature would be to include additional per-page <meta> elements for social/sharing purposes, such as OpenGraph information.
The main thing that's missing would be that og:url is a required property for every page and DocC doesn't have any information about the absolute URL where a developer intends to host their documentation.
This could be a good first feature/enhancement for a new contributor, consisting of adding the means to pass this information to DocC and using it in the same code that adds the per-page <meta name="description" ...> element.
Adding support for ePub as an alternative output format
As mentioned above, DocC has some functionality for rendering a page's content as rich HTML for people to read.
This could work really well for more long form content with a mostly linear reading order, such as the Swift Programming Language documentation.
With new code to define an ePub package, spine, etc. and some new CSS to format and layout the documentation content, I could personally see ePub being a possible future output format for DocC.
There's likely going to be some discussion regarding how symbol documentation would be represented in this format with regard to the reading order.
If someone is interested in exploring this, and isn't afraid of a slightly larger project that would involve some amount of discussion here in the Swift forums, feel free to reach out to me.
Improvements to the reading experience for people browsing without requiring JavaScript
If, as a result of this enhancement, people start browsing the documentation with JavaScript disabled, either locally or on a web page, someone who is interesting in refining this experience could likely do so with a small amount of CSS.
I have some early ideas of how DocC could both include some amount of default CSS and also support customizing the CSS through files in the documentation catalog ('.docc' directory). Feel free to reach out if this sounds interesting to you.
Static HTML as its own output format
In my own completely personal opinion entirely without any official capacity, I believe that HTML output is the one format that could do it all; be readable by both people and tools, be easy to host and integrate with existing websites, have an easy to customize appearance, allow for custom content to be integrated into the documentation pages without requiring code changes in either Swift-DocC or Swift-DocC Render.
Obviously this would be a very large undertaking that, even if it was fully prioritized and continuously and actively worked on, would take quite a long time.
However, I also believe that each of the other future directions add something that pushes the needle forward and little by little makes this a slightly more realistic goal.
The biggest separate problem that would need to be solved would be the navigation sidebar and its filtering and open quickly functionality.
One hypothetical solution—either short term or long term—could be that this continues to be rendered entirely through JavaScript based on a JSON data file.
Another hypothetical solution could be that DocC outputs an HTML navigator structure that's augmented with a smaller amount JavaScript to support the filtering and open quickly functionality. In that case, it's unclear if the HTML navigator would be sufficient data for these features or if they would still require some (JSON?) data file.
At this point, as I see it, this navigator subproject would require some amount of prototyping and exploration of ideas in order to
As excited as I personally am about the possibilities that this could unlock and as much as I personally—entirely without any official capacity—would like to see DocC have a full fidelity HTML output format, I don't think that I'll have any spare time to spend on this in the foreseeable future.