@_nodoc attribute for hiding symbols from the symbol graph

QuietMisdreavus · July 26, 2022, 2:14pm

Hello! I just wanted to post here to let people know about a PR i just opened: introduce a `@_documentation(...)` attribute to influence SymbolGraphGen by QuietMisdreavus · Pull Request #60242 · apple/swift · GitHub

It introduces a @_nodoc attribute to hide a symbol from the symbol graph (and thus from documentation) without marking it as internal or prefixing its name with an underscore. It also works on @_exported import statements, to hide symbols from the exported module.

The name is up for discussion, though since it’s specific to the symbol graph i wanted to make sure the name reflected that. I picked “nodoc” to mirror the in-comment marker for Jazzy.

Let me know what you think! I’m looking forward to getting this in people’s hands.

franklin · July 26, 2022, 2:30pm

Thanks @QuietMisdreavus, this is super helpful for cases where you can't underscore a symbol but still want it omitted from documentation. The @_exported import bit gives more flexibility in what you want to document and resolves `@_exported import`s should not emit symbols from external dependencies · Issue #331 · apple/swift-docc · GitHub. Not to start a naming bikeshed… I like the brevity of @_nodoc, but have you also considered more explicit names like @_excludedFromDocumentation or @_excludedFromSymbolGraph?

QuietMisdreavus · July 26, 2022, 2:34pm

I could get behind @_excludedFromSymbolGraph as the attribute name, since strictly speaking symbol graphs can be used for more than just documentation. @_nodoc was basically a first pass to get the implementation running; as i mentioned, i just picked the name to mirror Jazzy. Unless there are objections to that name in particular, i can update the PR to use @_excludedFromSymbolGraph instead.

Kyle-Ye · July 26, 2022, 3:35pm

As discuessed Enum case with "_" name prefix is ignored by DocC · Issue #342 · apple/swift-docc · GitHub here, could we consider adding a pair of attributes to control this behavior(eg. @nonSymbolGraph and @SymbolGraph)(There are some examples already like @objc and @nonobj, nonisolated and isolated)?

The positive one means the symbol will always be emitted to the symbol graph (even prefixing its name with an underscore).
And the negative one means it will be always hidden.(The @_nodoc attribute you describe)

krilnon · July 26, 2022, 3:43pm

This seems like a generally useful tool for API authors. Also great for people reading the declaration, so the intent of the symbol is a bit clearer.

I'll mirror Kyle's comment on the usefulness of opposite operation, too. Sometime you wind up in a situation where a symbol doesn't show up in (effectively) the symbol graph, yet the intent of the API author was for the symbol to be public and documentable. (33683668)

franklin · July 26, 2022, 4:03pm

An attribute that has the opposite behavior is a great idea IMO. It would help with cases like the one mentioned in Enum case with "_" name prefix is ignored by DocC · Issue #342 · apple/swift-docc · GitHub.

gwendal.roue · July 26, 2022, 4:15pm

I could get behind @_excludedFromSymbolGraph as the attribute name, since strictly speaking symbol graphs can be used for more than just documentation.

I would really suggest choosing a name that contains "doc", "documentation", or "DocC", because "excluded from symbol graph" won't mean anything for many people.

Actually I don't have the slightest idea what the Symbol Graph is. And yet I'm quite sure I don't need it in order to write DocC documentation. I know this because I wrote some :-)

If someday I need to exclude a symbol from the DocC documentation, I may search the web for "exclude from DocC", and certainly not for "exclude from Symbol Graph".

xwu · July 26, 2022, 4:22pm

As an underscored attribute, @_nodoc seems perfectly serviceable as-is.

jack · July 26, 2022, 5:38pm

This is great! I like the symmetry with the suggested opposite behavior attribute too. To avoid a scenario where someone accidentally adds @_nodoc and @_doc, what if we did something like @_doc(<#visibility#>) e.g. @_doc(hidden) and @_doc(visible)?

Some future (low priority) suggestions for handling more esoteric workflows:

An equivalent C/Objective-C __attribute
A markup directive that can be added to a documentation extension file for workflows where the ability to modify visibility post-build/without modifying source is important.
Configurable/overridable via APINotes.

taylorswift · July 26, 2022, 8:25pm

instead of excluding the symbol (which can interfere with symbolgraph algorithms and have unintended side-effects by creating holes in the symbolgraph), could it instead emit a field like "visibility": "hidden" to indicate that the symbol should not be shown by the frontend?

ethankusters · July 27, 2022, 3:32am

Along these lines- I'm wondering if we should consider aligning this general design with SPI? I can imagine wanting to hide certain symbols for general consumption of my documentation but expose them for certain audiences (just like with SPI).

I'm imagining something like: @_doc(docCategory) and then a corresponding --include-hidden-docs docCategory flag that can be passed to the symbol graph extract tool.

I could see this design fitting well with the above as well. Maybe there's still an option to fully exclude the symbols from the symbol graph but clients could choose to instead include them and instead read their @_doc category via a property in the symbol graph and modify behavior accordingly.

"docCategory": "foo"

jack · July 27, 2022, 5:16am

I like this, but I hesitate to add additional complexity when @spi already provides slicing for modules that present different views of public API to various clients. Do you have an example in mind?

In any case adding this flexibility is likely backwards-compatible if we wanted to introduce it later—we could interpret hidden and visible as special/default categories.

icanzilb · July 27, 2022, 3:12pm

I think this is super useful and everyone's already shared thoughts along the lines of how I feel about the proposal. I'm not ecstatic about the name containing an abbreviation but I do like the brevity.

I think in general I'd stay away from all lowercase, underscored, and abbreviated naming because these all go against the swift naming guidelines. I'm sure nobody will mind but in my opinion will be nicer if attribute names are coherent. For example we already have @dynamicCallable and @dynamicMemberLookup to mention some of the existing attributes that do contain multiple words (src: Attributes — The Swift Programming Language (Swift 5.7))

In that sense, if you don't go for some of the variants suggested above that take a parameter, maybe at least the name could be @noDocumentation or @hideFromDocumentation.

Also, I think it's an important detail to mention here that this actually omits the symbol from the symbol graph dump but when documenting the attribute for consumers probably better to just say that "documentation will not be compiled for that symbol" because going through the symbol graph is an implementation detail.

Finally, I'm really excited about this feature — I always end up having a "Utilities" topic group where I shove few types that make no sense from consumer point of you, I could finally just hide them.

QuietMisdreavus · July 27, 2022, 3:50pm

There's a policy in Swift-DocC of always printing whatever it's given from the symbol graph; if we could come to a community consensus of what kinds of visibility constraints we would want to handle "by default", then Swift-DocC could do this kind of filtering via something like @ethankusters's suggestion.

Speaking of...

I like the idea of something like @_documentationCategory(...) (making the naming long-form based on @icanzilb's comment) that would special-case forceHidden and forceVisible. Anything else would go into the symbol graph and could be handled by a tool however it would like. Part of me wonders if this would get confused with Swift-DocC's topic groups and SPI over time, though.

The key difference i would see between @_documentationCategory and @_spi is that the latter is already automatically excluded from documentation, whereas the former is only automatically excluded with the proposed forceHidden category; anything else would have to be upon request. I could see this getting a bit thorny with the behavior on @_exported import statements, implementation-wise, but that's something we could work through in the PR.

ethankusters · July 27, 2022, 7:14pm

The main one I can think of is that I might want to hide some public symbols from all viewers of my documentation and others only from non-contributors. So I might use @_documentationCatergory(forceHidden) and @_documentationCategory(contributors). Then when I build my documentation for contributors to my framework I would increase (reduce?) the minimum access level to internal and include the contributors documentation category.

There's definitely overlap here with how @_spi is intended to be used but since there's a need for a documentation specific attribute at all, I think there's likely a need for more granular control over what symbols are hidden for what audiences.

taylorswift · July 27, 2022, 8:00pm

i think @_documentationCategory(_:) is a great idea (though i would name it @documentation to follow the precedent of similar attributes like @available). but why not just have it be a free-form string that downstream tooling can interpret according to its own definitions?

@_documentation(DocC.forceHidden)
@_documentation(DocC.forceVisible)
@_documentation(myTool.myCustomCategory)

maybe (and hear me out) it could actually a good thing to combine documentation for “contributors” and “users”? after all, they need not be mutually exclusive, as a package author i would hope some of my users would eventually become contributors as well. and when i am consuming packages written by others, i often end up browsing through source code on GitHub anyway. pages for things that are “for contributors” could be made visually distinct, maybe with a different background color or styling.

QuietMisdreavus · July 27, 2022, 8:33pm

At least forceVisible needs to be special-cased, due to needing to override the underscored-symbols behavior. And forceHidden will need to be special-cased with the current implementation of @_exported import statements because arbitrary attributes cannot be read in the compiler when these are being processed.

In the average case, the number of incidental users who become avid contributors are vastly outweighed by those who are not. Including internal implementation details on the off chance that someone trying to figure out how to use a package gets interested in improving it just clutters the experience for the majority of people. If you would like to propose a design for integrating "public" documentation with "internal" documentation and making them distinct, please do so in a new thread. As it stands, i feel like that would be a larger change to the precedent of documentation tooling in general to slip into this discussion.

taylorswift · July 27, 2022, 8:56pm

i don’t know how the compiler parses attributes, but would it be possible to make “magic” cases like forceVisible and forceHidden be recognized by the compiler, while still allowing for a larger set of possible user-defined values?

i think the underlying question here is whether a symbolgraph should be a single artifact containing “all of the information” about a module at a given time, or whether we should have different ‘flavors’ of symbolgraphs for different purposes. i can see two possible directions here:

symbolgraph is an archive: we generate a single (collection) of symbolgraphs per package revision (e.g. in a CI workflow), and then downstream tooling subsets parts of the archive “for contributors”, “for users”, etc.
symbolgraph is an interchange format: we generate symbolgraphs on-the-fly, with different contents depending on the intended usage. this is the workflow that @ethankusters proposed, where you would build separate symbolgraphs and then deploy the rendered output for different audiences.

i don’t know which of these is better, as there are pros and cons to both. but i don’t really see this as a “rights of the majority” dispute since the end product would probably be similar once post-processing happens.

QuietMisdreavus · August 1, 2022, 4:02pm

Totally! We could allow whatever we want, while only checking for a couple of specific cases. If this is going to be a broader thing, then we could even add prefixes like your earlier suggestion, and call them something like swift.forceVisible and swift.forceHidden.

This is an interesting question. As it stands, the symbol graph is a kind of implementation detail so that tools like Swift-DocC don't have to be integrated with the Swift compiler or try to parse files themselves. The latter option ("symbol graph as interchange format") is basically where we're at today: If you want internal instead of public symbols, or to include symbols marked SPI, you generate a new symbol graph and rerun Swift-DocC with a new input. This makes it easier for tooling to consume the data: If it's in the symbol graph, display it.

On the other hand, something like the former option ("symbol graph as archive of everything") is useful if you expect to do filtering or analytics on the data. That way, a single file can act as the authoritative source of information about a module. Like i said, this a bit of a departure from what we currently do, but moving some filtering into Swift-DocC shouldn't be a problem if that's what we decide to do. I would definitely want to make that decision as a group, though, and get more input from others before going down that road.

I feel like this has gotten away somewhat from the original concept, though: Providing an in-language construct for marking an item as "internal" without underscoring its name or actually marking it as internal, as well as the other way around. I think we can decide what to do about this independently of how we decide to treat the symbol graph as a concept.

QuietMisdreavus · August 4, 2022, 8:16pm

Based on the discussion here, i've updated the implementation in the PR to turn the attribute from @_nodoc into @_documentation(...), allowing you to add an arbitrary "documentation category" to a symbol. This includes two special-case categories:

@_docuentation(underscored) treats the symbol as if it had an underscored name (i.e. the old @_nodoc behavior), hiding it from public docs.
@_documentation(ignoreUnderscored) does the opposite, forcing an underscored symbol to appear as if it were not underscored, allowing public symbols like the enum case in this issue to be visible in public docs.

Regardless of what you write in the attribute, the category will be visible in the symbol graph in the symbol's "category" field, allowing tooling to use these categories for any purpose.