Symbol Graph Adaptions for Documenting Extensions to External Types in DocC

franklin · April 13, 2022, 9:05am

Hi @theMomax! This is looking really great.

I'm hugely in favor of modeling the extended types in the main symbol graph as opt-in behavior. The existing model of emitting extended module APIs in separate files is suited for displaying extension information in their extended module's documentation, but not in the defining module's documentation. For example, you'd include SwiftDocC@Swift.symbols.json when building documentation for Swift. I find your proposal much better UX-wise, so it makes sense to me to modify the symbol graph format to cater for that. Modeling the extended types in the main symbol graph allows documentation compilers to continue just representing whatever API hierarchy the symbol graph describes. With this architecture, I believe that the work on the Swift-DocC side would be quite minimal (just a matter of understanding these new symbol kinds and relationships, I believe).

The new symbol kinds and relationships make sense to me. Just a few notes:

An extended type (swift.TYPE_KIND.extension) can conform to multiple protocols I believe, so this should be a many-to-many relationship, if I understand the diagram correctly.
I'm not quite sure about reusing the memberOf relationship to model extended type -> extended module relationships. I don't have a better naming suggestion off-hand though, so I would like to hear others' thoughts as well.
Regarding the swift.extension symbol type: having the module being modeled as both a member of the swift.TYPE_KIND.extension and swift.extension is a bit odd to me. As you mentioned, this would require special processing on the Swift-DocC side to aggregate the doc comments, and we'd lose the 1 symbol = 1 documentation page that the DocC model currently has. I'm wondering if emitting swift.extension symbols should be done in separate "mode" entirely (--emit-extension-symbols or alike), in which the symbol -> swift.TYPE_KIND.extension relationship wouldn't exist. If the goal here is to aggregate documentation comments, we could also take your original model without swift.extension, and add a new array field in swift.TYPE_KIND.extension that contains all the (unordered) doc comments and source locations associated with all the extensions of this type in the module. However, from an authoring UX perspective, I'm not sure quite sure how that would work; you'd want to control the ordering of these. Any ideas? I'm also not opposed to scoping this out of this proposal.

Could you please attach an example symbol graph with your proposed additions, to make sure everyone is one the same page?

This is a great question. I don't think we should use same precise identifier (aka USR) as the extended type (i.e., s:SS) because consumers would see that type itself as being defined in the module. For example, Swift-DocC does USR-based link resolution as part of its compilation process, and as it stands, would consider s:SS to be symbol defined in the module, since its symbol graph contains a symbol with that USR. Swift-DocC would need to special-case symbols with kind swift.*.extension, but I think a better approach would be to have a separate USR for them entirely. It would be useful to record the relationship between the symbol and its extended symbol's as well though, as you mention further down—this will help with inter-module linking in in the future. I'm not quite sure whether the Swift compiler synthesizes a different USR for extensions (maybe @QuietMisdreavus would know). You could place a breakpoint in SymbolGraphGen and inspect what the declarations that are being processed.

Yes, I think we should do that. It's not required in the symbol graph model for the target of a relationship to be defined in the same symbol graph (i.e., it doesn't have to be in the symbols array). These relationships get a targetFallback property that textually describe the target. For example:

{
    "kind": "conformsTo",
    "source": "c:objc(cs)Bar",
    "target": "s:SH",
    "targetFallback": "Swift.Hashable"
}

I think tracking this kind of relationship makes sense for symbol extension -> symbol makes sense as well and this will be useful when performing inter-module link resolution, because DocC would be able to look up s:SH in another DocC archive. It might not make sense to display this relationship on the documentation page of the extension though—the info seems a bit redundant, at least until we can make the target symbol an actual link.

The symbol graph format is maintained independently from Swift-DocC as it is used by other documentation compiler as well. It's important to minimize breakages as much as possible. That being said, the symbol graph format has not reached stability yet (Swift symbol's graph is currently at 0.5.3 https://github.com/apple/swift/blob/main/lib/SymbolGraphGen/FormatVersion.h), but we should still aim to reduce breakages. It's also worth noting that the model aims to be generic enough for compilers of other languages to also be able to emit symbol graphs (e.g., there is an effort underway for clang to emit symbol graph files for Objective-C), so there needs to be some alignment so that documentation compilers can consume the same format.