Dealing with Nested Types when Documenting Extensions with DocC

theMomax · September 5, 2022, 10:41am

A couple of months back I started a pitch on how to Document Extensions to External Types Using DocC. While there will be a few PRs with improvements later on, the initial version is pretty much ready for merge. However, before that, I need your help once more to decide how to deal with extensions to nested types.

This decision influences at what URL the pages for extensions to nested types such as SymbolKit's SymbolGraph.Symbol are located and therefore also influences how these pages can be referenced using symbol links.

The original proposal defined the following location for extended type pages:

However, it didn't further specify how the EXTENDED_TYPE_NAME segment would look like for extensions to nested types.

Approaches

I want to propose two approaches which I have already implemented and hope that you can help me decide which one is favorable.

Path Contraction

The first approach - I call it "path contraction" - cuts all container type names from the EXTENDED_TYPE_NAME segment, so that literally only the "name" of the extended type remains.

For example, the extended type page for SymbolKit's SymbolGraph.Symbol would be located at hostpath/EXTENDING_MODULE_NAME/SymbolKit/Symbol.

You can find the PR for this approach here.

Hierarchical

The second approach is a hierarchical one. It creates extended type pages for all container types of extended nested types, even if these container types are not extended themselves.

Thus, the extended type page for SymbolKit's SymbolGraph.Symbol would be located at hostpath/EXTENDING_MODULE_NAME/SymbolKit/SymbolGraph/Symbol. Furthermore, hostpath/EXTENDING_MODULE_NAME/SymbolKit/SymbolGraph would also be a valid page URL, independently of whether or not SymbolGraph itself is extended.

Evaluation

IMO, both these approaches have advantages and disadvantages in certain situations. I created a small project with documentation catalog that should serve as support for discussing these aspects. I compiled this same catalog using both approaches. You can find the documentation archive resulting from path contraction here and the hierarchical version here.

These are the aspects I think are important to consider:

Reference Intuitiveness

I think one major advantage of the hierarchical approach is that the URL and therefore also the referencing schema follows the same rules that we know from Swift.

For example:

public extension SymbolGraph.Symbol {
    func foo() {}
}

IMO, ``SymbolKit/SymbolGraph/Symbol`` (hierarchical) is more intuitive than just ``SymbolKit/Symbol`` (path contraction).

Collisions

This becomes even more apparent if we consider how many more collisions path contraction can cause, which have to be resolved using hash- or type-based disambiguation suffixes. The sample project also features an extension to UnifiedSymbolGraph.Symbol:

public extension UnifiedSymbolGraph.Symbol {
    func foo() {}
}

Therefore, the actual references for path contraction are the following:

The hierarchical approach doesn't have this problem as it contains the unambiguous path segments SymbolGraph and UnifiedSymbolGraph, respectively:

Note, that with path contraction there can also be a collision between a nested type and its container types such as in this example:

struct Nested {
    struct Nested {
        struct Nested {}
    }
}

Here, all three types can only be disambiguated using a hash-based disambiguation suffix.

Cluttering

However, the added path segments in the hierarchical approach can also have a negative impact:

public extension SymbolGraph.Symbol.Swift.GenericConstraint.Kind {
    func foo() {}
}

While we do also extend SymbolGraph.Symbol, we don't extend SymbolGraph.Symbol.Swift or its nested type GenericConstraint. Therefore, the hierarchical approach generates two extended type pages that are basically empty, i.e. they only list their extended child:

With path contraction, these two pages do not exist, and ``SymbolKit`` links to ``SymbolKit/Kind`` directly.

Cross-Target Nesting

One additional complexity surfaces when considering cross-target nesting - which I think is a rather rare use-case.

In our sample project, the secondary target SomeSymbolKitExtension extends SymbolKit's SymbolGraph as follows:

public extension SymbolGraph {
    struct ExternalType { }
}

The main NestedTypes target (which imports SomeSymbolKitExtension) extends this nested ExternalType:

public extension SymbolGraph.ExternalType {
    func foo() {}
}

In the hierarchical approach, this extension would be part of the extended module SymbolKit, because if we define a hierarchy between the extended types, an extension to the inner type is always also an extension to the context it lives in. Therefore, the respective extended type page is located at ``SymbolKit/SymbolGraph/ExternalType``, i.e. as part of the extended module ``SymbolKit``. This follows the Swift ruleset, where the most qualified identifier for the extension is SymbolKit.SymbolGraph.ExternalType.

With path contraction, on the other hand, we view SymbolGraph.ExternalType independently of its parent type SymbolGraph, and thus consider this an extension to the extended module SomeSymbolKitExtension. The respective pages are located at ``SomeSymbolKitExtension/ExternalType`` and ``SomeSymbolKitExtension``.

Implementation

Overall, the implementation for path contraction is a little more complicated as it requires some adaptions to path resolution and disambiguation logic in order to deal with contracted path elements. However, I think the difference is small enough to base this decision first and foremost on user experience.

I personally think the hierarchical approach is better as its only real disadvantage is cluttering. However, very deep nesting is rather uncommon and if really necessary, one can "hide" the empty pages using manual curation.

Please let me know what you think, which approach you prefer, and if I've missed any important usability aspects!

Thanks for your help!

daniel-grumberg · September 5, 2022, 11:02am

You make a lot of good points here! I personally agree that the hierarchical approach seems better here, if only because it follows the Swift ruleset. I think we could strive down the line for a mix of both approaches. I think that the links should follow the hierarchical approach, but if the intermediate pages are empty, we could automatically curate the leaf pages in the right spot in the extended types hierarchy.

finestructure · September 5, 2022, 11:10am

+1 on the hierarchical approach! Feels like it’s easier to explain and the “con” case seems rather convoluted and therefore should be rare.

theMomax · September 5, 2022, 12:03pm

That's a great idea and definitely something I can look into later on. Since curation doesn't influence the URLs, we can also iterate on this without causing a breakage.

scanon · September 5, 2022, 12:19pm

Agreed. The hierarchical approach seems obviously more correct.

theMomax · September 7, 2022, 5:45am

This seems to be decided. I'll close https://github.com/apple/swift-docc/pull/335 in favor of https://github.com/apple/swift-docc/pull/369.

Thanks for your feedback!