Document Extensions to External Types Using DocC

Hi @theMomax! This is looking really great. Thanks for putting so much thought into this.

I've been following the discussion and I think we've arrived at a good solution here. I just have a few clarifying questions.


Are we only including the -swift.module-extension disambiguation suffix if we detect a collision with another top-level symbol? That would follow the way Swift-DocC's type disambiguation suffixes generally work but I wanted to clarify to be sure.

So then the default way to link to an Array extension (assuming you didn't define a top-level Swift symbol) would be:

``Swift/Array/init(_:)``

But in the case where I have defined a collision:

public enum Swift {
    case foo
}

I would have:

``Swift-swift.enum/foo``

and

``Swift-swift.module-extension/Array/init(_:)``

This follows the way Swift-DocC currently handles collisions as documented here: DocC Documentation: Link to Symbols and Other Content. So we would get this behavior effectively for "free" if we start including module extension symbols in the primary symbol graph.

The alternative of always including the disambiguation suffix would lead to a somewhat clunky authoring experience since developers would need to write out the full Swift-swift.module-extension suffix whenever they're referencing an extension symbol in their current package.


I'm also wondering if we should consider adjusting the disambiguation suffix to be -swift.module.extension since I think this follows the existing disambiguation suffixes a little better (like -swift.enum.case).


On another topic- it seems like the proposed solution here is to add two new symbol kinds to the primary symbol graph for a module that includes extensions:

  • module.extension
  • extension

The first to refer to the containing module and the second to refer to the symbol being extended. Is this right?


I'm wondering if we should have more nuanced kinds for the symbols being extended. I'm imagining we'll have auto-curation something like this:

<!-- SlothCreator.md -->

# Sloth Creator

## Topics

### Structures

...

### Protocols

...

### Extended Modules

- ``Swift``
- ``SwiftUI``
<!-- Swift.md -->

# ``Swift``

## Topics

### Extended Symbols

- ``Int``
- ``Sequence``
- ``Array``

But it might be nice to be able to support:

<!-- Swift.md -->

# ``Swift``

## Topics

### Extended Protocols

- ``Sequence``

### Extended Structures

- ``Int``
- ``Array``

This would also allow us to to put something like "Extended Structure" in the eyebrow text for the Int and Array pages instead of just "Extended Symbol".

Maybe we should just introduce an extension suffix to the existing kinds in the symbol graph?

So we would have:

- `swift.struct.extension`
- `swift.protocol.extension`

Instead of a more generic:

- `swift.extension`

Thank you again for putting this together! I'm really looking forward to being able to document extensions with Swift-DocC.

2 Likes

Thank you for diving deep on this @theMomax, it's probably the most complex design area in DocC and I think you've ended up in a good place. The approach is intuitive, while still accommodating the most extreme edge cases :sparkles:

I agree with Ethan, the mix of dashes and periods is a bit jarring. The dash is meant to signal that the symbol name is over and now disambiguation has begun. The periods are there to separate components of the disambiguation suffix.

3 Likes

+1 on this, makes sense!

These are great questions, and I think there is more overall discussion to be had around how the symbol graph format can potentially evolve to better surface extension information. The current format already records some information. For example, the extended types are not currently emitted as symbols in the symbol graph I believe—should they be? If so, how does DocC differentiate the extended type and the definition of the type, which is important for USR-based link resolution?

I'm wondering if we should discuss potential symbol graph format evolutions in a separate post, so that we can keep this thread focused on end-user experience and iterate on this original pitch.

5 Likes

not sure if it would be enough information, but could it infer their existence from pathComponents?

Yes, the disambiguation suffix is only included in case of a collision of course.

Yes, I'd go with the default behavior that currently exists for other collisions.

Definitely, yes! Thanks for pointing this out. Would have been a shame to break this consistency!

This would be the ideal outcome, I think. Knowing the symbol kind of the extended type is definitely a helpful feature. I will include this feature in the write-up of this discussion.

This sounds very reasonable to me! However, while its good to have these things in mind, I won't include it in the pitch write-up just yet. This pitch is UX-focused and, honestly, I just need a little more time to get a clear picture of the implementation strategy, so I can't really tell what consequences such decision might have. I'll probably launch another thread to discuss these topics in the next weeks as @franklin suggested.

3 Likes

Hi everyone, this is the updated proposal, based on my initial pitch and the discussion in this thread:

(Introduction and Motivation didn't change, but I still included them so it is a more concise read and also because the example is referenced later.)

Introduction

DocC does not include extensions to a type defined in an external module in the documentation catalogue, even though the extension and its contents are declared in the documented module.

Consider the following potential addition to the SlothCreator package:

/// A type that generates sloths.
public protocol SlothGenerator {
    /// Generates a sloth in the specified habitat.
    func generateSloth(in habitat: Habitat) throws -> Sloth
}

public extension Collection where Element == Habitat {
    /// Generates one ``Sloth`` per ``Habitat`` in the collection.
    ///
    /// - Note: Unfortunately, neither this comment nor the function itself is included in
    /// the documentation catalogue yet.
    func mapToSloth(using generator: SlothGenerator) throws -> [Sloth] {
        try self.map(generator.generateSloth(in:))
    }
}

/// A type that generates names for sloths.
public protocol NameGenerator {
    /// Generates a name for a sloth.
    ///
    /// - parameter seed: A value that influences randomness.
    func generateName(seed: Int) -> String
}

/// An array of strings conforms to ``NameGenerator``. Each time
/// ``generateName(seed:)`` is called, it returns the element identified
/// by the given seed.
extension Array: NameGenerator where Element == String {
    public func generateName(seed: Int) -> String {
        self[seed % self.count]
    }
}

Neither of the two extensions is added to the documentation catalogue. That is, both Collection/mapToSloth(using:) and Array/generateName(seed:) are not listed. Furthermore, they cannot be referenced using their identifiers and therefore it is also not possible to include them in the documentation catalogue using manual curation. Finally, Array (with Element == String) is also not listed among the Conforming Types of NameGenerator.

Motivation

Swift encourages us to use extensions on external types and therefore we should also be able to document such extensions accordingly. As I already mentioned, this capability has also been requested and discussed on the Swift forums before.

While there are possibly infinite use cases, I think that especially the growing ecosystem of SwiftUI packages could benefit hugely from this addition. A large part of their public API surface may be made up of view modifiers, which are usually exposed as function-extensions on SwiftUI's View type.

Proposed Solution

We assume a hosting environment which hosts all documentation catalogues relevant to our project at hostpath/MODULE_NAME. For example, the documentation page for SlothGenerator (in the SlothCreator module) would be located at hostpath/slothcreator/slothgenerator and the standard library's Array could be found at hostpath/swift/array.

New Documentation Pages

Extended modules are documented within the extending module's documentation catalogue, where all content that is added to an external module via extensions is prefixed with the extended module's name. That is, the base path for all external extended contents is hostpath/EXTENDING_MODULE_NAME/EXTENDED_MODULE_NAME. To be more precise, hostpath/EXTENDING_MODULE_NAME/EXTENDED_MODULE_NAME would host a page that lists all types belonging to EXTENDED_MODULE that were (publicly) extended in EXTENDING_MODULE. For each of these types there exists a page at hostpath/EXTENDING_MODULE_NAME/EXTENDED_MODULE_NAME/EXTENDED_TYPE_NAME, which lists all the members and e.g. default implementations added to this type in EXTENDING_MODULE. Of course, all of these also have their respective documentation pages at hostpath/EXTENDING_MODULE_NAME/EXTENDED_MODULE_NAME/EXTENDED_TYPE_NAME/MEMBER_NAME.

We'd get the following new pages for the SlothCreator example above:

  • hostpath/slothcreator/swift (extended module page)
  • hostpath/slothcreator/swift/collection (extended type page)
  • hostpath/slothcreator/swift/collection/maptosloth(using:) (added member page)
  • hostpath/slothcreator/swift/array (extended type page)
  • hostpath/slothcreator/swift/array/namegenerator-implementations (added default implementation page)
  • hostpath/slothcreator/swift/array/generatename(seed:) (added member page)

Page Contents

The following snippets describe the outline of the newly introduced pages with examples from the SlothCreator framework.

Extended Module Page
<!-- hostpath/slothcreator/swift -->

# ``Swift``

## Topics

### Extended Protocols

- ``Collection``

### Extended Structures

- ``Array``
Extended Type Page
<!-- hostpath/slothcreator/swift/array -->

Extended Structure
# ``Array``

[`Array`](hostpath/swift/array) was originally declared in the [Swift](hostpath/swift) Framework.

## Declaration

\```swift
extension Array
\```

## Topics

### Default Implementations

- [NameGenerator Implementations](hostpath/slothcreator/swift/array/namegenerator-implementations)

## Relationships

### Conforms To

- ``NameGenerator``

The Added Member Page and Added Default Implementation Page look and behave exactly as they do for normal (locally defined) symbols.

Modified Documentation Pages

This proposal also adds content to some existing pages:

Module Page

The module page is extended by a segment called "Extended Modules" listing all extended module pages.

<!-- hostpath/slothcreator -->

# Sloth Creator

## Topics

### Structures

...

### Protocols

...

### Extended Modules

- ``Swift``
- ...

Type Page

The structure of normal type pages is not altered, however, they may receive additional entries in their "Relationships" section, e.g. for NameGenerator:

<!-- hostpath/slothcreator/namegenerator -->

Protocol
# ``NameGenerator``

A type that generates names for sloths.

## Declaration

\```swift
protocol NameGenerator
\```

## Topics

### Instance Methods

- ``generateName(seed:)``

## Relationships

### Conforming Types

- ``Array``
    Conforms when `Element` is `String`.

Manual Curation

Manual curation is permitted via the usual methods, including the possibility to reference any of the newly introduced pages outside of their respective extended module path (e.g. outside of hostpath/slothcreator/swift).

Navigation Sidebar

The navigation sidebar essentially lists the same content as the module page. That is, the new section header (i.e. the one on the same level as e.g. "Structures") would be called "Extended Modules". This section only lists the extended module pages, not the extended symbol pages. This list's items can be expanded to reveal items for the extended type pages, which in turn can be expanded to reveal the member pages, just as with regular pages.

Resolving URL Collisions

There exists one possible collision with the proposed URL scheme for extended content: A module defines a type with the same name as one of its public dependencies (i.e. extended modules). This scenario should be very very rare. However, we still propose to resolve such conflicts by applying the standard disambiguation suffix strategy. The EXTENDED_MODULE_NAME would get the suffix -swift.module.extension.

For example if SlothCreator contained the following enum declaration:

public enum Swift {
    case foo
}

Then we would have urls hostpath/slothcreator/swift-swift.enum and hostpath/slothcreator/swift-swift.module.extension. These disambiguation suffixes remain in place even for members that could be referenced unambiguously without the suffixes. For example, we'd get hostpath/slothcreator/swift-swift.enum/foo and hostpath/slothcreator/swift-swift.module.extension/array/generatename(seed:).

Identifier Syntax

Identifiers used in DocC code are suffixes of the urls used above. References to any of the newly introduced pages (i.e. those that list content from extensions to external types) must always contain the EXTENDED_MODULE_NAME. I.e. even if there is no local type Array, just ``Array`` is no valid identifier for hostpath/slothcreator/swift/array. Instead, ``Swift/Array`` would be the correct syntax.

Once [SR-15431] Support DocC references to symbols defined in another module · Issue #208 · apple/swift-docc · GitHub is implemented, we always follow a local first strategy. I.e. if a module shadows a symbol, the simple EXTENDED_MODULE_NAME/SYMBOL_PATH identifier links to the local page of the extension (hostpath/EXTENDING_MODULE_NAME/EXTENDED_MODULE_NAME/SYMBOL_PATH), not to the original page in the extended module's documentation catalogue (hostpath/EXTENDED_MODULE_NAME/SYMBOL_PATH).

In order to reference shadowed symbols, one must use absolute identifiers. Absolute identifiers are basically the URL without the leading hostpath. That is, shadowed symbols can be referenced unambiguously using /EXTENDED_MODULE_NAME/SYMBOL_PATH (note the leading slash).

Firstly, note that absolute identifiers are always an option, i.e. one could also refer to SlothCreator's NameGenerator in the Sloth Creator documentation catalogue using ``/SlothCreator/NameGenerator``.

Secondly, relative identifiers can be used for referencing external symbols, if they are not shadowed. That is, ``Swift/Array/count`` is a valid identifier in the SlothCreator module, even though there is no local extension that defines a count on Swift.Array.

Finally, identifiers use the same disambiguation syntax as the URL does. This applies to both, relative and absolute identifiers. In the edge-case where the -swift.module.extension suffix is required, usage of a relative path that includes a colliding name, but does not feature the disambiguation suffix is not permitted even if it does not collide in its entirety. (Usually, this identifier could link to a symbol in another documentation catalogue.) I.e. ``Swift/Array/count`` cannot be used in the example from "Resolving URL Collisions" in order to link to the Array.count definition in the standard library. Instead the absolute identifier ``/Swift/Array/count`` must be used.

5 Likes

Just to say: this looks excellent. I've hit this problem quite a bit, where you can only tell people that things exist, and it is quite frustrating.

The most important details for me (beyond just making these APIs visible), are the ability to reference APIs I add in extensions throughout the rest of the package documentation, and the ability to curate their topic placement, and I think this proposal does a really good job with all of those.

Relying on disambiguation suffixes means you can add a type, then all of your extensions change their URLs. It's better to avoid these things ever coming in to conflict, if possible. Why not namespace the extended module names inside a path component that is either a single rare or invalid identifier?

For example, "_": So instead of "hostpath/slothcreator/swift", we'd use "hostpath/slothcreator/_/swift".

(Or "_$", "_$_", or something better - you get the idea)

This is the case indeed. However, this only applies if this type happens to have the same name as one of your public dependencies - which is really rare I think. (Do you think there is a mayor use-case for this?) And even if you really need a type with such name a simple global replacement /MY_NAME => /MY_NAME-swift.module.extension should do the job.

The general idea was to optimize usability for the 99% of cases, while still supporting the 1% with a slightly downgraded experience. We want to have an intuitive syntax for DocC identifiers as well as URLs .

Furthermore, the way DocC currently works each path segment also results in an auto-generated page. With this additional overview-layer the extended modules section of the documentation catalogue would fit in worse into the sidebar and the documentation catalogue's overview page. In both these places you have sections such as "Structures" which list a series of items. With your proposed structure, the new section "Extended Modules" would only list one item: the overview page corresponding to the _ path segment.

4 Likes

I'm not sure if there could be a case for it - but developers like to get creative and push tools to their limits, so I wouldn't rule it out.

...but only if the cost is acceptable, of course; it's fine for tools to have limits. Given the points you raised about the identifiers matching the URLs, and the existing one-page-per-path-component design, I agree that it's not worth the complexity.

3 Likes

what about

/hostpath/swift/array?import=slothcreator

?

That is actually a quite creative idea! However, it introduces a completely different hierarchical structure. I further think that it doesn't really fit in with the existing DocC identifier syntax. Finally, I don't think URL query parameters would work well with static hosting.

documentation is inherently dynamic. designing around the requirements of static hosting limits a lot of things.

Would you like to elaborate on that? While I see that your suggested URL syntax implies a totally different structure, I don't really see what benefits it would have (that would outweigh static hosting). Do you see any mayor features that are incompatible with the current proposal and if so, could you present a vision of how they could be implemented following your approach?

for example, static hosting means no case-folding; try visiting:

https://apple.github.io/swift-markdown/documentation/markdown/Markup

This, however, is nothing that is incompatible with the current proposal, but with static hosting itself! You could easily use the proposed solution in a dynamic environment and implement case-folding there.

I just started a follow-up thread for discussing the symbol graph structure and further implementation details: Symbol Graph Adaptions for Documenting Extensions to External Types in DocC.

I would be very grateful for any input there! Thanks again to everyone contributing to this discussion!

what happens if there is more than one extended module? meaning, module A extends a type declared by module B, which in turn extends a type declared in module C?

qualified name: FooType.BarType.BazType

dependency graph:

swift-foo                  swift-bar                  swift-baz 
 FooType  -- has member ->  BarType  -- has member ->  BazType
       (perpetrator: swift-bar)   (perpetrator: swift-baz)

what happens if module A conforms a type in module B to a protocol declared by module C, which has extension members declared in module D?

qualified name: FooType.qux(_:)

dependency graph:

swift-foo                   swift-bar                    swift-qux 
 FooType  -- conforms to ->  Barable  -- has member ->  Barable.qux(_:)
       (perpetrator: swift-baz)   (perpetrator: swift-qux)

swift-foo                                 swift-baz 
 FooType  -- has synthesized member ->  FooType.qux(_:)
            (perpetrator: swift-baz)   

Thank you @taylorswift , this is very interesting input!

Maybe my formal schema hostpath/EXTENDING_MODULE_NAME/EXTENDED_MODULE_NAME/SYMBOL_PATH wasn't general enough. The generalized schema is hostpath/EXTENDING_MODULE_NAME/EXTENDED_TYPE_PATH/SYMBOL_PATH.

You'd get the following pages from your first example:

  • hostpath/swift-foo/footype
  • hostpath/swift-bar/swift-foo/footype/bartype
  • hostpath/swift-baz/swift-bar/swift-foo/footype/bartype/baztype

All the respective prefix-paths naturally exist too. All pages behave exactly like outlined in my post above. E.g. hostpath/swift-baz/swift-bar also has a section Extended Modules, which lists the module swift-foo.

This is a very tricky question, indeed. I'd propose the following behavior:

Firstly, page paths are constructed in a manner consistent with the example above. The only remaining question is how to deal with the synthesized member. The URL for those consists of four parts:

hostpath/EXTENDED_TYPE_IDENTIFIER/CONFORMED_PROTOCOL_IDENTIFIER-implementations/SYNTHESIZED_MEMBER_IDENTIFIER, e.g.:

schema example
hostpath hostpath
EXTENDED_TYPE_IDENTIFIER swift-baz/swift-foo/footype
CONFORMED_PROTOCOL_IDENTIFIER-implementations swift-bar/barable-implementations
SYNTHESIZED_MEMBER_IDENTIFIER swift-qux/swift-bar/barable/qux(_:)

If the CONFORMED_PROTOCOL or SYNTHESIZED_MEMBER is defined in the local module (in our example swift-baz, the respective module specifier can be omitted resulting in a simpler path.

Overall, we'd get the following pages (plus prefix-pages):

  • hostpath/swift-foo/footype
  • hostpath/swift-bar/barable
  • hostpath/swift-qux/swift-bar/barable/qux(_:)
  • hostpath/swift-baz/swift-foo/footype/swift-bar/barable-implementations/swift-qux/swift-bar/barable/qux(_:)

I know this last path isn't particularly pretty or easy to remember, but that's just because your example is pretty extreme :slight_smile:

how would you write a symbol link to a path like hostpath/swift-baz/swift-foo/footype/swift-bar/barable-implementations/swift-qux/swift-bar/barable/qux(_:)? would there be any shorthands?

I'm afraid I have to correct my previous draft for the URL schema. I assumed the CONFORMED_PROTOCOL_IDENTIFIER-implementations segment is also included in the URL for the respective member pages, but actually it isn't. So we either have:

hostpath/EXTENDED_TYPE_IDENTIFIER/CONFORMED_PROTOCOL_IDENTIFIER-implementations (for the overview page)

OR

hostpath/EXTENDED_TYPE_IDENTIFIER/SYNTHESIZED_MEMBER_IDENTIFIER (for the member page).

Therefore, the paths in the example would be:

  • hostpath/swift-foo/footype
  • hostpath/swift-bar/barable
  • hostpath/swift-qux/swift-bar/barable/qux(_:)
  • hostpath/swift-baz/swift-foo/footype/swift-bar/barable-implementations
  • hostpath/swift-baz/swift-foo/footype/swift-qux/swift-bar/barable/qux(_:)

Luckily, this also shortens the URLs quite a bit.

The simplest in-code identifier in swift-baz would therefore be:

  • swift-foo/footype/swift-bar/barable-implementations (for the overview page)
  • swift-foo/footype/swift-qux/swift-bar/barable/qux(_:) (for the member page)

I think I mentioned this previously, but here's my take on this: Initially, there won't. Removing any of the module specifiers in these URLs introduces the possibility of a naming collision. (We previously agreed that DocC should allow for referencing shadowed symbols.) This is a complexity I want to avoid in the initial implementation. I think that most users will be glad to just have the possibility to link to these symbols even if the syntax is a bit lengthy. However, that doesn't mean we can't iterate on it afterwards. Of course this fully qualified syntax will always remain an option, but adding shortcuts in situations where naming is unambiguous is definitely a topic for future work.

Ideally, the identifier swift-foo/footype/swift-qux/swift-bar/barable/qux(_:) could then be boiled down to just footype/qux(_:), given that:

  • swift-baz defines no other type FooType
  • swift-baz defines no member qux(_:) on FooType
  • swift-baz does not conform FooType to any other protocol which has a member qux(_:)