Unified documentation links

I couldn't attend the Documentation Workgroup meeting where this was discussed but from what I've been told there was general agreement that these individual pieces would be better discussed as separate pitches. I agree with that assessment so I'll reply to each piece as it relates to the DocC link syntax today.

I'll add a description of the DocC link syntax at the end for additional context and to highlight some features that are relevant to these proposed link syntax changes.


Path separators

It feels like a nice refinement to parse consecutive path separators as a single separator and include the subsequent separators in the name of the following path component. This avoid the need to escape the path separator in the most common cases, for example when linking to Swift's division operator.

It's still possible that a separator character occurs in the middle of a function or operator name. For example, the symbol name for this custom operator would be +/-(_:_:) which DocC can't link to today and the proposed syntax doesn't solve. I consider this to be a bug that would best be solved by supporting escaped separators (for example using a backslash).

public extension Int {
    static func +/- (lhs: Int, rhs: Int) -> (added: Int, subtracted: Int) { ... }
}

Disambiguation

Adding support for disambiguation written within as SymbolName [HASH] in addition to SymbolName-hash feels like a reasonable change that could make it easier for developers to utilize multiple documentation tools in parallel.

However, using Swift specific keywords instead of symbol kind identifiers defined in Symbol Kit may prove problematic for links in other languages.

It's not clear to me what's meant by "all keywords must be present". Is this referring to all keywords in the declaration of that symbol or to the minimal amount of Swift keywords necessary for that type of declaration. Needing to specify all keywords from the declaration of that symbol pose two problems:

These Objective-C method declarations contain no keywords that can be used to disambiguate the instance method from the class method:

@interface MyClass : NSObject
/// An instance method.
- (id)something;
/// A class method.
+ (id)something;
@end

This Swift declaration contains 5-7 keywords (depending on if you include "public" access modifier and the "throws" keyword that applies to the initializer's parameter) and the leading keywords can be ordered in many different ways:

public class Something {
    public required nonisolated convenience init(_ perform: () throws -> Void) async rethrows { ... }
}

On the other hand, mapping Symbol Kit symbol kind identifiers into other identifiers (for example, using "var" for both properties and instance variables) may have issues in languages other than Swift. This Objective-C class definition has both an instance property and an instance variable called "name". If both these have the same kind identifier they instead need to be disambiguated by their hashes.

@interface Person : NSObject {
    NSString *name; // a variable
}
@property(copy) NSString name; // a property
@end

As DocC adds support for additional languages we may run into more of these problems where that language has collisions that's not covered by the reduced set of symbol kind identifiers.

Case sensitive

This has no impact on DocC.

Trailing parenthesis

I'm not sure what problem this is solving. From my perspective it just adds ambiguity to links that wouldn't otherwise be ambiguous. For example an Objective-C class with a property and a method or a Swift type with a property and a static function both result in descendant path components named something and something() :

@interface MyClass : NSObject
@property NSInteger something; // a property
- (void)something:(NSInteger)argument; // a method
@end
public struct MyStruct {
    public var something: Int
    public static func something() {}
}

Assuming the exact spelling is preferred over the extra trailing parenthesis, this could result in a situation where

  • The property is added first and some links to the property are written with trailing empty parenthesis. These links resolve without warning.
  • The method is added later. Now the links with trailing parenthesis that used to resolve to the property resolve to the method instead. There is no warning highlighting this issue to the developer.

This is different from how symbol overloads and other link collisions work where adding a new symbol that results in a collisions results in a warning requiring links to be update to unambiguously refer to a specific symbol.

Namespacing

The main reason for requiring a leading slash to refer to a symbol in another module in DocC is about clarity while reading the raw link in source. For example, imagine that this namespacing was implicit and you encountered these two links in some documentation markup:

  • Swift/String
  • Swift/Sequence/partitioned(by:)

Reasoning about what symbols these two links refer to becomes harder than it may seem at first.

If the current module has one or more public[1] extensions to the Swift's String type then Swift/String refers to the local page listing the extensions that the current module adds to Swift's String type. It's also possible, but unlikely, that the current module has a local symbol named "Swift" with a nested symbol called "String". If the current module has both a local "Swift/String" symbol and extends Swift's String type this link would be ambiguous, requiring "-struct" or "-struct.extension" disambiguation to uniquely refer to either the local type or the extension page.

If the current module has a dependency on Swift Algorithms then the second link refers to the partitioned(by:) function added to Swift's Sequence protocol in a public extension in Swift Algorithms (assuming the current module doesn't have such an extension).

We like to avoid a situation where a link resolves to one symbol without warnings and then after some project changes refer to another symbol without warning .

DocC could support implicit namespacing and handle collisions with external symbols the same as collisions within the current module, requiring "-struct" or "-struct.extension" disambiguation. This would solve the issue of ambiguous links but wouldn't help the developer reason about what symbol a link refers to.

It may also require the developer to disambiguate links like String-struct or Collection-protocol if their projects also extends those types or links like MyClass-2b5dq if one of their project's dependencies also has a public class with the same name (since both are classes they need to be disambiguated by hashes).

I personally like that I can look a link in DocC without a leading slash and know that it refers to some symbol in the current module.

Vector links

I haven't heard the term "vector link" before and couldn't find any DocC feature requests describing the type of issues that it aims to solve but after doing a bit of research on the issue I find this to be an interesting problem with a few possible alternatives to consider.

I searched GitHub for occurrences of concatenated symbols links and found 6 projects doing this: "Actomaton", "JivoSDK", "KeyboardKit", "MIDIKit", "react-native-custom-keyboard", and "SpotifyAPI".

Some of these projects use concatenated symbol links where each link adds another component to the previous link, for example:

  • ``KeyboardContext``.``KeyboardContext/preview``
  • ``Effect``.``Effect/init(id:sequence:)``
  • ``SpotifyAPILogHandler``.``SpotifyAPILogHandler/bootstrap()``

Some instead concatenate distinct symbol links of properties where later links refer to a member of the previous properties type, for example:

  • ``MIDIManager/endpoints```.``` MIDIEndpoints/outputs``
  • ``Jivo``.``Jivo/session``.``JVSessionController/shutDown()``

If the goal is to create a single link that displays multiple path components developers can accomplish that today using []() syntax, for example [MyClass.myProperty](doc:MyClass/myProperty) . This isn't great since parts of the link is repeated but it is flexible in that the text can be freely customized, for example: [The myPropertyproperty onMyClass](doc:MyClass/myProperty) .

One alternative that could be interesting to discuss would be the ability to globally customize how much of a links should be displayed on the page and/or what the default should be. I could see the argument for displaying the containing type name for properties, methods, enum cases etc. unless the link is from the containing type's scope.

If we define a new syntax for customizing how a symbol link displays on the page I would like to see if there are syntax alternatives that can accommodate other types of link display configuration as well. I think there's a lot to explore here. For example, one feature that we want to support in DocC but don't know the right syntax for is "inactive links"; resolving the link to get the correct symbol name and warn if the link doesn't resolve but render the symbol name in "code voice" without making it a clickable link. This is different from directly putting the symbol in code voice because it allows for the symbol to display its language specific name when switch between Swift and Objective-C version of the page that contains the inactive link. I could also imagine cases when it'd be nice to link to a function and display its name but truncate its arguments.

Linking to overload groups

The Improving the presentation of overloaded symbols in Swift DocC proposal suggested not using links with no disambiguation (or only symbol kind disambiguation where necessary) for overload pages. For example, this class would have two overload pages that can be linked to using ``something()-method`` and ``something()-type.method`` respectively.

public class Something {
    public func something() -> Int { 0 }
    public func something() -> String { "" }

    public static func something() -> Int { 0 }
    public static func something() -> String { "" }
}

I don't see a need to add additional syntax for links that are already unambiguous.


For some of these changes where there's more to discuss it may be easier to create new threads to continue talking about each piece separately instead of having multiple ongoing conversations in the same thread.


How links work in DocC

The DocC link syntax is documented both in the Link to Symbols and Other Content section of the DocC documentation about its documentation markup (aimed towards people using DocC) and in the Linking Between Documentation page of the SwiftDocC framework documentation (aimed towards SwiftDocC contributors) so I'll try to not repeat too much about what's already documented in those places.


DocC supports two types of documentation links:

  • Symbol links; a symbol path surrounded by two grave accents on each side: MyClass/myProperty
  • General documentation links; markdown links with a "doc" scheme: <doc:MyArticle> or <doc:MyClass/myProperty>.

Symbol links can only link to symbols but general documentation links can link to all types of documentation content: symbols, articles, and tutorials. Both symbol links and general documentation links use a "path" in the documentation hierarch using forward slashes ("/") to separate each path component. Symbol links only consist of a "path" but general documentation links can also include a URI fragment to reference an on-page element (for example to reference tutorial sections or article headings) and a URI host which is almost never used.

doc://com.example/path/to/documentation/page#optional-fragment
      ╰────┬────╯╰────────────┬────────────╯ ╰───────┬───────╯
       bundle ID    path in docs hierarchy    on-page element

This means that there are a few different syntax alternatives for referencing a symbol in DocC:

  • ``MyClass/myProperty``
  • <doc:MyClass/myProperty>
  • [](doc:MyClass/myProperty)
  • [Arbitrary text](doc:MyClass/myProperty)

Symbol links can't resolve on-page elements so if a developer writes MyClass/myProperty#Name-of-some-heading they'll get a warning with a fixit to use a <doc:> style link instead.

The link syntax in DocC isn't specific to Swift. Any language that can describe its symbols and their relationships in a symbol graph file can be used in DocC. In addition to the Swift compiler emitting symbol graph files, Clang is capable of emitting symbol graph files for C and Objective-C code.

DocC uses the symbol spelling for each language based on the data in the symbol graph files. In this example an Objective-C class links to its instance method using a symbol:

/// ``doSomethingWithFirst:second:``
@interface MyClass : NSObject
- (void)doSomethingWithFirst:(NSString *)first
                      second:(NSString *)second;
@end

Symbols that have representations in multiple languages can use either language's symbol spelling (although the spellings need to be consistent throughout the path). Regardless of which language's symbol spelling is used in the link, the rendered page will display the name of the symbol in the source language that the page is being displayed in. In this example a Swift class with custom Objective-C names links to its instance method using both the Swift spelling and the Objective-C spelling:

/// ``MyClass/doSomething(with:and:)``
/// ``TLAMyClass/doSomethingWithText:andNumber:``
@objc(TLAM
public class MyClass: NSObject {
    @objc(doSomethingWithText:andNumber:)
    public func doSomething(with text: String, and number: Int) -> Bool { ... }
}

If a link could ambiguously refer to more than one page, DocC needs additional disambiguation to make the link unique.
Disambiguation can be added at any path component that makes the link unique and is added to the end of that path component separated by a dash ("-"). Multiple disambiguation suffixes (and redundant disambiguation in general) is supported.

DocC currently supports two disambiguation kinds: "symbol kind" and "symbol hash" disambiguation. Two more disambiguation alternatives ("return type(s)" and "parameter types") have been pitched but are still being implemented.

If a symbol has a different type from the other symbols with the same symbol path, you can use that symbol’s type suffix to disambiguate the link and make the link refer to that symbol. Symbol kind disambiguation can include a source language identifier prefix but it's not needed.

/// ``red-property`` 
/// ``red-type.property``
public struct Color {
    public var red, green, blue: Double

    public static let red = Color(red: 1.0, green: 0.0, blue: 0.0)
}

If the colliding symbols are of the same kind the link needs to be disambiguated with a symbol hash instead. The symbol has is a folded FNV1 hash of the symbols unique identifier, as already explained in the original post.

In the extremely unlikely case where a symbol needs to be disambiguated by its symbol hash and that symbol's hash disambiguation spells out one of the 4 or 5 letter symbol kind disambiguations, the parsing ambiguity can be resolved by disambiguating with both a symbol kind and symbol hash. For example, if someFunction() was overloaded and the hash for one of the overloads was "enum" that overload could unambiguously be referenced using ("-func" because the symbol is a function and "-enum" because it's that symbol's hash)

/// ``SomeClass/someFunction()-func-enum

Sometimes it's preferable to disambiguate at an earlier path component to use a symbol kind disambiguation instead of a hash disambiguation. In this example both symbol links refer to the same symbol but only the first link can be understood by a human.

@protocol Something <NSObject>
- (void)something;
@end

@interface Something : NSObject<Something>
- (void)something;
@end

/// ``Something-class/something``
/// ``Something/something-4f2sm``
@interface SomeOtherSymbol : NSObject
@end

Links in DocC are made to resemble URLs for their familiarity to developers but a link in DocC is not the same as the web URL to that page. You can link to the less-than operator in Swift using /Swift/Comparable/<(_:_:) but the path of its web URL is /documentation/swift/comparable/_(_:_:)-9jp4d . Similarly, a symbol with representations in multiple languages can be linked to using either language's spelling (for example doSomething(with:and:) or doSomethingWithText:andNumber: but the path of that page's web URL is the same (preferring there Swift spelling in the web URL) regardless of which language's spelling the developer used in the link.yClass)


  1. Depending on the access level of symbols included in the symbol graph files this could also include extensions of other access level, mainly internal extensions. ↩︎

4 Likes