Symbol Graph Adaptions for Documenting Extensions to External Types in DocC

theMomax · April 11, 2022, 12:41pm

Hi everyone! In my recent post about DocC we worked out how we want to present extensions to external types in DocC. The proposed solution can be found here. Unfortunately, we also found out that the symbol graph files currently do not contain all the information this solution requires and further, that the current structure might not be ideal to ensure easy processing in Swift-DocC.

I am very pleased with the result from the last thread, so I hope we can have a similarly productive discussion regarding the implementation details here!

Current State

I took the last days to check out the symbol graph generation in the Swift compiler and the output it generates.

AST

In the AST we have easy access to all the information we need. This includes specifically

comments on extension blocks e.g.:

/// THIS COMMENT HERE
public extension SomeExternalType: SomeProtocol {
    /* ... */
}

information on the extended type (e.g. its kind (swift.struct vs swift.protocol vs ...))

Those were two areas where we weren't sure if that information would be easily accessible in the UX design phase.

Symbol Graph Files

Extensions to external types are currently excluded from the module's main symbol graph. I.e. where the main module is SwiftDocC, public extensions to the standard library declared in SwiftDocC are located in SwiftDocC@Swift.symbols.json. This includes the symbol of kind e.g. swift.property, swift.func, or swift.struct as well as relationships of kind memberOf and conformsTo.

The symbol has a swiftExtension.extendedModule field containing the extended module's name (e.g. "Swift"). This also applies to extensions to local types.

Both relationships refer to the actual type declaration in the external module, e.g. String with identifier "s:SS". For extensions to external types, the symbol for this original type declaration is never present in the same file. It might be emitted as part of another symbol graph file, but e.g. for the Swift standard library it isn't emitted at all (locally).

Proposed Solution

My general idea is to adapt the symbol graph's structure in a way that best fits the structure in the UX design so that meddling with the graph in Swift-DocC is avoided. Meanwhile, the symbol graph should continue to transport raw data only. Specifically, I wouldn't want to bring any of the heuristics mentioned in the UX pitch discussion into the Swift compiler.

General Structure

The basis for my proposed solution is @ethankuster 's reply on the UX thread. You can find a diagram of my proposed graph structure below:

We introduce an alternative "extension" version for all type symbol kinds (struct, class, enum, protocol), e.g. next to swift.struct we also introduce swift.struct.extension. There exists one such symbol (of the respective kind) in the graph per extended external type.

The existing memberOf and conformsTo relationships (for extended external types) also end/start in these new symbols instead of referring to the original type-declaration's symbol.

Furthermore a new symbol swift.module.extension is introduced. This symbol represents the external module that is publicly extended. There exists one such symbol in the graph per imported module where at least one type is publicly extended.

The swift.TYPE_KIND.extension symbols are connected to the swift.module.extension symbols via a new memberOf relationship. (We could also use a different name here if we want to reserve the term "member" for type-members.)

Both these types of symbols carry little information (e.g. no comments) as they have no direct representation in the source code. The first type of symbols basically represents all usages of extension SomeExternalType whereas the second type of symbol represents all usages of the respective import SomeModule. I emphasize all, because there can be multiple occurrences of both in the source code! Nevertheless, we need these aggregate-symbols in order to achieve the structure that feels natural in DocC as pointed out in the UX discussion.

Adding Extension Block Comments

If we decide that we want to utilize comments above extension blocks to automatically generate more meaningful documentation on the extended type's documentation pages in Swift-DocC later on, I propose the following solution:

A new symbol of kind swift.extension is introduced. There exists one such symbol per extension block with at least one public member. This symbol carries information such as the comments above the respective extension block.

Additional "memberOf" and "conformsTo" relationships are introduced in accordance with the members/conformance declared in the respective block. The relationships to the swift.TYPE_KIND.extension symbols remain intact!

Finally, new contributesTo (naming totally up for discussion) relationships are added between the swift.extension symbols and the swift.TYPE_KIND.extension symbol that correspond to the same extended external type.

These symbols and relationships would be inspected and transformed into synthesized additions to other symbols based on various heuristics in Swift-DocC, and finally removed from the internal symbol graph representation. The prime example for this would be aggregating comments from above extension blocks and adding them to the respective swift.TYPE_KIND.extension symbol.

Unifying the Graphs

The existence of the swift.module.extension symbols would also allow us to move all these symbols into the main symbol graph, e.g. they are emitted in the main SwiftDocC.symbols.json file, not SwiftDocC@Swift.symbols.json. This should also simplify processing the data in Swift-DocC.

Note that all these changes would be hidden behind a flag in the Swift compiler, so that this changed behavior is opt-in for other projects depending on symbol graphs!

Alternatives Considered

Synthesizing Artificial Symbols in Swift-DocC

One option would be synthesizing the swift.TYPE_KIND.extension and swift.module.extension symbols in Swift-DocC. This would make sense as they do not really exist in code and do not really transport any information/metadata beyond the structure they provide.

However, this structure is required in DocC in order to achieve the desired outcome from a UX perspective and therefore the symbols would have to be synthesized after importing the symbol files in Swift-DocC.

While I cannot fully grasp the architectural consequences this would have inside Swift-DocC yet, I think the biggest problem with this approach is the following:

At some point we have imported the symbol graph file in Swift-DocC. At this point the symbol graph is described using SymbolKit and it would be relatively easy to synthesize the missing nodes and relationships. However, SymbolKit is The specification and reference model for the Symbol Graph File Format. Therefore it would be odd to add symbols to SymbolKit that do not exist in the symbol graph files emitted by the Swift compiler. Furthermore, the most important type, Symbol, is a struct, so extending the library (in terms of altering the graph's structure) from within Swift-DocC should prove to be difficult.

Synthesizing the pages later on in Swift-DocC is quite difficult I think. This would involve re-writing large parts of the conversion logic as there would be many new rules to follow and tricky situations that just couldn't happen before, e.g. relationships where one end doesn't exist.

Do Not Unify Symbol Graph Files

An option to keep the changes to the symbol graph files rather low would be to keep extensions to external types in separate files. That would mean we'd have no swift.module.extension symbols as this information can be deduced from what file contains a symbol.

Just synthesizing the top level documentation pages for the separate symbol graphs could be manageable inside Swift-DocC, however I'm not sure if the effort is justified. After all, the symbol graph files would still contain new symbols and relationships, which I guess is considered a breaking change. And if we already break it, we can also break it in a way that really fits our needs.

Discussion Points

Please keep in mind that I'm still rather new to this entire code-base, so I'm really dependent on your expert opinions for these large architectural decisions!

Of course, any feedback is welcome! I also want to bring up three questions to you all right at the beginning:

Firstly, what would the precise identifier for all the X.extension symbols be? How does the Swift compiler currently generate them? Is there some documentation I could read?

Secondly, should we add a "extensionOf" relationship between swift.TYPE_KIND.extension and the respective original declaration symbols (even though this original declaration is not part of the same symbol graph)? I think this could cause problems in the conversion in Swift-DocC again, however it also seems right to have a reference to the original type declaration when considering the discussion about [SR-15431] Support DocC references to symbols defined in another module · Issue #208 · apple/swift-docc · GitHub in the UX thread. If I have to mess with the symbol graph file format once I guess it would be best to plan ahead a bit.

Finally, for my broader understanding: What role does Swift-DocC play in the universe of Swift symbol graph files? Is it the main reason they exist or just one of many consumers of this interface? What would us introducing a new symbol graph file format mean for the old format? Would both versions have to be maintained next to each other indefinitely, or would the old one phase out after a short(ish) period of time?

daniel-grumberg · April 12, 2022, 11:02am

Love the proposal and thank you for doing this work!

I am little confused by the point above, why would we need the aggregate symbol for each extended imported module? Could the source module information not be encoded in the extension symbol directly via, a source module field? Unless we need to know more than the name of the extended module, I don't see the need to have a brand new symbol for it. I also think that the alternative where we do one symbol per extension block might be easier to implement in the compiler, merging these should be straight forward in DocC and make the symbol graph a closer match with the original source code. This would be useful, as you pointed out, for extension block documentation comments, and if Swift adds extension block level features. One example I can think of is extension block level default access control specifier, but off the top of my head I think it is reflected in the underlying symbols. I don't think there is anything else like that in the pipeline as far as I know (I also don't know if it would make sense from a language design perspective for Swift to introduce these kinds of features).

theMomax · April 12, 2022, 12:23pm

Hi Daniel, thanks for your feedback!

You are totally right, the information carried by my proposed swift.TYPE_KIND.extension and swift.module.extension symbols already exists in the current symbol graph file format, i.e. it could be reconstructed inside Swift-DocC from e.g. pathComponents or the swiftExtension.extendedModule fields. I already discussed the problem I see with this approach in the alternatives considered section:

theMomax:

At some point we have imported the symbol graph file in Swift-DocC. At this point the symbol graph is described using SymbolKit and it would be relatively easy to synthesize the missing nodes and relationships. However, SymbolKit is The specification and reference model for the Symbol Graph File Format . Therefore it would be odd to add symbols to SymbolKit that do not exist in the symbol graph files emitted by the Swift compiler. Furthermore, the most important type, Symbol , is a struct , so extending the library (in terms of altering the graph's structure) from within Swift-DocC should prove to be difficult.

Synthesizing the pages later on in Swift-DocC is quite difficult I think. This would involve re-writing large parts of the conversion logic as there would be many new rules to follow and tricky situations that just couldn't happen before, e.g. relationships where one end doesn't exist.

At least based on the repository's URL SymbolKit is kind of owned by Swift-DocC, so maybe redefining SymbolKit as a model that is a superset of the model in the symbol graph files would be an option, though!

I agree that independently of whether or not we add the swift.TYPE_KIND.extension and swift.module.extension symbols, we should definitely add a symbol for each extension block ( swift.extension in my original post) so we can actually transport all the information we need. In case we don't add the swift.TYPE_KIND.extension symbol, this symbol would also have to carry the type kind information!

daniel-grumberg · April 12, 2022, 1:44pm

CC @Franklin as I am new myself to DocC and he knows more than I do.

I am opposed to this as SymbolKit is the reference for SGF processing. If this was really needed we could create the types and extensions in DocC itself though. However, I really don't think we would need to synthesize special nodes, we could merge them when generating the DocumentationNode. However, upon second thought, it might make for less friction to do this at the SGF layer.

franklin · April 13, 2022, 9:05am

Hi @theMomax! This is looking really great.

I'm hugely in favor of modeling the extended types in the main symbol graph as opt-in behavior. The existing model of emitting extended module APIs in separate files is suited for displaying extension information in their extended module's documentation, but not in the defining module's documentation. For example, you'd include SwiftDocC@Swift.symbols.json when building documentation for Swift. I find your proposal much better UX-wise, so it makes sense to me to modify the symbol graph format to cater for that. Modeling the extended types in the main symbol graph allows documentation compilers to continue just representing whatever API hierarchy the symbol graph describes. With this architecture, I believe that the work on the Swift-DocC side would be quite minimal (just a matter of understanding these new symbol kinds and relationships, I believe).

The new symbol kinds and relationships make sense to me. Just a few notes:

An extended type (swift.TYPE_KIND.extension) can conform to multiple protocols I believe, so this should be a many-to-many relationship, if I understand the diagram correctly.
I'm not quite sure about reusing the memberOf relationship to model extended type -> extended module relationships. I don't have a better naming suggestion off-hand though, so I would like to hear others' thoughts as well.
Regarding the swift.extension symbol type: having the module being modeled as both a member of the swift.TYPE_KIND.extension and swift.extension is a bit odd to me. As you mentioned, this would require special processing on the Swift-DocC side to aggregate the doc comments, and we'd lose the 1 symbol = 1 documentation page that the DocC model currently has. I'm wondering if emitting swift.extension symbols should be done in separate "mode" entirely (--emit-extension-symbols or alike), in which the symbol -> swift.TYPE_KIND.extension relationship wouldn't exist. If the goal here is to aggregate documentation comments, we could also take your original model without swift.extension, and add a new array field in swift.TYPE_KIND.extension that contains all the (unordered) doc comments and source locations associated with all the extensions of this type in the module. However, from an authoring UX perspective, I'm not sure quite sure how that would work; you'd want to control the ordering of these. Any ideas? I'm also not opposed to scoping this out of this proposal.

Could you please attach an example symbol graph with your proposed additions, to make sure everyone is one the same page?

This is a great question. I don't think we should use same precise identifier (aka USR) as the extended type (i.e., s:SS) because consumers would see that type itself as being defined in the module. For example, Swift-DocC does USR-based link resolution as part of its compilation process, and as it stands, would consider s:SS to be symbol defined in the module, since its symbol graph contains a symbol with that USR. Swift-DocC would need to special-case symbols with kind swift.*.extension, but I think a better approach would be to have a separate USR for them entirely. It would be useful to record the relationship between the symbol and its extended symbol's as well though, as you mention further down—this will help with inter-module linking in in the future. I'm not quite sure whether the Swift compiler synthesizes a different USR for extensions (maybe @QuietMisdreavus would know). You could place a breakpoint in SymbolGraphGen and inspect what the declarations that are being processed.

Yes, I think we should do that. It's not required in the symbol graph model for the target of a relationship to be defined in the same symbol graph (i.e., it doesn't have to be in the symbols array). These relationships get a targetFallback property that textually describe the target. For example:

{
    "kind": "conformsTo",
    "source": "c:objc(cs)Bar",
    "target": "s:SH",
    "targetFallback": "Swift.Hashable"
}

I think tracking this kind of relationship makes sense for symbol extension -> symbol makes sense as well and this will be useful when performing inter-module link resolution, because DocC would be able to look up s:SH in another DocC archive. It might not make sense to display this relationship on the documentation page of the extension though—the info seems a bit redundant, at least until we can make the target symbol an actual link.

The symbol graph format is maintained independently from Swift-DocC as it is used by other documentation compiler as well. It's important to minimize breakages as much as possible. That being said, the symbol graph format has not reached stability yet (Swift symbol's graph is currently at 0.5.3 https://github.com/apple/swift/blob/main/lib/SymbolGraphGen/FormatVersion.h), but we should still aim to reduce breakages. It's also worth noting that the model aims to be generic enough for compilers of other languages to also be able to emit symbol graphs (e.g., there is an effort underway for clang to emit symbol graph files for Objective-C), so there needs to be some alignment so that documentation compilers can consume the same format.

theMomax · April 15, 2022, 2:08pm

Thank you for your valuable feedback @franklin!

You are absolutely right, I think I messed that up while reorganizing the drawing.

I agree that this solution doesn't feel perfect with the redundant memberOf relationship between <<Member Symbol>> and ``<<swift.TYPE_KIND.extension>>`. Yes, it would require special processing on Swift-DocC's side, but I think this processing would be rather easy to implement as we are only removing information. Nevertheless, I'm definitely open to other suggestions.

I don't think that would be useful for what we are trying to achieve right now. After all, we don't want to require two compiler runs to build a single documentation catalogue.

Depending on what capabilities we are aiming for, this could be a very good solution! I think with this approach we would commit to the following two restrictions:

The documentation does not know which extension block declares a specific member or if two members belong to the same documentation block.
As you already indicated, extension block comments cannot be referenced for curation. This would basically result in a "take it or leave it" situation. If no comment for the swift.TYPE_KIND.extension page is provided via manual curation, we try to synthesize one from the extension block comments. Otherwise, we just use the provided comment as is. There is no syntax to manually include extension block comments.

If we are willing to accept these (I am), this solution works perfectly fine and should be really straight forward to process in Swift-DocC.

As a JSON file? Yes, sure I'll try to provide one as soon as I find time.

That's also my intuition. I hope I can research the current structure of USRs more extensively in the next weeks.

Thanks for the info, I wasn't aware of that. In that case I think we should definitely include it in the symbol graph files, even though we won't show it in the UI until we can link to other modules.

That's good to know, thank you! In that case I guess we'll have to maintain both symbol graph formats for quite some time...

theMomax · April 22, 2022, 9:25am

I finally got around to investigate the USRs for extension blocks and import statements.

Extension Blocks

Let's start with extension blocks. We need this USR if we want to include swift.extension symbols, but a common prefix of this USR should also be used for the swift.TYPE_KIND.extension symbols.

You can find the relevant USR generation function here: swift/USRGeneration.cpp at ee7446f24341b42da86cdf66d60e1d49b931a678 · apple/swift · GitHub

The extension block USR follows the following pattern:

If the extension block defines at least one member, we just prefix this member's USR with an e::

s:e:<<USR_FIRST_EXTENSION_BLOCK_MEMBER>>

If not, and the extension block defines at least one conformance, we have the e: followed by the extended type's USR and the USR of the protocol we conform to:

s:e:<<USR_EXTENDED_TYPE>><<USR_PROTOCOL_EXTENDED_TYPE_CONFORMS_TO>>

A few examples:

// USR: s:e:s:SS9SwiftDocCE15myFuncExtensionyyF
public extension String {
    // USR: s:SS9SwiftDocCE15myFuncExtensionyyF
    func myFuncExtension() {
        
    }
    // USR: s:SS9SwiftDocCE19myPropertyExtensionSbvp
    var myPropertyExtension: Bool {
        true
    }
}

// USR: s:9SwiftDocC17MyAwesomeProtocolP
public protocol MyAwesomeProtocol { }

// USR: s:e:s:s5Int64Vs:9SwiftDocC17MyAwesomeProtocolP
extension Int64: MyAwesomeProtocol {

}

// USR: s:9SwiftDocC17MyBwesomeProtocolP
public protocol MyBwesomeProtocol { }

// USR: s:e:s:s5Int64V9SwiftDocCE02myA4PropSbvp
extension Int64: MyBwesomeProtocol {
    // USR: s:s5Int64V9SwiftDocCE02myA4PropSbvp
    public var myInt64Prop: Bool {
        false
    }
}

// USR: s:e:s:SS9SwiftDocCE4BLUBV
public extension String {
    // USR: s:SS9SwiftDocCE4BLUBV
    struct BLUB { }
}

Therefore I'd suggest using the common prefix for all these USRs as the precise identifier for symbols of type swift.TYPE_KIND.extension:

s:e:<<USR_EXTENDED_TYPE>>

Note that the <<USR_FIRST_EXTENSION_BLOCK_MEMBER>> always begins with <<USR_EXTENDED_TYPE>>, as the member is defined on this extended type.

The respective identifiers for the swift.struct.extension symbols for Int64 and String would be the following:

s:e:s:s5Int64V
s:e:s:SS

This should be unique within the defining module.

Import Statements

I think the closes thing we have to the swift.module.extension symbol in the source code are import statements. Unfortunately, they do not have an USR.

That said, I think we can just use an even shorter prefix by cutting the USR after the module identifier.

Examples would be:

Foundation: s:e:s:10Foundation
Markdown: s:e:s:8Markdown
Swift Standard Library (skips the module identifier): s:e:s:

I could also imagine to drop the s:e: prefix or replace it by s:m: (for module) for the swift.module.extension symbol's identifier.

Which option would you prefer, or does anyone have any entirely different suggestions?

franklin · April 26, 2022, 10:53am

Ah, that's interesting. I did not expect each extension block to have a separate USR. Thanks for doing that research!

I don't think we should be synthesizing USRs in SymbolGraphGen; we should be using whatever USR the Swift compiler emits. Otherwise the symbol graph would contain references to symbols that effectively don't exist from the Swift compiler's perspective. Preserving the notion that the symbol graph is a representation of whatever APIs the module contains is important design-wise. If we generate identifiers, they shouldn't be treated as the USR of a symbol.

@Xi_Ge it looks like you're the original author of printExtensionUSR; do you know if there is a USR that is unique across all extensions to the same symbol? E.g., a common extension USR for Swift.String in:

public extension String { func foo() {} }
public extension String { func bar() {} }

If so, we could then just relate foo() and bar() to that USR in the symbol graph.

If there is no such USR, then I guess this would mean including each extension block as a separate declaration in the symbol graph with the USR that the Swift compiler assigns it. We'd also need a way of linking these back together (e.g., via a common identifier) on the Swift-DocC side (maybe via a SymbolKit API) so that they are indeed seen as a single symbol. I'd also be interested in hearing @QuietMisdreavus's thoughts on this.

theMomax · May 11, 2022, 9:53am

Since there wasn't any input on the other options, I went ahead exploring this strategy.
I successfully altered the SymbolGraphGen module to emit "swift.extension" symbols for each extension block. These carry all the information we need. There's still some work to be done, but I don't see any blockers there.

I agree. Essentially, we want SymbolKit to remain the reference model for the Symbol Graph File Format, while also being extensible enough so we can use it in Swift-DocC to handle our slightly modified symbol graph format. (We don't want to introduce an extra graph library for our new format for performance and maintenance reasons.)

Unfortunately, that might require some breaking changes in SymbolKit. Currently, Symbol.KindIdentifier is an enum, which means we cannot simply extend it in Swift-DocC to add new cases (swift.module.extension, swift.struct.extension, ...).

I think the prettiest (but also most intrusive) solution would be to adopt the more extensible struct-with-static-constants pattern in Symbol.KindIdentifier as a replacement for the enum, just as it was done for Relationship.Kind. We could then easily add our new "cases" in Swift-DocC, but also loose some of the safety features enums have in combination with switch statements, etc.

An alternative (less intrusive) solution is based on the Symbol.KindIdentifier.unknown case. We would use that case when dealing with our custom symbols in Swift-DocC and carry the rest of the information either in Symbol.Kind.displayName or a new optional property.

What do you think?

franklin · May 12, 2022, 10:09am

Making Symbol.KindIdentifier a struct with static properties sounds great to me, it makes the API much more flexible. The SymbolKit API is still evolving, so even though we should prefer not breaking public API, I'd argue that this is an acceptable change because kind identifiers shouldn't be a closed set anyway. This also provides the flexibility for clients to add their own kind identifiers. We'll just want to make sure to leave the PR up for a bit so that the community has time to add a default case to switch statements and provide feedback. There might be some changes to do in Swift-DocC as well before landing the SymbolKit changes to ensure that the Swift CI toolchain builds still succeed.

theMomax · May 24, 2022, 1:47pm

I just opened a draft PR based on that strategy: https://github.com/apple/swift/pull/59047

I described the changes there. Please let me know what you think!

There are three points I'd be especially happy to hear your opinions on:

I currently only changed the behavior for extensions to external types, as we didn't really need any more information for the local case. However, this also implies that we still cannot access documentation comments on top of extensions to internal types. During the pitch discussion, I had the feeling that the community neither wanted to embrace this spot for documentation, nor entirely prohibit its use. My thought process was the following: If we allow this spot to contain documentation, people will use it and we have to somehow process it and find a suitable spot in the documentation catalog or append it to some other documentation string. The result will probably always be worse than if we just forced them to write one comprehensive piece of documentation above the original type declaration.
I did not unify the Symbol Graph Files. We have to transform the graph later anyway, so merging them before doesn't really make a difference. Furthermore, this way programs that only look at the main Symbol Graph Files (without @) wouldn't even break if we were to make the new behavior the default.
This is also the last point: should we make the new behavior the default and basically invert the flag to say -omit-extension-block-symbols (omit)?

Note that all three points should be easy to adapt, so changing them would be of little effort.

Here's also some examplary symbol graph files I generated from the SwiftDocC module: download zip archive

I added the following code to the project which should make it easy to observe the changes I made:

/// USR: s:e:s:SS9SwiftDocCE4BLUBV
public extension String {
    /// USR: s:SS9SwiftDocCE4BLUBV
    struct BLUB { }
}

/// USR: s:e:s:SS9SwiftDocCE15myFuncExtensionyyF
public extension String {
    /// USR: s:SS9SwiftDocCE15myFuncExtensionyyF
    func myFuncExtension() {
        
    }
    /// USR: s:SS9SwiftDocCE19myPropertyExtensionSbvp
    var myPropertyExtension: Bool {
        true
    }
}

/// USR: s:9SwiftDocC17MyAwesomeProtocolP
public protocol MyAwesomeProtocol { }

/// USR: s:e:s:s5Int64Vs:9SwiftDocC17MyAwesomeProtocolP
extension Int64: MyAwesomeProtocol {

}

/// USR: s:9SwiftDocC17MyBwesomeProtocolP
public protocol MyBwesomeProtocol { }

/// USR: s:e:s:s5Int64V9SwiftDocCE02myA4PropSbvp
extension Int64: MyBwesomeProtocol {
    /// USR: s:s5Int64V9SwiftDocCE02myA4PropSbvp
    public var myInt64Prop: Bool {
        false
    }
}

/// USR: s:e:s:Sa9SwiftDocCSQRzlE11myArrayPropSbvp
extension Array: MyAwesomeProtocol where Element: Equatable {
    /// USR: s:Sa9SwiftDocCSQRzlE11myArrayPropSbvp
    public var myArrayProp: Bool {
        false
    }
}

I can also generate the symbol graph files for another project if you prefer.

franklin · June 14, 2022, 10:25am

Thank you, Max! Sorry for the late response here.

theMomax:

I currently only changed the behavior for extensions to external types, as we didn't really need any more information for the local case. However, this also implies that we still cannot access documentation comments on top of extensions to internal types. During the pitch discussion, I had the feeling that the community neither wanted to embrace this spot for documentation, nor entirely prohibit its use. My thought process was the following: If we allow this spot to contain documentation, people will use it and we have to somehow process it and find a suitable spot in the documentation catalog or append it to some other documentation string. The result will probably always be worse than if we just forced them to write one comprehensive piece of documentation above the original type declaration.

This sounds good to me. If we ever wanted to include extensions to local types in the future, the design you're proposing would allow for that quite nicely.

I'm in favour of making this the default behaviour, as long as clients like Swift-DocC won't break when the Swift compiler changes get merged. Otherwise, we should not make it the default for now to leave some time for clients to add support, then we can make it the default.

Thanks for attaching the symbol graph files, it's greatly appreciated to understand the changes! I'm very excited by your progress here.

theMomax · June 15, 2022, 9:20am

Hi Franklin, no worries at all...I found plenty of other work up the stack I could do in the meantime!

I'm glad you agree with my overall design decisions!

In regards to what should be the default behavior:

"Like Swift-DocC" is difficult to define. Tools that - like Swift-DocC - only parse the main SGFs (i.e. the ones without @) won't break as the only change there is an additional property on the swiftExtension mixin. However, an additional unknown property in JSON should always be ignored.

Tools that do parse extension SGFs (i.e. the ones with @) will probably break! They will encounter symbols and relationships of unknown kind (swift.extension and extensionTo), as well as relationships the source of which is a symbol of unknown kind (conformsTo and extensionTo).

I tried it with DocC (current main + earliest open source commit), and both versions ran successfully. Jazzy, which is the only other consumer of symbol graph files I know of, will break, though!

jazzy/symbol_graph/symbol.rb:119:in `init_kind': Unknown symbol kind 'swift.extension' (RuntimeError)

However, they also flag the feature as somewhat experimental in their readme:

Docs from .swiftmodules or frameworks

This feature is new and relies on a new Swift feature: there may be crashes and mistakes: reports welcome.

Swift 5.3 adds support for symbol graph generation from .swiftmodule files. This looks to be part of Apple's toolchain for generating their online docs.

If Jazzy really is the only other tool using SGFs, another option could also be to make the new behavior default and give their maintainers a friendly heads up once the PR has been merged. I reckon there'll be still some time left until the next Swift release then.

franklin · June 15, 2022, 9:30am

Swift-DocC does parse @ extension files. If you compile documentation with A.symbols.json, A@B.symbols.json, and B.symbols.json, DocC will pick up the @ symbol graph file. However, you're right that this behavior isn't possible when building via Xcode or SwiftPM. We should still make sure that the workflow continues to work for other integrations of docc, though. Given that, I think we should make the behavior opt-in initially, and after some time (a few months, say), make it the default, as you're proposing. And yes, since these changes aren't for Swift 5.7 but the next release, we have some time :)

theMomax · June 15, 2022, 9:34am

I didn't know that; thank you for clarifying!

It's decided then

johnfairh · June 15, 2022, 9:46am

Don't worry about any Jazzy breakage, I'll sort it out as you land your stuff.

We should probably delete that warning, it really does date from Swift 5.3 time...

theMomax · June 15, 2022, 10:00am

Wow @johnfairh, that was quick!

Thanks for your offer, but I think we'll still make it opt-in first, just to potentially avoid some disruption in the development of docc. I'll then change the default behavior in the Swift compiler once I have made docc compatible with the new format.

taylorswift · June 15, 2022, 5:25pm

swift-biome relies on the @ filenames to assign namespaces to symbols, please make this opt-in!

theMomax · June 15, 2022, 5:34pm

Thanks for the heads up. As already stated, it’ll be opt in first. I’ll make sure to notify everyone once the new behavior is fixed and you can start adopting it.