What is the purpose of synthesized symbols in a symbolgraph?

is there ever a situation where a type’s list of inherited symbols could not be computed dynamically from its list of protocol/superclass conformances?

The symbol graph is meant to represent the entire API surface for a module; the intention is to avoid replicating the rules and logic of any particular programming language in tools that consume the data. For example, the initializer inheritance rules for Swift are unique to the language and fairly complex. Additionally, tools that require globally-consistent identifiers (for example, source editor-based requests for Quick Help via USR) would need to reproduce the rules for constructing symbol identifiers for synthesized symbols. Finally, to compute all synthesized symbols you would need access to the symbol graphs for all dependent modules.

So the answer to your question is: technically no, but it requires access to all the symbol graphs and a desire to reproduce compiler/language-specific logic.

That said, I think there are ways to chip away at the problem I suspect you're alluding to; the symbol graph balloons in size significantly when synthesized member emission is enabled and most of the data is duplicated. There's probably a more compact representation, but I'm not sure how significant the improvement would be.

1 Like

unfortunately, serving documentation for synthetics is an inherently dynamic problem, and a static solution like emitting synthesized symbols into a symbol graph does not actually solve it.

i spent a lot of time trying to wrap my head around this concept, and i think i’ve finally zeroed in on the error in thinking that was causing me so many headaches with inherited members, which i’ll summarize below:

  • false statement: synthetic symbols are part of a module’s symbolgraph, and every module comes with a set of natural symbols and synthetic symbols. the API of a collection of modules (e.g., a package) is the union of the symbolgraphs of its constituent modules.

  • true statement: synthetic symbols are not part of any symbolgraph; rather, they arise through interactions between arbitrary subsets of modules. the API of a collection of modules (e.g., a package) cannot be computed in advance without considering every possible subset of modules in that collection.

to walk through a concrete example, suppose we have four single-module packages with a “Z”-shaped dependency graph:

legend: swift-x <- swift-y ::= “swift-y depends on swift-x”
swift-foo <- swift-baz 
          ↙            
swift-bar <- swift-qux
  • swift-foo declares enum FooType
  • swift-bar declares protocol Barable
  • swift-baz conforms FooType to Barable
  • swift-qux extends Barable with Barable.qux(_:)

the natural symbolgraph for these four packages might look like

swift-foo                   swift-bar                    swift-qux 
 FooType  -- conforms to ->  Barable  -- has member ->  Barable.qux(_:)
       (perpetrator: swift-baz)   (perpetrator: swift-qux)

now, you would expect that if you import both BazModule and QuxModule, then FooType should have a synthetic member FooType.qux(_:).

    swift-foo                               swift-??? 
     FooType  -- has synthetic member ->  FooType.qux(_:)
               (perpetrator: swift-???)   

but this symbol doesn’t actually belong to either of swift-{foo, bar, baz, qux}, nor can we really “blame” its existence on any single module. it actually belongs to an imaginary client package that imports both BazModule and QuxModule.

swift-foo <- swift-baz <- (swift-baz × swift-qux)
          ↙            ↙ 
swift-bar <- swift-qux

so, at a minimum, the number of possible imaginary client modules grows with O(n2) of the number of modules involved.

1 Like

IMHO you aren’t thinking of this right and you kinda hit the nail on the head with this statement.

Everything you stated is correct, nonetheless where is the need to serve FooType.qux(_:). The documentation for FooModule shouldn’t worry about including documentation for symbols that don’t exist yet. Furthermore, if FooType is just using the default implementation provided by QuxModule then users should refer to QuxModule’s documentation. If an imaginary package provides a custom implementation of FooType.qux(_:) then it should be the one documenting it, but that ties into the excellent work @theMomax has been doing as part of Document Extensions to External Types Using DocC,

qux would not be a default implementation because Barable has no requirement for it. therefore there is no way for a type conforming to Barable to know it exists just by knowing about Barable — it also has to know about all the third-party modules extending Barable.

there is no direct relationship between FooType and QuxModule, other than that they both have a friend in common (Barable) which is why this is a documentation problem.

the 'imaginary' package would usually be an end-consumer package (combining libraries in ways none of the individual libraries themselves could have forseen), while FooModule and QuxModule would usually be libraries, and libraries are more likely to have documentation. so it is not really reasonable to expect every end user to build and host their own documentation for their stack.