What is the purpose of synthesized symbols in a symbolgraph?

is there ever a situation where a type’s list of inherited symbols could not be computed dynamically from its list of protocol/superclass conformances?

The symbol graph is meant to represent the entire API surface for a module; the intention is to avoid replicating the rules and logic of any particular programming language in tools that consume the data. For example, the initializer inheritance rules for Swift are unique to the language and fairly complex. Additionally, tools that require globally-consistent identifiers (for example, source editor-based requests for Quick Help via USR) would need to reproduce the rules for constructing symbol identifiers for synthesized symbols. Finally, to compute all synthesized symbols you would need access to the symbol graphs for all dependent modules.

So the answer to your question is: technically no, but it requires access to all the symbol graphs and a desire to reproduce compiler/language-specific logic.

That said, I think there are ways to chip away at the problem I suspect you're alluding to; the symbol graph balloons in size significantly when synthesized member emission is enabled and most of the data is duplicated. There's probably a more compact representation, but I'm not sure how significant the improvement would be.

1 Like

unfortunately, serving documentation for synthetics is an inherently dynamic problem, and a static solution like emitting synthesized symbols into a symbol graph does not actually solve it.

i spent a lot of time trying to wrap my head around this concept, and i think i’ve finally zeroed in on the error in thinking that was causing me so many headaches with inherited members, which i’ll summarize below:

  • false statement: synthetic symbols are part of a module’s symbolgraph, and every module comes with a set of natural symbols and synthetic symbols. the API of a collection of modules (e.g., a package) is the union of the symbolgraphs of its constituent modules.

  • true statement: synthetic symbols are not part of any symbolgraph; rather, they arise through interactions between arbitrary subsets of modules. the API of a collection of modules (e.g., a package) cannot be computed in advance without considering every possible subset of modules in that collection.

to walk through a concrete example, suppose we have four single-module packages with a “Z”-shaped dependency graph:

legend: swift-x <- swift-y ::= “swift-y depends on swift-x”
swift-foo <- swift-baz 
          ↙            
swift-bar <- swift-qux
  • swift-foo declares enum FooType
  • swift-bar declares protocol Barable
  • swift-baz conforms FooType to Barable
  • swift-qux extends Barable with Barable.qux(_:)

the natural symbolgraph for these four packages might look like

swift-foo                   swift-bar                    swift-qux 
 FooType  -- conforms to ->  Barable  -- has member ->  Barable.qux(_:)
       (perpetrator: swift-baz)   (perpetrator: swift-qux)

now, you would expect that if you import both BazModule and QuxModule, then FooType should have a synthetic member FooType.qux(_:).

    swift-foo                               swift-??? 
     FooType  -- has synthetic member ->  FooType.qux(_:)
               (perpetrator: swift-???)   

but this symbol doesn’t actually belong to either of swift-{foo, bar, baz, qux}, nor can we really “blame” its existence on any single module. it actually belongs to an imaginary client package that imports both BazModule and QuxModule.

swift-foo <- swift-baz <- (swift-baz × swift-qux)
          ↙            ↙ 
swift-bar <- swift-qux

so, at a minimum, the number of possible imaginary client modules grows with O(n2) of the number of modules involved.