Support symbol filtering at the DocC level

this was split off from a larger thread:

the organizational explanation for this is because DocC does not currently have a means of performing filtering itself. so all these responsibilities have been assigned to SymbolGraphGen, which is why we are here talking about explicit attributes + command line options + a convention about underscored APIs.

if DocC were to gain the capability to filter its input, maybe SymbolGraphGen won’t need so many overlapping features?

2 Likes

There's a lot of value keeping filtering capabilities in SymbolGraphGen itself as well, since it allows for the symbol graphs (which can get quite large in size) to only contain the symbols that the client is interested in. For example, if SymbolGraphGen emitted symbols of all visibilities (private, internal, public) but the client is only interested in public symbols, the symbol graphs files would get much larger than they need to, especially for frameworks that have a small public API surface relative to their internal APIs.

It would makes sense to make SymbolGraphGen's behavior more configurable though, depending on what purpose clients need the data for. For example, while it's incorrect to consider public APIs that are underscored as 'public' (per Lexical Structure — The Swift Programming Language (Swift 5.7)), I'm in favor of being able to configure SymbolGraphGen to include them in symbol graphs if that's something clients need.

2 Likes

i agree. filtering on access level is absolutely something that should remain in SymbolGraphGen. however there are a lot of filtering rules in SymbolGraphGen, like the ones related to underscored prefixes and @available that don’t really contribute a lot to the size savings. those i think could be moved into DocC?

The filtering out of underscored public symbols when generating a symbol graph for symbols of 'public' visibility makes sense IMO given the description in TSPL. Other parts of the Swift compiler have similar treatments. A change in behavior in SymbolGraphGen would require clients to update, which we'd need to allow some time for. I'd lean towards adding a flag/flags to disable filtering for specific use cases or entirely depending on what clients need. Can you please summarize the kind of control you need for your uses, and ideally pitch how you'd like for that to be achieved?

1 Like

i don’t think there is disagreement on this point. generating “complete” symbolgraphs is a specialized use-case for when you need to do additional post-processing on a symbolgraph. it does not need to and based on the considerations you’ve mentioned, should not be the default, at least in the short term.

i think a compiler flag, of the kind you have suggested, and @QuietMisdreavus has filed under issue #60163 is the best solution. (cc @Jazek )

in the medium to long term, if DocC gained the capability to filter its input, then this option could be enabled by default, and the SymbolGraphGen could deprecate its own filtering at its own pace.