i am interested in parsing .swiftinterface files in order to extract information about @_spi attributes in swift libraries. i intend to combine this information with the output of lib/SymbolGraphGen, which lacks @_spi awareness. but i am unsure how to map the full signatures in the .swiftinterface files to the partial signatures emitted by lib/SymbolGraphGen.
as far as i am aware, there are two possible ways to identify vertices in symbol graphs:
by mangled name (USR)
by source position
but i am unsure how to obtain either a mangled name or a source position from a .swiftinterface file. any pointers would be appreciated!
The source positions of the original declarations are not preserved in the swiftinterface. The mangled name can be recovered by SILGen if you are the frontend, but there's no real way to do that if you just have swift-syntax.
Your best bet is to extend SymbolGen to emit the relevant information.
A reasonable hypothetical follow-up question: "Why can't swift-syntax emit mangled names?" Because mangled names have types in them, and they look through typealiases and such, so they depend on (at least some) import resolution and type-checking.
i understand that the Right Solution here is to fix SymbolGraphGen. however, my original goal here is to generate documentation for swift packages.
today, this means generating documentation for swift packages on 5.9, as that is the version of swift people expect to read documentation for. in a few weeks time, this will mean generating documentation on 5.10.
even if we are willing to show users documentation that is different from what it is labeled as, there is a high percentage of packages in the ecosystem that do not compile with nightly toolchains.
unless i’m missing something, an optimistic timeline for a SymbolGraphGen-based solution is going to be along the lines of:
if i miss the March 15 deadline (likely, as i am an external developer), we are looking at Q1 2025 as the earliest we can have @_spi visibility in swift documentation. is my understanding correct?
Random idea: Use indexstore to bridge the gap between the USRs and the parsed source.
Invoking the frontend from the command line to compile a .swiftinterface file with -index-store-path won't actually generate an indexstore with meaningful data (there's some code path that bails out early, but I don't remember where). However! You can write your own tool based on Swift's compiler sources that creates a new frontend invocation with the right options set and that does work. We do this at my employer to generate an indexstore from the Apple SDK .swiftinterface files so we can feed that data into our code search pipeline.
So you could probably hack together something like this:
Generate an indexstore from the .swiftinterface file, which gives you the mapping from USR to source locations.
Parse the .swiftinterface file with swift-syntax to find all the @_spi declarations.
Combine the information from 1 and 2 to figure out which USRs have which @_spis.
Since the tool you write that does #1 is operating on .swiftinterface files, it doesn't have to match any specific version of the compiler (just new enough to handle any interfaces you pass to it). You may need to do some work to configure the compiler instance with the right search paths for any dependencies you might have, but you have to do that anyway to extract a symbol graph too (symbol graph extraction requires loading all the modules, IIRC?).
It's not the simplest workaround, but it might be worth trying while you wait for the SymbolGraphGen fixes that I hope you're planning to still contribute to land in a released toolchain.
that might just be crazy enough to work i was actually thinking about integration with indexstore for rendering code snippets the other day, so this might just fit nicely into that. any recommendations for where i can learn about indexstore though?
correct, i accidentally wrote a build system while i was working on Swift Unidoc. the exception would be binary and system targets (which i guess includes the standard library), i do not know where to obtain the .swiftinterface files for those.
I don't know if it's really documented per se, but the C API for navigating indexstore has been fairly stable for a while. The file format has as well; while the emitted files are LLVM bitcode with no compatibility guarantees (that I'm aware of), the record layout has remained more-or-less unchanged for several years, so you don't need to worry about linking against the exact same version of libIndexStore (I hope I'm not jinxing that).
The indexstore is built on top of two main concepts: units and records. Units represent a compiled artifact like an object file or a precompiled Clang module, and they contain information about which file(s) were compiled to make that unit and their dependencies. Records each represent a single source file in the compilation and contain the symbol information.
If you're generating an index from a .swiftinterface file, you should just end up with a single record for that file, so that makes navigating the store a bit easier. From there, you can iterate over all the symbol occurrences that represent declarations and record their line/column information.
thanks Tony, it would have taken me forever to figure all that out on my own.
another idea recently popped into my mind, which is that the challenge is compiling swift packages with nightly toolchains, as many of them either run into ownership errors or crash the compiler. but i am not sure why we must compile the package using the same version of the toolchain that we emit the symbol graph JSON for.
last i checked, this wasn’t supported, as SymbolGraphGen checks for some kind of version header. but i am wondering how complicated it would be to enable this, so that we could use a nightly SymbolGraphGen on a package that was built with a release toolchain.
I think the challenge here is that emitting the symbol graph involves loading the modules—the one you're interested in extracting as well as all of its dependencies. So, all the compiler's Sema and Serialization behavior come into play, and it's only supported to load a serialized binary module using the version of the compiler that compiled it.
If you had .swiftinterface files for the module you're interested in extracting and all of its dependencies, then you could potentially just do an implicit module build that loads all of those modules from those interfaces instead. But I don't know if that's feasible, especially since you specifically called out binary targets above.
Jumping back up to that for a moment:
For the standard library, on Apple platforms the .swiftinterface files are in Xcode's platform SDKs. On Linux, I don't believe they distribute one at all, because Linux isn't an ABI-stable platform. I'm wrong, on Linux I see it under <toolchain_dir>/usr/lib/swift/linux/Swift.swiftmodule/x86_64-unknown-linux-gnu.swiftinterface.
But since you're interested in SPI data, the public .swiftinterface files wouldn't have those anyway. It's the .private.swiftinterface files you need, and I'd expect SDKs to not include those when distributing publicly.