[GSoC 2025] DocC Language Features in SourceKit-LSP

Hello everyone!

I'm Gunjan Rawat, a final-year Computer Science student, and I'm excited to contribute to Swift as part of Google Summer of Code 2025. I’m particularly interested in working on the project "DocC Language Features in SourceKit-LSP", and I would love to collaborate with @ahoppen and @matthewbastien .

I have about a year of experience as an iOS development intern, where I worked with Swift, which helped me gain a strong understanding of the language. Additionally, I had the amazing opportunity to participate in last year’s Swift Mentorship Program as a mentee, where I worked on setting up and understanding the Swift compiler with @amritpan. That experience not only deepened my knowledge of Swift’s internals but also introduced me to contributing to open-source projects in the Swift.

I have reviewed the project details and successfully set up a local development environment for both SourceKit-LSP and Swift-DocC. At the moment, I’m focusing on exploring how to implement syntax highlighting for DocC .md and .tutorial files in SourceKit-LSP. I believe SourceKit-LSP currently does not provide syntax highlighting support for these files, I’m trying to understand the best approach to integrate it.

To move forward, I have a few questions regarding how LSP handles syntax highlighting and where the changes should be made:

  1. How does LSP currently handle syntax highlighting in SourceKit-LSP?
  • From my understanding, SourceKit-LSP uses semantic tokens to provide syntax highlighting, but I want to clarify how exactly these tokens are generated within the codebase. Specifically:
  • What part of the codebase is responsible for processing and sending semantic tokens?
  1. Where should DocC syntax highlighting be added?
  • Should I modify the existing syntax highlighter to support DocC, or would it be better to create a separate highlighter specifically for DocC files?
  • Would adding DocC syntax highlighting require parsing .md and .tutorial files within SourceKitLSP/Documentation, or is there an existing mechanism that can be extended for this?

Understanding these details will help me move forward. I plan to step by step move towards understanding how to work with 'going to definition for symbols' and 'Diagnostics reporting symbol declarations'. If I can get a bit of context on these as well, that'll be great.

Also, I also noticed that this project is marked as "medium" with an estimated 90-hour workload, which is same as some "small" ones. I wanted to confirm whether this classification is accurate or if there are additional considerations I might be overlooking. @MatthewBastien

4 Likes

Hello @GunjanRawat26!

Thanks for your interest in this project! I've just put up a PR to adjust the time to 175 hours because yes, this is supposed to be a medium sized project.

For syntax highlighting there are two pieces:

  1. A grammar file within the editor (in this case VS Code) which does the basic syntax highlighting. This may be tricky for the Markdown files because VS Code already has a grammar for those. It'll be interesting to see if there's a way to add custom syntax highlighting just for the DocC specific features.
  2. Semantic highlighting done via the Semantic Tokens request in the Language Server Protocol specification. Not sure if this applies to DocC, but it would be worth looking into.

Diagnostics reporting is probably the most interesting/involved aspect of this project. It'll require searching the Swift comments or Markdown/Tutorial files for symbol links surrounded with double back ticks. SourceKit-LSP would then have to search its index to make sure these symbols are valid and send diagnostics to the editor if they are not.

Time permitting there are a few extra pieces that could be implemented such as hover support for said symbols as well as using DocC to create the hover. Really, the sky's the limit for language features. You can find a full list of language features in the Language Server Protocol specification.

Here are some (hopefully) more specific answers to your questions:

  1. How does LSP currently handle syntax highlighting in SourceKit-LSP?

You can have a look at the SwiftLanguageService to see how other language features are already implemented for Swift. More specifically, the Semantic Tokens request is done by SemanticTokens.swift which adds an extension to SwiftLanguageService.

  1. Where should DocC syntax highlighting be added?

I'm in the process of putting some final pieces of the DocC support into SourceKit-LSP as well as a bit of reorganization of the code, but all of these language features would be implemented in the DocumentationLanguageService.

Hey @matthewbastien , I was trying to see how we could implement a parser logic for DocC files for highlighting. We might need to use DocC's built-in parser, to parse directives, is there an API from DocC that let's me do this?

I'm not sure I fully understand your question, and may have misinterpreted it - but thought I'd try and answer:

DocC has the concept of directives as things that look like:

@Metadata {
    @TitleHeading("Release Notes")
}

Which is typically only found in the markdown files in a DocC catalog. If that's what you're asking, there's not an explicit API for parsing them - it's embedded into the overall flow for how docc processes catalogs and symbolgraphs together, accessed and worked through the lens of parsing the nodes in the symbol graph, and converted them into DocC archive nodes that it stashes into a data file, later read by the DocC Renderer.

The parser itself is in swift-docc/Sources/SwiftDocC/Semantics/DirectiveParser.swift at 18c027ee91ca28706d53ca501f2292bc099bf71b · swiftlang/swift-docc · GitHub, but the bit you probably care more about are the variety of classes that conform to AutomaticDirectiveConvertible - or to the protocol itself, which is the primary pathway for the processing flow logic (swift-docc/Sources/SwiftDocC/Semantics/DirectiveInfrastructure/AutomaticDirectiveConvertible.swift at 18c027ee91ca28706d53ca501f2292bc099bf71b · swiftlang/swift-docc · GitHub) (at least I believe so - I'll defer to @ronnqvist and others to correct me if I'm wildly off target here)

You might start by highlighting every directive the same using the lower-level BlockDirective type from the Markup library, and then determine whether any directives from the list @Joseph_Heck shared should get special highlighting. For example, I could imagine @Comment directives receiving special treatment.

2 Likes

Hey folks, I’m still in the process of adding (and deciding how to add) screenshots related to the expected results.
In the meantime, I’m using this as my current proposal : GSoc proposal - Google Docs
Please feel free to leave any comments or suggestions directly on the doc if you have time.
cc: @matthewbastien

1 Like

That proposal looks great! I don't really have any comments to add. Thank you for putting this together!

2 Likes