[GSoC 2025] Documentation coverage

sdms · March 11, 2025, 6:23pm

Hello, everyone!

My name is Sofia Diniz, and I’m an undergraduate student majoring in Information Systems, set to graduate in April. I first became acquainted with iOS development as a student at the Apple Developer Academy in 2022. Two years later, I graduated from the Academy and began teaching iOS development there. Last July, I shifted my focus to building projects for the Vision Pro.

Along the way, I’ve won the Swift Student Challenge twice, developed various apps, published an article on leveraging component packages to improve accessibility in mobile apps, and deepened my interest in the Swift ecosystem, going beyond app development.

For the past few months, I’ve been eager to contribute to the ecosystem of open-source Swift projects. Since this will be my first open-source contribution, GSoC seems like the perfect opportunity to receive guidance while giving back to a community that has taught me so much.

I’m particularly interested in enhancing Swift DocC’s experimental documentation coverage feature and have been familiarizing myself with the codebase. So far, I’ve dived into related topics in the forum, cloned and run the repository and I am now diving into the documentation while evaluating potential first issues. I noticed that the tagged good first issues have been open for some time (with the latest comments dating back at least a year), so I was wondering if they are still relevant additions to the project.

As previously mentioned, I’m quite novice to this process, so my plan is to start with a good first issue, gain a deeper understanding of the codebase, draft potential implementations for my GSoC proposal, validate them if possible, and then finalize my proposal.

If you have any recommendations or suggestions on how to approach this process, I would greatly appreciate your guidance!

ronnqvist · March 13, 2025, 6:42pm

Hi @sdms,

I'm happy to hear that you're interested in enhancing Swift DocC’s experimental documentation coverage feature.

I just had a look and those issues are all still relevant. We haven't added any new good first issue items in a while but if there's any particular area of the project that you're interested in I'm sure that we can find some new good first issues in that area as well.

That sounds like a good approach. I would also recommend playing around with the feature as it exists today to get a sense of how it behaves today and what data it output and compare that to how you'd expect it to behave and the data you'd expect it to output.

It could also be useful to prototype something (a website, app, or anything else that you're comfortable with) that displays some coverage data to build your own opinion on the current format—if it contains the information that you need, if it's easy to work with, etc.—and if there are any changes to the data that you'd like to propose to make it easier to present the information.

(That prototype isn't the end product of the project but it can be a relevant tool throughout your project to verify that the data is easy to work with and contains relevant information.)

sdms · March 18, 2025, 2:40am

Hi, @ronnqvist!!

Thank you for your kind and detailed reply! I really appreciate the guidance.

I noticed that you recently added a few new good first issues, and this one in particular caught my interest. However, it seems like someone else is already working on it, so I assume it would be best for me to explore other options.

These pointers were really helpful—thank you. I’ve tested the feature a couple of times already and have been taking notes on thoughts on how to improve it, particularly regarding the presentation of the data. I think this will create a good foundation for the prototype that you suggested.

I’ve also been reading forum posts about this feature in order to try and better understand it. These posts in particular gave me a some valuable insights:

Additionally, I’ve been reading deeper into the codebase, and from what I gathered (please feel free to correct me if I’m mistaken), most of the implementation of the Documentation Coverage feature spans across the following files:

Sources/SwiftDocC/Coverage/DocumentationCoverageOptions,
Sources/SwiftDocC/Infrastructure/Context/DocumentationContext+Configuration,
Sources/SwiftDocC/Infrastructure/CoverageDataEntry,
Sources/SwiftDocCUtilities/Action/Actions/CoverageAction, and
Sources/SwiftDocC/Infrastructure/ConvertActionConverter.

Finally, I took another look into the open issues in the GitHub repository and came across this one. It seems to be a relatively simple fix, and it’s closely related to the area I want to focus on, so I believe it would be a fitting first dive into the codebase.

I would greatly appreciate hearing your thoughts on these matters :)

ronnqvist · March 18, 2025, 1:05pm

Yes, someone has expressed interest in working on that. I'll see if we can find another new good first issue in the same area for you.

Yes, that seems correct to me. CoverageDataEntry is the main place where the coverage data is calculated for each page. Each coverage entry is created by ConvertActionConverter during the "render" phase (since the CoverageDataEntry reads some data from both each page's model object and its render object). After all pages are rendered and have their coverage entries created, the ConvertActionConverter passes the coverage entries to an output consumer type which writes the data to a file within the archive output. This file is then read by the CoverageAction which prints a summary of the data.

DocumentationCoverageOptions is the model for the command line options that the developer passes to docc convert. That would be a good place to add and/or remove command line flags and options based on what configuration you feel like is useful for the developer to be able to make for the coverage feature.

The code in DocumentationContext+Configuration is for making configuration to the context before it's created. Specifically, configuring shouldStoreManuallyCuratedReferences makes the DocumentationContext store curated references in manuallyCuratedReferences which the coverage data checks and reports on. If you need to make other coverage specific configuration changes to the documentation context, you can add values in Configuration.ExperimentalCoverageConfiguration and read them from the context.

sofiaromorales · March 18, 2025, 1:32pm

Hi @sdms. I'm glad to know you're interested in this project.

Just in case you might find it helpful there's another thread from another DocC contributor giving feedback about this feature Experimenting with documentation coverage metrics - a few questions and feedback - #3 by Joseph_Heck.

While the thread is a few years old and some points may no longer be relevant, there could still be valuable insights to consider when planning future enhancements to this tool.

Thanks again for your interest, looking forward to your proposal :)

sdms · March 21, 2025, 9:56am

Just in case you might find it helpful there's another thread from another DocC contributor giving feedback about this feature Experimenting with documentation coverage metrics - a few questions and feedback - #3 by Joseph_Heck.

Hi, @sofiaromorales!

Thank you for the recommendation. Some of the points in that thread were really insightful. Since a good part of the GSoC project I want to work on involves making decisions on how to improve this feature, I’ve been gathering as much feedback as I can find to ensure these decisions are well-informed. Therefore these posts are quite helpful :)

Yes, that seems correct to me. CoverageDataEntry is the main place where the coverage data is calculated for each page. Each coverage entry is created by ConvertActionConverter during the "render" phase (since the CoverageDataEntry reads some data from both each page's model object and its render object). After all pages are rendered and have their coverage entries created, the ConvertActionConverter passes the coverage entries to an output consumer type which writes the data to a file within the archive output. This file is then read by the CoverageAction which prints a summary of the data.

Thank you so much for the detailed explanation! I’ve been taking notes on the pointers you provided, as well as insights and conclusions I’ve drawn from reading the code and forum discussions, and this has been incredibly helpful.

On a more conceptual level, I’ve noticed that the Project Idea and a couple of other forum posts mention the goal of “writing the coverage metrics in a new extensible format that other tools can read and display”. I’ve come across various pieces feedbacks that I believe correlate to this topic, such as:

Making documentation metrics easily accessible for the Swift Package Index
Exporting the data as a CSV file
Displaying the coverage metrics available in the website preview

Given the variety of use cases, my understanding is that it would be better not focusing on one specific tool, but expanding the DocumentationCoverageOptions model to allow for a more granular control over filters and grouping the metrics. This would be a more flexible approach which I believe would benefit both developers and other tools. There have also been feedbacks aligned with this, such as the idea of revisiting the current flag options to ensure wether they're still relevant, and creating new ones that would allow for alignment between the organization of symbols in the detailed coverage metrics.

To gather ideas on customization options, I started looking into other tools with similar needs -mainly other documentation and testing tools. This brought me back to how intuitive and effective the interface for the test coverage in Xcode is. Although I believe this level of integration with Xcode exceeds the scope of the project, it seems like a very interesting reference. I also noticed some requests for clearer, more user-friendly fields in both the "brief" and "detailed" coverage data formats. Structuring the data in a way similar to Xcode's coverage reports could help address that as well.

Would love to hear your thoughts on this, as I believe it will be an important point on the project!

ronnqvist · March 24, 2025, 1:18pm

sdms:

On a more conceptual level, I’ve noticed that the Project Idea and a couple of other forum posts mention the goal of “writing the coverage metrics in a new extensible format that other tools can read and display”. I’ve come across various pieces feedbacks that I believe correlate to this topic, such as:

Making documentation metrics easily accessible for the Swift Package Index

Exporting the data as a CSV file

Displaying the coverage metrics available in the website preview

Given the variety of use cases, my understanding is that it would be better not focusing on one specific tool, but expanding the DocumentationCoverageOptions model to allow for a more granular control over filters and grouping the metrics. This would be a more flexible approach which I believe would benefit both developers and other tools. There have also been feedbacks aligned with this, such as the idea of revisiting the current flag options to ensure wether they're still relevant, and creating new ones that would allow for alignment between the organization of symbols in the detailed coverage metrics.

Yes, it's best to not focus too much on any specific tool but rather try and accommodate a wide range of needs (which is difficult).

It's important to both consider the output data (the "documentation-coverage.json" file) and the caller's ability to make configurations (the command line flags). That portion of the project idea description is mostly about the format of the "documentation-coverage.json" file. Currently that file includes a list of entries like this one:

{
  "usr": "doc://SwiftDocC/documentation/SwiftDocC/PresentationURLGenerator/presentationURLForReference(_:)",
  "title": "presentationURLForReference(_:)",
  "sourceLanguage": {
    "name": "Swift",
    "linkDisambiguationID": "swift",
    "id": "swift"
  },
  "hasAbstract": true,
  "kind": {
    "isSymbol": true,
    "name": "Instance Method",
    "id": "org.swift.docc.kind.instanceMethod"
  },
  "availability": [],
  "hasCodeListing": false,
  "availableSourceLanguages": [
    {
      "name": "Swift",
      "linkDisambiguationID": "swift",
      "id": "swift"
    }
  ],
  "kindSpecificData": {
    "discriminant": "instanceMethod",
    "associatedValue": {
      "total": 1,
      "documented": 1
    }
  },
  "isCurated": false
}

In this entry there are 4 coverage metrics (has abstract, has code listing, is curated, and number of documented parameters (.kindSpecificData.associatedValue). The rest is various metadata about the symbol that the metrics apply to. There is no indication in this file that hasAbstract should be decoded as a boolean. There is also no indication in this file regarding how .kindSpecificData.associatedValue should be decided or what it represents. A hypothetical tool that consumes the current file needs to know what all the keys are and what each of them represent. This means that if DocC were to include some new information (for example if the method's return value is documented) each tool would need to update their decoding code to display the new information.

Some tools may wish to only display a summary of the data. Such tools currently have to consume all the data and summarize it themselves (for example count the percentage of symbols with an abstract).

Some other tools who are more aimed towards a deep dive into the data may with to group the properties and methods of a class together with that class. That information is currently not available in the file.

ronnqvist · April 2, 2025, 5:09pm

I'm sure you're already aware but I just wanted to post a friendly reminder that the proposal submission deadline is April 8 at 18:00 UTC (next week). If you have any further questions that could help shape the project proposal or if you want feedback on a draft proposal don't hesitate to reach out.

sdms · April 5, 2025, 3:07pm

Hey, @ronnqvist! Thank you very much for the kind reminder! I've been a little more quiet as I work on my proposal, but I'm still very invested in the project and excited about it! I plan on sending a draft here soon, and I hope to have enough time to implement any feedbacks.

sdms · April 8, 2025, 12:13pm

Hey! Although it's really close to the deadline, I thought it might still be useful to leave a link to my proposal draft here. I'm aware this isn't the ideal timeline, so it's totally okay if no feedbacks are sent before the deadline - I just wanted to leave space for that while I finish working on it. I'm also aware that I'm likely not going to do it exactly as I would like to, but I'll try my best anyways. Here's the draft

ktoso · April 8, 2025, 12:55pm

Hi Sofia,
the proposal is well written, and overall good but it's missing some pieces which are very important.

We really value a concrete timeline and goals/deliverables (especially the 50% check-in and 100% deliverables are important to call out explicitly). I see you're aware that that's missing since it's marked WIP but as long as you could complete those parts. The Proposed Improvements don't need to be super detailed but if you can add a few sentences to them before submitting that would be quite helpful.

Please make sure to submit a proposal and don't wait to the last hours in case something goes wrong with the submissions app. We have no way to "late" accept proposals, so what's most important is that you submit a revision of it.

I'll leave project specific comments to the proposal mentors. Good luck!

sdms · April 8, 2025, 12:58pm

Hi, @ktoso! Thank you for the feedback! I'll definitely prioritize these points moving forward.

sdms · April 8, 2025, 4:01pm

Update: proposal submitted!

Regardless of the result, I wanted to thank everyone that joined this thread and helped me get started in contributing to the docc project. All the messages I got here were helpful and kind, and I truly do appreciate the time and effort you put into this :)