Hello!
I'm Raghav and I'm working on a proposal for the Sourcekitd refactoring project, and I'd love any feedback to tighten it up! I have been contributing to Swift (NIO) recently, and in the process learnt a ton about Swift’s internals, and even used Swift Concurrency in interesting ways to develop features like async packet captures (Add AsynchronizedFileSink for non-blocking PCAP recording by RaghavRoy145 · Pull Request #253 · apple/swift-nio-extras · GitHub)
I want to continue contributing to the project, and this particular one aligns really well with my goals, so I have written an initial draft. Just to make sure I'm on the right track, I was looking for some feedback (Comments are turned on!)
@ahoppen
Proposal Draft: GSoC 2025: Refactoring Sourcekitd- Draft - Google Docs
Edit: I have also added some questions at the end of the proposal that will help me design the project better, I'd love any help there as well, I'll copy them here:
Q 1: To understand the motivation better, which areas of sourcekitd are most prone to concurrency issues or race conditions right now? Are there particular locks or data structures in Requests.cpp and SwiftASTManager.cpp that you know are causing contention or complexity?
Q 2: How is the lifetime of the AST data currently managed? Are there reference-counting or memory-management patterns we need to preserve when moving to Swift concurrency? Do we ever share AST objects between multiple threads and do we need a new approach to safely share them in Swift concurrency?
Q 3: Are there any submodules or functionalities that must remain in C++ for compatibility reasons, and can’t be fully converted to Swift concurrency? How ‘deeply’ should we restructure code in Requests.cpp and SwiftASTManager.cpp - do you foresee only concurrency changes or also broader re-architecture?
Q 4: Some misc clarifying questions:
-
Will we have to change the representation of AST node pointers when bridging to Swift, or is a simple ‘pointer-wrapping’ approach acceptable?
-
Does the existing C++ library code we call from Swift concurrency assume it’s called from a single thread, or can it handle concurrency from multiple Swift tasks?
-
Do you envision one big actor for AST management or multiple fine-grained actors (e.g., separate actors for caches, indexing, completions, etc.)?
Finally, if its better for me to open PR's or work on relevant issues for this particular project - I'd love to work on that too if that's a better way to judging my approach.
Thanks again,
Raghav