[GSoC 2025] Refactoring Sourcekitd to Use Swift Concurrency - Proposal Feedback

Hello!

I'm Raghav and I'm working on a proposal for the Sourcekitd refactoring project, and I'd love any feedback to tighten it up! I have been contributing to Swift (NIO) recently, and in the process learnt a ton about Swift’s internals, and even used Swift Concurrency in interesting ways to develop features like async packet captures (Add AsynchronizedFileSink for non-blocking PCAP recording by RaghavRoy145 · Pull Request #253 · apple/swift-nio-extras · GitHub)

I want to continue contributing to the project, and this particular one aligns really well with my goals, so I have written an initial draft. Just to make sure I'm on the right track, I was looking for some feedback (Comments are turned on!)
@ahoppen

Proposal Draft: GSoC 2025: Refactoring Sourcekitd- Draft - Google Docs

Edit: I have also added some questions at the end of the proposal that will help me design the project better, I'd love any help there as well, I'll copy them here:

Q 1: To understand the motivation better, which areas of sourcekitd are most prone to concurrency issues or race conditions right now? Are there particular locks or data structures in Requests.cpp and SwiftASTManager.cpp that you know are causing contention or complexity?

Q 2: How is the lifetime of the AST data currently managed? Are there reference-counting or memory-management patterns we need to preserve when moving to Swift concurrency? Do we ever share AST objects between multiple threads and do we need a new approach to safely share them in Swift concurrency?

Q 3: Are there any submodules or functionalities that must remain in C++ for compatibility reasons, and can’t be fully converted to Swift concurrency? How ‘deeply’ should we restructure code in Requests.cpp and SwiftASTManager.cpp - do you foresee only concurrency changes or also broader re-architecture?

Q 4: Some misc clarifying questions:

  1. Will we have to change the representation of AST node pointers when bridging to Swift, or is a simple ‘pointer-wrapping’ approach acceptable?

  2. Does the existing C++ library code we call from Swift concurrency assume it’s called from a single thread, or can it handle concurrency from multiple Swift tasks?

  3. Do you envision one big actor for AST management or multiple fine-grained actors (e.g., separate actors for caches, indexing, completions, etc.)?

Finally, if its better for me to open PR's or work on relevant issues for this particular project - I'd love to work on that too if that's a better way to judging my approach.

Thanks again,
Raghav

1 Like

Hi @RaghavRoy145,

Thanks for sharing your proposal. Only got to look at it now since I was on vacation last week. It reads well overall. I think the timeline is quite ambitious, especially weeks 1-2 because this will be where all the bridging interfaces will need to be defined, which tends to be a bit of work in my experience.

To answer your questions:

Each request is handled sequentially because it calls into the compiler’s codebase, which works sequentially for the most part. That’s why the main concurrency concerns are at the point where requests interact with each other (by sharing an AST through SwiftASTManager and when they are dispatched to be handled in Requests.cpp). SwiftASTManager needs to be considered as one monolithic piece and it’s not really possible to extract small parts of it – it’s just too tightly interlocked for that.

One of the key benefits of refactoring Requests.cpp to Swift would be that it opens up doors for future improvements, like the out-of-process sourcekitd on non-Apple platforms, which you mentioned or the usage of distributed actors as the transport mechanism. It also just flips the mental model of sourcekitd from a C++ service that might use Swift for some methods to a service in Swift that calls into C++ as an implementation detail.

The ASTs are manually memory managed. Consider SwiftASTManager as a cache of ASTs that gets pruned when documents are closed in sourcekitd using the editor.close request. The existence of the AST is guaranteed for the execution time of consumeAsync in SwiftASTConsumer.

I can’t think of any reasons why any of Requests.cpp of SwiftASTManager.cpp would need to maintain any C++ interface. As long as we maintain the XPC / in-proc (which is C) interface of sourcekitd, all implementation details can be changed.

pointer-wrapping is fine.

We already call it from multiple dispatch threads, so calling it from multiple Swift tasks should just work.

I prefer multiple fine-grained actors over a single global actor since it improves local reasoning and allows more parallelism.

– Alex

1 Like

Thank you for your review and answers! You’re right about the timeline, I’ve been messing about with the refactoring locally and I believe I’ll need to change it.

I’ve already submitted the proposal but I bet I have some time to re-upload an edited version - using the community-bonding period - after which I can attend to your answers better.

Best,
Raghav

Just making sure timelines and expectations are understood here:

The mentioned "community bonding period" is intended for accepted projects to warm up and get ready to contribute. The final proposal must be uploaded "today" effectively (April 8 - 18:00 UTC) and the "community bonding period" is May 8 - June 1 which is after the selected projects are announced. If you want you can submit an updated proposal today, but please watch out to not miss the deadline :slight_smile:

We can't guarantee this specific project will be accepted, and results will only be announced on May 8 - 18:00 UTC (as per the schedule published on Google Summer of Code 2025 Timeline  |  Google for Developers)

1 Like

Thanks @ktoso I see the miscommunication here:

I bet I have some time to re-upload an edited version - using the community-bonding period - after which I can attend to your answers better.

This tries to mean - re-upload the edited version that includes the community bonding period in my currently too ambitious timeline, and then also respond to Alex's responses here!

But my wacky way of communicating that must've come off as me having the proposal submission timelines really wrong, thanks a lot for catching that!

1 Like

Sounds good, yeah just wanted to make sure you don't miss uploading the updated proposal :slight_smile:

Good luck!

1 Like

That makes sense – I'll prioritise concurrency management around shared AST interactions. Since SwiftASTManager is monolithic, I’ll use it as a single unit.

That clarifies lifetimes. I can work within those boundaries on the Swift side, especially when wrapping AST pointers in async contexts.

Perfect – I'll refactor internals as needed ( preserving the public interface). That gives me more freedom to build the concurrency model around actors and async/await.

I’ll go ahead with a ASTNodeRef wrapper in Swift.

Nothing much to add here – Noted!

Got it. I’ll still perhaps start with a centralised actor for AST management to get a working baseline, but will look for clean splits (e.g., completions, indexing) and factor those into separate actors, I could start doing that now as I ramp-up