SourceKit-LSP for other build systems

Hi there. We are looking at getting SourceKit-LSP integrated into our mobile repo and need a way to pass in the flags to the LSP using our existing build system. Using compilation databases is a possibility here, but it is a fair amount of work keeping them in sync for file creation / move and hard to make it scale to large repos as they all have to be generated up front without knowing which subtree to start at.

The solution I am currently sketching out is to provide a delegate build system class to the LSP that will call out to any build system to provide flags for a given file. The workspace would be configured with a sourcekit_config.json file at the root that looks something like:

{
    "indexStorePath": "path/to/index-store",
    "flagDelegate": "flag_script.sh"
}

When compiler flags are needed the LSP will call the provided script with the path to the file as its only argument, and the script returns the flags for that file, one per line. This makes it pretty trivial to implement support for LSP for any build system, eg Bazel / Buck / CMake. It also makes background indexing simpler as the index commands can be used to know where the user is currently working and which indexes need to be generated in the background.

Let me know your feedback please, if people are happy with this approach I can put up a PR shortly.

Thanks,
Richard

4 Likes

Thanks for starting this discussion! I'm very interested in improving our ability to work with other build systems. Some design comments follow.

For some use cases, it is important to be able to monitor for changes to the build settings and update them as needed. For example, when you add a new file to a swiftpm target directory, or modify the Package.swift file, we should be able to (a) get the new settings immediately and (b) notify any consumers of build settings (e.g. an existing open document) that the settings have changed. We haven't implemented update notifications for any build system in sourcekit-lsp yet, but it's something I always imagined adding.

This sort of watch-and-notify model seems like it might work better with a build setting server rather than a one-off query. I suppose another approach would be for the command-line tool to be able to provide a list of files to watch and when any of them change to re-query. But that maybe more expensive, and could be limiting if the set of files to watch can change a lot over time. We don't necessarily need to build a full solution for keeping the settings up-to-date right away, but I think it's important to understand how we might get there, or at least how any build settings solution we build fits into this.

I like that this gives us a way to provide the "plugin" out of band such that it would work without needing editor-specific support. I think we could also support alternative ways to configure it, such as the initialization options in the protocol, or using command-line arguments if anyone were interested.

Can you expand on this? Are you talking about driving index data generation based on the queries being made? How would that work in practice - for example, you probably want to have more index data than just the files you've looked at.

This is similar to what I had in mind for Bazel - you could watch for modifications to BUILD files - then you would need to rerun Bazel in order to understand the changes that were made. Due to the nature of this, you wouldn't want many of these scripts firing off at once (each creating a Bazel command) - you'd really want some sort of server to manage it, whether it be in sourcekit-lsp itself or a separate process.

There's also the problem of generated file dependencies. More so for Bazel/Buck than other build systems, but it's entirely possible that the file that the user has open depends on some generated files. Once again, you might need some deep build system integration to handle this.

Also not sure what you mean by this; I would think that you really want a single process to understand all of the files that a user has open (effectively: what is the user trying to do? what needs to be built?) and index as needed. sourcekit-lsp currently has this information although currently it's not exposed to the build system (perhaps it should be in some fashion).

Also fine, although potentially a little more fiddly depending on editor. It made sense to me to use a config file as the other integrations are already going this way, but I'm fine with either.

I agree that you will want some sort of server process to handle these requests efficiently, but I was thinking of leaving that as an implementation detail for the specific build system (for instance Buck already has a daemon mode and watches files, so some of this work is handled for us, not sure if Bazel has something similar).

This would definitely be nice to have, how would you envision sending the notifications from an external build system? Adding an interface to the server instance to call or using Darwin notifications or something?

Basically I was suggesting this as a lazy persons insight into what the developer is working on without directly exposing the editor actions through the lsp. The idea would be that the script is called to gather flags for a file, but it could also evaluate whether or not the dependent swiftmodules and header maps need to be generated and so on and kick off builds in the background to make sure the index store for that subtree is populated. It would be more elegant to handle this more explicitly, but I think this handles the primary use case of making files buildable already.

SGTM.

Would the Buck/Bazel build servers themselves be responsible for notifying us? For other build systems such as cmake-server we would need to figure out what the lifetime is, since they don't have a long-lived daemon the way Buck/Bazel do.

If we made the server aspect explicit, I would just expect this is part of the API. If we use an implicit model like you are proposing, something like DistributedNotificationCenter might work (I'm not an expert here) on macOS, but we'd need an answer for Linux (dbus?) and Windows. Or we could have the service tell sourcekit-lsp a path to watch for file changes.

Got it.

Yes, I think thats the most sensible. If the build system did not notify the language server, at what point would the AST get updated?

Seems like adding something to the API is simplest then.

I talked to @akyrtzi a bit about this, and he reminded me of a couple other pieces of information that we will want from the build system:

  • List of all known files, which we can use to kick of background indexing as needed
  • List of all output files, which is important for controlling visibility of index data; e.g. if you remove a source file, we need to know that its unit file in the index is now dead/hidden.

And we'll want to keep those up to date for changes as well.

It's not necessary that every build system we integrate with provide all of these features, but I bring it up so that we can design this in a way that extends to a fully featured build system interaction.


I think my biggest concern about using the simple command-line invocation is that it makes it harder for us to evolve the API and harder to build a fully featured solution in the future. Whereas an explicit server, while it is a more work upfront, makes it easier to evolve.

The AST might get rebuilt because you changed the source file in the editor, but without being told by the build system about the changed settings, we would continue to use whatever command-line arguments we had already cached from the last time. So we wouldn't see the updated build settings until you re-opened the project, or perhaps if you closed the document and we happened to throw out our cached settings.

Note: there might be other interesting notifications from a build system. For example, if you kick off building .swiftmodules for dependencies in the background, knowing when they finish building would allow us to rebuild the AST to get updated diagnostics.

I'd recommend to take a look at this Build Server Protocol to see if it's something that is close to what we need.

I would rather push the background indexing responsibility to the build system, wouldn't it be better placed to know what needs to be built and what doesn't?

The output files part is interesting, I had not considered the necessity to clean up the index of stale data. If we didn't what would be the outcome, the language server would get data for unit files with no corresponding source file? Couldn't we ignore the results and delete the stale data in that case?

This does ring true, maybe its worth listing all the interactions the language server would want from the build system and design something minimal around that instead? My only concern would be cost of entry for simpler build systems, but I guess theres not that many that would be using this facility.

Thanks, i'll take a look at this.

We could theoretically detect a missing file during lookup, but it's harder to do that efficiently. Also, the source file being removed was just one example. Another is that a source file maps to a different output file path than it did before (e.g. because of a build configuration change), so we might end up with stale data and have no way to detect that.

Terms of Service

Privacy Policy

Cookie Policy