Building index for the server

This topic is a continuation of a discussion that I believe may be interesting for everyone (original discussion: Try to build SPM workspace at open, if index store is not available. by krzyzanowskim · Pull Request #69 · apple/sourcekit-lsp · GitHub)

We have ideas about making it possible to use index data for more things, but I don't foresee this completely replacing live ASTs.

mind to share the ideas?

the fact that we trigger dependency resolution right now is a bug. When I first implemented this, there was no way to opt-out of that resolution, but @Aciid said he'd be fine with us adding a way to opt out. Just haven't got to this yet.

Let's assume dependency resolution is disabled as requested... what left? everything depends on it. Is "opt-out" still a way to go? please elaborate.

To summarize discussion:

  1. SPM has to resolve dependencies to generate a llbuild manifest (it's up to SPM, not up to a server to initiate this task)
  2. server wants to reuse llbuild manifest (debug.yml or build.db), for sourcekitd and index generation.
  3. llbuild manifest could be generated by SPM, without triggering an actual build afterward (requires SPM modification(?)) - SPM build manifest generation is not part of libSwiftPM, hence either spawn spm binary with e.g. --disable-build, or expose this part in SwiftPM and use it - (requires SPM modification). In theory, there is LLBuildManifestGenerator that should be able to generate manifest, but there seems to be more logic in SPM.
  4. Indexer job continuously updates the index as files are modified (bear in mind that an LSP is aware of the changes that are not yet saved to the underlying storage - I'd say most of the time changes are not yet saved, but it is expected to have modification already in the index)

Now interaction with manual calls (initiated by the user) to SPM is more tricky:

  1. Does SPM have any mechanism to prevent simultaneous builds? I see llbuild does have some:
    <unknown>:0: error: unable to attach DB: error: accessing build database "/Users/marcinkrzyzanowski/Devel/swift-source/sourcekit-lsp/.build/build.db": database is locked Possibly there are two concurrent builds running in the same filesystem location.
    error: terminated(1): /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swift-build-tool -f /Users/marcinkrzyzanowski/Devel/swift-source/sourcekit-lsp/.build/debug.yaml main output:
    
    don't see anything obvious in spm though. Checking for "build.db" lock feels a bit hacky.
  2. SPM may use a custom build directory that won't be shared with the server.
  3. Server indexer should update on each llbuild manifest change - custom build parameters may appear as a result of manual build command.
  4. A toolchain version used to generate dependencies (swiftmodules) and an indexer toolchain has to be aligned, otherwise, it won't work (will it?)

did I miss anything? thoughts?

What LSP operation is triggering the package resolution? It definitely shouldn't because it can leave the caches in a bad state if the user and server start an operation simultaneously. Also, the llbuild manifest is pretty much an implementation detail of SwiftPM and it shouldn't be used directly by other tools.

the fact that we trigger dependency resolution right now is a bug.

I'm being imprecise with my terminology, I think. I specifically meant that we shouldn't be hitting the network and fetching/updating packages.

We call Workspace.loadPackageGraph() to get a package graph, which we pass on to produce the BuildPlan that gives us compiler arguments.

Can you expand on what this issue is and what we should be doing instead? We're not going out of our way to trigger resolution, it's just part of package loading, which appears to be the only way to eventually get a build plan etc.

As part of loading the package graph, libSwiftPM will implicitly kick off the resolution machinery to fetch and resolve the dependencies if they're not present or if the package requirements change because of modifications to Package.swift file. Since there could be a race condition between LSP and an user initiated package operation, the state files like .build/dependencies-state.json and .build/repositories/checkouts-state.json could end up in a bad or corrupt state.

We should add a boolean to that method that would skip the package resolution process and only load the package graph object with the information it already has. However, if we notice that the graph requires re-resolution, we can emit a warning from SwiftPM that LSP can use to notify users about kicking off a build or resolve operation. Will that work for your usecase?

I think that's what we want, although since you mentioned issues with the intermediate files: what happens if there is no build directory at all yet? Will we be able to dig out any information, such as the set of files in each target, etc? What about in the case where the package has no dependencies?

You will always get the information about the root package (unless the manifest is broken).

1 Like

We're on the same page I guess. What I meant is that dependency resolution shouldn't result in starting the SPM machinery, instead, being a read-only operation. agree?

@akyrtzi said: this may mean that the sourcekit-lsp service never initiates such a build on its own, but only has the hooks for the editor/IDE to control this.

Technically this is feasible:

  • despite missing distinct client -> server notification about the index updates, maybe DidChangeWatchedFiles Redirecting… could trigger this. (+ experimental client capability)
  • I think it's unexpected approach among the LSP implementations to move indexing responsibility to the client, isn't it? - ergo the sourcekit-lsp won't work out of the box with any client.

I've made such a solution for my own use and it seems to work.

One more thing is index format - is it ok to build (and generate index) with Swift 4.2 and index with Swift 5.x at the same time? I expect problems here - correct me if I'm wrong but it rather needs separate swiftmodules, isn't it?

It only requires that the user builds the package, which is what users will do after cloning a package anyway.

There shouldn't be a problem, the same compiler can understand both and the index format is independent of the swift language version.

If the source is in non-buildable state, it won't build successfully, and won't index the source, and client need to force a programmer to fix the sources to build index. Is there a known way to build index despite errors of indexed source code?

The scenario I'm thinking about is as follows:

  1. start sourcekit-lsp where server doesn't trigger the package build by itself - so no index
  2. client build the sources, but build failed due to source errors
  3. sourcekit-lsp dont pickup the SPM changes (because SwiftPMWorkspace.reloadPackage() is not called - correct me if I'm wrong @blangmuir).
  4. client have to restarts the server to pickup on resolved SPM package.

Indexing doesn't require there to be no errors, but it's true that a build system will usually cancel the build some time after the first failure, so we may not have the opportunity to index everything that way.

Reloading after changes is something I want to add soon. This shouldn't be affected by build failures though, only changes to the package model - e.g. adding a new file.

Adding a new file is one thing. This one is clear.

The other thing is a scenario where none of the manifesto files changes - just resolution possibilities changes. We stated earlier that the server shouldn't build a workspace during initialization. In that scenario, the initial package graph contains information about what's inside Package.swift without external deps. We remember and store this graph. Later, when client builds the SPM package - we should load updated graph, but how do we get notified about it? Neither Package.swift nor Package.resolved is modified - we have to load graph again (call SwiftPMWorkspace.reloadPackage()):

  • One options is to have a non-standard notification from the client to the server
  • Other option is to look at changes in .build directory - can't rely on that as it's an implementation detail (right?).

We can have swiftpm decide what needs watching, so this should be fine I think.

is that something that exists already? (I'm not aware of)

update: SwiftPM.Workspace has a bunch of public paths, including managedDependencies.statePath that maybe, maybe is the file we should (and can) look at @Aciid ?

SwiftPM should expose an API that gives you a list of paths to watch.

Can you elaborate more? I try to combile swiftc, -continue-building-after-errors, -index-ignore-system-modules, -index-store-path, -index-file yet the error prevented from generating an index:

$ swiftc -sdk /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk -continue-building-after-errors -index-ignore-system-modules -index-store-path . -index-file main.swift

main.swift:2:8: error: no such module 'Kitura'
import Kitura
       ^

When I do this I get index data. Could you provide more details about what you're trying?

  • What toolchain version is this?
  • When you say "prevented from generating and index" what are you seeing - no index data directory created at all?

https://swift.org/builds/development/xcode/swift-DEVELOPMENT-SNAPSHOT-2019-03-04-a/swift-DEVELOPMENT-SNAPSHOT-2019-03-04-a-osx.pkg

it does, no records though. It creates something like this:

v5/
├── records
└── units
    └── main-3HU6EVUNNF6IA

the directory for a file without errors looks like this

v5/
├── records
│   └── DO
│       └── main.swift-ILZJ7X5YS6DO
└── units
    └── main-2FLLG6Y2JEC3

do you think it's all fine?

What's the content of main.swift? If the not having the Kitura module causes none of the symbols to resolve at all, that could cause an empty record.

hm... makes sense.

import Foundation
import Kitura
import LoggerAPI
import HeliumLogger
import Application

do {
    HeliumLogger.use(LoggerMessageType.info)

    let app = try App()
    try app.run()

} catch let error {
    Log.error(error.localizedDescription)
}