How Do You Build a Sandboxed Editor that Uses sourcekit-lsp?

flowtoolz · October 4, 2020, 6:39pm

Hello

I've been thinking about how to do this for a long time and finally had time to try. Needless to say, I spent this Sunday on it going bonkers ...

Question:

What is the "proper" way of a sandboxed macOS app to launch sourcekit-lsp?

Context:

I'm building a sandboxed macOS app. Let's call it a code "Editor". Right now it only supports Swift. I wanna employ LSP so I can support more languages in the future. So the first step is to use sourcekit-lsp for Swift.

So far ...

Building the sourcekit-lsp executable worked right away Awesome job!

I just have not been able to launch sourcekit-lsp from my app ...

As far as I understand now, and I hope I'm wrong, it is impossible to publish such an editor through the AppStore, because it cannot be sandboxed, because it needs to launch "3rd party" executables: our beloved language servers.

I can't just launch it as a Process or NSTask. I can't bundle it with the app and launch it from there (for many reasons). I can't use the NSOpenPanel, because that does not grant permission to launch an executable, even if the user has selected it. I imagine even using an XPC extension with its own entitlements wouldn't solve the problem, because sourcekit-lsp is still a "3rd party" application.

I'm sure I'm having multiple blind spots here. So I'd be super happy about any pointer in the right direction

krzyzanowskim · October 4, 2020, 7:00pm

You can ship your own build of sourcekit-lsp in the bundle, it won't help much with sandboxing though.

sourcekit-lsp missing XPC transort layer similar to what happens in clangd project (⚙ D54428 [clangd] XPC transport layer, framework, test-client).

flowtoolz · October 4, 2020, 8:22pm

... Can't even use /usr/bin/xcrun from the sandbox ... But /usr/bin/perl works

What would then be the (future) official way to do this? My hunch is now:

Language servers will at some point be part of the OS. So sourcekit-lsp will locate in /usr/bin and will be publicly usable. Apple just isn't there yet.
As long as a language server isn't an "official" part of the OS, the IDE developer herself must build and ship that language server with her IDE (whether in a distinct helper app using XPC or as part of the main executable)

Is that accurate in any way?

flowtoolz · October 4, 2020, 8:30pm

So at this point, I need to compile and ship sourcekit-lsp (and every other language server I want to use) with my app? Or is there any other way?

blangmuir · October 5, 2020, 5:46pm

Can you provide more details about how you're calling it and what the error you are getting is?

flowtoolz · October 6, 2020, 9:56am

Absolutely

Well, it starts with not being able to sudo copy sourcekit-lsp to a "world readable" folder:

mv: rename sourcekit-lsp to /usr/bin/sourcekit-lsp: Operation not permitted

So I let the user locate the executable via NSOpenPanel and tried to launch it like so:

let process = Process()
        
process.executableURL = sourcekitLSPExecutable
process.arguments = ["--help"]
        
do {
    try process.run()
}
catch {
    print(error.localizedDescription)
}

The resulting error: The file “sourcekit-lsp” doesn’t exist. This is due to the sandbox. The panel is not intended for executables, as some Apple engineer stated in the Apple developer forums.

So I tried to use sourcekit-lsp via xcrun (if that's possible, I picked the idea up somewhere ...), first checking wether I can use xcrun at all:

process.executableURL = URL(fileURLWithPath: "/usr/bin/xcrun")
process.arguments = ["--help"]

This logs xcrun: error: cannot be used within an App Sandbox.

But I can perfectly use perl:

process.executableURL = URL(fileURLWithPath: "/usr/bin/perl")
process.arguments = ["--help"]

If I put sourcekit-lsp into /usr/local/bin and launch it from there, the error states: The file “sourcekit-lsp” doesn’t exist. But launching pod there doesn't work either, so I guess /usr/local/bin is off the table for sandboxed apps.

Maybe sourcekit-lsp would need to be installed at a "system level" in /usr/bin with the same access level as perl ... ?

I fear, at this point, sandboxed apps simply can't use local language servers.

blangmuir · October 6, 2020, 5:47pm

Thanks @flowtoolz and sorry for the trouble. I confirmed that this is currently the expected behaviour of the sandbox. To use sourcekit-lsp or any other language server from inside a sanboxed app you would currently have to bundle it in the app itself and add the com.apple.security.app-sandbox and com.apple.security.inherit entitlements.

I should also note that we are not testing sourcekit-lsp in a sandboxed configuration, so it's possible it may fail or be missing functionality in such an environment. For example, we call out to other executables and libraries from the toolchain, and we need access to files in the source directory and SDK.

If you're interested in filing enhancement requests for this use case, feedbackassistant.apple.com would be the right place. One potential request would be the ability to execute a user-selected executable inside a sandboxed app when using NSOpenPanel. If you do file such a report, please send me the feedback number so that I can follow up.

Maybe sourcekit-lsp would need to be installed at a "system level" in /usr/bin with the same access level as perl ... ?

/usr/bin/sourcekit-lsp does exist in macOS 11, but it will not help here, because under the hood it uses xcrun to find the real binary inside Xcode (this is also how many other /usr/bin executables work, including /usr/bin/swiftc).

flowtoolz · October 6, 2020, 10:49pm

Hello @blangmuir,

Much thanks for the detailed response and all your work on sourcekit-lsp!

Yes, that's what I also realized.

I had another idea, and it seems to work in principle. Maybe this would also be interesting for the community: What if there was a web service running locally that provides access to different language servers via WebSockets?

Edit: What worked so far: I created a dummy web service with Vapor, ran it on my Mac at http://127.0.0.1:8080 and talked to it via WebSockets from the editor app.

That non-sandboxed Vapor web service can indeed launch sourcekit-lsp, although I'm still lost on how the back and forth works I tried to put an encoded request message into the input pipe of sourcekit-lsp and expected a response message coming out of the output pipe:

let process = Process()

func testSourceKitLSP() {
    process.executableURL = URL(fileURLWithPath: "/path/to/sourcekit-lsp")
    
    let inputPipe = Pipe()
    process.standardInput = inputPipe
    
    let outputPipe = Pipe()
    process.standardOutput = outputPipe

    let outputFile = outputPipe.fileHandleForReading
    outputFile.waitForDataInBackgroundAndNotify()
    
    NotificationCenter.default.addObserver(forName: .NSFileHandleDataAvailable,
                                           object: outputFile,
                                           queue: nil) { _ in
        let outputData = outputFile.availableData
        print(String(data: outputData, encoding: .utf8) ?? "error decoding output")
    }
    
    do {
        try process.run()
    } catch {
        print(error.localizedDescription)
    }
    
    if let messageData = testMessageData() {
        inputPipe.fileHandleForWriting.write(messageData)
    }
}

testSourceKitLSP()

Edit: No error is logged or comes out the error pipe, the process runs, but the output pipe never fires ... When I input somethin on the console, I at least get an error like this:

Fatal error: fatal error encountered decoding message MessageDecodingError(code: LanguageServerProtocol.ErrorCode(rawValue: -32700), message: "expected \':\' in message header", id: nil, messageKind: LanguageServerProtocol.MessageDecodingError.MessageKind.unknown): file LanguageServerProtocolJSONRPC/JSONRPCConnection.swift, line 219

Edit2: https://github.com/flowtoolz/LSPService

flowtoolz · October 8, 2020, 3:35pm

Regarding sourcekit-lsp:

It seems the problem was that sourcekit-lsp only sends any output if the input was correctly encoded. It would be very helpful though in developing a client, if sourcekit-lsp would send somethin out the output- or error pipe when it cannot decode the data it receives. Only on the console do I see a decoding error, as sourcekit-lsp passes the error message to fatalError(...). Not sure why nothing at all happens when my client app sends faulty messages ...

Regarding the Sandbox:

Technically it does work now But I wonder whether this approach will fly with the Mac App Store review. Mostly because the Language Server Host app might be considered a plugin that effects the main app's behaviour. But I guess this is off-topic here.

blangmuir · October 9, 2020, 11:23pm

Do you have a suggestion for what should be done differently? I don't think we want to try to recover from invalid messages (e.g. malformed json, incomplete data) in sourcekit-lsp, since we cannot ensure any consistent state between the client/editor and the language server.

flowtoolz · October 10, 2020, 10:12pm

I'm not deep enough in the matter to know what state would need to be recovered. So I see three options:

1. Send an LSP Error Message but Don't Crash

As a client, I'd expect no state to change. A message that can't even be interpreted should'nt have the power to alter state. But I expect to receive an LSP message informing me that something went wrong. How I deal with that info, I'd assume, would be my responsibility, maybe I shut down sourcekit-lsp and start anew, maybe I go to some fallback state, whatever ...

In particular during development, an LSP message like this would be golden:

Content-Length: 91

{
    "jsonrpc": "2.0",
    "id": null,
    "error": {
        "message": "expected ':' in message header",
        "code": -32700
    }
}

At the point the error is recognized (JSONRPCConnection.swift, line 231), all infos needed to form the above LSP message are present.

Within sourcekit-lsp, types like enum RequestID currently don't explicitly allow null JSON values, but the LSP specification does. Maybe we could adapt sourcekit-lsp so it can build the above message.

2. Send an LSP Error Message and Then Crash

Even if sourcekit-lsp would indeed need to shut down, I'd expect to receive the above mentioned error message before that. I'm not sure how intrusive that change would need to be to the current implementation, as we'd need to wait for the (quite deep reaching asynchronous) message sending to complete before calling fatalError.

3. Deliver Any Kind of Error Message and Then Crash

Getting an error message would enormously help developing a sourcekit-lsp client, in particular since the message I saw in MessageDecodingError was already quite informative. So I (naively) put in this log here:

} catch let error as MessageDecodingError {
  switch error.messageKind {
      ...
    case .unknown:
      log("decoding error (code \(error.code.rawValue)) for message of unknown type: \(error.message)",
          level: .error)
      break
  }
  // FIXME: graceful shutdown?
  fatalError("fatal error encountered decoding message \(error)")
}

But the message didn't show up in the client app that launches sourcekit-lsp. So I tested all thinkable "channels":

FileHandle.standardError.write(("stdErr: " + error.message).data(using: .utf8)!)
FileHandle.standardOutput.write(("stdOut: " + error.message).data(using: .utf8)!)
log("log: \(error.message)", level: .error)
print("print: \(error.message)")

The result:

When I build and run sourcekit-lsp in Xcode and then type faulty input to the Xcode console, all of those channels end up on the Xcode console, even before the fatal error
When I run sourcekit-lsp in Terminal and type in faulty input there, all but log get printed out before the fatal error. log comes in after fatal error, packaged into an LSP message: {"jsonrpc":"2.0","method":"window\/logMessage","params":{"type":4,"message":"log: expected ':' in message header"}}
When I build and run my client app in Xcode and let it launch sourcekit-lsp, and let it write faulty input to the standard input of sourcekit-lsp, then nothing at all happens: sourcekit-lsp does not terminate, not print, not send output data and not send error data. But with a correct message as input, it does send an LSP response message to its output

If you know a way to make sourcekit-lsp provide some feedback in case of undecodable input, I'd happily try to implement that and create a PR

blangmuir · October 12, 2020, 5:11pm

FileHandle.standardError.write(("stdErr: " + error.message).data(using: .utf8)!)

Are you capturing the stderr from the sourcekit-lsp process? If so, I would have expected this to work. I think this would also work to send the message on stderr

log(...)
Logger.shared.flush()
fatalError(...)

This seems reasonable to me in principle, if the implementation is not overly complex. We would need to synchronously wait on sending the reply, since we want the fatalError to happen with the correct stack trace.

On the subject of recovering from parse errors:

It's not that the state of the service is modified, but that the editor thinks the state has been modified when it has not. Consider the notifications used to modify the contents of text documents. If one of those is malformed, the editor and language server will disagree about the contents of the document. Notifications have no identity, so we have no way to inform the editor which notification failed. For requests, the same can happen if we are unable to recover the request identifier during parsing.

How would the editor recover from this state? It does not know which request or notification failed, so it does not know what the state on the server is. In fact, the particular error you are using as an example is worse: the message base protocol itself failed. How will the server and editor agree about where to restart processing in the byte stream?

flowtoolz · October 13, 2020, 1:54pm

Thanks @blangmuir for your help and patience

Here's a tiny PR: Log error before fatalError(...) when message decoding fails by flowtoolz · Pull Request #333 · apple/sourcekit-lsp · GitHub log(...) already produces an LSP message, so that's what I went for.

Turns out the "Terminal" behaviour of sourcekit-lsp just isn't as comparable to the "client subprocess" behaviour as I assumed. In my client, the sourcekit-lsp input requires at least a proper LSP message header to produce the error feedback. Those "header decoding errors" like expected ':' in message header only show up in Terminal. But as far as I'm concerned, the PR sufficiently closes the feedback gap

blangmuir · October 13, 2020, 6:50pm

Merged, thanks!

Is this still true after Log to stderr instead of sending window/logMessage by DavidGoldman · Pull Request #327 · apple/sourcekit-lsp · GitHub? CC @DavidGoldman

flowtoolz · October 13, 2020, 11:01pm

Ah, you're right, instead of the message from stdOut I now get this from stdErr:

[2020-10-13 23:59:35.820] error decoding message: jsonrpc version must be 2.0

I was on the latest commit in my forked repo, but still using the build from the cloned repo which had a similar change but was probably on an older commit ...

So, I'm working to replace the log with a synchronous error response using a null ID, somethin like:

send(async: false) { encoder in
    try encoder.encode(JSONRPCMessage.errorResponse(ResponseError(error), id: .null))
}

flowtoolz · October 14, 2020, 1:24am

@blangmuir 2nd try: Synchronously send error response on message decoding failure by flowtoolz · Pull Request #334 · apple/sourcekit-lsp · GitHub

flowtoolz · October 29, 2022, 12:17pm

Hey @krzyzanowskim !

Since this thread, I went the WebSocket way to enable my sandboxed app Codeface to talk to different LSP servers. Users have understandably taken issue with how cumbersome my solution is.

But recently I heard you figured out a way to do that via XPC afterall. Matt also offers the ProcessService package.

My question: What is your current preferred way to launch processes from a sandboxed app? Does the new ExtensionKit change the game? Also, I wonder whether - in terms of AppStore rules - the XPC solution is illegal or in a grey area or just tricky to do. I'd be glad to know a few pointers or even just the relevant terms to google