SourceKit-LSP CPU spinning and libdispatch complications

Hi there! I have been looking into the Windows CPU spinning issue SourceKit-LSP use of CPU · Issue #1461 · swiftlang/sourcekit-lsp · GitHub (caused by the underlying DispatchIO.read spinning while pipe is open on Windows · Issue #820 · swiftlang/swift-corelibs-libdispatch · GitHub ) and investigated the root cause of Windows: swift-package-manger blocks on send to swift-build · Issue #882 · swiftlang/swift-corelibs-libdispatch · GitHub

It turned out that specific hang was on the Read side and not on the Write side that @z2oh described in DispatchIO.read spinning while pipe is open on Windows · Issue #820 · swiftlang/swift-corelibs-libdispatch · GitHub . Both issues are caused by changing pipes to PIPE_WAIT. Note it is the Read side that causes sourcekit-lsp spinning, as confirmed by this quick prototype replacing the read side of DispatchIO with just a reader thread in sourcekit-lsp: Prototype pipe handling without dispatch IO by roman-bcny · Pull Request #2315 · swiftlang/sourcekit-lsp · GitHub

The hang in libdispatch happens in the following PIPE_WAIT scenario:

  • pipe is empty
  • monitor thread calls the 0-byte ReadFile and blocks on the read (which is the expected way to get async IO)
  • another thread calls PeekNamedPipe on the pipe and also blocks, which blocks the corresponding queue and results in a hang

It might be counter-intuitive but synchronous file operations on Windows appear to be holding a lock throughout the entire execution - which is the likely reason why Peek would block in presence of a blocked Read. For example, NtQueryInformationFile in ReactOS calls IopLockFileObject on synchronous operations (NtQueryInformationFile also provides info about available read bytes and hangs the same way as Peek).

I don't see any simple way to get around this in libdispatch. There are additional complications with the way the library is structured, where the handler of the read operation only has access to the pipe handle and no other data seems to be shared between the monitor thread and the operation handling thread, so I don’t think there’s even an easy way to add hacks to try to avoid Peek hangs. Additionally, libdispatch has 3 threads involved in this situation, with 2 of them spinning with a ping-pong event notification while data is present in the pipe and the 3rd thread consuming the data (which then sporadically hangs if the pipe is emptied and one of the other threads is blocked on a read). A proper solution here would require some significant surgery requiring a good understanding of libdispatch internals - which I do not have, and I’m told it’s fairly likely no one does at this point given the unfortunate fate of libdispatch.

I’ve discussed this with @compnerd and we’re not sure what route we should be taking here to get us further. My idea with the prototype above was that we could leave the dispatch use as is, while avoiding specifically the pipe handling (DispatchIO) part which is what is ultimately causing this trouble on Windows. However I’m told this may not be a suitable state for sourcekit-lsp, and might introduce subtle bugs due to mixing dispatch and concurrency, so asking for guidance here. Thanks!

(edit3)

5 Likes

CC: @owenv

NtQueryInformationFile/IopLockFileObject ReactOS link reactos/ntoskrnl/io/iomgr/iofunc.c at 435482912c6dc0aaa4371e37b7336b2b1291e65f · reactos/reactos · GitHub

(edit)

One other idea could be to use FILE_FLAG_OVERLAPPED on Windows, which should allow for truly async pipe access avoiding the hang with PIPE_WAIT. However, pipe creation is generally outside of libdispatch control. Windows API does allow to flip the pipe to PIPE_WAIT on the fly, but this doesn’t seem to be the case with overlapped I/O. So this would have to be configured on the sourcekit-lsp end. I’m not sure if/how this is exposed to Swift.

That is not exposed to SourceKit-LSP as it uses Foundation's APIs and overlapped IO is not a concept on Unix, so there is no API for that. In order to extend that, you would need to go through the Foundation evolution process.

IMO migrating away from dispatch and using proper structured concurrency for SourceKit-LSP seems like a nice approach. It would help simplify and modernise the SourceKit-LSP codebase, resolve this particular issue, and open up SourceKit-LSP to get any improvements in the future due to enhancements of the concurrency support.

3 Likes

FWIW there should be a way to do what I did in my prototype with dispatch queues instead of the different concurrency model. Yes that would be in the workaround category as it would effectively re-implement what DispatchIO already provides (for a reason: it’s suboptimal on Windows and we don’t know how to fix it), but that’s a much smaller scope change than redoing everything without dispatch so we can get the benefit to the user quicker.

Thanks for your interest and further digging here @roman-bcny!

Removing DispatchIO from JSONRPCConnection.swift seems reasonable to me, though I’d prefer moving both read and writing off in that case. Your current PR never has a yield in reader though, so it holds onto one of the worker threads in the co-operative pool for the entire lifetime of the process.

Assuming there’s not similar issues with FileHandle on Windows (@compnerd?), we could probably move to using eg. FileHandle.readabilityHandler for reading + a write on the existing sendQueue for writing. Would be nice to AsyncStream-ify that as well, but I don’t think that’s necessary for the initial replacement.

My PR is not meant to be merge-able and is just an MVP/demo to illustrate the idea of how simple replacing that DispatchIO piece might look (probably a bit more complex in the end, with both read and write parts, but hopefully shouldn’t be too far in terms of complexity). Note that this is my first ever attempt to write anything in Swift so terrible choices are expected :slight_smile:

The underlying reason why this gets very complicated on Dispatch side isn’t some particular flaw of the pipe API, but because Windows support was patched in as an addition to the existing already complex threading model and the choice was made to do this using an additional monitor thread. It probably seemed like the simple way forward back then, but that was probably done without realization that you can’t simultaneously do a blocking Read and a non-blocking Peek. That is what ultimately causes all the trouble, and as long as we stay within that model getting around issues would be hard.

My PR is meant to demonstrate how we could replace this entirely on the Swift side instead with a simpler model that doesn’t have to retrofit into an existing complex framework / threading model. Note that the additional thread that I’m using in the PR also exists with DispatchIO in a slightly different form (monitor thread which continuously spins on 0-byte ReadFile on non-blocking pipe or blocks on it in case of PIPE_WAIT), so it is not a regression in that regard. But as I don’t know what I’m doing in Swift it is very possible that I used the incorrect way to spawn that thread. The idea was that this thread would exist while the pipe is open and terminate after, just like the monitor thread with DispatchIO would: swift-corelibs-libdispatch/src/event/event_windows.c at 0bb6c10fa556722654917b4a18ba2dc39b18392a · swiftlang/swift-corelibs-libdispatch · GitHub

(edit)

Updated to use FileHandle.readabilityHandler per Ben’s suggestion (with @compnerd’s blessing offline) and it’s looking much cleaner and still works in my tests! (although I have little idea what I’m doing with the queues so this might very well be wrong/deadlock-y) Prototype pipe handling without dispatch IO by roman-bcny · Pull Request #2315 · swiftlang/sourcekit-lsp · GitHub

(edit)

One thing to keep an eye on is that we appear to have had problems with availableData on Windows in the past: Use `readData` to read data from a pipe instead of `availableData` by ahoppen · Pull Request #8047 · swiftlang/swift-package-manager · GitHub. I don't have firsthand knowledge of exactly how this was manifesting though

Thanks a lot for your investigations here, @roman-bcny and for opening the PR to switch from DispatchIO to plain FileHandle. If we can switch to use FileHandle directly and remove the dependency on Dispatch for stdin/stdout handling, that would be great, assuming that we don’t buy ourselves a truckload of problems that DispatchIO was previously handling for us – which I am not currently aware of.

If you find time to make your PR review-ready that would be greatly appreciated and I’m sure all SourceKit-LSP users on Windows will love you for it :smiley:

2 Likes

Thanks @owenv, I went ahead and replaced availableData with (try? fileHandle.read(upToCount: Int.max)) ?? Data() ^_^

I will get the write side to a (seemingly) working state and send the PR for review.

Please LMK if there is anything in particular that I should be testing for this change (apart from whatever CI will do on the PR)

ah, one more aspect that we might need to be more careful with is interaction with close()

I’m not sure what would be the equivalent way to express this idea with bare .write and readabilityHandler - some kind of cancellation?

Also one alternative would be to kill the subprocess and let the pipes empty normally, but I’m not sure what is the expected behavior here.

(edit)

Let’s move those code-related discussions to your PR, I think it’s easier to discuss them on GitHub.

Regarding your question about close(flags: stop): I don’t think we need to have any hard guarantees that we won’t read any more messages from stdin after calling close() on the JSONRPCConnection, ie. I would expect everything to work just fine if we didn’t pass flags: .stop here. I suggested setting inFD.readabilityHandler = nil here in the PR, I think that should give fairly similar semantics. Or do you think that’s missing something? If so, let’s continue this conversation in Prototype pipe handling without dispatch IO by roman-bcny · Pull Request #2315 · swiftlang/sourcekit-lsp · GitHub.

Ah yes thanks, I missed this bit in readabilityHandler docs: "To stop reading the file or socket, set the value of this property to nil. Doing so cancels the dispatch source and cleans up the file handle’s structures appropriately."
start() returns the process so supposedly it's the caller's responsibility to deal with process termination, a quick test suggests that we stop receiving updates but the subprocess keeps running (adding inFD.close() keeps the same behavior)
I should be able to put something together soon.

1 Like

One thing to keep an eye on is that we appear to have had problems with availableData on Windows in the past

Sigh, this gets more complicated than it seemed. As this comment pointed out that patch changed the read from EOF=false to EOF=true Use `readData` to read data from a pipe instead of `availableData` by ahoppen · Pull Request #8047 · swiftlang/swift-package-manager · GitHub so this probably does a single blocking read until the process finishes.. which now Fix test output streaming on Windows by owenv · Pull Request #9191 · swiftlang/swift-package-manager · GitHub is trying to solve by going the DispatchIO route. Looks like we’ve gone full circle, lol.

Might need to go back to understanding why availableData was hanging in the first place there. I don’t think reading until EOF is the intended use of readabilityHandler so this doesn’t look right.

@owenv do you still intend to land DispatchIO in that patch or are you looking into alternatives?

I don't, that PR is hitting similar hangs. I'm looking at potentially beginning to adopt swift-subprocess for these use cases instead