The hang in libdispatch happens in the following PIPE_WAIT scenario:
pipe is empty
monitor thread calls the 0-byte ReadFile and blocks on the read (which is the expected way to get async IO)
another thread calls PeekNamedPipe on the pipe and also blocks, which blocks the corresponding queue and results in a hang
It might be counter-intuitive but synchronous file operations on Windows appear to be holding a lock throughout the entire execution - which is the likely reason why Peek would block in presence of a blocked Read. For example, NtQueryInformationFile in ReactOS calls IopLockFileObject on synchronous operations (NtQueryInformationFile also provides info about available read bytes and hangs the same way as Peek).
I don't see any simple way to get around this in libdispatch. There are additional complications with the way the library is structured, where the handler of the read operation only has access to the pipe handle and no other data seems to be shared between the monitor thread and the operation handling thread, so I don’t think there’s even an easy way to add hacks to try to avoid Peek hangs. Additionally, libdispatch has 3 threads involved in this situation, with 2 of them spinning with a ping-pong event notification while data is present in the pipe and the 3rd thread consuming the data (which then sporadically hangs if the pipe is emptied and one of the other threads is blocked on a read). A proper solution here would require some significant surgery requiring a good understanding of libdispatch internals - which I do not have, and I’m told it’s fairly likely no one does at this point given the unfortunate fate of libdispatch.
I’ve discussed this with @compnerd and we’re not sure what route we should be taking here to get us further. My idea with the prototype above was that we could leave the dispatch use as is, while avoiding specifically the pipe handling (DispatchIO) part which is what is ultimately causing this trouble on Windows. However I’m told this may not be a suitable state for sourcekit-lsp, and might introduce subtle bugs due to mixing dispatch and concurrency, so asking for guidance here. Thanks!
One other idea could be to use FILE_FLAG_OVERLAPPED on Windows, which should allow for truly async pipe access avoiding the hang with PIPE_WAIT. However, pipe creation is generally outside of libdispatch control. Windows API does allow to flip the pipe to PIPE_WAIT on the fly, but this doesn’t seem to be the case with overlapped I/O. So this would have to be configured on the sourcekit-lsp end. I’m not sure if/how this is exposed to Swift.
That is not exposed to SourceKit-LSP as it uses Foundation's APIs and overlapped IO is not a concept on Unix, so there is no API for that. In order to extend that, you would need to go through the Foundation evolution process.
IMO migrating away from dispatch and using proper structured concurrency for SourceKit-LSP seems like a nice approach. It would help simplify and modernise the SourceKit-LSP codebase, resolve this particular issue, and open up SourceKit-LSP to get any improvements in the future due to enhancements of the concurrency support.
FWIW there should be a way to do what I did in my prototype with dispatch queues instead of the different concurrency model. Yes that would be in the workaround category as it would effectively re-implement what DispatchIO already provides (for a reason: it’s suboptimal on Windows and we don’t know how to fix it), but that’s a much smaller scope change than redoing everything without dispatch so we can get the benefit to the user quicker.
Thanks for your interest and further digging here @roman-bcny!
Removing DispatchIO from JSONRPCConnection.swift seems reasonable to me, though I’d prefer moving both read and writing off in that case. Your current PR never has a yield in reader though, so it holds onto one of the worker threads in the co-operative pool for the entire lifetime of the process.
Assuming there’s not similar issues with FileHandle on Windows (@compnerd?), we could probably move to using eg. FileHandle.readabilityHandler for reading + a write on the existing sendQueue for writing. Would be nice to AsyncStream-ify that as well, but I don’t think that’s necessary for the initial replacement.
My PR is not meant to be merge-able and is just an MVP/demo to illustrate the idea of how simple replacing that DispatchIO piece might look (probably a bit more complex in the end, with both read and write parts, but hopefully shouldn’t be too far in terms of complexity). Note that this is my first ever attempt to write anything in Swift so terrible choices are expected
The underlying reason why this gets very complicated on Dispatch side isn’t some particular flaw of the pipe API, but because Windows support was patched in as an addition to the existing already complex threading model and the choice was made to do this using an additional monitor thread. It probably seemed like the simple way forward back then, but that was probably done without realization that you can’t simultaneously do a blocking Read and a non-blocking Peek. That is what ultimately causes all the trouble, and as long as we stay within that model getting around issues would be hard.
My PR is meant to demonstrate how we could replace this entirely on the Swift side instead with a simpler model that doesn’t have to retrofit into an existing complex framework / threading model. Note that the additional thread that I’m using in the PR also exists with DispatchIO in a slightly different form (monitor thread which continuously spins on 0-byte ReadFile on non-blocking pipe or blocks on it in case of PIPE_WAIT), so it is not a regression in that regard. But as I don’t know what I’m doing in Swift it is very possible that I used the incorrect way to spawn that thread. The idea was that this thread would exist while the pipe is open and terminate after, just like the monitor thread with DispatchIO would: swift-corelibs-libdispatch/src/event/event_windows.c at 0bb6c10fa556722654917b4a18ba2dc39b18392a · swiftlang/swift-corelibs-libdispatch · GitHub
Thanks a lot for your investigations here, @roman-bcny and for opening the PR to switch from DispatchIO to plain FileHandle. If we can switch to use FileHandle directly and remove the dependency on Dispatch for stdin/stdout handling, that would be great, assuming that we don’t buy ourselves a truckload of problems that DispatchIO was previously handling for us – which I am not currently aware of.
If you find time to make your PR review-ready that would be greatly appreciated and I’m sure all SourceKit-LSP users on Windows will love you for it
Let’s move those code-related discussions to your PR, I think it’s easier to discuss them on GitHub.
Regarding your question about close(flags: stop): I don’t think we need to have any hard guarantees that we won’t read any more messages from stdin after calling close() on the JSONRPCConnection, ie. I would expect everything to work just fine if we didn’t pass flags: .stop here. I suggested setting inFD.readabilityHandler = nil here in the PR, I think that should give fairly similar semantics. Or do you think that’s missing something? If so, let’s continue this conversation in Prototype pipe handling without dispatch IO by roman-bcny · Pull Request #2315 · swiftlang/sourcekit-lsp · GitHub.
Ah yes thanks, I missed this bit in readabilityHandler docs: "To stop reading the file or socket, set the value of this property to nil. Doing so cancels the dispatch source and cleans up the file handle’s structures appropriately."
start() returns the process so supposedly it's the caller's responsibility to deal with process termination, a quick test suggests that we stop receiving updates but the subprocess keeps running (adding inFD.close() keeps the same behavior)
I should be able to put something together soon.
Might need to go back to understanding why availableData was hanging in the first place there. I don’t think reading until EOF is the intended use of readabilityHandler so this doesn’t look right.
@owenv do you still intend to land DispatchIO in that patch or are you looking into alternatives?