I am having a hard time attempting to use Swift in a production app on Windows!
It turns out if your URLSessionTask
fails due to a timeout, instead of completing with an error I instead get a total crash.
I thought I'd see if I can use something that's a little more bare-metal like async-http-client only to realize there are some major blockers keeping it from building for Windows (mostly swift-nio issues).
Currently there is no pure-Swift way of making a web request guaranteed to not crash on Windows .
In my case, if a user blocks network access of my Swift-backed Windows software through a 3rd party tool or if my server times out for any reason, the crash happens.
Relevant GitHub Issue: [SR-13383] Windows, Dispatch crashes after network request timeout · Issue #609 · apple/swift-corelibs-libdispatch · GitHub
PR with "dubious" fix: Don't crash due to closed socket on Windows by triplef · Pull Request #772 · apple/swift-corelibs-libdispatch · GitHub
P.S. I also get a similar full crash and burn upon attempting to cancel()
a URLSessionTask
of any kind on Windows...
If anyone is up to the task of fixing any or all of these problems, I am more than willing to pay top of market for development time.
Followup: maybe mostly fixed in corelibs-foundation with this recent merged PR? Only time will tell.
apple:main
← readdle:readdle/urlsession-dispatchsource-close
opened 11:12PM - 26 Dec 23 UTC
To avoid issues described in #4791 we have to follow `DispatchSource` cancel pro… cedures in the first place. I.e. we have to prevent underlying socket close until `DispatchSource` calls a cancel handler.
Everything becomes complicated a bit, because we have multiple cases for socket and `DispatchSource` lifecycle. Cancelling Dispatch Source is asynchronous operation (with control points at the start and at the end of the process), and other actions, despite being simple and synchronous, become more complicated due to their dependency on cancel operation. Some of basic cases:
- Socket could (but not necessary would) outlive its owning `easy_handle`, as CURL caches connections in `multi_handle`
- 99% of time DispatchSource cancels quickly, but sometimes some other work on the queue squeezes in between the start and the end of cancelling. CURL could easily send a series of register-unregister-register requests, and we have to carefully handle possible overlapping.
This change aims to extend socket life by tying its lifecycle with boxing object (`_SocketReference`). Socket is not closed until `_SocketReference` is alive. While `DispatchSource` cancel process is ongoing, we're keeping such object, sharing it through a storage with the [close socket function](https://curl.se/libcurl/c/CURLOPT_CLOSESOCKETFUNCTION.html). Close socket function implementation marks `_SocketReference` as eligible for closing. If there is no ongoing Dispatch Source cancel, the reference is deinited immediately, effectively closing socket.
Also, this change is trying to be as less invasive as possible. Mostly additive, without structural changes. Major work is done by manipulating the state of `_SocketReference`. I believe there is better way to handle this, but this would probably require more extended rework of how Dispatch Sources are managed and stored.
And the fly in the ointment. This fixes most of crash scenarios on Windows, but not all. I can now say confidently that Dispatch has some flaws related to the socket processing on that platform. During stress testing I observed numerous Dispatch crashes on adding socket handle after it being reused by the system - and that is after graceful and complete cancel of corresponding DispatchSource. Luckily, it appears only on heavy load (test `test_repeatedRequestsStress` - marked as expected to fail), but it definitely affects final product and our users.
Also, this changeset includes tests from #4854 and timer fixes from #4858 (as it makes everything work more predictable). If this whole PR would be unreliable for some reason, I'd like to merge aforementioned changes separately.