Proposal: swift-corelibs-foundation: Replace libcurl with SwiftNIO and AsyncHttpClient

[cc @millenomi @Tony_Parker]

Motivation

The swift-corelibs-foundation's (scl-f) URLSession implementation for protocols
http and ftp currently uses libcurl, provided by the host's package system or
built separately as part of the toolchain build (Windows). At the current time
these 2 protocols still have bugs and require more work to fix up authentication,
cookies, session metrics and some concurrency bugs in the general URLSession
codebase. It will take significant effort to fully implement everything that is
still outstanding.

Proposed Solution

I am proposing to replace the libcurl implementation with async-http-client
(AHC) based on swift-nio.

URLSession/http would become a thin wrapper around AHC, which already implements
most of the functionality required, although its delegate does need expanding to
allow mapping to all of the URLSession delegate methods. This would allow using
the development effort of AHC/NIO and not splitting off resources to maintain
two full HTTP clients.

FTP support would be dropped unless an alternative FTP NIO client is developed.

AHC and NIO sources would be vended by scl-f meaning that a copy of the individual
packages would be copied into scl-f's source tree and then the NIO packages renamed
to avoid clashing with the normal NIO modules using a script. The modules would
then be imported @_implementationOnly.
This would allow Foundation / FoundationNetworking to still be used by applications
using either NIO1 or NIO2 packages.

The source of the packages in the scl-f source tree would be manually updated as necessary.

Although a complete copy of the NIO sources would be built as part of scl-f, the
extra binary size of including NIO in FoundationNetworking will still be smaller
than linking to libcurl, which on Ubunutu currently links to about 40 non-system
libraries

CMake would still be used to build scl-f so the NIO packages would require CMakefiles.
The following packages would be used:

This would mean that NIO support would need to be finished on Windows and would
be a future requirement for any platform that wanted to be supported by Swift.

This will not be a quick change to make as there are many parts that need to be implemented
and the URLSession code that remains will be ported to async/await if that is
found to be the best solution to fix concurrency issues. This will probably take at least a year.

Please note that even though FoundationNetworking will use NIO code this does not
mean that FoundationNetworking or Foundation will become Swift packages.
It will still remain part of the toolchain and be built with CMake as part of the toolchain build.

Main tasks

  • Convert scl-f from an Xcode project to Package.swift to make it easier to
    develop with packages. This DOES NOT mean scl-f itself will be a package, only
    that it is easier for development and testing builds of scl-f itself on Linux
    and macOS with packages.

  • Enhance AHC's client and delegate methods to provide all of the functionality
    that URLSession/http requires. This will also benefit clients of AHC.

  • Convert scl-f's _HTTPURLProtocol to use AHC.

  • Add CMakefiles for all of the NIO projects.

Acknowledgements

Thanks to @johannesweiss, @lukasa and @graskind for ideas on using NIO/AHC and
feedback on this proposal.

19 Likes

This is very cool! And a good move for security and reliability!

It's really not a good idea to have Foundation, with its own URL type and parser, delegating requests to cURL (which also has its own URL parser). It means that an application could use Foundation's URL types to attempt to sanitise a request - maybe checking the hostname, port, etc. - only to find that cURL sends it somewhere entirely different! Those kinds of issues come up in pretty-much every major programming language, and even though I don't know of any specific examples in Foundation (I've never looked, TBH), it's entirely possible that they exist.

The only way to eliminate these issues is for the application and request code to use the same URL parser. That way, you know the host you connect to will definitely be the same as the one in the .host property from your URL, the path you send will be the .path property, etc. Since AsyncHttpClient uses the Foundation URL parser, we will have those guarantees after this switch.

See A New Era of SSRF - Exploiting URL Parser in Trending Programming Languages! for more information about these kinds of bugs and how they can be exploited (that's where the picture is from).

FWIW, Foundation's URL gives the hostname as evil.com for the above example. Assuming cURL hasn't changed their parser, Foundation doesn't even have the option of aligning with other implementations (or the latest standard, which says google.com should be the hostname from that example). After the move to AHC, that could very well change.

Would this also apply to Darwin platforms? From what I've been able to discover, URLSession allows FTP downloads (but not uploads). Is this saying that FTP downloads may no longer be supported (through URLSession) in some future SDK?

FTP almost certainly should go. If it were up to me, I'd support dropping it from Apps built with the latest SDK.

What exactly are the platform requirements for NIO? Is it still possible to implement on less traditional platforms, such as WASI?

Finally, while I know this isn't the goal of the proposal, I think lots of developers would appreciate if there were builds of AHC/NIO with access to some of URLSession's magic behaviour (e.g. background sessions on iOS). New APIs would be ideal, but even binary-only builds of those packages with the additions would be welcome. I'm sure it's been heavily requested and there are good reasons why it hasn't been possible, but maybe it's a good opportunity to take another look at it =)

1 Like

None of this proposal applies to Darwin's Foundation used on Apple platforms. This change would only apply to the open source Foundation used on Linux, Windows etc. And yes, FTP would be dropped from a future version of the open-source Foundation.

Im not sure what WASI provides for networking, @Max_Desiatov may have an idea.

There isn't anything for networking in WASI right now. A WASI host can pass a file handle to an executed module, but that's basically it. In SwiftWasm we basically exclude all URLSession-related code from the build because of that. I hope that's going to be just as easy when libcurl is replaced with NIO.

This doesn't mean you can't do HTTP with SwiftWasm. At least in browsers it's easy enough to call the fetch API through JavaScriptKit instead. But that's not highly relevant here. The fetch API is substantially different from URLSession, currently it's not worth for us to align them.

It's been explicitly the goal of AHC/NIO to not build those features, and to leave them to libraries such as Foundation to build the higher abstractions that URLSession provide.

This proposal doesn't discussion replacing URLSession, just the implementation detail of relying on libcurl. All of the current "magic behavior" would still exist as is, it's just that SwiftNIO and AHC will be the ones driving it under the hood :slight_smile:

FWIW, NIO’s platform requirements are currently excessive. As the move to supporting Windows has encountered, the base NIO package requires too much from the OS, making it extremely difficult to shim. This also makes running NIO in WASM hard due to the sheer quantity of things we need out of the OS.

A future-looking piece of work would split out the currency types and protocols from the MultithreadedEventLoopGroup and its associated bootstraps. This would remove, at a stroke, the need to get a giant codebase compiling on the specific platform. It would replace it with the need to implement custom event loops and channels, but ideal implementations do that anyway.

NIO and AHC have no plans to establish stable ABIs at this time. The performance penalty on source-available code is still too great. However, NIO and AHC would be implementation details of URLSession, so in some sense the way you’d get a stable ABI would be to call URLSession instead.

7 Likes

libcurl, which on Ubunutu currently links to about 40 non-system
libraries

I've already run into this issue with the Swift SDKs I distribute for Android (which includes the corelibs like Foundation), so I look forward to any effort to remove that dependency.

1 Like

I think on Windows you'd really want to base Foundation on WinSock and WinHTTP for pretty much all the reasons why you'd want to use URLSession on Darwin platforms.
Proxies, SSL, all that stuff a user would expect to configure within the Windows control panel. I think, my Windows is rusty ;-)

1 Like

First, I agree with @Helge_Hess1 - using WinSock + WinHTTP for network access on Windows is a significantly better approach - and in fact is what currently happens. swift-corelibs-foundation uses libcurl, which is built against WinSock and WinSSL, which means that the global system configuration actually takes part in the configuration of network accesses.

There are multiple technical issues which still need to be resolved before the Windows port can continue to make progress. I do not have the time to complete the entire port myself at this point, and it would require someone else to step up to help with it. The port does require a strong understanding of Windows, Linux, Darwin, ELF, PE/COFF, MachO, and networking fundamentals. The library has a very strong assumption of POSIX and Unix and does require re-architecting various pieces (some of which has been done with the extraction of POSIX interfaces, SuS interfaces, and the addition of a WinSock based semantic compatibility layer).

At this point, I don't think that simply switching Foundation to NIO is viable without someone first being able to complete the NIO port for Windows, which is not very simple to accomplish.

2 Likes

More importantly it would still be a fundamentally wrong approach for Foundation (as much as I'd like to get NIO forward on Windows).
I'm not even sure it makes sense on Linux.

1 Like

Funny, but relevant, history note: The first thing I did at Apple was porting a libneon (kinda like libcurl for DAV) codebase to WinHTTP native. Because that's what you want.

1 Like

Ultimately only a version of async-http-client would need to be ported to Windows. NIO is really just an implementation detail of that library.

1 Like

Why not directly implement URLSession using WinHTTP, i.e. why inject yet another layer with additional dependencies? Getting a decent version running doesn't sound like a huge amount of work, and would further the Windows port w/o having to wait for NIO, which has largely different deployment targets.

Also, there is CoreFoundation (available as FOSS), which does have support for Windows. Not sure whether any HTTP stuff is included in that, though.

P.S.: I'm absolutely not against another URLSession that optionally sits on top of NIO, but I have major doubts that one makes sense on Windows, and very likely not as the default on Linux. For all the usual reasons (security, system setup, integration, etc).