Announcing Swift HTTP Types

Guoye · July 10, 2023, 6:23pm

We're happy to introduce a Swift package for shared HTTP currency types. Details are on the Swift.org blog: Introducing Swift HTTP Types

johannesweiss · July 10, 2023, 7:51pm

super cool, thanks everybody who was involved!!

taylorswift · July 10, 2023, 9:32pm

re: Integrating with SwiftNIO (i wish these headings had anchors),

is there any reason why the SwiftNIO integration is going in the swift-nio-extras package and not swift-nio itself? by my count there would now exist three independently incrementing version tuples in an NIO-based build tree that uses this package, and that seems needlessly contrived.

lukasa · July 10, 2023, 9:54pm

Yes, adding the dependency to swift-nio is unnecessary, forcing users to add this to their build graph when they only require NIOCore, for example. Keeping the core of SwiftNIO as dependency-free as possible is a goal here.

Note that the long-term goal is that we will replace NIO's types entirely, at which point we'll revisit the dependency graph.

taylorswift · July 10, 2023, 10:36pm

in my experience there exists a point at which “don’t include what you don’t use” becomes more harmful than it is beneficial.

for modules this threshold is rather distant — for example there are almost always huge dividends to be reaped by cleaving off a Foundation linkage even if it makes the intra-package dependency graph more complex. and another reason for this is because modules within the same package are unversioned with respect to one another, so the costs of breaking up large modules into many smaller modules is low. so, over time, i have gravitated towards package layouts that have many targets with many dependency edges between them. because this kind of atomized structure works well for modules.

for packages i have learned the opposite lesson. dividing up packages into many federated components is almost never worthwhile.

benefits of federated packages:

SPM clones fewer repositories for users who are only using the “core” package
can upgrade small packages one-by-one instead of performing a huge migration
easier to document what a small package “does”

drawbacks of federated packages:

nearly impossible to work on features that require changes to more than one package at a time
cannot possibly test every combination of pin resolutions
developers tend to waste time over-engineering upstream packages for imaginary use cases instead of allowing development to be driven by the needs of the downstream packages
developers end up having to coordinate “joint” PRs that all need to be merged separately into a constellation of source control repositories.
very hard to look back and understand the history of many related packages as a unit

as for the first two benefits, i find that they are not really that valuable in practice.

SPM clones fewer repositories for users who are only using the “core” package
- for me, whether or not SPM needs network access at all to build something is more important than “how much” it uses the network once we have already accepted it needs to download something from GitHub. simply put, if SPM is going to clone swift-nio, it might as well clone swift-http-types along with it.
can upgrade small packages one-by-one instead of performing a huge migration
- i have found that API stability is inversely related to package federation, versions of federated packages usually end up sorting themselves into “epochs”, so upgrading package A from version 0.1 to version 0.2 usually also requires upgrading package B from 0.5 to 0.6, package C from 0.3 to 0.4, et cetera. because as much as the individual packages want to pretend they have independent timelines, they actually share one development timeline, we have just taken the version number of one package and given it a large number of “aliases” for the other packages.

at a higher level, i think that a big reason for this is because SPM uses the word package to refer to what other languages think of as a version-controlled repository, and we do not have the ability to organize a project into many packages that still share a common repo and version sequence. and i think this is a problem unique to swift, because of a peculiar limitation of its preferred build system.

stackotter · July 11, 2023, 2:04am

I completely agree. I have personally tried splitting a project into separate repos before, complete development nightmare, and that’s just when I was working on it alone. Changes that affect multiple packages are super annoying to do without ending up with some possible way someone could end up with incompatible versions.

Unless the parts are actually just independent packages, such as a maths package and a graphics package that depends on the maths package, I just find the multi-repo architecture unwieldy.

When a repo is called project-extras, imo it’s going to be getting developed in tandem with the base project and might as well be in the main repo, or split into independent projects with clear purposes.

I can see that that’s probably just a matter of opinion, but I can say that I’ve always avoided NIO because of it feeling unwieldy (can’t exactly explain why in a concise way)

Karl · July 11, 2023, 8:19am

This looks like a very interesting package!

One thing: I noticed that this library uses CoreFoundation directly. IIRC that is not portable; CF is available on Darwin platforms and Linux, but is deliberately omitted from other platforms (e.g. Windows).

lukasa · July 11, 2023, 11:46am

To the core of your feedback, sure, we can retarget the PR towards core NIO. It'll delay our ability to merge until we drop 5.6 support, but that's potentially a worthwhile trade.

As to your broader concerns, this is generally all accurate, but the reason SwiftNIO has historically had federated packages is solely about dependency management in older Swift versions. Specifically:

swift-nio-ssl originally had a dependency on the system OpenSSL. That was unacceptable in the core library, where users should not have build-time failures because of a missing system library dependency. It now carries a copy of BoringSSL.
swift-nio-http2 originally had a dependency on the system copy of nghttp2.
swift-nio-ssh has a dependency on swift-crypto, which forces a build and link of a copy of BoringSSL, which would be the second copy if we also had SSL in here. It also forces a breaking change as SSH has minimum deployment targets, which NIO core does not.
swift-nio-extras depends on the system zlib.

All of this seems like overkill now for ssl, http2, and extars, but it's also worth adding the most current Swift version at the time of their release:

ssl: 4.0
http2: 4.1
extras: 4.1

None of these SwiftPM versions had support for conditional target dependencies (came in 5.3) or target based dependency resolution (came in 5.2), so adding them to the core NIO repo would have unnecessarily forced their system dependencies on all users.

Our strong preference at the time (and now) was to keep all of these in a single repository, and if we were developing NIO from scratch in 2023 we would do so. But we cannot escape our past.

Once we had split them out, it minimises churn on the ecosystem to keep them split. We could forcibly-deprecate the ecosystem repositories and push everyone to move to the core NIO repos, but we'd need to continue shipping bugfixes into the ecosystem repos for quite some time. It is possible that at our next breaking release we'll do exactly that, but for now we're trying to maximise compatibility and stability for the Swift on Server ecosystem.

As a final note, I don't think NIO necessarily gets more approachable in a world where the satellite packages are all merged into the tree. NIO's Sources directory looks like this. NIO has 11 products and 43 targets in the core project alone. Adding SSL, SSH, HTTP2, and Extras would push us up to 20 products and 75 targets. That's a pretty unwieldy project right there.

taylorswift · July 11, 2023, 9:53pm

thanks for all the explanation! in my view, this is something that needs to be done eventually, and the best time to do it is the present, while Swift on Server is still relatively small and doesn’t have much of an “ecosystem” grown around it yet. i think that if we are successful and Swift on Server gains traction beyond its current (and, let’s be honest, niche) user base, then compatibility concerns will be much greater in the future than they are now.

i wrote more about this topic here if you are interested, but to summarize: i personally find it amusing that a project with 20 products and 75 targets is considered “large” in the swift world.

there are plenty of not-so-good reasons for this (swift developers’ penchant for writing very large modules, @inlinable fatigue, no SPM support for multiple package manifests, no multitarget DocC support, et cetera et cetera…) but ultimately i think that trying to adapt by splitting up packages (or keeping existing projects federalized) is not the right approach. it is better to tolerate large manifests in the medium term and wait for SPM to gain better support for large projects. because, as you yourself have observed, it is very, very difficult to stitch projects back together after they’ve been flashpointed into a dozen timelines.

by the way, re:

my longstanding assumption (which may be wrong!) is that most people doing Swift on Server today are deploying to linux machines, and are probably manually installing the toolchain and runtime libraries on the cloud instances. after a while, i learned that maintaining a long toolchain support window for this type of project is a waste of time, it was way easier for me to just SSH into the cloud instance and upgrade the toolchain, which allows me to only target the latest swift release.

i imagine this is quite different from swift on Apple platforms where long toolchain windows are important to a lot of people. but, different world, different incentives…

Guoye · July 12, 2023, 12:24am

The use of CFURL is unfortunate but necessary since Foundation URL / URLComponents implement the RFC and they are incompatible with WHATWG URL in subtle ways (e.g. percent encoding "|" characters). CFURL is more lenient and preserves the raw bytes given to it. The hope is that Foundation URL will one day gain WHATWG URL parsing mode.

Karl · July 12, 2023, 11:54am

Ah, I see.

We don't need WHATWG-compatible parsing to create HTTP requests from Foundation URLs, though. If you have a Foundation.URL value, the components will already be encoded to Foundation's satisfaction. All we need to create a request are the raw, percent-encoded values of the URL components (which is what is currently being extracted via CF). Fortunately, Foundation recently added APIs to provide that data, in the form of the path(percentEncoded: Bool) family of methods.

Unfortunately, URLComponents can't, in general, be used to get the percent-encoded path because of [SR-15512] URLComponents percent-decodes paths which contain a semicolon · Issue #3352 · apple/swift-corelibs-foundation · GitHub.

This does however expose another issue, which is that swift-corelibs-foundation hasn't been updated in many, many years, so the path(percentEncoded: Bool) family of methods are only available on Darwin platforms. But I don't think we need a full codebase sync just to implement these particular methods - they should be fairly straightforward to implement on top of what's already in corelibs-foundation.

If we implement those methods in corelibs, we can drop the use of CF in this package, making it portable to non Darwin/Linux systems. I think that's important for a library that exposes currency types.

Guoye · July 12, 2023, 7:46pm

If you have a Foundation.URL value, the components will already be encoded to Foundation's satisfaction.

This is not the case. CFURLCreateAbsoluteURLWithBytes and CFURLGetBytes are escape hatches which allow clients (e.g. WebKit) to sneak strings which aren't considered valid by URL past URL.

All we need to create a request are the raw, percent-encoded values of the URL components (which is what is currently being extracted via CF).

We do need percent-encoded values, but not how URL currently does it.

let cfURL = CFURLCreateAbsoluteURLWithBytes(kCFAllocatorDefault, "https://example.com?q=1|2", 25, kCFStringEncodingASCII, nil, false)!
CFURLGetByteRangeForComponent(cfURL, .query, nil) // range of q=1|2
(cfURL as URL).query(percentEncoded: true) // q=1%7C2

let url = URL(string: "https://example.com?q=1|2")!
url.query(percentEncoded: true) // q=1%7C2

We don't want the CF dependency either, but there are websites that depend on this today.

Karl · July 14, 2023, 8:45am

I appreciate that you guys also don't want the CF dependency; I'm sure that was nobody's first choice.

I've filed a GitHub issue to track it: Drop CF dependency · Issue #10 · apple/swift-http-types · GitHub

Wow! TIL. That's very subtle, and I don't think it is mentioned in the documentation.

For construction, it seems that we have an NSURL initialiser which forwards to CFURLCAUWB (documentation unfortunately is empty). NSURL is available on all platforms.

NSURL(absoluteURLWithDataRepresentation: Data("https://example.com?q=1|2".utf8), relativeTo: nil) as URL
$R34: Foundation.URL = "https://example.com?q=1|2"
                                               ^

As for the component getters, it seems the behaviour can vary - .host does not perform any escaping but .path and .query do.

let url = (NSURL(absoluteURLWithDataRepresentation: Data("https://ex  ample.com?q=1 2".utf8), relativeTo: nil) as URL)

url.host(percentEncoded: true) 
$R54: String? = "ex  ample.com"
                   ^^
url.query(percentEncoded: true) 
$R55: String? = "q=1%202"
                    ^^^

When I suggested that the Foundation team add these APIs, it was always the intention that they give the raw "unadulterated" components; I never imagined the getter would actually add escaping and it is possible that it was unintentional.

Foundation URL Improvements

I think the biggest, easiest improvements would be to:

Add .percentEncodedX properties to URL, so that people don't need to go through URLComponents to avoid URL's automatic percent-decoding. Automatic decoding is really, really, really bad and currently there is no way around it (even converting URL -> URLComponents will automatically percent-decode under certain circumstances, and can corrupt the path).

Add a method which provides the URL's string buffer, with ranges of all of the components. This would be seriously useful for URL -> WebURL conversion -- currently, because there are differences between the standards (as well as bugs in URL), we need to request each component individually to ensure the converted URL points to the same location, and each component potentially allocates a String.

I would suggest that either:

We implement that behaviour in SCF (where percentEncoded: true returns the raw component), and Darwin changes its implementation to do the same. That means we can use official, public APIs everywhere

Since "normal" users don't sneak incorrectly-escaped URLs in to Foundation.URL, they should not be affected by this change. We should urgently document that just because a user passed percentEncoded: true, it doesn't mean the resulting string is escaped, and it MUST be sanitised before it is written to the network -- but as I showed above with .host(percentEncoded:), that is already true today; a hostname with unescaped newlines can be trivially exploited if written as part of an HTTP header.
We add some kind of SPI which exposes these CF methods, whether via an "official" @_spi or just some underscored methods. We don't need this on Darwin Foundation, which can fall back to CF, so it can be limited to the open-source SCF.

Anyway, we can continue this discussion on the GitHub issue, if you prefer.

yvsong · July 14, 2023, 4:08pm

Is the Retry-After header field missing in HTTPField.Name?

Jon_Shier · July 14, 2023, 5:04pm

For things like that it's probably best to just create a PR to add it.

Guoye · July 14, 2023, 8:10pm

The list was compiled through the combination of 1. telemetry of most common fields used, and 2. internal use case (e.g. Proxy-Authorization even though it's uncommon on the internet).

We don't have an exhaustive list of all fields. For example, Content-Language was omitted even though it's in RFC9110 since it's not seen used in the wild. That been said, if you have a use case for any of these fields defined in the RFC, we are happy to add them.

dkz · July 21, 2023, 2:28pm

Nice work! this is pretty cool!

andre1sk · August 14, 2024, 8:20pm

Sorry for a newb question so long term vision is to phase out SwiftNIO ?

Guoye · August 14, 2024, 8:23pm

No, there is no plan to phase out SwiftNIO

andre1sk · August 14, 2024, 8:37pm

Cool thank you for clarification!