WebURL 0.3.0 released!

I'm happy to announce that WebURL 0.3.0 has been released!

What's Changed

:books: DocC-based Documentation!

All of the documentation has been rewritten and reorganised to take advantage of the new DocC documentation engine. It's a really huge improvement, so do please check it out. And if you find anything which you think could be improved, don't hesitate to file an issue or even submit a PR :slightly_smiling_face:

:link: Foundation Integration

WebURL 0.3.0 includes the WebURLFoundationExtras module, which comes with a way to convert Foundation URL objects to WebURLs. That means your libraries can use WebURL for their internal processing, while continuing to support clients who provide data using Foundation's types.

The async-http-client port is an example of this. Even though it uses WebURL for its internal processing, it is still possible to create requests using Foundation.URL using an extension. This means it gets to benefit from modern, web-compatible URL parsing (for example, when resolving HTTP redirects), and WebURL's simpler, more efficient API, without breaking compatibility.

New with this release, the async-http-client port offers a build configuration which omits all Foundation dependencies. By doing so, we've measured binary size improvements of up to 16% on a statically-linked & stripped executable, while keeping the full functionality of AHC such as streaming, compression, and HTTP/2. We expect that size improvement could improve even further with Swift 5.6, as the standard library will no longer need to link all of ICU's Unicode data.

:zap:+ :straight_ruler: Performance and Code Size Improvements

WebURL keeps getting faster, and leaner, but not meaner :innocent:. Compared to 0.2.0, WebURL 0.3.0 offers some incredible performance enhancements. URL parsing time has been reduced by almost 1/3, our fantastic in-place component setters can be almost 40% faster, and common operations like iterating path components can now be performed in just half the time.

Full Benchmark Comparison
benchmark                                              column     results/0_2_0 results/0_3_0       %
-----------------------------------------------------------------------------------------------------
Constructor.SpecialNonFile.AverageURLs                 std                25.69         25.36    1.28
Constructor.SpecialNonFile.AverageURLs                 warmup      607364757.00  411871846.00   32.19
Constructor.SpecialNonFile.AverageURLs                 iterations      45277.00      64160.00  -41.71
Constructor.SpecialNonFile.AverageURLs                 time            28390.00      20685.00   27.14
Constructor.SpecialNonFile.AverageURLs filtered        std                24.14         23.13    4.19
Constructor.SpecialNonFile.AverageURLs filtered        warmup      943156822.00  844993929.00   10.41
Constructor.SpecialNonFile.AverageURLs filtered        iterations      30543.00      33231.00   -8.80
Constructor.SpecialNonFile.AverageURLs filtered        time            41049.00      37988.00    7.46
Constructor.SpecialNonFile.IPv4 host                   std                25.99         23.99    7.70
Constructor.SpecialNonFile.IPv4 host                   warmup      655361449.00  418003090.00   36.22
Constructor.SpecialNonFile.IPv4 host                   iterations      43615.00      65144.00  -49.36
Constructor.SpecialNonFile.IPv4 host                   time            28432.00      19104.00   32.81
Constructor.SpecialNonFile.IPv4 host filtered          std                24.54         23.59    3.84
Constructor.SpecialNonFile.IPv4 host filtered          warmup      756658347.00  683989589.00    9.60
Constructor.SpecialNonFile.IPv4 host filtered          iterations      38017.00      41673.00   -9.62
Constructor.SpecialNonFile.IPv4 host filtered          time            33023.00      30306.00    8.23
Constructor.SpecialNonFile.IPv6 host                   std                25.11         25.37   -1.03
Constructor.SpecialNonFile.IPv6 host                   warmup      565434962.00  510942889.00    9.64
Constructor.SpecialNonFile.IPv6 host                   iterations      51923.00      53166.00   -2.39
Constructor.SpecialNonFile.IPv6 host                   time            24641.00      23175.00    5.95
Constructor.SpecialNonFile.IPv6 host filtered          std                25.17         23.98    4.72
Constructor.SpecialNonFile.IPv6 host filtered          warmup      697132861.00  647714822.00    7.09
Constructor.SpecialNonFile.IPv6 host filtered          iterations      41856.00      43791.00   -4.62
Constructor.SpecialNonFile.IPv6 host filtered          time            29755.00      29782.00   -0.09
Constructor.SpecialNonFile.Percent-encoding components std                26.44         28.43   -7.53
Constructor.SpecialNonFile.Percent-encoding components warmup      248098799.00  203956749.00   17.79
Constructor.SpecialNonFile.Percent-encoding components iterations     117423.00     136120.00  -15.92
Constructor.SpecialNonFile.Percent-encoding components time            10866.00       9187.00   15.45
Constructor.SpecialNonFile.Percent-encoded hostnames   std                27.97         28.03   -0.24
Constructor.SpecialNonFile.Percent-encoded hostnames   warmup      224749631.00  190466566.00   15.25
Constructor.SpecialNonFile.Percent-encoded hostnames   iterations     128245.00     146599.00  -14.31
Constructor.SpecialNonFile.Percent-encoded hostnames   time            10064.00       8585.00   14.70
Constructor.SpecialNonFile.Long paths                  std                21.50         20.72    3.61
Constructor.SpecialNonFile.Long paths                  warmup     1997965969.00 1631080065.00   18.36
Constructor.SpecialNonFile.Long paths                  iterations      13800.00      16960.00  -22.90
Constructor.SpecialNonFile.Long paths                  time            92204.00      75105.50   18.54
Constructor.SpecialNonFile.Complex paths 1             std                23.44         22.45    4.22
Constructor.SpecialNonFile.Complex paths 1             warmup      577456224.00  488800985.00   15.35
Constructor.SpecialNonFile.Complex paths 1             iterations      47763.00      57511.00  -20.41
Constructor.SpecialNonFile.Complex paths 1             time            25771.00      22003.00   14.62
Constructor.SpecialNonFile.Complex paths 2             std                22.46         23.20   -3.29
Constructor.SpecialNonFile.Complex paths 2             warmup      696544797.00  598427278.00   14.09
Constructor.SpecialNonFile.Complex paths 2             iterations      39380.00      46624.00  -18.40
Constructor.SpecialNonFile.Complex paths 2             time            32198.00      27043.00   16.01
Constructor.SpecialNonFile.Long query 1                std                36.84         42.24  -14.68
Constructor.SpecialNonFile.Long query 1                warmup       58211463.00   37606140.00   35.40
Constructor.SpecialNonFile.Long query 1                iterations     501394.00     678233.00  -35.27
Constructor.SpecialNonFile.Long query 1                time             2523.00       1922.00   23.82
Constructor.SpecialNonFile.Long query 2                std                24.80         23.64    4.67
Constructor.SpecialNonFile.Long query 2                warmup      475690878.00  395570466.00   16.84
Constructor.SpecialNonFile.Long query 2                iterations      60219.00      69046.00  -14.66
Constructor.SpecialNonFile.Long query 2                time            21042.00      18345.00   12.82
URLEncoded.String.urlEncoded                           std                27.23         29.79   -9.41
URLEncoded.String.urlEncoded                           warmup      186010182.00  199351008.00   -7.17
URLEncoded.String.urlEncoded                           iterations     149345.00     143445.00    3.95
URLEncoded.String.urlEncoded                           time             8293.00       8876.00   -7.03
URLEncoded.String.urlDecoded                           std                32.71         35.58   -8.78
URLEncoded.String.urlDecoded                           warmup       94496194.00   97867243.00   -3.57
URLEncoded.String.urlDecoded                           iterations     288242.00     291888.00   -1.26
URLEncoded.String.urlDecoded                           time             4322.00       4671.00   -8.07
ComponentSetters.Unique.Scheme                         std                70.94         58.17   18.00
ComponentSetters.Unique.Scheme                         warmup       15344399.00   14524985.00    5.34
ComponentSetters.Unique.Scheme                         iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Scheme                         time              687.00        652.00    5.09
ComponentSetters.Unique.Scheme.Long                    std                62.44        117.06  -87.48
ComponentSetters.Unique.Scheme.Long                    warmup       23288436.00   22207436.00    4.64
ComponentSetters.Unique.Scheme.Long                    iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Scheme.Long                    time             1066.00        971.00    8.91
ComponentSetters.Unique.Username                       std                67.32         96.89  -43.93
ComponentSetters.Unique.Username                       warmup       10732897.00    7759100.00   27.71
ComponentSetters.Unique.Username                       iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Username                       time              464.00        398.00   14.22
ComponentSetters.Unique.Username.PercentEncoding       std                59.57         69.44  -16.55
ComponentSetters.Unique.Username.PercentEncoding       warmup       16480653.00   16044546.00    2.65
ComponentSetters.Unique.Username.PercentEncoding       iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Username.PercentEncoding       time              755.00        720.00    4.64
ComponentSetters.Unique.Username.Long                  std                67.37         68.96   -2.35
ComponentSetters.Unique.Username.Long                  warmup       14455228.00   13527540.00    6.42
ComponentSetters.Unique.Username.Long                  iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Username.Long                  time              674.00        625.00    7.27
ComponentSetters.Unique.Password                       std                65.31         71.21   -9.03
ComponentSetters.Unique.Password                       warmup        9598941.00   10257216.00   -6.86
ComponentSetters.Unique.Password                       iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Password                       time              431.00        452.00   -4.87
ComponentSetters.Unique.Password.PercentEncoding       std                51.37         54.54   -6.17
ComponentSetters.Unique.Password.PercentEncoding       warmup       24940450.00   23987255.00    3.82
ComponentSetters.Unique.Password.PercentEncoding       iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Password.PercentEncoding       time             1152.00       1084.00    5.90
ComponentSetters.Unique.Password.Long                  std                75.26         77.68   -3.22
ComponentSetters.Unique.Password.Long                  warmup       14083593.00   11472728.00   18.54
ComponentSetters.Unique.Password.Long                  iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Password.Long                  time              667.00        593.00   11.09
ComponentSetters.Unique.Hostname.Domain.ASCII          std                51.68         71.11  -37.61
ComponentSetters.Unique.Hostname.Domain.ASCII          warmup       21521201.00   17439569.00   18.97
ComponentSetters.Unique.Hostname.Domain.ASCII          iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Hostname.Domain.ASCII          time              996.00        774.00   22.29
ComponentSetters.Unique.Hostname.IPv4                  std                54.01         56.13   -3.92
ComponentSetters.Unique.Hostname.IPv4                  warmup       28033245.00   21298667.00   24.02
ComponentSetters.Unique.Hostname.IPv4                  iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Hostname.IPv4                  time             1240.00        932.00   24.84
ComponentSetters.Unique.Hostname.IPv6                  std                54.86         63.79  -16.28
ComponentSetters.Unique.Hostname.IPv6                  warmup       29897933.00   26471182.00   11.46
ComponentSetters.Unique.Hostname.IPv6                  iterations     940549.00    1000000.00   -6.32
ComponentSetters.Unique.Hostname.IPv6                  time             1342.00       1267.00    5.59
ComponentSetters.Unique.Hostname.Opaque                std                55.77         73.02  -30.93
ComponentSetters.Unique.Hostname.Opaque                warmup       17405851.00   16313367.00    6.28
ComponentSetters.Unique.Hostname.Opaque                iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Hostname.Opaque                time              881.00        734.00   16.69
ComponentSetters.Unique.Hostname.Domain.ASCII.Long     std                60.00         64.66   -7.76
ComponentSetters.Unique.Hostname.Domain.ASCII.Long     warmup       32915885.00   21175049.00   35.67
ComponentSetters.Unique.Hostname.Domain.ASCII.Long     iterations     864752.00    1000000.00  -15.64
ComponentSetters.Unique.Hostname.Domain.ASCII.Long     time             1484.00        933.00   37.13
ComponentSetters.Unique.Hostname.Opaque.Long           std                98.84         66.19   33.03
ComponentSetters.Unique.Hostname.Opaque.Long           warmup       22182606.00   18928126.00   14.67
ComponentSetters.Unique.Hostname.Opaque.Long           iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Hostname.Opaque.Long           time             1009.00        852.00   15.56
ComponentSetters.Unique.Path.Simple                    std                41.89        136.40 -225.64
ComponentSetters.Unique.Path.Simple                    warmup       59342376.00   54345691.00    8.42
ComponentSetters.Unique.Path.Simple                    iterations     488043.00     518814.00   -6.30
ComponentSetters.Unique.Path.Simple                    time             2641.00       2432.00    7.91
ComponentSetters.Unique.Path.DotDot                    std                38.11         37.34    2.02
ComponentSetters.Unique.Path.DotDot                    warmup       61336452.00   55107092.00   10.16
ComponentSetters.Unique.Path.DotDot                    iterations     477329.00     510952.00   -7.04
ComponentSetters.Unique.Path.DotDot                    time             2715.00       2503.00    7.81
ComponentSetters.Unique.Path.PercentEncoding           std                42.89         47.08   -9.77
ComponentSetters.Unique.Path.PercentEncoding           warmup       53212410.00   51359864.00    3.48
ComponentSetters.Unique.Path.PercentEncoding           iterations     516695.00     559329.00   -8.25
ComponentSetters.Unique.Path.PercentEncoding           time             2453.00       2236.00    8.85
ComponentSetters.Unique.Path.Simple.Long               std                45.16         45.23   -0.15
ComponentSetters.Unique.Path.Simple.Long               warmup       44473028.00   40969026.00    7.88
ComponentSetters.Unique.Path.Simple.Long               iterations     609005.00     662796.00   -8.83
ComponentSetters.Unique.Path.Simple.Long               time             2076.00       1868.00   10.02
ComponentSetters.Unique.Path.PercentEncoding.Long      std                37.88         44.60  -17.74
ComponentSetters.Unique.Path.PercentEncoding.Long      warmup       53818775.00   51662870.00    4.01
ComponentSetters.Unique.Path.PercentEncoding.Long      iterations     505777.00     550604.00   -8.86
ComponentSetters.Unique.Path.PercentEncoding.Long      time             2536.00       2335.00    7.93
ComponentSetters.Unique.Query                          std                45.51         50.69  -11.38
ComponentSetters.Unique.Query                          warmup       35362596.00   33197642.00    6.12
ComponentSetters.Unique.Query                          iterations     858584.00     909510.00   -5.93
ComponentSetters.Unique.Query                          time             1500.00       1386.00    7.60
ComponentSetters.Unique.Query.PercentEncoding          std                47.72         46.83    1.86
ComponentSetters.Unique.Query.PercentEncoding          warmup       31587314.00   31973539.00   -1.22
ComponentSetters.Unique.Query.PercentEncoding          iterations     890907.00     908907.00   -2.02
ComponentSetters.Unique.Query.PercentEncoding          time             1412.00       1389.00    1.63
ComponentSetters.Unique.Query.Long                     std                49.44         50.38   -1.90
ComponentSetters.Unique.Query.Long                     warmup       31488606.00   30794671.00    2.20
ComponentSetters.Unique.Query.Long                     iterations     900942.00     927445.00   -2.94
ComponentSetters.Unique.Query.Long                     time             1418.00       1368.00    3.53
ComponentSetters.Unique.Fragment                       std                53.76         57.43   -6.83
ComponentSetters.Unique.Fragment                       warmup       25204267.00   23655742.00    6.14
ComponentSetters.Unique.Fragment                       iterations    1000000.00    1000000.00    0.00
ComponentSetters.Unique.Fragment                       time             1147.00       1066.00    7.06
ComponentSetters.Unique.Fragment.PercentEncoding       std                46.65         52.49  -12.52
ComponentSetters.Unique.Fragment.PercentEncoding       warmup       31553737.00   31072700.00    1.52
ComponentSetters.Unique.Fragment.PercentEncoding       iterations     894227.00     949524.00   -6.18
ComponentSetters.Unique.Fragment.PercentEncoding       time             1432.00       1331.00    7.05
PathComponents.Iteration.Small.Forwards                std                46.18         66.41  -43.80
PathComponents.Iteration.Small.Forwards                warmup       29260013.00   16217749.00   44.57
PathComponents.Iteration.Small.Forwards                iterations     939438.00    1000000.00   -6.45
PathComponents.Iteration.Small.Forwards                time             1354.00        710.00   47.56
PathComponents.Iteration.Small.Reverse                 std                49.71        182.26 -266.64
PathComponents.Iteration.Small.Reverse                 warmup       30192228.00   16381246.00   45.74
PathComponents.Iteration.Small.Reverse                 iterations     894435.00    1000000.00  -11.80
PathComponents.Iteration.Small.Reverse                 time             1405.00        700.00   50.18
PathComponents.Iteration.Long.Forwards                 std                28.72         41.10  -43.09
PathComponents.Iteration.Long.Forwards                 warmup      137151720.00   72185076.00   47.37
PathComponents.Iteration.Long.Forwards                 iterations     203537.00     395360.00  -94.24
PathComponents.Iteration.Long.Forwards                 time             6129.00       3171.00   48.26
PathComponents.Iteration.Long.Reverse                  std                27.29         37.33  -36.79
PathComponents.Iteration.Long.Reverse                  warmup      135440838.00   70423343.00   48.00
PathComponents.Iteration.Long.Reverse                  iterations     203502.00     403059.00  -98.06
PathComponents.Iteration.Long.Reverse                  time             6187.00       3110.00   49.73
PathComponents.Append.Single                           std                51.17         48.75    4.73
PathComponents.Append.Single                           warmup       33583707.00   32078019.00    4.48
PathComponents.Append.Single                           iterations     837295.00     894834.00   -6.87
PathComponents.Append.Single                           time             1542.00       1414.00    8.30
PathComponents.Append.Multiple                         std                45.61         39.78   12.79
PathComponents.Append.Multiple                         warmup       49976440.00   40668172.00   18.63
PathComponents.Append.Multiple                         iterations     505432.00     608896.00  -20.47
PathComponents.Append.Multiple                         time             2467.00       2031.00   17.67
PathComponents.RemoveLast.Single                       std                63.62         79.44  -24.86
PathComponents.RemoveLast.Single                       warmup       14669773.00   13557016.00    7.59
PathComponents.RemoveLast.Single                       iterations    1000000.00    1000000.00    0.00
PathComponents.RemoveLast.Single                       time              661.00        608.00    8.02
PathComponents.RemoveLast.Multiple                     std                62.47         56.77    9.12
PathComponents.RemoveLast.Multiple                     warmup       14615425.00   14256858.00    2.45
PathComponents.RemoveLast.Multiple                     iterations    1000000.00    1000000.00    0.00
PathComponents.RemoveLast.Multiple                     time              671.00        659.00    1.79
PathComponents.ReplaceSubrange.Shrink                  std                37.93         44.86  -18.28
PathComponents.ReplaceSubrange.Shrink                  warmup       41934649.00   33989237.00   18.95
PathComponents.ReplaceSubrange.Shrink                  iterations     675845.00     804741.00  -19.07
PathComponents.ReplaceSubrange.Shrink                  time             1916.00       1575.00   17.80
PathComponents.ReplaceSubrange.Grow                    std                37.21         34.93    6.13
PathComponents.ReplaceSubrange.Grow                    warmup       67791448.00   58102148.00   14.29
PathComponents.ReplaceSubrange.Grow                    iterations     409423.00     489424.00  -19.54
PathComponents.ReplaceSubrange.Grow                    time             3067.00       2598.00   15.29
-----------------------------------------------------------------------------------------------------
                                                       std                                     -12.49
                                                       warmup                                   15.61
                                                       iterations                              -11.03
                                                       time                                     15.39

And that's not even the best part. All of these improvements come in a package which is 20% smaller!

Title                              Section             Old             New  Percent
WebURLBenchmark                     __text:        1715105         1376177   -19.8%

(Measured on an Intel MBP)

:pensive: System.framework Integration Disabled on iOS For Now

And with all that great news, there had to be one... less great thing. Unfortunately, the last few releases of Xcode have shipped with a broken version of Apple's System.framework for iOS, which broke the build on that platform. Strangely, it is only iOS - macOS, tvOS, and even watchOS all work fine. We've disabled that integration on iOS for now, but we'll keep an eye on things and re-enable it once the issue is fixed (FB9832953).

In the mean time, you can still use swift-system, the open-source distribution of System.framework, on all platforms, including iOS.


There has never been a better time to try WebURL, so check it out!

16 Likes

Fantastic! I've been following WebURL and related discussion in the Foundation URL Improvements and I have two questions:

  1. What kind of timeline do you expect to see 1.0?
  2. How do you see the future or URL handling in Swift? Between WebURL and System we seem to have (or will have) better replacements for Foundation's URL. System will obviously gain traction in at least some domains, but what is the path forward to make WebURL the recommendation for web URLs? Is it something that SSWG would incubate? Is it even possible to supplant Foundation's URL (knowing it would have to stay around for possibly ever given its prevalence now).
1 Like

Great questions!

I don't really have a timeline per-se, just a list of things to do.

  1. Complete Foundation interop (WebURL -> Foundation.URL)

    Because WebURL is always normalized, this is thankfully quite a bit easier. Most of the time, Foundation can just parse WebURL's string and all the components will be equivalent. The most common failure is when Foundation wants something to be percent-encoded but WebURL doesn't think it needs to be.

    So as part of this, I'm planning to add APIs to normalize percent-encoding (e.g. over-encoding, or removing over-encoding). It's quite a useful feature in general, for example - there have been discussions for years about how SwiftPM could normalize package URLs, and this is something it could use.

  2. Rework query params/form params API

    This is the part of the API I'm least happy with, personally. The current formParams view is approximately a port of JavaScript's URLSearchParams class. I pretty much added it because it's important to have something to work with the query, and this thing was defined in the URL Standard. But it has some... not always desirable behaviour and I think we can do better, and should do better before declaring the API stable.

  3. (Maybe) Introduce a Domain type

    WebURL.Host.domain is currently a String. It might be worth wrapping it in a type. There's a bunch of useful things that WebURL (and other libraries) could do with such a type.

And that's basically it. I'd love to do IDNA as well, but we need the standard library to expose their unicode normalization algorithms before that is practical. I doubt that will happen soon

I don't know.

Ultimately, the way I see it, this really boils down whether we, as an ecosystem, are going to rely on Apple to solve all of our problems, or if there is really room for independent developers to take the initiative and create packages to fix things ourselves. Basically, is the Swift community self-sustaining?

Sometimes I get that response from people - "why not just wait until Apple does something?" or "I won't use it unless it's from Apple". And it's really frustrating and sad for the ecosystem; Foundation.URL has a lot of problems, some of which can't reasonably be fixed. I think even the Foundation team know this, which is why it already ships 2 URL types - URL and URLComponents (which conform to different standards)

I'm not against Foundation in principle. The issues that I have with it are technical/factual, and I realise that the people working on it do not always have complete freedom to fix things. That said, I still think we need to be open and honest about the problems (which I accept is difficult, especially for a company like Apple). It doesn't serve the community to just pretend everything is great for the sake of being polite, when there are clearly issues which have needed urgent attention for years and years and haven't gotten it.

I'd certainly be open to working with the Foundation team to address those problems. They could use WebURL as a dependency and export it - it's not completely absurd; they already depend on third-party libraries such as cURL, libxml2, and ICU, and the project is open-source/Apache-2.0. I think something like that would send a good message about the health of the Swift community, and it would help counter concerns that Foundation is too monolithic -- WebURL would still exist as a separate package, if all you want is URLs.

The trickiest part would probably be the API, because if it were exported, it would then be part of Apple's OS frameworks. I wouldn't mind giving up some control of the API direction, and I'd certainly appreciate their expertise in improving the API - perhaps we'd both have to agree on additions/changes to the WebURL package API, but Foundation obviously can add their own extensions in their library (because that's how Swift works).

I'd want things to go through some kind of swift-evolution style community process rather than being driven only by Foundation's internal discussions, but given the recent pitches, perhaps that would be acceptable. I also don't really expect the API to change much after it stabilises - how much can you really add to a URL type?

But yeah, I don't know. I'm just doing what I'm doing and we'll see what happens. There haven't been any discussions.

Pitch is coming soon! Foundation integration was a prerequisite for it to be realistically considered IMO. I still need to implement the other direction, but it's at least at the point where packages could use WebURL internally without breaking all of their clients.

8 Likes

FYI, I've just landed the initial implementation of WebURL -> Foundation.URL conversion on the main branch.

It still needs more tests and fuzzing before I'm happy to tag a release with it, but so far I've been testing it against the web-platform-tests (the shared test suite used by the major browsers). Because WebURL is automatically normalized, for the most part, it can just be parsed by Foundation and everything is fine.

Percent-Encoding

The most significant area where we actually need to do something is when the URL contains technically-invalid characters like square brackets or curly braces, for example:

"http://example.com/foo?color[R]=100&color[G]=230&color[B]=123"
"http://example.com/products?id={uuid}"

These kinds of URLs are indeed in use on the web (for example, JSONAPI and OpenAPI use square brackets to reference an objects in a graph). Even though these characters are technically invalid, the WHATWG URL Standard requires that we allow them, even without percent-encoding - and all the major browsers do that. But Foundation.URL is very strict and doesn't allow them. It's actually quite inconsistent - it accepts strings with square brackets and percent-encodes them, but rejects curly braces and other characters.

Now, percent-encoding is a tricky business. The server has complete freedom to treat a URL however it likes, so in the most general case we can't say for sure when percent-encoding would alter how the URL is processed. If a server sees "color%5BR%5D", we can't guarantee that it won't interpret that as an object literally named "color[R]" rather than a reference to a sub-object. It's all just application-level semantics; you can even invent your own ways of encoding additional structure if you need it.

Given that we don't know what is safe to encode or not, the safest thing to do is to send the URL as it was written, and if you use something like our async-http-client port to make the request directly from WebURL, that is what will happen (matching Safari, Firefox, and Chrome). But if you make the request via URL and URLSession, those characters need to be percent-encoded.

So the conversion initializer offers to percent-encode those characters for you. The default is true, because it's probably fine (?) for most servers, but if you encounter a situation where a server treats encoded and non-encoded characters differently, you can opt-out of it. Unfortunately, opting-out means the conversion will fail if disallowed characters are present, because URL just will not accept it.

Results

With encoding enabled (again, the default), we can convert 650/666 tests in the WPT suite (97.6%), and even without encoding, 93.7% of them pass. And this is a URL test suite, so it includes a lot more weird URLs than most users are likely to ever encounter.

So yeah, that's a quick update. Do please try it out and let me know if you have any issues. I know URLSession is critical for network requests on Apple platforms, and now you can WebURL throughout your application - right up to the point where you make the request.

I'm also looking in to wrapping some Foundation APIs so you won't need to manually convert URL types. But that's a little way off because I'd ideally like to propagate the original WebURL through to the URLResponse. I'm looking at a couple of cheeky tricks which could allow that ;)

4 Likes

wouldn’t the remote service be unescaping the query string before parsing it as a subscript?

Not necessarily. Actually, it is generally advised to split the query (and any URL component) before decoding, because percent-encoding (also known as percent escaping) is used to escape characters which you mean literally but would conflict with some (standard or non-standard) delimiter.

Take form-encoding for example: http://test/search?q=tom%26jerry

If you decode before splitting, you end up with the query "q=tom&jerry", which breaks down as:

  • q = tom
  • jerry

But really, this was supposed to represent a literal "&" sign in the value of the q parameter. If you split before decoding, you get the correct result.

(And if the remote service was using WebURL's formParams API, they'd get the correct result without even worrying about these sorts of details - it does the right thing for you :slight_smile: )

4 Likes

If there's only one thing I've learned from your WebURL posts, it is this: whatever you do with a URL, even if it works for you, it's probably still wrong.

12 Likes

That's why I'm so motivated to continue! I had no idea when I started this project just how many intricate details there are, and previously, I probably made every mistake you could think of.

But at the same time, that's what API design is all about - managing the complexity, and making things easy and correct so developers don't need to care about those details.

8 Likes