[Back from revision] Foundation URL Improvements

Looks good! Some comments:

String Literals

What exactly does an "invalid web address" mean? For example, https:///apple.com is technically a valid URL according to RFC-2396, but the HTTP spec says this is invalid and it will fail at request-time. Since this is about string literals, there's an argument that it would be good to catch obvious misspellings like that early.

Also, some URLs may be valid but ill-formatted (such as uppercase schemes, e.g. "HTTPS://apple.com"). RFC-2396 says schemes are supposed to be lowercase, and again, since this is about string literals, there may be room to be stricter:

Scheme names consist of a sequence of characters beginning with a
lower case letter and followed by any combination of lower case
letters, digits, plus ("+"), period ("."), or hyphen ("-"). For
resiliency, programs interpreting URI should treat upper case letters
as equivalent to lower case in scheme names (e.g., allow "HTTP" as
well as "http")

RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax

DirectoryHint

Looks like a nice improvement!

Still, I'm not sure if it's a good idea to even have blocking filesystem calls at all, even as an option - local drives are relatively fast these days, but POSIX OSes allow you to map network resources and other things in to a shared, singular filesystem. That means a blocking FS call really has completely unpredictable performance.

So what could a better solution look like? Well, with static member syntax, we can use leading-dot syntax to resolve generic types:

func someFunc<T>(_: T) {}

someFunc(.foo) // calls someFunc<Foo>(_:)
someFunc(.bar) // calls someFunc<Bar>(_:)

If we had something like rethrowing protocol conformances but for async, theoretically we could make .is(Not)Directory and .inferFromPath be synchronous, while .checkFileSystem would be async.

Obviously that relies on language features which do not currently exist. Still, it's worth thinking about - how would this API be extended to support such a feature if were added to the language? Blocking filesystem calls are so bad that it would be really worth doing that as soon as it became possible. It's the kind of thing where I think developers shouldn't even have an option; why have an option to run with scissors?

Nit: There's a typo here: url contains an absolute path (leading /), so this example actually prints "/directory/file2".

FilePath initializers

Presumably, the same is true for FilePath.init?(URL) - that FilePath("file:///usr/bin/swift") would actually resolve to FilePath.init(String) and interpret the entire thing as some kind of file path?

It may be more of an edge case, but I'd recommend adding a url: label to that initialiser as well, just to make sure that things are clear at the call site. These sorts of overlapping overloads with literal arguments are generally not a good thing IMO.

AppendingPathComponents Changes

Looks good! I like that there's the ability to add a single component!

URLs have one path (that's why the property is called .path), so pluralising it reads a bit strangely to me. I think the real win here is the ability to append multiple components in a single call, right? Being able to avoid this:

let photoURL = baseURL
    .appendingPathComponent("Photos")
    .appendingPathComponent("\(id.first!)")
    .appendingPathComponent(id)

I'm not sure dropping the word 'components' from the argument label is really necessary.

Common Directories

I think this needs more explanation. The "current working directory" is the filesystem path against which relative filesystem paths are resolved, but file URLs are absolute. My concern in the previous thread was that the working directory is (a) not predictable and (b) often unsafe to access (it's just a mutable global, not even a thread-local - so it could, for example, tear if one thread is setting it while another thread simultaneously tries to read it).

Is the idea that this should accept a URL as an argument, which then affects how relative paths are resolved by FileManager and/or URL.init(FilePath)?

Also, the closure should be async if people are expected to perform file operations within it.

Swift globals can be lazily evaluated and cached, so it seems like they could be O(1) on average. None of these strike me as things which can change during the life of a process, so caching should be fine AFAICT.

PercentEncodedPath and co.

OMG Yes! :raised_hands:

Currently, for Foundation.URL -> WebURL conversion, my approach is to verify every component. But this has hit a snag because there is literally no way to get an accurate path from URL. So these properties would be a big help.

Just having this would be really great. If I could ask for more (and these opportunities do not come up very often, so I will), I'd really like to get component ranges for use with the absoluteString. Currently, the conversion process spends ~65% of its time just calling getters from URL (which it only calls once).

Surely that will improve by avoiding percent-decoding, but I'd really like to avoid extra string allocations and bridging overheads. For comparison, getting the equivalent components as slices of WebURL's UTF8View takes a negligible amount of time (<1%), precisely because it can avoid those things.

image

Summary

Overall, some really good changes!

P.S.

Please - not that name (I know it's not being proposed). :wink:

1 Like