Foundation URL Improvements

Saklad5 · December 23, 2021, 2:34pm

The exact requirement of LosslessStringConvertible is as follows:

The description property of a conforming type must be a value-preserving representation of the original value. As such, it should be possible to re-create an instance from its string representation.

In other words, a conforming type MUST have a description property that, when passed to the initializer init?(_ description: String) (note it is failable), produces an equivalent instance.

Does URL meet that standard?

James_Dempsey · December 23, 2021, 3:18pm

LosslessStringConvertible is expecting the string provided by description to be able to round trip back to a URL without loss. Returning the absolute string from description would discard the base/relative info and not be able to do a lossless round-trip. So changing description in that way would preclude implementing LosslessStringConvertible.

Since they have different argument types, the StaticString initializer and the init?(_ description: String) failable initializer in LosslessStringConvertible should be able to coexist.

But, I think a call with a static string such as URL("https://swift.org") would always resolve to the String init method in the protocol and never the StaticString init method.

I don't know if there's a way to get the compiler to prefer the StaticString init method. If there is, then the current proposal wouldn't preclude adding LosslessStringConvertible in the future.

I think if preserving base/relative info is not important for a developer's use case, then absoluteString and init?(string:) provide similar functionality but with different naming.

xwu · December 23, 2021, 4:00pm

As I said above, there is: by making URL conform to ExpressibleByStringLiteral with StaticString as the literal type.

chrispix · December 27, 2021, 9:46pm

Karl:

I see. I actually quite it somewhat surprising that .appendingPathComponent allows you to append multiple components. It seems like the user is explicitly saying that the provided string should be considered a single component and percent-encoded accordingly (even if it contains user input which happens to contain a forward-slash).
func urlForBand(_ bandName: String) -> URL {
  URL(string: "https://example.com/music/bands")!
    .appendingPathComponent(bandName)
}

// Imagine the band name is not a string literal,
// but instead derived from runtime data (perhaps a search field).
// This results in 2 components:
urlForBand("AC/DC") // https://example.com/music/bands/AC/DC
Also, if I can control the bandName parameter in the above example, I can add any number of ".." components and take control of the entire path. That could potentially have security implications in certain circumstances, and in other situations just leads to particular inputs returning the incorrect result unless the developer knows about this and manually adds percent-encoding.

This is an excellent point. It feels like this function could use an argument like urlEncoded: Bool = false. Then the default behavior would be to URL encode the path component, unless the caller overrides by explicitly stating it's already encoded.

tera · December 28, 2021, 8:13pm

The "AD/DC" example above is very good. Ideally URL shall provide both abilities, e.g.:

let newUrl = url + .component("AC/DC")
let newUrl = url + "AC/DC" // same
let newUrl = url + .path("folder/file.txt")
let newUrl = url + .extension("png")
let newUrl = url + .components(["folder", "AC/DC", "file.txt"])

the third is a shortcut for:

let newUrl = url + "folder" + "file" + .extension("txt")

Karl · January 2, 2022, 11:37pm

This might be of interest to anybody wondering why it could be unwise for URL to just adopt the WHATWG URL parser (and to the Foundation team, if they are indeed looking at parser changes): Fixing the Unfixable: Story of a Google Cloud SSRF.

Today, Foundation will refuse to parse this URL: "http://creds\\@host.com". The back-slash is part of the "unwise" character set in RFC-2396 and so must be percent-encoded. However, RFC-3986 doesn't require back-slashes to be percent-encoded. The WHATWG model goes even further, and considers back-slashes and forward-slashes to be equivalent in certain schemes, so the URL is interpreted as "http://creds/@host.com" (note: the host is "creds").

What happened in this bug is that when the allow-list checker parsed the URL string, it saw the host as "host.com" (as per 3986), which was allowed, but ended up sending a request (using the WHATWG model) to an attacker-controlled domain and leaked authorisation tokens which were used to compromise Google's own cloud account.

URLs are a bit of a nightmare, as you can see - 3 different standards, 3 different parsing behaviours, and mixing them in the same program or even across systems which share data or communicate with each other can lead to (potentially severe) misunderstandings. You may even be mixing standards right now without even realising it (in fact, it's quite likely). The WHATWG parsing behaviour is not necessarily better, but it's the model which incorporates all the weird compatibility behaviour actors on the web platform need to avoid those misunderstandings.

So... adding the WHATWG parsing behaviour to the existing URL type would be pretty risky, IMO. It's quite reasonable to have them as separate types. Whilst it is often possible to convert between them, the process is quite delicate and even depends on ordering (e.g. String -> WHATWG -> 2396 may have a significantly different result to String -> 2396 -> WHATWG). But even if we used the type-system to isolate the different standards in this way, and even if we had the most careful conversion routines possible, it still would not be enough; one subsystem (e.g. the allow-list checker) might be using one standard, and another subsystem (e.g. the request engine) might use a different standard. At least if they use different types, there's more of a signal that they might behave slightly differently.

Long-term, we're going to need a single standard throughout programs and across systems, and other systems will be using the WHATWG standard (because the web), so it is inevitable that we will need to, as well. There are lots of options for how that could be achieved with Foundation, but just changing URL's parser is probably not the best way.

But yeah - I thought this would of interest to people in this discussion.

Saklad5 · January 5, 2022, 5:03pm

I completely agree: Foundation.URL’s existing behavior is not in a position to be changed.

It is far easier, and healthy for the community, to replace it instead. If necessary, Foundation can simply add initializers for converting to and from its successors (thereby removing the need for said successors to implement that themselves).

rex-remind · May 27, 2022, 2:02am

This does not seem correct or even "swifty", why is this being proposed as such?

One of Swift's values is being "Safe" Swift.org - About Swift

developer mistakes should be caught before software is in production. Opting for safety sometimes means Swift will feel strict, but we believe that clarity saves time in the long run.

Alternatives I would consider:

URL("invalid URL")!
URL(unsafe: "invalid URL")

Every app I've ever worked on mostly eschews ! to begin with, because when erroring, logging and letting the user know something is not right with that feature is better than crashing. An app which crashes may turn a user off forever, an app that works 95% but has a bug the user can back away from is always a better option (unless possibly all the app does is dependent on 1 URL, but I digress). Making the force unwrap hidden can lead to poorer quality applications and more frustrated users. Instead why not allow the developer to decide to potentially crash their app or not?

stackotter · May 27, 2022, 5:55am

The feature that @icharleshu was talking about is safe. A URL literal would be required to be known at compile time, meaning that the compiler can throw an error if the URL is invalid. This avoids the common pattern of initialising a URL from a string literal and then instantly unwrapping it, which isn’t fully safe because it would only catch invalid urls at runtime instead of compile time.

Initialising a URL from a string determined at runtime would still return an optional.

dhoepfl · June 1, 2022, 6:51am

See this issue.

Muescha · August 31, 2022, 5:23pm

i have a use case where i like to rename the filen name without extension - what i see as solutions are something like

var ext = filePath.pathExtension
var oldFilename = filePath.deletingLastPathComponent().lastPathComponent
var newfilename = oldfilename + " copy"
filePath.deleteLastPathComponent()
filePath.appendPathComponent(newfilename)
filePath.appendPathExtension(ext)

somehow this reads very verbose and smelly and i think there should be somehow a better syntax to have this done:

filePath.file.name = filePath.file.name + " copy"

where file is some URL.File object to hold the last pathcomponent which is not a directory with this properties: { name, fullName (="name.ext"), ext, parentPath }

tera · September 2, 2022, 5:37pm

You can hide that complexity,

extension URL {
    var fileBaseName: String {
        get {
            deletingPathExtension().lastPathComponent
        }
        set {
            self = deletingLastPathComponent()
                .appendingPathComponent(newValue)
                .appendingPathExtension(pathExtension)
        }
    }
    var fileExtension: String {
        get {
            pathExtension
        }
        set {
            self = deletingPathExtension().appendingPathExtension(newValue)
        }
    }
}

and then use it like so:

file.fileBaseName += " copy"
file.fileExtension = "123"

sspringer · September 2, 2022, 6:52pm

What I am missing concerning URL (as it is the case with other parts of Foundation) are some easy to use methods like isFile, isDirectory, size, or osPath (with either slash or backslash according to the current system). As mentioned by others, other missing easy to use methods would involve the file manager under the hood, but I think having according methods of URL would make things much easier, e.g. copy one file to another with a force option, or finding files with names according to a regular expression with a findRecursively option. See there for some of those methods.

Muescha · September 3, 2022, 6:22am

Thx. A very nice solution.

tera · September 3, 2022, 1:22pm

"osPath" from your list is ok to be done at URL level, others indeed involve FileManager, if I have those wrappers I'd make them "async". Note that "resourceValue" is particularly problematic due to (lack of proper) cache handling and validation. Long running calls like "copy" or "fileEqual" (and in fact pretty much everything in the FileManager due to its blocking nature) would benefit from having an ability of cancellation and getting the current progress.

sspringer · September 3, 2022, 10:45pm

Async means that you need an async context to call them, I am not sure if this is always what you want; but yes, I have „the feeling“ that „all“ is somehow going async now… Or having both? (Hope async/await will soon also be usable on Windows.)

tera · September 5, 2022, 5:00pm

In some cases result of a file manager operation is not even needed and when it is there are various means of doing that: synchronous/blocking execution, polling, promises, callbacks/closures/bindings, delegate methods, notifications - some of these methods are more convenient (or even a must) for certain use cases (e.g. to be notified about a progress of copy operation, or being able cancelling the running operation midway, or monitoring the file or folder changes, etc). See FileManagerDelegate for one of the alternatives.

alobaili · August 5, 2024, 11:37am

Hello @icharleshu & @Tony_Parker Did this proposal ever get a formal Proposal number? I looked for it in swift-foundation/Proposals at main · apple/swift-foundation · GitHub and couldn't find it.

Thank you.

alobaili · August 5, 2024, 11:51am

I was wondering if it would be an appropriate extension to this proposal to introduce new static URL properties for UIApplication.openSettingsURLString and UIApplication.openNotificationSettingsURLString so we can have something like URL.openSettings and URL.openNotificationSettings.

This will allow us to write convenient and safe code in SwiftUI when using the openURL environment value.

struct ContentView: View {
    @Environment(\.openURL) private var openURL

    var body: some View {
        Button("Change Language") {
            openURL(.openSettings)
        }
    }
}

Or is it something that is better kept under UIKit or SwiftUI and requires submitting feedback using Apple's Feedback Assistant instead?