Foundation URL Improvements

Hi all, Apple’s Foundation team is working on some improvements to URL, focused on simple changes we can make to improve the ergonomics of using it. Given that this is such a widely used type in Swift, we’re interested in everyone’s input on these ideas. In particular, please let us know if (1) you think this will improve the ergonomics of using URL or (2) if there are other straightforward changes we should consider to help. Thanks!


URL Enhancements

  • Proposal: FOU-NNNN
  • Author(s): Charles Hu
  • Status: Active review

Introduction

URL is one of the most common types used by Foundation for File IO as well as network-related tasks. The goal of this proposal is to improve the ergonomics of URL by introducing some convenience methods and renaming some verbose/confusing ones. Specifically, we proposed to:

  • Introduce a new StaticString initializer
  • Refine the existing "file path" initializers
  • Refine appendingPathComponent and friends
  • Introduce common directories as static URL properties

We will discuss each improvement in the following sections.

The New StaticString Initializer

URL represents two concepts: 1) internet address such as https://www.apple.com initialize by init?(string: String), and 2) file path such as /System/Library/ initialized by init(fileURLWithPath: String). We propose to introduce a new StaticString initializer that could initialize URL for both concepts. This initializer will require the StaticString to contain the URL scheme and initialize Strings with the file:/// scheme as file paths and everything else as web addresses. Because this initializer takes a StaticString known at compile-time, it considers "schemeless" Strings as a programmer error and asserts the existence of a scheme. It will also assert any invalid web addresses, which mainly means empty strings and strings with invalid characters. Here are some examples:

String Literal URL Type URL absoluteString
"https://www.apple.com" Web Address https://www.apple.com
"file:///var/mobile" File Path file:///var/mobile
"www.apple.com" - Assertion failure
/System/Library/Frameworks" - Assertion failure
"file://www.ap le.com" File Path file:///current/dir/www.ap%20le.com
"https://www.ap le.com" Web Address Assertion failure
@available(TBD)
extension URL {
    /// Initialize an URL with a String literal. This initializer **requires** the
    /// string literal contains a URL scheme such as `file://` or `https://` to correctly
    /// interpret the URL. It will `assert` the validate of web addresses
    /// (i.e. URLs that doesn't start with the `file` scheme).
    init(_ string: StaticString)
}

Revisit the FilePath Initializers

The new StaticString initializer conflicts with the existing FilePath based initializer because neither has an argument label and FilePath conforms to ExpressibleByStringLiteral. The FilePath initializer is problematic because it could lead to unexpected results. For example, contrary to the assumption that most developer will have, URL("https://www.apple.com") actually initializes a file path (isFileURL returns true) because "https://www.apple.com" is interpreted as a FilePath due to ExpressibleByStringLiteral conformance. We therefore propose to rename the FilePath family initializers to init(filePath: FilePath) and deprecate and @_disfavoredOverload the original versions.

Similarly, we propose to rename the existing family of file path initializers, init(fileURLWithPath path: String) to init(filePath: String), to slightly improve the ergonomics of these methods due to the more concise naming. This will, of course, conflict with the new FilePath initializer, so we will mark them as @_disfavoredOverload because they require additional transformations to and from FilePath.

extension URL {
    @available(*, deprecated, renamed: "init?(filePath:)")
    @_disfavoredOverload
    public init?(_ path: FilePath)

    @available(*, deprecated, renamed: "init?(filePath:isDirectory)")
    @_disfavoredOverload
    public init?(_ path: FilePath, isDirectory: Bool)

    @available(TBD)
    @_disfavoredOverload
    public init?(filePath: FilePath)

    @available(TBD)
    @_disfavoredOverload
    public init?(filePath: FilePath, isDirectory: Bool)
}

extension URL {
    @available(TBD)
    public init(filePath: String)

    @available(TBD)
    public init(filePath: String, isDirectory: Bool)

    @available(TBD)
    public init(filePath: String, relativeTo: URL?)

    @available(TBD)
    public init(filePath: String, isDirectory: Bool, relativeTo: URL?)

    @available(*, deprecated, renamed: "init(filePath:)")
    public init(fileURLWithPath: String)

    @available(*, deprecated, renamed: "init(filePath:isDirectory)")
    public init(fileURLWithPath: String, isDirectory: Bool)

    @available(*, deprecated, renamed: "init(filePath:relativeTo)")
    public init(fileURLWithPath: String, relativeTo: URL?)

    @available(*, deprecated, renamed: "init(filePath:isDirectory:relativeTo)")
    public init(fileURLWithPath: String, isDirectory: Bool, relativeTo: URL?)
}

Revisit appendingPathComponent and Friends

We also propose to rename the (painfully) long, but popular appendingPathComponent() and friends to .appending(path:), as well as adding a new .appending(paths:) variant that takes a collection of path components:

extension URL {
    @available(TBD)
    public mutating func append(path: String)

    @available(TBD)
    public mutating func append(path: String, isDirectory: Bool)

    @available(TBD)
    public func appending(path: String) -> URL

    @available(TBD)
    public func appending(path: String, isDirectory: Bool) -> URL

    @available(TBD)
    public mutating func append<C: Collection>(paths: C, isDirectory: Bool? = nil) where C.Element == String

    @available(TBD)
    public func appending<C: Collection>(paths: C, isDirectory: Bool? = nil) where C.Element == String -> URL

    @available(*, deprecated, renamed: "append(path:)")
    public mutating func appendPathComponent(String)

    @available(*, deprecated, renamed: "append(path:isDirectory)")
    public mutating func appendPathComponent(String, isDirectory: Bool)

    @available(*, deprecated, renamed: "appending(path:)")
    public func appendingPathComponent(String) -> URL

    @available(*, deprecated, renamed: "appending(path:isDirectory)")
    public func appendingPathComponent(String, isDirectory: Bool) -> URL
}

The addition of .appending(paths:) will make appending multiple path components to URL simpler. For example:

let baseURL: URL = ...
let id = UUID().uuidString

// Before
let photoURL = baseURL
    .appendingPathComponent("Photos")
    .appendingPathComponent("\(id.first!)")
    .appendingPathComponent(id)

// After
let photoURL = baseURL.appending(paths: ["Photos", "\(id.first!)", id])

Common Directories as Static URL Properties

We propose to add all "get URL" style methods from FileManager to URL as static methods. This allows the call site to use Swift's powerful static member lookup to get the URLs to predefined directories instead of always needing to spell out FileManager.default. We also propose to add a few more static directory URLs that correspond to FileManager.SearchPathDirectory. Note that these are not stored properties and will not run in O(1). We will add documentations to specify that.

@available(TBD)
extension URL {
    /// The working directory of the current process.
    /// Calling this property will issue a `getcwd` syscall.
    public static var currentDirectory: URL
    /// The home directory for the current user (~/).
    /// Complexity: O(1)
    public static var homeDirectory: URL
    /// The temporary directory for the current user.
    /// Complexity: O(1)
    public static var temporaryDirectory: URL
    /// Discardable cache files directory for the
    /// current user. (~/Library/Caches).
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var cachesDirectory: URL
    /// Supported applications (/Applications).
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var applicationDirectory: URL
    /// Various user-visible documentation, support, and configuration
    /// files for the current user (~/Library).
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var libraryDirectory: URL
    /// User home directories (/Users).
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var userDirectory: URL
    /// Documents directory for the current user (~/Documents)
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var documentsDirectory: URL
    /// Desktop directory for the current user (~/Desktop)
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var desktopDirectory: URL
    /// Application support files for the current
    /// user (~/Library/Application Support)
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var applicationSupportDirectory: URL
    /// Downloads directory for the current user (~/Downloads)
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var downloadsDirectory: URL
    /// Movies directory for the current user (~/Movies)
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var moviesDirectory: URL
    /// Music directory for the current user (~/Music)
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var musicDirectory: URL
    /// Pictures directory for the current user (~/Pictures)
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var picturesDirectory: URL
    /// The user’s Public sharing directory (~/Public)
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var sharedPublicDirectory: URL
    /// Trash directory for the current user (~/.Trash)
    /// Complexity: O(n) where n is the number of significant directories
    /// specified by `FileManager.SearchPathDirectory`
    public static var trashDirectory: URL
    /// Returns the home directory for the specified user.
    public static func homeDirectory(forUser user: String) -> URL?

    /// Locates and optionally creates the specified common directory in a domain.
    ///
    /// - parameter directory: The search path directory. The supported values are
    ///     described in FileManager.SearchPathDirectory.
    /// - parameter domain: The file system domain to search. The value for this
    ///     parameter is one of the constants described in
    ///     `FileManager.SearchPathDomainMask`. You should specify only one domain for
    ///     your search and you may not specify the allDomainsMask constant for
    ///     this parameter.
    /// - parameter url: The file URL used to determine the location of the returned
    ///     URL. Only the volume of this parameter is used. This parameter is ignored
    ///     unless the directory parameter contains the value
    ///     `FileManager.SearchPathDirectory.itemReplacementDirectory` and the domain
    ///     parameter contains the value `userDomainMask`.
    /// - parameter shouldCreate: Whether to create the directory if it does not
    ///     already exist.
    public init(
        for directory: FileManager.SearchPathDirectory, 
        in domain: FileManager.SearchPathDomainMask, 
        appropriateFor url: URL?, 
        create shouldCreate: Bool) throws
}

Now code for common file tasks such as writing files to the Downloads directory will be much cleaner:

let secretData = ...

// Before
let downloadDirectoryURL = FileManager.default.urls(for: .downloadsDirectory, in: .userDomainMask)[0]
let targetFolderURL = downloadDirectoryURL.appendingPathComponent("TopSecrets")
try FileManager.default.createDirectory(at: targetFolderURL, withIntermediateDirectories: true, attributes: nil)
let secretDataURL = targetFolderURL.appendingPathComponent("Secret.file")
try secretData.write(to: secretDataURL)

// After
let targetFolderURL: URL = .downloadsDirectory.appending(path: "TopSecrets")
try FileManager.default.createDirectory(at: targetFolderURL, withIntermediateDirectories: true, attributes: nil)
try secretData.write(to: targetFolderURL.appending(path: "Secret.file"))

Impact on Existing Code

Very minimal. Most changes introduced are additive with a few additional method renames.

Alternatives Considered

Use FilePath for File IO Tasks

Swift System introduced FilePath to represent a location on the file system, which functionally overlaps with URL for File IO tasks. Although FilePath is more lightweight and has better cross-platform support, replacing URL with FilePath for File IO tasks in Foundation could lead to two major issues:

  • Bigger impact on existing code. Callers of FileManager will need to be updated to use FilePath.
  • URL supports resource value caching, a feature that many systems depends on. FilePath will need to support fetching and caching URLResourceValues before it could replace URL for file tasks.
73 Likes

Wow, those are huge ergonomic improvements! Excellent job!

6 Likes

FWIW it seems FilePath is quite carefully crafted and would perhaps be worth the friction if taking the long term view - it’s a nice currency type and it’d be great to standardize on it throughout the swift eco system for file paths… just 0.02c

12 Likes

There are some welcome changes, glad to see them happening.

What's even more interesting to me is that these are changes being proposed in Foundation which I'm assuming will find their way in to the closed source version shipped with Apple OS's and the open source version. Is this setting a new president or is this simply a one off?

In terms of other straightforward changes, something off the top of my head which isn't directly linked to the URL type but is a case which I've had to build out for previously is around URL query parameters. At the moment you can use the URLQueryItem type in conjunction with URLComponents etc but it doesn't support arrays and dictionarys and formatting that down in to strings accordingly. I've had to support this a few times so it would be nice if this was visited separately. I actually don't know if its defined in a standard somewhere or not, but it seems to be relatively common in my experience.

4 Likes

If you want the String initializers to “win”, then using @_disfavoredOverload is neither necessary nor desirable. Swift favors String by default; @_disfavoredOverload is necessary to favor other ExpressibleByStringLiteral types over String or to break ties between two other ExpressibleByStringLiteral types.

7 Likes

Always happy to see improvements to URL, and even happier to see improvements to Foundation being pitched on these forums! I hope this is the beginning of a trend ;)

This is a nice addition, although I have some questions:

  • The description says it will assert, but do you actually mean fatalError? What happens in a release build if the string cannot be parsed?
  • If you're doing this for string literals, why not go all the way and make it ExpressibleByStringLiteral?

Yep. I encountered the same issue in WebURL, and solved it in the same way (adding a filePath argument label). For symmetry, is it worth adding a url argument label to the FilePath.init(URL) initializer? (Note: this would be required if URL became ExpressibleByStringLiteral).

There's actually a subtle interaction here which I noticed while creating WebURL, and which essentially forced me to add a WebURL.init(filePath: String) initializer (I really wanted to abolish the idea of file paths as strings -- they are not strings, and certainly don't have to be UTF-8; they are binary data, and FilePath handles this correctly).

Basically, for Windows users, FilePath stores the contents as UTF-16. That means that constructing a file URL from a string literal would incur a performance penalty, as the string passes from UTF-8 (in the string literal) -> UTF-16 (in FilePath) -> UTF-8 again (for the URL).

In order to not penalise Windows users, file paths as string literals have to be consumed directly by the URL initialiser, and not go through FilePath.

Does this append a single path component, or multiple components? What happens if I call:

URL("https://example.com/git/my-repo").appending(path: "issues/2959")

I think a lot more can be done with path components. This shortens some method names, which is nice, but I think a larger overhaul would be preferable.

Yes, please! This would be waaaaay better.

IMO, the best and Swiftiest™ approach to this would be to decouple these completely from the path/URL, and just return those values as a Dictionary or some new type. Hidden variables are the cause of all sorts of headaches, for example:

Do 2 URLs with the same string but different cached values compare as ==?

  • If no, that's an unintuitive user model. If I use a URL as a key in a Dictionary, I won't be able to find it again unless I happen to have precisely the same cached values.
  • If yes, I can insert a URL in to a Set, retrieve it again, and possibly get back an older instance with stale cached values, making the feature unreliable. And how can I depend on a feature which is not reliable?
16 Likes

How long have you got? :sweat_smile:

In all seriousness, I think the biggest, easiest improvements would be to:

  1. Add .percentEncodedX properties to URL, so that people don't need to go through URLComponents to avoid URL's automatic percent-decoding. Automatic decoding is really, really, really bad and currently there is no way around it (even converting URL -> URLComponents will automatically percent-decode under certain circumstances, and can corrupt the path).

  2. Add a method which provides the URL's string buffer, with ranges of all of the components. This would be seriously useful for URL -> WebURL conversion -- currently, because there are differences between the standards (as well as bugs in URL), we need to request each component individually to ensure the converted URL points to the same location, and each component potentially allocates a String.

There are more, but I think these 2 would be relatively straightforward and come with some big wins.

6 Likes

Long time listener, first-ish time caller. I apologise in advance for any misconceptions that my comment may reveal.

  • Bigger impact on existing code. Callers of FileManager will need to be updated to use FilePath.

I don’t think I understand why it is that existing code would need to be updated. Assuming that it would, I still don’t think this should necessarily be a dealbreaker, if the benefits obtained are meaningful enough.

FilePath will need to support fetching and caching URLResourceValues before it could replace URL for file tasks.

I would appreciate an exploration of of adding that support, rather than discarding this option simply because it’s not currently supported.

As a more general comment, my perception of Foundation is that it has a lot of baggage, Objective-C and otherwise, and is a particularly unwieldy dependency. Rightly or wrongly, I find myself a little disappointed every time I have to resort to using it. Based on some comments on Swift System issues, it seems that it’s intended to be lower level than Foundation. If functionality fits there, it should be implemented there. If Swift users could accomplish certain tasks idiomatically with either Foundation or Swift System, we should prefer Swift System.

4 Likes

I love literal initializer. Thanks

Oh, and one more thing:

    /// The working directory of the current process.
    /// Calling this property will issue a `getcwd` syscall.
    public static var currentDirectory: URL

I really wish we wouldn't do this. I discussed some of the reasons on this swift-system PR. Basically: the current working directory is process-wide, mutable state, making it unsuitable in multithreaded environments. Many OS vendors (including Apple) have specific APIs for thread-local working directories to work around this.

Now, there are reasons why swift-system might not care about that. Its job is to represent the system APIs as they are, warts and all - but IMO it is dangerous to expose in a high-level library such as Foundation.

Introducing a Task-local current directory is probably out of scope for this proposal, but we shouldn't make things worse by adding sugar for dangerous APIs.

13 Likes

Glad to here such useful improvements!

I was wondering if there was an initializer that allows explicitly setting the encoding strategy of URL components. I had encountered some old-styled APIs that took ; as query separators:

https://example.com/portal/notices;columnsId=123;pageNo=1

I didn't find a way to correctly initialize such a URL in Swift at that time, so I had to deploy a simple proxy for such API. Is such URL allowed today?

In fact, I don't think FilePath has better cross-platform support at the time. FilePath is still a multi-platform API that acts very differently across platforms (between Windows and other OSs).

I had proposed to generalize FilePath in swift-system#67, which will also allow existing FilePath APIs to be used in remote paths (like scp arguments and object storage services). I believe we shouldn't replace any existing APIs with FilePath until it has a stable cross-platform behavior. And once FilePath is generalized, it should also become an individual target instead of remaining a part of SystemPackage.

1 Like

Common Directories as Static URL Properties

I second Karl on this not being a good idea, including because that will introduce a hidden, implicit dependency on the FileSystem. (this is similar to how I feel like Date.now introduces a hidden implicit dependency on the hardware clock for example).
While I understand the appeal (of making it easy to call using the static member lookup aka "dot-leading" syntax), this would have unexpected repercussions imho.

URL should be DTO-like, i.e. just a model object holding data, but not relying on querying external things (like the FileSystem and FileManager) to return its value. Not only because of thread-safety concerns as mentioned by Karl, but for consistency with the URL type's goal and design being a data holder only.

The fact that URL and FileManager are decoupled today is for a good reason imho, with FileManager encompassing the interaction with the file system and being able to check the presence of files on disk, and rely on the File System implementation (macOS vs Windows vs Linux, etc) of the currently mounted root, etc… while URL should be agnostic to all this and should not interact with hardware, system or disk.


That being said, the rest of the suggestions all look like welcome changes, and thanks for opening the discussion on this!

13 Likes

These look great! How about also adding something like this:

public func appending(query: String, value: String?) -> URL
2 Likes

That would be nice, wouldn't it?

Unfortunately, URL.appendingPathComponent just forwards to NSURL.appendingPathComponent, which makes blocking calls to the filesystem. So does URL(fileURLWithPath:) (here).

But I do agree that a URL type should be a pure model type, and that the results of URL manipulation should never depend on the state of the filesystem. That's why I also objected to the recent URL.lines async addition, which continues this trend of blurred lines -- you can request data (and text lines) using a URL, but the URL API should be kept pure and focussed on its task of... basically, string manipulation.

It would be impolite to link it here, but there are alternatives if you'd prefer a true model type for URLs.

15 Likes

I believe the goal of making the language easier through some of these ergonomics decision is on-point.

That said, the URL struct seems to be crowding over responsibilities which are not related to what a URL is. Bringing over FileManager functionality to URL seems to only benefit in the easy syntax access of the directories through static methods. The recently added URL.lines is also an example.

@Karl makes some great points about the use of dedicated tools for the job. If FilePath is a better tool for handling any file directory jobs, let’s normalize that instead of making URL fit too many roles and not doing them all as good.

In my opinion, and if I understand it as I think I am, the codebase of many projects would not be so heavily affected as this proposal implies. Maybe an example would communicate ideas better as your motivation of not going through this alternative.

I clarify in no way I’m saying to make everything break in order to innovate, but to guide developers into implementing better tools for their job over time. Maybe even at first, with optional opt-in.

In another subject, I would love to see a refinement or replacement to the use of an array urlqueryitems to modify a URL’s query.

Ideally, they could be methods like set, append and remove in URL and a URLQuery struct that allows going over, as a sequence, through the query items and even get their value from a O(1) key access.

This would enrich the concept of an URL as a specialized string manipulator which is its main responsibility.

7 Likes

I very warmly welcome the first public Foundation pitch (who would guess we'd read "FOU-NNNN" one day?). I hope to see more in the future :clap:

if there are other straightforward changes we should consider to help

I wonder, @icharleshu, if URL will become Sendable eventually.

15 Likes

Do I understand correctly that these are runtime assertions? Thus URL("invalid URL") would still compile and type-check as non-optional URL without any compile-time errors?

Also Comparable.

It sounds weird - like, surely everybody would have noticed already if URLs didn't conform to Comparable - but it's true! It doesn't conform!

print(URL(string: "http://example-a.com")! < URL(string: "http://example-b.com")!)
// ❌ Referencing operator function '<' on 'Comparable' requires that 'URL' conform to 'Comparable'

I find it really interesting that this isn't a bigger deal.

1 Like

Very excited about this! A huge portion of static URL initializers that I encounter are just immediately force-unwrapped (e.g. URL("https://forums.swift.org")!) so making this the default in StaticString cases like this will be a huge improvement.

5 Likes