Foundation URL Improvements

I agree with the decision to avoid ExpressibleByStringLiteral conformance: it’s inappropriate for types that can’t accept all strings. The new StaticString initializer has runtime-checked preconditions, so the explicit call is important.

As for changing existing Foundation.URL behavior: I think the best thing to do is replace the type entirely with things like SystemPackage.FilePath and WebURL. Foundation.URL should be kept as a sort of legacy option.

Note that WebURL isn’t actually stable yet, so everyone should probably keep using Foundation.URL in production code anyway.

2 Likes

URL indeed feeds tons of apis, from Data.init(contentsOf:options:) to URLRequest, and URLRequest itself feeds a lot of other tools, URLSession, URLProtocol... It's called "Foundation" for a reason.

2 Likes

Foundation is an excellent framework full of powerful tools. It is also literally incapable of fully embracing Swift, due to the need for Objective-C compatibility (enumerations, etc.). Its monolithic nature also makes it difficult to implement across all the platforms Swift can run on, which include things like embedded systems.

In light of that, Foundation should avoid breaking changes to defined behavior: ultimately, it should be replaced by modern Swift packages (the core libraries) and/or the Standard Library. And yes, that means that all of those will likely need to be replaced someday too. That day is not now.

5 Likes

URL is the most frequently used foundation type after String / Data / Array / Dictionary. It is used in gazillion of places (QuickLook, Metal, CoreImage, EventKit to name just a few not obvious). It seems it would be much easier to fix URL rather than move everything on top of a new type. If WebURL can run in a special mode, that makes it 100% compatible with URL (including API and bug for bug compatibility) then URL can be just typealias for WebURL, and with appropriate opt-in the bug for bug compatibility requirement can be lifted ("opt_in_for_fewer_url_bugs_and_faster_performance = true").

Foundation.URL, through no fault of the Foundation team, is a bad tool for the job. It conforms to an obsolete and ambiguous standard, tries to fill many disparate roles, and is shackled by the need to remain Objective-C compatible.

Three of the four examples you just gave suffered from the same sort of issues: Foundation.NSString, Foundation.NSArray, and Foundation.NSDictionary. They were vastly more common than Foundation.URL ever was or will be, and they were wisely replaced by completely new types. Sure, they did have to implement hacky toll-free bridging to make that happen, but it happened.

Foundation.URL is unfixable. Applying the fixes needed would render it unsuitable for at least some of its existing roles, and constitute an overhaul so comprehensive that you would really be replacing it with a new type anyway.

Its primary successors (SystemPackage.FilePath, WebURL, etc.) are specifically designed to make conversion as painless as it can be without compromising on other things. For the moment, WebURL is not stable. FilePath is, however, and everyone using Foundation.URL for files should be considering whether they can use it instead (converting to Foundation.URL as necessary for APIs still tied to it).

1 Like

Whatever foundation team does here at the end of the day is widely expected to be Objective-C compatible! Even if the new implementation is in swift with Objective-C shim layer on top. Or are you suggesting Objective-C users shall not benefit from latest and greatest changes?

I do not believe it is so bad... Please give concrete examples that proves it unfixability. The very topic's question is about "how can we improve it", and I believe we can.. Even if by the end of the day improvement means:

 typealias URL = WebURL

with WebURL having some compatibility mode that replicates all URL's bugs and quirks (with explicit opt-out of the bugs & quirks).

Bridging is the crux of the matter here.... you may change URL to WebURL or to whatnot you like - just do the bridging that for all intents and purposes makes the thing acting the same way as the original URL - that's the way to go.

There's a lot of great feedback in this thread. A sincere thank you to everyone who has responded so far. I'll leave some of the more specific responses on URL itself to @icharleshu but do want to discuss Foundation and its process at a higher level.

I talk about Foundation to a lot of people, and I have worked on the library myself for quite some time. I think one of the best ways to describe it is as a double-edged sword.

On one hand, it has the burden of maintaining compatibility with hundreds of thousands of distinct apps and libraries. Those apps use APIs like URL in many ways that are unanticipated or unexpected, beyond the promises we thought we made about how these types should behave.

On the other hand, it has the potential of improving hundreds of thousands of apps and libraries with even small changes to APIs like URL -- both in the implementation and in the interface.

There are certainly times where it makes sense to start over. We've done this a few times in Foundation for new Swift API over the past few releases. Most recently it was with AttributedString and the new FormatStyle family for formatting dates, numbers, and more. There are also times where it makes sense to improve the existing APIs, because then the cost of adoption is very low for all existing clients.

So, we have responsibility but also opportunity. These are the kinds of tradeoffs we make in our API design every day. Personally, I think it's an exciting place to be. This thread, and the one about Locale, are about opening up this opportunity to more people: the ones who count themselves as authors of those thousands of apps and libraries.

Our basic goal is to gather more ideas and feedback. Of course, we will have to do some categorization of changes on the spectrum of "minor, compatible improvement" to "complete rethink." Sometimes the latter may not be practical or even possible. I expect that different people with different interests will come to different conclusions about what the best path forward is. I don't think that we will end up in a place where literally everyone is in agreement. For this particular RFC, the changes are certainly more on the "minor, compatible improvement" side. That doesn't mean ideas on the other side aren't valuable or interesting. We may just need to split them up.

Overall, what I believe is that your ideas and discussions here, along with the experience and time commitment from the Foundation team, will result in a better API for everyone in the end.

Now, as far as a concrete process for this RFC and also Locale: what we hope to do is open up this discussion here for a few weeks and iterate on changes. Then we will put a bow on it so that we can move on to other APIs and ideas. Hopefully we can land implementations in swift-corelibs-foundation around that time so that people can try out these APIs before they are finalized. There isn't a formal process yet, because honestly I'd like to grow one a bit more organically to see what works best for all of us, because the needs of a library like this are a bit different than those for the compiler and language. We'll figure it out together.

36 Likes

As far as the language is concerned, I think the biggest issue Foundation (and all system frameworks) have is portability: once you use it in Swift code, especially packages, the code becomes considerably less versatile. Platform support becomes a very complicated question.

I believe that people will be more willing to adopt Apple’s libraries and technologies if they could be confident they will work on non-Apple platforms. I’d certainly be more willing to use Combine if it was an open-source package. Since Foundation is dependent on the Objective-C runtime, that’s only really feasible if Foundation’s functionality is reimplemented as various packages too.

There’s been a lot of progress recently towards accomplishing that, and I think it should continue.

2 Likes

@icharleshu

(1) you think this will improve the ergonomics of using URL

Those, definitely, yes.
I create nearly exactly the same extensions for project that has to deal with local system file paths.
Very happy to see the type being reconsidered.

I often create extensions that resembles:

extension URL {
    static var cachesDirectory: URL { get }
    var contents: [URL] { get }
    var fileExists: Bool { get }
    var isDirectory: Bool { get }
    func remove()
    func appendingDir(path: String) -> URL
    func appendingFile(path: String) -> URL
    func createDirectoryIfNeeded() -> URL
}

.

(2) if there are other straightforward changes we should consider to help. Thanks!

  • Pure Swift? (This is Foundation improvement so pointless?)
  • As already mentioned by others, URL query support incl. automatic (or, as it's mentioned above that there's no single rule, so something close) percent encoding would be nice.
  • More imports from FileManager?

I know this goes beyond 'improvements', – please don't take this serious too much – but this is a good opportunity to express what I've been feeling.

I remember the very first time I started to learn Swift and use NSURL (it was Swift 1 or 2 I guess), I got confused that it handles both Remote (http, etc) and Local (file://) schemes (I didn't have Obj-C experience back then).

Now I know how the type, Foundation and ObjC behave and it's okay, but sometimes it still feels a bit (just a little bit) awkward that it's mixed in a single type. Kind of feels like 2 types are in 1 type, suffering from ObjC legacy.

I mean, ultimately I want something like:

public protocol URL
public struct WebURL : URL
public struct FileURL : URL
2 Likes

The replacements for Foundation.URL are in the works, and they work pretty much exactly like that: SystemPackage.FilePath for files, WebURL.WebURL for webpages.

FilePath is already stable, and as the original post says is the better option if you aren’t dependent on resource value caching.

WebURL is not stable yet, but it will be eventually.

2 Likes

In addition to wonderful URL.resourceBytes an ability to read a number of bytes from offset would be great to have. e.g. in this form:

url.resourceBytes[10000 ..< 10200]

if that's impossible then e.g this:

url.bytes(offset: 10000, size: 10200)

for file url read from offset is straightforward, for web URL HTTP range header can be used, if unsupported by the server raise an error.

The slice could be possible in a general form on all async sequences and not just URL's AsyncBytes:

extension AsyncSequence {
  public subscript<R: RangeExpression>(_ range: R) -> AsyncSlice<Self> where R.Bound == Int {
    let r = range.relative(to: 0..<Int.max)
    return AsyncSlice(self, range: r)
  }
}

public struct AsyncSlice<Base: AsyncSequence> {
  let base: Base
  let range: Range<Int>
  init(_ base: Base, range: Range<Int>) {
    self.base = base
    self.range = range
  }
}

extension AsyncSlice: AsyncSequence {
  public typealias Element = Base.Element
  
  public struct Iterator: AsyncIteratorProtocol {
    let range: Range<Int>
    var iterator: Base.AsyncIterator
    var index = 0
    
    init(_ iterator: Base.AsyncIterator, range: Range<Int>) {
      self.iterator = iterator
      self.range = range
    }
    
    public mutating func next() async rethrows -> Element? {
      while index <= range.lowerBound {
        _ = try await iterator.next()
        index += 1
      }
      
      if index >= range.upperBound {
        return nil
      }
      defer { index += 1 }
      return try await iterator.next()
    }
  }
  
  public func makeAsyncIterator() -> Iterator {
    Iterator(base.makeAsyncIterator(), range: range)
  }
}

I think those types of additions should be considered separately for their own merit and not just on URL.

3 Likes

So long as the solution leads to "HTTP range header" for web urls and "read from offset" for file urls (and thus won't incur unnecessary I/O of reading some leading bytes and throwing them away) - it's good. To put it differently read(offset:length:) must move O(length) bytes across the bus / wire. Possibly accompanied with some readAnyway(offset:length:) that first tries the fast path and if that fails (say, for web URLs for servers that do not support HTTP range headers) performs the operation anyway reading all bytes from the start and throwing leading bytes away, moving O(offset + length) bytes across the bus / wire.

3 Likes

ah you are talking about a different request than just a normal URL fetch then. The general solution is perhaps not ideal for doing that particular action. I think the focus of these improvements are more focused to the ergonomics of the storage type URL not the networking side of it - that portion has a different set of folks responsible for it.

1 Like

Got you.

A quick & dirty attempt of mine.
extension URL {
    /// transfers O(range.count) bytes across the bus / wire
    func bytes(in range: ClosedRange<Int>, delegate: URLSessionTaskDelegate? = nil) async -> URLSession.AsyncBytes? {
        if scheme == "file" {
            fatalError("TODO file url")
        } else {
            return await getBytes(in: range, delegate: delegate)
        }
    }
    
    /// tries to transfer O(range.count) bytes across the bus / wire
    /// if that failes transfers O(range.upperBound) bytes across the bus / wire
    func bytesAnyway(in range: ClosedRange<Int>, delegate: URLSessionTaskDelegate? = nil) async -> URLSession.AsyncBytes? {
        if scheme == "file" {
            fatalError("TODO file url")
        } else {
            if let bytes = await getBytes(in: range, delegate: delegate) {
                return bytes
            }
            let bytes = await getBytes(in: nil, delegate: delegate)
            fatalError("TODO: fetch range from bytes")
        }
    }
    
    /// private method that optionally uses range request
    private func getBytes(in range: ClosedRange<Int>?, delegate: URLSessionTaskDelegate?) async -> URLSession.AsyncBytes? {
        var request = URLRequest(url: self)
        if let range = range {
            request.addValue("bytes=\(range.lowerBound)-\(range.upperBound)", forHTTPHeaderField: "Range")
        }
        let (bytes, response) = try! await URLSession.shared.bytes(for: request, delegate: delegate)
        let r = response as! HTTPURLResponse
        let status = r.statusCode
        guard status >= 200 && status < 300 else {
            return nil
        }
        if range != nil {
            guard let header = r.value(forHTTPHeaderField: "Accept-Ranges") else {
                return nil
            }
            let rangeUnits = header.components(separatedBy: " ")
            guard rangeUnits.contains("bytes") else {
                return nil
            }
            print(header)
        }
        return bytes
    }
}

For file url range reading possibly some manual code based on either DispatchIO or FileHandle is the way to go.

Possible API choices:

extension URL {
    // Obvious choices of optional v throwing

    // Option 1: two names
    func urlBytes(in range: ClosedRange<Int>, delegate: URLSessionTaskDelegate? = nil) async -> URLSession.AsyncBytes?
    func fileBytes(in range: ClosedRange<Int>) async -> FileHandle.AsyncBytes?

    // Option 2: same name overloaded by result type
    func bytes(in range: ClosedRange<Int>, delegate: URLSessionTaskDelegate? = nil) async -> URLSession.AsyncBytes?
    func bytes(in range: ClosedRange<Int>) async -> FileHandle.AsyncBytes?
    
    // Option 3: combined result
    enum CommonAsyncBytes {
        case url(URL.AsyncBytes)
        case fileHandle(FileHandle.AsyncBytes)
    }
    func bytes(in range: ClosedRange<Int>, delegate: URLSessionTaskDelegate? = nil) async -> CommonAsyncBytes?

    // Option 4 (delegate is unused for file url and optional for web url)
    func bytes<T: AsyncSequence>(in range: ClosedRange<Int>, delegate: URLSessionTaskDelegate? = nil) async -> T?
}

Hi @icharleshu. As many have said, this is a delight to read and many of these changes are definitely improvements.

My principal feedback has to do with the cascade of @_disfavoredOverloads in order to make everything work; a substantial amount of time is spent discussing this and, having reflected for a few days, I think literal conformance is the answer you're looking for. Here's why—

First, a bit of a detour to say that I agree with @Karl's point quoted below:

Indeed, Swift's syntax has been changed explicitly so that, for types expressible by a literal, the second line has identical meaning to the former (i.e., it uses the literal initializer and not the unlabeled converting initializer), a change that was made because users so pervasively confused the two.

And to your original point, it's not the case that a type expressible by a literal has to be expressible by all values representable by that literal; indeed, by that standard, it would be very wrong to conform UInt8 to ExpressibleByIntegerLiteral!

With that background out of the way, the reason I suggest that you consider this route in connection with the issue of @_disfavoredOverloads is this:

  1. Suppose you adopt ExpressibleByStringLiteral with StaticString as your string literal type; then, URL("https://example.com/") will behave exactly as you propose due to that conformance alone.

  2. The FilePath initializers would not then require deprecation with @_disfavoredOverload, or a labeled initializer, to disambiguate. Both of these necessary changes stem from (as you say) the user assumption that URL("https://example.com") will behave as described in (1), which it will do simply with ExpressibleByStringLiteral conformance without any cascading change required to these APIs.

  3. If you can avoid creating a new labeled initializer in (2) for init(filePath: FilePath)—which, incidentally, is suboptimal because it incurs the "muffin man" problem common to Obj-C APIs, -(id)doYouKnowTheMuffinMan:(TheMuffinMan *)theMuffinMan;, that Swift naming guidelines tried to eliminate—you can also avoid having to use @_disfavoredOverload for your proposed init(filePath: String) APIs, since there wouldn't be another overload with the same label. To me, using an undocumented feature for compatibility reasons as in (2) is one thing, but using it for a newly introduced API not to conflict with another newly introduced API is (IMHO) kind of a "design smell."

I think overall this would greatly simplify and minimize the number of changes needed for the desired behavior of URL("https://www.apple.com/") to "just work." Additionally, just outright conforming to ExpressibleByStringLiteral would (as discussed above) align with empiric evidence that users have a hard time distinguishing between unlabeled converting initializers and literal initialization. (I wonder also if it would obviate some of @Karl's concerns regarding unnecessary UTF8-to-UTF16 roundtripping on Windows.)

11 Likes

Swift.UInt8 has compile-time checking to ensure that unrepresentable integer literals aren’t used.

Until it is possible to implement compile-time literal evaluation in Swift, I am strongly opposed to using literals for URL.

No custom numeric type has such compile-time checking. Would you argue that no fixed-width integer type outside the standard library should conform to ExpressibleByIntegerLiteral? This would be plainly inconsistent with how FixedWidthInteger has been designed.

That’s because of a failing of the language, and in theory they shouldn’t conform to it.

So, by your theory, custom numeric types shouldn’t conform to integer literal protocols. By the same token, custom set types (ordered sets, etc.) should not be expressible by array literals, because they don’t reject non-unique values at compile time, and custom dictionary types should not be expressible by dictionary literals, because they don’t reject non-unique keys at compile time?

I fail to see a single literal type for which a third-party conformance—or even in some cases, existing conformances even in the standard library—could adhere to your theory. No, this theory plainly contradicts the actual design of Swift. There is neither a requirement for compile-time literal evaluation nor a requirement that every literal value should be accepted.

3 Likes