(core library) modern URL types

Okay so a few days ago there was a discussion
<https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt;
about getting pure swift file system support into Foundation or another
core library, and in my opinion, doing this requires a total overhaul of
the `URL` type (which is currently little more than a wrapper for NSURL),
so I’ve just started a pure Swift URL library project at <
https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already
in place and functional; the goal is to eventually support all of the
Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed
by an array buffer. The URLs are just 56 bytes long each, so they should be
able to fit into cache lines. (NSURL by comparison is over 128 bytes in
size; it’s only saved by the fact that the thing is passed as a reference
type.)

As I said, this is still really early on and not a mature library at all
but everyone is invited to observe, provide feedback, or contribute!

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

···

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org> a écrit :

Okay so a few days ago there was a discussion <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt; about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

1 Like

I think that’s more a problem that lies in Foundation methods that take
URLs rather than the URLs themselves. A URL is a useful abstraction because
it contains a lot of common functionality between local file paths and
internet resources. For example, relative URI reference resolution. APIs
which take URLs as arguments should be responsible for ensuring that the
URL’s schema is a `file:`. The new URL type I’m writing actually makes the
scheme an enum with cases `.file`, `.http`, `.https`, `.ftp`, and `.data`
to ease checking this.

···

On Mon, Aug 21, 2017 at 2:23 AM, Félix Cloutier <felixcloutier@icloud.com> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file
system path. For the record, I'm not a fan of existing Foundation methods
that create objects from an URL. There is a useful and fundamental
difference between a local path and a remote path, and conflating the two
has been a security pain point in many languages and frameworks that allow
it. Examples include remote file inclusion in PHP and malicious doctypes in
XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing
hacks work by causing a program to access an URL controlled by an attacker
to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local
and remote resources, so that at least developers have to be explicit about
allowing remote resources. This makes a new URL type less necessary towards
supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution < > swift-evolution@swift.org> a écrit :

Okay so a few days ago there was a discussion
<https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt;
about getting pure swift file system support into Foundation or another
core library, and in my opinion, doing this requires a total overhaul of
the `URL` type (which is currently little more than a wrapper for NSURL),
so I’ve just started a pure Swift URL library project at <
https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already
in place and functional; the goal is to eventually support all of the
Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed
by an array buffer. The URLs are just 56 bytes long each, so they should be
able to fit into cache lines. (NSURL by comparison is over 128 bytes in
size; it’s only saved by the fact that the thing is passed as a reference
type.)

As I said, this is still really early on and not a mature library at all
but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

···

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

Okay so a few days ago there was a discussion <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt; about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I have an alternative implementation of a URI (a superset of URL). It is
quite ready to use but still requires some polishing and adding of
additional specific URI schemes.
It is designed primarily for server side usage but can be useful on a
client too. The main goals of the design is correctness (according to
RFC3986 and RFC7230) and efficiency.

You can take a look at it here: GitHub - my-mail-ru/swift-URI: Swift implementation of a URI in accordance with RFC3986

···

2017-08-21 7:38 GMT+03:00 Taylor Swift via swift-evolution < swift-evolution@swift.org>:

Okay so a few days ago there was a discussion
<https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt;
about getting pure swift file system support into Foundation or another
core library, and in my opinion, doing this requires a total overhaul of
the `URL` type (which is currently little more than a wrapper for NSURL),
so I’ve just started a pure Swift URL library project at <
https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already
in place and functional; the goal is to eventually support all of the
Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed
by an array buffer. The URLs are just 56 bytes long each, so they should be
able to fit into cache lines. (NSURL by comparison is over 128 bytes in
size; it’s only saved by the fact that the thing is passed as a reference
type.)

As I said, this is still really early on and not a mature library at all
but everyone is invited to observe, provide feedback, or contribute!

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

There's no question that there's a lot in common between file: URLs and other URLs, but that hardly makes URLs better in the file: case.

I saw the enum. The problem remains that it's a common API principle to accept the broadest possible input. It's not fundamentally wrong to accept and resolve common URL types, as there's a ton of things that need to read documents from the Internet by design. However, if this is the default facility for file handling too, it invents either:

A responsibility for API users to check that their URL is a file: URL;
A responsibility for API authors to provide separate entry points for file URLs and remote URLs, and a responsibility for API users to use the right one.

It would also add a category of errors to common filesystem operations: "can't do this because the URL is not a file: URL", and a category of questions that we arguably shouldn't need to ask ourselves. For instance, what is the correct result of glob()ing a file: URL pattern with a hash or a query string, should each element include that hash/query string too?

Félix

···

Le 20 août 2017 à 23:33, Taylor Swift <kelvin13ma@gmail.com> a écrit :

I think that’s more a problem that lies in Foundation methods that take URLs rather than the URLs themselves. A URL is a useful abstraction because it contains a lot of common functionality between local file paths and internet resources. For example, relative URI reference resolution. APIs which take URLs as arguments should be responsible for ensuring that the URL’s schema is a `file:`. The new URL type I’m writing actually makes the scheme an enum with cases `.file`, `.http`, `.https`, `.ftp`, and `.data` to ease checking this.

On Mon, Aug 21, 2017 at 2:23 AM, Félix Cloutier <felixcloutier@icloud.com <mailto:felixcloutier@icloud.com>> wrote:
I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

Okay so a few days ago there was a discussion <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt; about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

There’s also a library somewhat similar to PathKit called FileKit (https://github.com/nvzqz/FileKit\). FileKit has some functionality I’m not sure how I feel about, such as different types based on the type of data in the file. It also uses a lot of Foundation for NSBacked types which breaks Linux compatibility.

I do agree that URL is not the best option for file paths, despite many of the similarities with URL paths, and agree that much of Foundation should be ported to pure Swift in order to increase linux compatibility.

···

On Aug 22, 2017, at 10:25 AM, Dave DeLong via swift-evolution <swift-evolution@swift.org> wrote:

I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

Okay so a few days ago there was a discussion <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt; about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

So are you saying we need *three* distinct “URI” types for local-absolute,
local-relative, and remote? That’s a lot of API surface to support.

···

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com> wrote:

I completely agree. URL packs a lot of punch, but IMO it’s the wrong
abstraction for file system paths.

I maintain an app that deals a *lot* with file system paths, and using
URL has always felt cumbersome, but String is the absolute wrong type to
use. Lately as I’ve been working on it, I’ve been experimenting with a
concrete “Path” type, similar to PathKit (kylef (Kyle Fuller) · GitHub
PathKit/). Working in terms of AbsolutePath and RelativePath (what I’ve
been calling things) has been *extremely* refreshing, because it allows
me to better articulate the kind of data I’m dealing with. URL doesn’t
handle pure-relative paths very well, and it’s always a bit of a mystery
how resilient I need to be about checking .isFileURL or whatever. All the
extra properties (port, user, password, host) feel hugely unnecessary as
well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution < > swift-evolution@swift.org> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file
system path. For the record, I'm not a fan of existing Foundation methods
that create objects from an URL. There is a useful and fundamental
difference between a local path and a remote path, and conflating the two
has been a security pain point in many languages and frameworks that allow
it. Examples include remote file inclusion in PHP and malicious doctypes in
XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing
hacks work by causing a program to access an URL controlled by an attacker
to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local
and remote resources, so that at least developers have to be explicit about
allowing remote resources. This makes a new URL type less necessary towards
supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution < > swift-evolution@swift.org> a écrit :

Okay so a few days ago there was a discussion
<https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt;
about getting pure swift file system support into Foundation or another
core library, and in my opinion, doing this requires a total overhaul of
the `URL` type (which is currently little more than a wrapper for NSURL),
so I’ve just started a pure Swift URL library project at <
https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already
in place and functional; the goal is to eventually support all of the
Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed
by an array buffer. The URLs are just 56 bytes long each, so they should be
able to fit into cache lines. (NSURL by comparison is over 128 bytes in
size; it’s only saved by the fact that the thing is passed as a reference
type.)

As I said, this is still really early on and not a mature library at all
but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

great work! it looks like there is quite a lot of duplicated work going on
here though which is unfortunate. how do we reconcile these 2
implementations?

···

On Thu, Sep 21, 2017 at 12:16 PM, Aleksey Mashanov < aleksey.mashanov@gmail.com> wrote:

I have an alternative implementation of a URI (a superset of URL). It is
quite ready to use but still requires some polishing and adding of
additional specific URI schemes.
It is designed primarily for server side usage but can be useful on a
client too. The main goals of the design is correctness (according to
RFC3986 and RFC7230) and efficiency.

You can take a look at it here: GitHub - my-mail-ru/swift-URI: Swift implementation of a URI in accordance with RFC3986

2017-08-21 7:38 GMT+03:00 Taylor Swift via swift-evolution <
swift-evolution@swift.org>:

Okay so a few days ago there was a discussion
<https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt;
about getting pure swift file system support into Foundation or another
core library, and in my opinion, doing this requires a total overhaul of
the `URL` type (which is currently little more than a wrapper for
NSURL), so I’ve just started a pure Swift URL library project at <
https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already
in place and functional; the goal is to eventually support all of the
Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage
backed by an array buffer. The URLs are just 56 bytes long each, so they
should be able to fit into cache lines. (NSURL by comparison is over 128
bytes in size; it’s only saved by the fact that the thing is passed as a
reference type.)

As I said, this is still really early on and not a mature library at all
but everyone is invited to observe, provide feedback, or contribute!

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

If value-based generics become a thing, then you’ll be able to specify a URL type with URL<.file> which would pretty much solve this problem.

···

On Aug 21, 2017, at 11:24 AM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

There's no question that there's a lot in common between file: URLs and other URLs, but that hardly makes URLs better in the file: case.

I saw the enum. The problem remains that it's a common API principle to accept the broadest possible input. It's not fundamentally wrong to accept and resolve common URL types, as there's a ton of things that need to read documents from the Internet by design. However, if this is the default facility for file handling too, it invents either:

A responsibility for API users to check that their URL is a file: URL;
A responsibility for API authors to provide separate entry points for file URLs and remote URLs, and a responsibility for API users to use the right one.

It would also add a category of errors to common filesystem operations: "can't do this because the URL is not a file: URL", and a category of questions that we arguably shouldn't need to ask ourselves. For instance, what is the correct result of glob()ing a file: URL pattern with a hash or a query string, should each element include that hash/query string too?

Félix

Le 20 août 2017 à 23:33, Taylor Swift <kelvin13ma@gmail.com> a écrit :

I think that’s more a problem that lies in Foundation methods that take URLs rather than the URLs themselves. A URL is a useful abstraction because it contains a lot of common functionality between local file paths and internet resources. For example, relative URI reference resolution. APIs which take URLs as arguments should be responsible for ensuring that the URL’s schema is a `file:`. The new URL type I’m writing actually makes the scheme an enum with cases `.file`, `.http`, `.https`, `.ftp`, and `.data` to ease checking this.

On Mon, Aug 21, 2017 at 2:23 AM, Félix Cloutier <felixcloutier@icloud.com> wrote:
I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org> a écrit :

Okay so a few days ago there was a discussion about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

that might be true but I’d need to hear from more people to justify
creating two very similar URI types. Technically `file:` is just a string
that the file system interpreter gives meaning to; the scheme of a URL can
be any alphanumeric string. For example you could write a file manager
which interprets "foo:" as a local address, or ignore the scheme entirely.

···

On Mon, Aug 21, 2017 at 11:23 AM, Félix Cloutier <felixcloutier@icloud.com> wrote:

There's no question that there's a lot in common between file: URLs and
other URLs, but that hardly makes URLs better in the file: case.

I saw the enum. The problem remains that it's a common API principle to
accept the broadest possible input. It's not fundamentally wrong to accept
and resolve common URL types, as there's a ton of things that need to read
documents from the Internet by design. However, if this is the default
facility for file handling too, it invents either:

   - A responsibility for API users to check that their URL is a file:
   URL;
   - A responsibility for API authors to provide separate entry points
   for file URLs and remote URLs, and a responsibility for API users to use
   the right one.

It would also add a category of errors to common filesystem operations:
"can't do this because the URL is not a file: URL", and a category of
questions that we arguably shouldn't need to ask ourselves. For instance,
what is the correct result of glob()ing a file: URL pattern with a hash or
a query string, should each element include that hash/query string too?

Félix

Le 20 août 2017 à 23:33, Taylor Swift <kelvin13ma@gmail.com> a écrit :

I think that’s more a problem that lies in Foundation methods that take
URLs rather than the URLs themselves. A URL is a useful abstraction because
it contains a lot of common functionality between local file paths and
internet resources. For example, relative URI reference resolution. APIs
which take URLs as arguments should be responsible for ensuring that the
URL’s schema is a `file:`. The new URL type I’m writing actually makes
the scheme an enum with cases `.file`, `.http`, `.https`, `.ftp`, and `
.data` to ease checking this.

On Mon, Aug 21, 2017 at 2:23 AM, Félix Cloutier <felixcloutier@icloud.com> > wrote:

I'm not convinced that URLs are the appropriate abstraction for a file
system path. For the record, I'm not a fan of existing Foundation methods
that create objects from an URL. There is a useful and fundamental
difference between a local path and a remote path, and conflating the two
has been a security pain point in many languages and frameworks that allow
it. Examples include remote file inclusion in PHP and malicious doctypes in
XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing
hacks work by causing a program to access an URL controlled by an attacker
to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local
and remote resources, so that at least developers have to be explicit about
allowing remote resources. This makes a new URL type less necessary towards
supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution < >> swift-evolution@swift.org> a écrit :

Okay so a few days ago there was a discussion
<https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt;
about getting pure swift file system support into Foundation or another
core library, and in my opinion, doing this requires a total overhaul of
the `URL` type (which is currently little more than a wrapper for
NSURL), so I’ve just started a pure Swift URL library project at <
https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already
in place and functional; the goal is to eventually support all of the
Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage
backed by an array buffer. The URLs are just 56 bytes long each, so they
should be able to fit into cache lines. (NSURL by comparison is over 128
bytes in size; it’s only saved by the fact that the thing is passed as a
reference type.)

As I said, this is still really early on and not a mature library at all
but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
    case this // “."
    case up // “..”
    case item(name: String, extension: String?)
}

public protocol Path {
    var components: Array<PathComponent> { get }
    init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

···

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com <mailto:delong@apple.com>> wrote:
I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

Okay so a few days ago there was a discussion <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt; about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

Since an absolute URI is still relative to root, I feel like there should be a way to finagle relative and absolute URIs into a single type.

···

On Aug 22, 2017, at 2:37 PM, Taylor Swift via swift-evolution <swift-evolution@swift.org> wrote:

So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com> wrote:
I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org> a écrit :

Okay so a few days ago there was a discussion about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I would absolutely love to see an API like AbsolutePath / RelativePath for file system operations!

···

On 22. Aug 2017, at 21:02, Dave DeLong via swift-evolution <swift-evolution@swift.org> wrote:

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
    case this // “."
    case up // “..”
    case item(name: String, extension: String?)
}

public protocol Path {
    var components: Array<PathComponent> { get }
    init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com <mailto:kelvin13ma@gmail.com>> wrote:

So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com <mailto:delong@apple.com>> wrote:
I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

Okay so a few days ago there was a discussion <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt; about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

And then you could make APIs that are specific to certain schemes, and maybe even have file references working again.

extension URL where Scheme == .file {
  var path: String { … } // non-optional
}

extension URL where Scheme == .fileReference {
  var path: String? { … } // optional
}

Charles

···

On Aug 21, 2017, at 11:29 AM, Robert Bennett via swift-evolution <swift-evolution@swift.org> wrote:

If value-based generics become a thing, then you’ll be able to specify a URL type with URL<.file> which would pretty much solve this problem.

It could be problematic to initialize, however. You would need to know Scheme in URL<Scheme> ahead of initialization time for at least two reasons: first because value types aren't polymorphic, and second because the size of a type can depend on its generic parameters.

(If we have a good solution for both, then that would be great, yes.)

Félix

···

Le 21 août 2017 à 09:28, Robert Bennett <rltbennett@icloud.com> a écrit :

If value-based generics become a thing, then you’ll be able to specify a URL type with URL<.file> which would pretty much solve this problem.

On Aug 21, 2017, at 11:24 AM, Félix Cloutier via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

There's no question that there's a lot in common between file: URLs and other URLs, but that hardly makes URLs better in the file: case.

I saw the enum. The problem remains that it's a common API principle to accept the broadest possible input. It's not fundamentally wrong to accept and resolve common URL types, as there's a ton of things that need to read documents from the Internet by design. However, if this is the default facility for file handling too, it invents either:

A responsibility for API users to check that their URL is a file: URL;
A responsibility for API authors to provide separate entry points for file URLs and remote URLs, and a responsibility for API users to use the right one.

It would also add a category of errors to common filesystem operations: "can't do this because the URL is not a file: URL", and a category of questions that we arguably shouldn't need to ask ourselves. For instance, what is the correct result of glob()ing a file: URL pattern with a hash or a query string, should each element include that hash/query string too?

Félix

Le 20 août 2017 à 23:33, Taylor Swift <kelvin13ma@gmail.com <mailto:kelvin13ma@gmail.com>> a écrit :

I think that’s more a problem that lies in Foundation methods that take URLs rather than the URLs themselves. A URL is a useful abstraction because it contains a lot of common functionality between local file paths and internet resources. For example, relative URI reference resolution. APIs which take URLs as arguments should be responsible for ensuring that the URL’s schema is a `file:`. The new URL type I’m writing actually makes the scheme an enum with cases `.file`, `.http`, `.https`, `.ftp`, and `.data` to ease checking this.

On Mon, Aug 21, 2017 at 2:23 AM, Félix Cloutier <felixcloutier@icloud.com <mailto:felixcloutier@icloud.com>> wrote:
I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

Okay so a few days ago there was a discussion <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt; about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

Yes, and a URI class that don’t provide any FS operations, but only take care of proper URI parsing and building.

···

Le 23 août 2017 à 12:03, Jakob Egger via swift-evolution <swift-evolution@swift.org> a écrit :

I would absolutely love to see an API like AbsolutePath / RelativePath for file system operations!

On 22. Aug 2017, at 21:02, Dave DeLong via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
    case this // “."
    case up // “..”
    case item(name: String, extension: String?)
}

public protocol Path {
    var components: Array<PathComponent> { get }
    init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com <mailto:kelvin13ma@gmail.com>> wrote:

So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com <mailto:delong@apple.com>> wrote:
I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

Okay so a few days ago there was a discussion <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt; about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

+1

-Thorsten

···

Am 23.08.2017 um 12:03 schrieb Jakob Egger via swift-evolution <swift-evolution@swift.org>:

I would absolutely love to see an API like AbsolutePath / RelativePath for file system operations!

On 22. Aug 2017, at 21:02, Dave DeLong via swift-evolution <swift-evolution@swift.org> wrote:

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
    case this // “."
    case up // “..”
    case item(name: String, extension: String?)
}

public protocol Path {
    var components: Array<PathComponent> { get }
    init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com> wrote:
I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org> a écrit :

Okay so a few days ago there was a discussion about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Totally agree!

-Thorsten

···

Am 24.08.2017 um 20:06 schrieb Jean-Daniel via swift-evolution <swift-evolution@swift.org>:

Yes, and a URI class that don’t provide any FS operations, but only take care of proper URI parsing and building.

Le 23 août 2017 à 12:03, Jakob Egger via swift-evolution <swift-evolution@swift.org> a écrit :

I would absolutely love to see an API like AbsolutePath / RelativePath for file system operations!

On 22. Aug 2017, at 21:02, Dave DeLong via swift-evolution <swift-evolution@swift.org> wrote:

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
    case this // “."
    case up // “..”
    case item(name: String, extension: String?)
}

public protocol Path {
    var components: Array<PathComponent> { get }
    init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:

So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com> wrote:
I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org> a écrit :

Okay so a few days ago there was a discussion about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution