(core library) modern URL types

Just leaving it out here that the iOS/macOS app experience with paths is likely to be very different from the server (or system program) experience. Most of your paths are relative to a container because the effective root of your data is the container. I would tend to believe that a lot of your paths, not while fully static, could be expressed with something like "${DOCUMENTS}/static/part", where ${DOCUMENTS} is the only thing that effectively changes. A system-level program just happens to use / as its prefix.

Most platforms already have some notion of file domains, which are really just path prefixes (the user's home folder, the file system root, the network root, etc). I think that some level of support for these would be desirable (would it be only a set of functions that return different path prefixes). It's also my humble opinion that tilde expansion should be part of a larger shell expansion feature to avoid surprises.

I think that I support the conclusion.

Félix

···

Le 22 août 2017 à 12:02, Dave DeLong <delong@apple.com> a écrit :

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
case this // “."
case up // “..”
case item(name: String, extension: String?)
}

public protocol Path {
var components: Array<PathComponent> { get }
init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:
So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com> wrote:
I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org> a écrit :

Okay so a few days ago there was a discussion about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Yep, I agree. My early thoughts on this included ideas around a “Container” for a path, but I ended up at the point that it wasn’t a useful abstraction (that I could see), because a container was really just a different AbsolutePath “prefix”.

I toyed around with the idea of something like:

protocol PathContainer: Path { }

struct ContainedPath<C: PathContainer>: Path { … }

I ultimately decided against it because, for example, a “Documents” container doesn’t make sense, because there can be many Documents folders, and having the static typing for this sort of thing tended to fly in the face of the dynamic nature of the file system. Ultimately, the simple distinction between an AbsolutePath and a RelativePath felt like the right sort of balance.

My “AbsolutePath” type does have some static properties to some well-known “container” locations. For example:

public struct AbsolutePath: Path {
    
    public static let root = AbsolutePath()
    public static let temporaryDirectory = AbsolutePath(fileSystemPath: NSTemporaryDirectory())
    public static var applicationCacheDirectory: AbsolutePath { return ... }
    public static var applicationSupportDirectory: AbsolutePath { return ... }

}

But, I agree that server needs are probably quite different from app needs.

Dave

···

On Aug 22, 2017, at 1:52 PM, Félix Cloutier <felixcloutier@icloud.com> wrote:

Just leaving it out here that the iOS/macOS app experience with paths is likely to be very different from the server (or system program) experience. Most of your paths are relative to a container because the effective root of your data is the container. I would tend to believe that a lot of your paths, not while fully static, could be expressed with something like "${DOCUMENTS}/static/part", where ${DOCUMENTS} is the only thing that effectively changes. A system-level program just happens to use / as its prefix.

Most platforms already have some notion of file domains, which are really just path prefixes (the user's home folder, the file system root, the network root, etc). I think that some level of support for these would be desirable (would it be only a set of functions that return different path prefixes). It's also my humble opinion that tilde expansion should be part of a larger shell expansion feature to avoid surprises.

I think that I support the conclusion.

Félix

Le 22 août 2017 à 12:02, Dave DeLong <delong@apple.com> a écrit :

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
    case this // “."
    case up // “..”
    case item(name: String, extension: String?)
}

public protocol Path {
    var components: Array<PathComponent> { get }
    init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com <mailto:kelvin13ma@gmail.com>> wrote:
So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com <mailto:delong@apple.com>> wrote:
I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

Okay so a few days ago there was a discussion <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt; about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

I liked the polymorphic identifiers paper presented at Dynamic Languages Symposium 2013.

http://dl.acm.org/citation.cfm?doid=2508168.2508169 <http://dl.acm.org/citation.cfm?doid=2508168.2508169&gt;

There's a lot of power in URI's that remains largely untapped in most systems.

There's not a great reason for this heterogeneity other than historical baggage.

Properly done, URI's can unify keypaths, user defaults, environment variables, files, network stores, databases, etc.

···

On Aug 22, 2017, at 12:52 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

Just leaving it out here that the iOS/macOS app experience with paths is likely to be very different from the server (or system program) experience. Most of your paths are relative to a container because the effective root of your data is the container. I would tend to believe that a lot of your paths, not while fully static, could be expressed with something like "${DOCUMENTS}/static/part", where ${DOCUMENTS} is the only thing that effectively changes. A system-level program just happens to use / as its prefix.

Most platforms already have some notion of file domains, which are really just path prefixes (the user's home folder, the file system root, the network root, etc). I think that some level of support for these would be desirable (would it be only a set of functions that return different path prefixes). It's also my humble opinion that tilde expansion should be part of a larger shell expansion feature to avoid surprises.

I think that I support the conclusion.

Félix

Le 22 août 2017 à 12:02, Dave DeLong <delong@apple.com> a écrit :

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
    case this // “."
    case up // “..”
    case item(name: String, extension: String?)
}

public protocol Path {
    var components: Array<PathComponent> { get }
    init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com <mailto:kelvin13ma@gmail.com>> wrote:
So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com <mailto:delong@apple.com>> wrote:
I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other identifier.

IMO, this justifies that there should be separate types to handle local and remote resources, so that at least developers have to be explicit about allowing remote resources. This makes a new URL type less necessary towards supporting file I/O.

Félix

Le 20 août 2017 à 21:37, Taylor Swift via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

Okay so a few days ago there was a discussion <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20170814/038923.html&gt; about getting pure swift file system support into Foundation or another core library, and in my opinion, doing this requires a total overhaul of the `URL` type (which is currently little more than a wrapper for NSURL), so I’ve just started a pure Swift URL library project at <https://github.com/kelvin13/url&gt;\.

The library’s parsing and validation core (~1K loc pure swift) is already in place and functional; the goal is to eventually support all of the Foundation URL functionality.

The new `URL` type is implemented as a value type with utf8 storage backed by an array buffer. The URLs are just 56 bytes long each, so they should be able to fit into cache lines. (NSURL by comparison is over 128 bytes in size; it’s only saved by the fact that the thing is passed as a reference type.)

As I said, this is still really early on and not a mature library at all but everyone is invited to observe, provide feedback, or contribute!
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

1 Like

I liked the polymorphic identifiers paper presented at Dynamic Languages Symposium 2013.

http://dl.acm.org/citation.cfm?doid=2508168.2508169

There's a lot of power in URI's that remains largely untapped in most systems.

There's not a great reason for this heterogeneity other than historical baggage.

Properly done, URI's can unify keypaths, user defaults, environment variables, files, network stores, databases, etc.

Would you mind summarizing the idea as the paper is not freely available?

-Thorsten

···

Am 24.08.2017 um 23:07 schrieb Eagle Offshore via swift-evolution <swift-evolution@swift.org>:

On Aug 22, 2017, at 12:52 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

Just leaving it out here that the iOS/macOS app experience with paths is likely to be very different from the server (or system program) experience. Most of your paths are relative to a container because the effective root of your data is the container. I would tend to believe that a lot of your paths, not while fully static, could be expressed with something like "${DOCUMENTS}/static/part", where ${DOCUMENTS} is the only thing that effectively changes. A system-level program just happens to use / as its prefix.

Most platforms already have some notion of file domains, which are really just path prefixes (the user's home folder, the file system root, the network root, etc). I think that some level of support for these would be desirable (would it be only a set of functions that return different path prefixes). It's also my humble opinion that tilde expansion should be part of a larger shell expansion feature to avoid surprises.

I think that I support the conclusion.

Félix

Le 22 août 2017 à 12:02, Dave DeLong <delong@apple.com> a écrit :

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
    case this // “."
    case up // “..”
    case item(name: String, extension: String?)
}

public protocol Path {
    var components: Array<PathComponent> { get }
    init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:
So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com> wrote:

I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other ide

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Looks like the author posted a free copy of the paper here:

Thanks,
Jon

···

On Sep 20, 2017, at 10:03 PM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org> wrote:

Am 24.08.2017 um 23:07 schrieb Eagle Offshore via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>:

I liked the polymorphic identifiers paper presented at Dynamic Languages Symposium 2013.

http://dl.acm.org/citation.cfm?doid=2508168.2508169 <http://dl.acm.org/citation.cfm?doid=2508168.2508169&gt;
There's a lot of power in URI's that remains largely untapped in most systems.

There's not a great reason for this heterogeneity other than historical baggage.

Properly done, URI's can unify keypaths, user defaults, environment variables, files, network stores, databases, etc.

Would you mind summarizing the idea as the paper is not freely available?

-Thorsten

On Aug 22, 2017, at 12:52 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Just leaving it out here that the iOS/macOS app experience with paths is likely to be very different from the server (or system program) experience. Most of your paths are relative to a container because the effective root of your data is the container. I would tend to believe that a lot of your paths, not while fully static, could be expressed with something like "${DOCUMENTS}/static/part", where ${DOCUMENTS} is the only thing that effectively changes. A system-level program just happens to use / as its prefix.

Most platforms already have some notion of file domains, which are really just path prefixes (the user's home folder, the file system root, the network root, etc). I think that some level of support for these would be desirable (would it be only a set of functions that return different path prefixes). It's also my humble opinion that tilde expansion should be part of a larger shell expansion feature to avoid surprises.

I think that I support the conclusion.

Félix

Le 22 août 2017 à 12:02, Dave DeLong <delong@apple.com <mailto:delong@apple.com>> a écrit :

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
    case this // “."
    case up // “..”
    case item(name: String, extension: String?)
}

public protocol Path {
    var components: Array<PathComponent> { get }
    init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com <mailto:kelvin13ma@gmail.com>> wrote:
So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com <mailto:delong@apple.com>> wrote:
I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other ide

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Thanks!

-Thorsten

···

Am 21.09.2017 um 10:07 schrieb Jonathan Hull <jhull@gbis.com>:

Looks like the author posted a free copy of the paper here:
http://www.hirschfeld.org/writings/media/WeiherHirschfeld_2013_PolymorphicIdentifiersUniformResourceAccessInObjectiveSmalltalk_AcmDL.pdf

Thanks,
Jon

On Sep 20, 2017, at 10:03 PM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org> wrote:

Am 24.08.2017 um 23:07 schrieb Eagle Offshore via swift-evolution <swift-evolution@swift.org>:

I liked the polymorphic identifiers paper presented at Dynamic Languages Symposium 2013.

http://dl.acm.org/citation.cfm?doid=2508168.2508169

There's a lot of power in URI's that remains largely untapped in most systems.

There's not a great reason for this heterogeneity other than historical baggage.

Properly done, URI's can unify keypaths, user defaults, environment variables, files, network stores, databases, etc.

Would you mind summarizing the idea as the paper is not freely available?

-Thorsten

On Aug 22, 2017, at 12:52 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

Just leaving it out here that the iOS/macOS app experience with paths is likely to be very different from the server (or system program) experience. Most of your paths are relative to a container because the effective root of your data is the container. I would tend to believe that a lot of your paths, not while fully static, could be expressed with something like "${DOCUMENTS}/static/part", where ${DOCUMENTS} is the only thing that effectively changes. A system-level program just happens to use / as its prefix.

Most platforms already have some notion of file domains, which are really just path prefixes (the user's home folder, the file system root, the network root, etc). I think that some level of support for these would be desirable (would it be only a set of functions that return different path prefixes). It's also my humble opinion that tilde expansion should be part of a larger shell expansion feature to avoid surprises.

I think that I support the conclusion.

Félix

Le 22 août 2017 à 12:02, Dave DeLong <delong@apple.com> a écrit :

I suppose, if you squint at it weirdly.

My current Path API is a “Path” protocol, with “AbsolutePath” and “RelativePath” struct versions. The protocol defines a path to be an array of path components. The only real difference between an AbsolutePath and a RelativePath is that all file system operations would only take an AbsolutePath. A URL would also only provide an AbsolutePath as its “path” bit.

public enum PathComponent {
    case this // “."
    case up // “..”
    case item(name: String, extension: String?)
}

public protocol Path {
    var components: Array<PathComponent> { get }
    init(_ components: Array<PathComponent>) // used on protocol extensions that mutate paths, such as appending components
}

public struct AbsolutePath: Path { }
public struct RelativePath: Path { }

By separating out the concept of an Absolute and a Relative path, I can put additional functionality on each one to make semantic sense (you cannot concatenate two absolute paths, but you can concat any path with a relative path, for example). Or all file system operations must take an AbsolutePath.

One of the key things I realized is that a “Path” type should not be ExpressibleByStringLiteral, because you cannot statically determine if a Path should be absolute or relative. However, one of the initializers for an AbsolutePath would handle things like expanding a tilde, and both types try to reduce a set of components as much as possible (by filtering out “.this” components, and handling “.up” components where possible, etc). Also in my experience, it’s fairly rare to want to deal with a known-at-compile-time, hard-coded path. Usually you’re dealing with paths relative to known “containers” that are determined at runtime (current user’s home folder, app’s sandboxed documents directory, etc).

Another thing I’ve done is that no direct file system operations exist on AbsolutePath (like “.exists” or “.createDirectory(…)” or whatever); those are still on FileManager/FileHandle/etc in the form of extensions to handle the new types. In my app, a path is just a path, and it only has meaning based on the thing that is using it. An AbsolutePath for a URL is used differently than an AbsolutePath on a file system, although they are represented with the same “AbsolutePath” type.

I’m not saying this is a perfect API of course, or even that a hypothetical stdlib-provided Path should mimic this. I’m just saying that for my use-case, this has vastly simplified how I deal with paths, because both URL and String smell really bad for what I’m doing.

Dave

On Aug 22, 2017, at 12:37 PM, Taylor Swift <kelvin13ma@gmail.com> wrote:
So are you saying we need three distinct “URI” types for local-absolute, local-relative, and remote? That’s a lot of API surface to support.

On Tue, Aug 22, 2017 at 12:24 PM, Dave DeLong <delong@apple.com> wrote:

I completely agree. URL packs a lot of punch, but IMO it’s the wrong abstraction for file system paths.

I maintain an app that deals a lot with file system paths, and using URL has always felt cumbersome, but String is the absolute wrong type to use. Lately as I’ve been working on it, I’ve been experimenting with a concrete “Path” type, similar to PathKit (https://github.com/kylef/PathKit/\). Working in terms of AbsolutePath and RelativePath (what I’ve been calling things) has been extremely refreshing, because it allows me to better articulate the kind of data I’m dealing with. URL doesn’t handle pure-relative paths very well, and it’s always a bit of a mystery how resilient I need to be about checking .isFileURL or whatever. All the extra properties (port, user, password, host) feel hugely unnecessary as well.

Dave

On Aug 20, 2017, at 11:23 PM, Félix Cloutier via swift-evolution <swift-evolution@swift.org> wrote:

I'm not convinced that URLs are the appropriate abstraction for a file system path. For the record, I'm not a fan of existing Foundation methods that create objects from an URL. There is a useful and fundamental difference between a local path and a remote path, and conflating the two has been a security pain point in many languages and frameworks that allow it. Examples include remote file inclusion in PHP and malicious doctypes in XML. Windows also had its share of issues with UNC paths.

Even when loading an arbitrary URL looks innocuous, many de-anonymizing hacks work by causing a program to access an URL controlled by an attacker to make it disclose the user's IP address or some other ide

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution