I think that if you consistently separate Root
from Component
everywhere, then consistent use of components
in the FilePath
APIs would not cause ambiguity, and you wouldn't have to preface every use of the word "component" by "relative." Alternatively, it would be reasonable to just call it "relative component" everywhere.
I had a long detailed thing about Path APIs and my experience with working on them. Then I realized this is about the System package. Read more here if you're interested.
I am very excited to see this API get proposed, as the lack of a good path API has been one of Swift's most obvious sore points. String
isn't powerful enough, and URL
is just completely the wrong thing to use.
In my own code, I've developed my own path type, and based on that experience I have a couple of points of feedback on this proposal as it stands:
-
I think the name
FilePath
is the wrong name. Paths are useful beyond their application to a file. AURL
, for example, has a.path
, but that isn't necessarily a file path. Paths can be used to refer to traversals through named nodes, or as a resource identifier. I think it would be a mistake to see this API introduced asFilePath
, because it implies unsuitability for all other applications of paths, leaving developers to wonder if 1) they're using the wrong thing or 2) they should be using something inferior (String
/URL
). -
Because of the applicability of paths to stuff beyond the filesystem, I don't think a proper
Path
type should have any built-in notions of how the path applies to the file system. Just like aDate
has no intrinsic knowledge of calendars nor timezones, aPath
shouldn't have any intrinsic knowledge of volumes, drives, roots, file system separators, and so on. Instead, aPath
should be interpreted by another object (eg,FileManager
), at which point those considerations can be applied. This eliminates the need for over-burdening the public API with use-case specific information. -
One thing I haven't seen (or may have missed) is using a distinct type for relative paths. In my own code, I have a
Path
protocol, with two concrete adopters:AbsolutePath
andRelativePath
. Having distinct types makes it very easy to know what I need to be providing to an API. Appending always takes aRelativePath
. Looking things up on the filesystem always takes anAbsolutePath
. Pushing a current directory can take either. Having aRelativePath
type makes working with consistent structures much easier. For example, if I'm working with lots of content bundles in an application, I can construct some standard relative paths (ex./Contents/Resources
,./Contents/Frameworks
,./Info.plist
) and then quickly apply them to all of my available bundles.
With this approach, some things become clearer:
- The need for any sort of
Root
component is eliminated, because rootedness is only applicable to anAbsolutePath
-
PathComponents
becomes a pretty straight-forward enum with three cases:.this
(.
),.up
(..
), and.item(String)
. APath
is little more than an array ofPathComponents
.
I'm intrigued that this API exists, as I've not really had occasion to use the System package. My big piece of feedback is...
What would it take to get this considered for the Standard Library as an actual Path
API, and not as part of the System package? Paths are useful for much much more than measly file system operations.
What's the distinction that you perceive between these two things?
(This isn't to say that there's no difference, but rather I'm interested in which differences are relevant to you.)
Visibility â The system package (so far) is not an "on by default" package and must be manually imported. This makes it far less likely to actually be used and gives the impression that it's not universally useful. This is in contrast to URL
, which lives in the standard library and is therefore implied to be universally useful. Since URL
(the current unsatisfactory Path type) is in the standard library, why not put a more appropriate and broadly applicable Path type there as well?
Applicability â The package name "system" implies that it's referring to the operating system. Paths are useful beyond their usage by an operating system's filesystem.
Layering â We have path-like types in a lower layer, so why would this API live at a different layer?
A point of clarification: URL
does not live in the standard library; it's in Foundation, which--like System--must be explicitly imported. This feeds into your layering point as well; System is a lower-level moduleš than Foundation, and is the lowest-level module providing a path-like thing as far as I know.
š I don't think that there's an explicit dependency of Foundation on System yet, but Foundation absolutely sits above the C-language API that System binds for Swift, and I expect that this dependency will exist in the fullness of time.
One way to do this would be to have the choice of path styles be based on the rootâwhich is really a statement about string conversion:
enum Root {
case posix // "/"
case dos(String) // "C:\", limiting to Character possible too
case unc(server: String, share: String) // "\\server\share\"
}
// in practice they won't actually be Strings, since we'd want to have only one underlying buffer
One downside of the Root+Components model that I'm only thinking of now is that it doesn't allow for DOS "C:relative\path" paths unless you consider "C:" a root, which would result in a world where "has a root" â "is absolute". Fortunately these paths are pretty much useless in a modern world so I think "not supported by System.FilePath" is a valid answer, but that would at least have to be documented.
The other place where this falls down is for parsing, where "foo\bar/baz" is two components in a Unix path but three components in a Windows path. I'm not sure whether the default behavior should be "the current OS" (what you want up until it isn't), "always POSIX" (more restricted and so more likely to be caught in testing), or "always Windows" (more generous behavior but could cause problems when someone finally puts a backslash in a path), but you should always be able to override it with an initializer parameter.
I'd absolutely shy away from thread-local or task-local storage here; aside from slowing down otherwise CPU-and-memory-bound operations, I don't think it's a good idea to introduce a situation where different parts of a process are implicitly dealing with paths differently.
[EDIT: I went through this whole exercise without thinking about relative paths. Back to the drawing boardâŚthough I think the idea of having an underlying Form
enum containing all the roots plus relativePOSIX
and relativeDOS
isn't the worst.]
The main other use I can think of is as part of a URL. What else are you thinking of?
-
As far as I can tell, this
FilePath
doesn't expand~
automatically, is that correct?This is how
NSURL
, Ruby'sFile
andPathname
work, but personally, I think it's very non-intuitive, and bad UX. I've run into several pieces of software that misbehave because the developer forgot to callexpandingTildeInPath
. I don't think that a currency path type should distinguish between expanded and intact tildes, it's just a trap. -
I'm not a fan of
starts(with:)
andends(with:)
. From my guess, you're trying to distinguish these from the naming conventions established bySwift.String
such ashasPrefix
. However, I think we should use domain language instead, like:"/usr".isParent(of: "/usr/bin")
"/usr/bin".isChild(of: "/usr")
-
How does
stem
work with multiple extensions, likefoo.tar.gz
? Is the resultfoo.tar
, or justfoo
?
What about recognizing or normalizing separators? \
is part of a component on Unix but a separator on Windows.
This proposal does, along with C++17 and C#.
Yes, that is the world we live in. Rust does use the term "root" to only mean the \
separator and "prefix" to refer to anything before it. But, even there, \foo\bar
is not absolute.
My knee-jerk reaction is to support these, but I don't have a good intuition of Windows best practices. @compnerd?
However, we don't support legacy DOS devices at the library level (though they would pass through to a syscall). The latter combined with separator normalization does mean that you can't use a trailing separator to distinguish between a legacy device and a folder/file named after a legacy device, but that's probably for the better. This is also consistent with C#, FWIW, though they don't strip a trailing separator.
Or, in the future we stick a bit on FilePath
and introduce UnixPath
and WindowsPath
, likely conforming to some common protocol. Or, perhaps even make one a typealias depending on platform, haven't thought that through though.
No, System is a library for systems level programming. Shell expansion happens, well, in the shell prior to a syscall, so we wouldn't want to automatically expand these. We will be adding support for accessing the environment in the future which would include variable and tilde expansion. But that should remain an explicit operation, as its behavior is dependent on the environment.
Nope, they're from BidirectionalCollection
. No strings involved.
foo.tar
. Some API for traversing the "components" of a component is future work (which I'll call out explicitly in the next version of the proposal). I'll also update the comment to show this example, thanks for pointing that out.
But that should remain an explicit operation, as its behavior is dependent on the environment.
Why is that? It seems needlessly error-prone to me. What usecase is there for distinguishing ~/Desktop
from /Users/User/Desktop
? A failure to expand the first into the second is just a programming error (IMO), so why would we want to allow it to go unfixed?
Yep, makes sense, I figured you were trying to dinstance yourself from the terminology used by String. But I think you missed this part:
The result of this expansion completely depends on which user executes the expansion. ~/Desktop
will expand to /Users/User1/Desktop
when executed as User1
(I'm not sure if the expansion would actually read from the USER
environment vairable, but that would make sense), and to /Users/User2/Desktop
when executed as User2
. A call to setuid
could change the current user between multiple expansion calls, which would lead to different results. This may not be desirable in a lot of cases. I personally would prefer a library implementing this to be explicit about it, and definitely would not expect such a low-level library as Swift System to implement this at all.
You can have a file or folder named ~
. Auto-expansion would yield surprising behavior in programmatic code. I do agree that a UI/CLI would normally want to expand the tilde prior to making any syscalls.
Tilde expansion is dependent on environment variables, which can be modified during program execution. We want programmatic consistency when during purely syntactic operations, such as creating a path.
Even worse, reading the value of an environment variable can be a data race. We don't want to make FilePath.init
's behavior dependent on what other threads in the process are doing.
@lukasa, how would server-side programming view automatic and implicit expansion of tilde?
Clients of System are encouraged to wrap FilePath with a type ensuring/asserting on their notion of "canonical". This is highly use-case dependent, as mentioned in the proposal.
â test mkdir -p \~/Desktop
â test cd \~/Desktop
â ~ pwd
/Users/avi/test/~/Desktop
Yes, in the shell one must escape the ~, but in a Swift script or program, I would not expect to have to do so. Shell expansion should not be part of a low-level API.
As you said upthread, the answer is contextual and depends very much where you use it, but in general it would be considered surprising.
In general, shell expansion on paths is not something we'd expect to see in the lowest-layer path API. It's important to have an API that says "treat this the way a shell would", but the more magic you allow in your path API the more security risk you open yourself up to. In this context ~
is effectively a magic spelling of ${HOME}
, and I don't think anyone would propose that arbitrary environment variable injection into paths is a good idea either.
So I agree with the rest of the thread: expanding ~
at this API level is an anti-feature.
I wish I had the time and energy to review the proposal and remember what is good about Path.swift.
Certainly, like usual with my open source I designed it for me however there are some nice things:
- Always absolute paths (avoids a whole category of potential (potentially devastating) bugs where the developer doesn't realize a path they have is relative but try to use it anyway))
- Chainable syntax (typically you have to do a series of operations, but this is perhaps less relevant if the API does not provide copy, move, delete etc.) due to how Swift works this also leads to a single
try
which is pleasant. - Separate functions for copying (moving, etc.) files into directories versus to files (again, prevents common bugs where a directory exists that you didn't expect, shell scripts suffer here all the time and NSFileManager does the same as the shell)
- Always normalizing paths (I see this is done)
- Codable implementation can take relative paths (avoids bugs where paths change, eg. app launches after reinstall, or username change, edge case bugs, but I aim for APIs that are as robust as possible)
There's some other nice features that an official API shouldn't have (eg. operator /, @dynamicMember paths) so I didn't mention them.
In general I was quite thoughtful about every aspect, however it was years ago I wrote this so cannot remember all the little details.
I will try to watch this proposal and be useful, thanks.
Another recommendation, don't aim to cover 100% of what people will ask for, doing the basics will suffice. The community will then fill in the gaps with extensions, some of them will be great and can then make future proposals.
Do you have some concrete examples of this? Most of what I've come up with would be better handled by a lexicallyStarts(with:)
. There is a point where we don't want to duplicate every API with a lexicallyFoo
variant, but if lexicallyContains
is the right tool and covers the important cases, we can definitely add it.
My biggest objection to that method is simply its name. Its name does not, IMO, clearly communicate what it does. This minimises the odds that it will occur to users to call that function when we want them to call it.
In this instance I think I'm proposing a special-case, where an alternative name would be preferred solely to make it more discoverable for users.
Version 2 of this proposal can be found here. This thread was tremendously helpful and I want to thank everyone who took the time to review and comment on this API.
Changes:
- Spin off
FilePath.Root
fromComponent
- Provides a much clearer separation of API
- Allows for many corner cases to be handled by the type system
-
FilePath.Root
can in the future be a namespace for Windows root analysis
-
FilePath.ComponentView
is now also aRangeReplaceableCollection
- Standard Swift algorithms operate over homogeneous components of a path
-
FilePath.Component
has aKind
enum, illustrating mutual exclusivity -
relativePath
was renamedremovingRoot()
- Consistent with other "removing" APIs
- More precise on Windows (where a rooted path can be relative)
- Rename
basename
/dirname
tolastComponent
andremovingLastComponent()
- Include
@available(unavailable, renamed:)
entries for discoverability -
removeLast
is nowremoveLastComponent
(it doesn't remove a root)
- Include
-
append
overhaul:-
append
now only takesComponent
s, so there are no roots involved -
append
overload taking aString
for common stringy treatment of paths- Ignores a leading separator if needed
- Will be preferred for string literal arguments
-
push
is introduced for the commoncd
-like semantics- aka
join
in Python,push
in Rust,Combine
in C#,append
in C++17
- aka
-
-
ing variants of everything introduced for expression chaining
-
__consuming
and__owned
annotations added to make these efficient
-
- Add
lexicallyResolving()
, a secure-ish append over untrusted subpaths - Add
CTypes
empy enum to serve as a namespace for C typealiases-
PlatformChar
andPlatformUnicodeEncoding
are nested insideCTypes
- Allows us to add more C types without polluting global namespace
-
- Added deferred/future-work section about working with paths from another platform