var resultStorage: ObjCBool = false // true or false doesn't matter here, but
// it has to be set before taking its
// address, which we have to do below.
guard FileManager.default.fileExists(atPath: url.path, isDirectory: &resultStorage)
else { fatalError("Handle case where url is not an existing file.") }
let isDir = resultStorage.boolValue // Finally!
Is this the recommended way to do something as elementary as checking if a given URL is a directory?
Can you provide an example of those files that are incorrectly judged as folders? Are they packages or what? Other than that, I'd try the two alternatives: one of which is mentioned above: FileManager.fileExists(atPath:isDirectory:) and another which is not: FileManager.attributesOfItem(atPath:). Note that the former method follows symlinks while the other two methods don't.
I'm doing a little research on exactly this question and though the thread mentions all the possible ways, I was hoping to find some clarity on the different methods.
Traditionally, I've used FileManager's fileExists(atPath:isDirectory:) because it seemed the oldest and hopefully most reliable (always fun to see an ObjcBool). In my internet travels (and discussed here in brief) are the three ways I have found:
I'm vaguely aware that resourceValues(forKeys:) caches values.
Are attributesOfItem(atPath:) and resourceValues(forKeys:) using the same API under the hood? Are both functions using cached values?
Is there any benefit to sticking with the older API over a newer (possibly cached) alternative? Personally, cached values (aka shared state) in a Sendable righteous world makes me hesitate.
I came here looking for answers and yes they're all here, but presented without comment.
Maybe this is a solved problem and any solution is foolproof enough? Anyone have any recommendations based on usage, speed, compatibility, gotchas/pitfalls, or known issues?
How do you want to treat things that are neither files nor directories?
What layer of the stack are you working at? Folks working at the BSD layer have different options from folks working with Foundation.
Do you want just this âis a directoryâ value? Or are you getting this value along with a bunch of other attributes?
Do you want this value for a single URL? Or are you iterating through the file system hierarchy and want the values for every URL you get back?
The last point is the most critical. If youâre working with a single URL then performance generally doesnât matter, so you can use whatever API you like. However, if youâre working with a large bunch of URLs then choosing the right API can yield a massive performance win (or lose, depending on your perspective :-).
Specifically, when operating on high-latency file systems, like a network file system, itâs absolutely critical that you get all the attributes you need while youâre enumerating the directory. If you enumerate the directory and then get attributes for each item, the latencies add up and performance will tank [1].
That means using getattrlistbulk. Except that you donât want to use getattrlistbulk because itâs really hard to call, so youâll wanna use some sort of wrapper. The obvious wrapper is FileManager, where the contentsOfDirectory(at:includingPropertiesForKeys:options:) returns the contents of the directory as an array of URLs, with each URL holding the resource values you requested in its cache.
And thatâs why I generally favour the resource value APIs. They were designed to get you on to this performance happy path.
If there are specific places where youâre worried about a resource value being cached inappropriately, deal with that using removeCachedResourceValue(forKey:) or removeAllCachedResourceValues(). But IME this is rarely needed.
Share and Enjoy
Quinn âThe Eskimo!â @ DTS @ Apple
[1] And I mean really tank. Imagine a directory with 50 items in it on a file server with a 100 ms latency. If you enumerate the directory properly, you can expect all your results in 100 ms. If you do it badly, thatâll take 5 seconds. Ouch!
I'm vaguely aware that resourceValues(forKeys:) caches values
Beware, the cached value is not revalidated until the next run loop invocation!
Create a file
call resource values on it - good
delete the file / make other changes
call resource values on it - itâs still there / unchanged
If there are specific places where youâre worried about a resource value being cached inappropriately, deal with that using removeCachedResourceValue(forKey:) or removeAllCachedResourceValues().
To add: I believe the resource-aware variant of contentsOfDirectory that is recursive is FileManager.enumerator(at: includingPropertiesForKeys: options: errorHandler:)
Thank you, this helps clear things up in a number of ways.
From what you're saying, the cached resource values are stored in URL and the FileManager API populates them according to the included keys. Its resourceValues all the way down, just some APIs populate them differently (getattrlistbulk vs. non-bulk NSURL.getResourceValue()).
Right now I'm just building a basic abstraction:
struct File {
var location: URL
private let manager: FileManager
}
In this case it would be better to ask the stored location URL for resourceValues (which get cached there), then to query the FileManager using location.path() which would re-build the URL each time and possibly need to re-populate the resourceValues. There's some assumptions there but hopefully my mental model of it is more accurate?
Since I'm mostly dealing in local files, I should really only be concerned about finding bottlenecks when I'm making a Sequence/Collection for subfolders and accessing attributes during iteration.
Thanks for putting things in practical context. Always appreciated :D
Wasnât there a recent conversation about that not working?
Either way, the caching of resource values is definitely a two-edged sword.
Ah, yes. I always forget about this API because I have a general aversion to the type casting required by NSEnumerator. However, itâs definitely the best way to handle recursive enumeration.
Right. While network file systems bring this issue into stark relief, youâll observe this phenomenon with local file systems too. You can think of the kernel as a very low-latency network file server (-: Every kernel call has a cost. getattrlistbulk has more overhead than readdir but, if you want attributes other than the name [1], readdir requires more kernel calls.