var resultStorage: ObjCBool = false // true or false doesn't matter here, but
// it has to be set before taking its
// address, which we have to do below.
guard FileManager.default.fileExists(atPath: url.path, isDirectory: &resultStorage)
else { fatalError("Handle case where url is not an existing file.") }
let isDir = resultStorage.boolValue // Finally!
Is this the recommended way to do something as elementary as checking if a given URL is a directory?
Can you provide an example of those files that are incorrectly judged as folders? Are they packages or what? Other than that, I'd try the two alternatives: one of which is mentioned above: FileManager.fileExists(atPath:isDirectory:) and another which is not: FileManager.attributesOfItem(atPath:). Note that the former method follows symlinks while the other two methods don't.
I'm doing a little research on exactly this question and though the thread mentions all the possible ways, I was hoping to find some clarity on the different methods.
Traditionally, I've used FileManager's fileExists(atPath:isDirectory:) because it seemed the oldest and hopefully most reliable (always fun to see an ObjcBool). In my internet travels (and discussed here in brief) are the three ways I have found:
I'm vaguely aware that resourceValues(forKeys:) caches values.
Are attributesOfItem(atPath:) and resourceValues(forKeys:) using the same API under the hood? Are both functions using cached values?
Is there any benefit to sticking with the older API over a newer (possibly cached) alternative? Personally, cached values (aka shared state) in a Sendable righteous world makes me hesitate.
I came here looking for answers and yes they're all here, but presented without comment.
Maybe this is a solved problem and any solution is foolproof enough? Anyone have any recommendations based on usage, speed, compatibility, gotchas/pitfalls, or known issues?
How do you want to treat things that are neither files nor directories?
What layer of the stack are you working at? Folks working at the BSD layer have different options from folks working with Foundation.
Do you want just this ‘is a directory’ value? Or are you getting this value along with a bunch of other attributes?
Do you want this value for a single URL? Or are you iterating through the file system hierarchy and want the values for every URL you get back?
The last point is the most critical. If you’re working with a single URL then performance generally doesn’t matter, so you can use whatever API you like. However, if you’re working with a large bunch of URLs then choosing the right API can yield a massive performance win (or lose, depending on your perspective :-).
Specifically, when operating on high-latency file systems, like a network file system, it’s absolutely critical that you get all the attributes you need while you’re enumerating the directory. If you enumerate the directory and then get attributes for each item, the latencies add up and performance will tank [1].
That means using getattrlistbulk. Except that you don’t want to use getattrlistbulk because it’s really hard to call, so you’ll wanna use some sort of wrapper. The obvious wrapper is FileManager, where the contentsOfDirectory(at:includingPropertiesForKeys:options:) returns the contents of the directory as an array of URLs, with each URL holding the resource values you requested in its cache.
And that’s why I generally favour the resource value APIs. They were designed to get you on to this performance happy path.
If there are specific places where you’re worried about a resource value being cached inappropriately, deal with that using removeCachedResourceValue(forKey:) or removeAllCachedResourceValues(). But IME this is rarely needed.
Share and Enjoy
Quinn “The Eskimo!” @ DTS @ Apple
[1] And I mean really tank. Imagine a directory with 50 items in it on a file server with a 100 ms latency. If you enumerate the directory properly, you can expect all your results in 100 ms. If you do it badly, that’ll take 5 seconds. Ouch!
I'm vaguely aware that resourceValues(forKeys:) caches values
Beware, the cached value is not revalidated until the next run loop invocation!
Create a file
call resource values on it - good
delete the file / make other changes
call resource values on it - it’s still there / unchanged
If there are specific places where you’re worried about a resource value being cached inappropriately, deal with that using removeCachedResourceValue(forKey:) or removeAllCachedResourceValues().
To add: I believe the resource-aware variant of contentsOfDirectory that is recursive is FileManager.enumerator(at: includingPropertiesForKeys: options: errorHandler:)
Thank you, this helps clear things up in a number of ways.
From what you're saying, the cached resource values are stored in URL and the FileManager API populates them according to the included keys. Its resourceValues all the way down, just some APIs populate them differently (getattrlistbulk vs. non-bulk NSURL.getResourceValue()).
Right now I'm just building a basic abstraction:
struct File {
var location: URL
private let manager: FileManager
}
In this case it would be better to ask the stored location URL for resourceValues (which get cached there), then to query the FileManager using location.path() which would re-build the URL each time and possibly need to re-populate the resourceValues. There's some assumptions there but hopefully my mental model of it is more accurate?
Since I'm mostly dealing in local files, I should really only be concerned about finding bottlenecks when I'm making a Sequence/Collection for subfolders and accessing attributes during iteration.
Thanks for putting things in practical context. Always appreciated :D
Wasn’t there a recent conversation about that not working?
Either way, the caching of resource values is definitely a two-edged sword.
Ah, yes. I always forget about this API because I have a general aversion to the type casting required by NSEnumerator. However, it’s definitely the best way to handle recursive enumeration.
Right. While network file systems bring this issue into stark relief, you’ll observe this phenomenon with local file systems too. You can think of the kernel as a very low-latency network file server (-: Every kernel call has a cost. getattrlistbulk has more overhead than readdir but, if you want attributes other than the name [1], readdir requires more kernel calls.
Maybe I need to dig farther into the CFURL constructor, but I don't see any way those included keys do anything besides disappear at the call site.
Both functions on FileManager that take Set<URLResourceKey> mention the keys having some effect but no specific mention of how the non- empty/nil keys effect things:
If you wish to only receive the URLs and no other attributes, then pass '0' for 'options' and an empty NSArray ('[NSArray array]') for 'keys'. If you wish to have the property caches of the vended URLs pre-populated with a default set of attributes, then pass '0' for 'options' and 'nil' for 'keys'.
I would like to use the \.isDirectory resource value while enumerating the contents of a directory (I'm making an Sequence). Now I'm not sure how to make that performant. Am I missing something about how URL's resource value caching mechanism works?
I thought about this too, so I double checked and there are only path based API, no mention of URL/URLResourceKey. Though I'd love a link to what you're mentioning in case I'm missing something.
In FoundationEssentials, NSEnumerator subclassing is out in favour of Sequence. Unfortunately, they deal with filenames rather than URLs and use readdir as discussed before (so not as effecient as getattrlistbulk)
I'm not certain that readdir is recursive, but its handy that the Sequence exposes by the d_type in the Element.
Regardless, I don't mean to mislead anyone into believing the API, and I really appreciate it when people go the extra step of investigating and testing, and share their work. I wish Apple really did more to avoid people having to do that, particularly with such core API.
I wonder how one could write a test to show that resource attributes were or weren't cached?
I missed the resourceInfo related copying, thanks for pointing that out.
I'm going to write some code to see what I can unearth, I assumed that is was something I'm missing cause that [URLResourceKey]? is just abandonded. It looks like a bug, but I'm not sure if that's just my inexperience showing.
Finding the right source code for some bit of Foundation functionality is quite tricky, so I decided to avoid that path and look at this in the debugger. The code never lies, as long as you interpret code as disassembled code (-:
Consider this code:
let u = URL(fileURLWithPath: "/Users/quinn/Test")
let e = FileManager.default.enumerator(at: u, includingPropertiesForKeys: nil)!
while let o = e.nextObject() {
let u = o as! URL
print(u)
}
If I run this on macOS 15.4 and set a breakpoint on the print(…), I can do this:
(lldb) p u as NSURL
(NSURL) 0x0000600002ac8780 "file:///Users/quinn/Test/.DS_Store"
(lldb) expr -l objc -O -- __CFURLResourceInfoPtr(0x0000600002ac8780)
<_FileCacheRef 0x11ef06910>{
… lots of details …
}
IMPORTANT__CFURLResourceInfoPtr is an implementation detail I gleaned from the open source, so the usual caveats apply.
As you can see, the URL has an attached resource cache. You can explore the debug description of the cache to see what was actually cached and what wasn’t.
I followed your example and it appears that the includingPropertiesForKeys parameter is respected like the documentation states. Though, to me, a function that doesn't accept values still receiving those values is indistinguishable from magic.
I found that isDirectory can be derived from attributes.hasBase.FLAGS as seen in CFURLHasDirectoryPath(_:). I'm not sure how URLResourceValues derives it exactly, but it's clear when I included the .isDirectoryKey key, attributes.hasBase.FLAGS is populated accordingly.
Interestingly, if I included .isDirectoryKey the whole of attribute.hasBase was included so I didn't need to include the .nameKey as well. This is not true between differently grouped resource values though.
Here are simplified __CFURLResourceInfoPtr() dumps for different combinations of keys, all for the first enumerated url, which was my home folder:
[.nameKey]
attributes
hasBase
hasDevice
[.nameKey, .volumeTypeNameKey]
attributes
hasBase
hasDevice
volumeInfo
propertyValues
... key/values
[.volumeTypeNameKey]
attributes
hasDevice
volumeInfo
propertyValues
... key/values
[.volumeTypeNameKey, .effectiveIconKey] ensures
attributes
hasBase
hasDevice
hasUserAccess
hasFinderInfo
hasFileDataLength
hasFullPath
volumeInfo
propertyValues
NSURLEffectiveIconKey... icon providers
... more key/values
nil
attributes
hasBase
hasDevice
My initial assumptions didn't hold up so I hope this helps clarify. Best I can deduce is: be as explicit as possible when included property keys, or specify nil.
I'm still baffled by how functions that aren't passed values obtain those values. My money is on Objective-C shenanigans... but any insight would be welcome.