We need `#fileName`


(Dave DeLong) #1

As I've been dusting off some proposals, I've realized something:

#file is the wrong thing for us to be using to describe code context. We should be using #fileName instead.

There are two main reasons that I've come up with to not use #file:

1️⃣ using #file frequently bloats the size of your binary. Every #file usage results in a new StaticString value that has to get encoded into a __TEXT section of your binary. If your typical file-length path is 64 characters and you've got 128 paths in your binary, you're looking at 8KB of space just for file paths.

As a small example, I took a look at a small iOS app I've got on the store, and sure enough I have about a dozen KB of of my binaries taken up by paths. This is for a small app with a handful of screens. How much space would a larger app be wasting by encoding full path names?

2️⃣ Encoding paths into a binary is a security concern. It leaks details about the machine on which the binary was build and can expose information about the build process itself. For example, by examining the paths in the binary, you can make very educated guesses about the size of a development team and what sort of CI setup they have (if they have one at all).

Additionally, the elements in the paths themselves might be of concern. A folder might be named after a top-secret project, because that's the name of the tag or branch used to build the app, or it might be the code name of the project itself.

Apple employees need only look at your typical Apple rumors site to get an idea of the sort of information that can be discovered by running strings on a binary.


I believe a far better value to be encoding into the binary would be the name (last path component) of the file. There is still a small chance of leaking sensitive information, but only if the file itself has a sensitive name. The binary size would be reduced, and we wouldn't be losing much meaningful information. Swift already disallows a single compilation target to have files with distinct paths have the same file name, so disambiguation isn't a huge issue.

So.

Can we replace #file with #fileName please?


(^) #2

how do you differentiate files with the same name but in different paths? main.swift shows up in every executable target.


(Dave DeLong) #3

main.swift does show up in every executable, but the compiler does not allow multiple .swift files with the same name in the same target: you cannot have (for example) UITableView.swift in both your ./Extensions folder and another (different) UITableView.swift file in a ./Classes/Implementations/ folder.

The only time you would end up with ambiguous file names is when you have two files with the same name in different targets. In my experience, the other information (such as #function, #line, and the backtrace) provide more than enough information to eliminate the ambiguity in this case.


If you try to have two files with the same name in a single target, your compilation will fail with this error:

<unknown>:0: error: filename "test.swift" used twice: '/Users/dave/Desktop/Test/Test/Subfolder/test.swift' and '/Users/dave/Desktop/Test/Test/test.swift'
<unknown>:0: note: filenames are used to distinguish private declarations with the same name

(Michel Fortin) #4

Another idea would be to use relative paths. Add a base directory parameter on the command line and let the compiler populate #file accordingly. No language change is required.


(Dave DeLong) #5

Yeah, this could work too, although it would complicate the code a bit. There are times when you're compiling code that lives outside your SRCROOT, and so you'd end up with ../../../Code/Project/Thing/Foo.swift paths, and that might be inadvertently revealing as well.


(Michel Fortin) #6

Then add a flag to suppress the path entirely. Especially if it's for security reasons, it's better to do it as a command line flag.

The alternative is to change all the functions you call that take #file as a default argument to take #fileName instead. This includes those in the standard library and other libraries you'll be using. I can't see this working very well since it only takes one leak to reveal the full path.


(Dave DeLong) #7

Hm, I really like that idea. A compiler flag to change what #file gets expanded to. Default expansion for source compatibility would be to expand to the full path.

SWIFT_FILE_EXPANSION_TYPE =                 // ex: /Users/dave/Code/Project/Target/file.swift
SWIFT_FILE_EXPANSION_TYPE = absolute        // ex: /Users/dave/Code/Project/Target/file.swift
SWIFT_FILE_EXPANSION_TYPE = relative        // ex: Target/file.swift (relative to SRC_ROOT)
SWIFT_FILE_EXPANSION_TYPE = name            // ex: file.swift (last path component)
SWIFT_FILE_EXPANSION_TYPE = none            // ex: (empty string)

Compiler flag name can easily be bikeshedded.


(Pyry Jahkola) #8

The last time I checked #file was indeed relative whenever the command-line argument to the Swift compiler was relative. Unfortunately Xcode never had an option to use relative paths when building projects.


(Josh Caswell) #9

basename (actual name up for bikeshedding) would be good to include as well: no path and no extension, since extension is fairly redundant.


(Tony Allevato) #10

Swift 5.0 will include the -debug-prefix-map flag that was added to remap absolute paths encoded in debug info, because build systems like Bazel rely on being able to cache build artifacts from remote machines whose workspaces may not have the same absolute paths (and then also debug them on entirely different machines).

I think it would make a lot of sense to reuse that flag and extend that remapping to #file as well.


(Jens Ayton) #11

Can we have this for Clang too plz. :slight_smile:


(Daniel Höpfl) #12

I like the idea.

Regarding privacy concerns:

SWIFT_FILE_EXPANSION_TYPE = hashed     // ex: Target/file.swift -> 880b724b5553f4182100e9c2402bc1d33f4924af

(Joe Groff) #13

To me, it makes sense to change the behavior of #file rather than add a new #fileName modifier, since for all the reasons the original post notes, the full path is rarely desirable. With the help of build systems telling the compiler what the root directory of a build is, it'd be great for it to provide just a relative path from that root as the standard behavior. Absent any compiler flag, the directory containing the current source file strikes me as a safer default root than the current behavior as well.


(Dave DeLong) #14

Hm, I could see the use for hashing, but I think it'd deserve to be a separate compiler flag, so I could hash the full path, the relative path, the file name, or the base name.

SWIFT_FILE_EXPANSION_TYPE = relative
SWIFT_FILE_HASH_ALGORITHM = none // default. also: sha1, sha256, md5, etc

Thoughts?


(Tony Allevato) #15

I think we might be approaching feature creep here. In the majority of cases such fine control isn't necessary, so we don't need to overcomplicate the feature for every niche case. Particularly, if you're concerned about the privacy of your source file paths to the point that even the basename is something you want to hash instead, then you ought to be using #if guards to not ship the strings in the first place.

In addition to the existing -debug-prefix-map flag mentioned above, another possibility would be to use the existing -working-directory flag and stem the paths based on that. So if you have the file /confidential/path/to/my/app/Sources/App.swift and you pass -working-directory /confidential/path/to/my/app, then #file would be "Sources/App.swift".

That would jibe nicely with @Joe_Groff's suggestion of falling back to the CWD absent any other flag, because (I think) the driver internally uses CWD if -working-directory is not provided (correct me if I'm wrong).


(Tony Allevato) #16

In fact, when invoking swiftc directly, #file appears to just be whatever exact path you passed to it:

// main.swift
print(#file)
$ swiftc main.swift && ./main
main.swift

$ swiftc ./main.swift && ./main
./main.swift

$ swiftc $PWD/main.swift && ./main
/Users/myusername/path/to/main.swift

(Joe Groff) #17

Yeah, the hashing use case strikes me as stretching the purpose of this language feature, and also a potential security hazard if used with hash tables that are also exposed to user input, since the supposedly "random" unique hashed IDs are being seeded with predictable strings.


(Nick Lockwood) #18

I do actually use the full #file path in a couple of projects. My use case is locating the source directory from inside a running an app in order to monitor and hot-load resource file changes on the fly.

I also pretty regularly use #file inside test targets to refer to source or resource files at relative locations inside the project.


(Jeremy David Giesbrecht) #19

I also make heavy use of it during development. Because the package manager and Xcode have different ideas about what the working directory for unit tests should be, it is the only way I know to reliably find the package directory in both contexts in order to load/export Git‐tracked specifications.

There is a lot of package testing code out there that relies on things like this:

let packageRoot = URL(fileURLWithPath: #file)
    .deletingLastPathComponent()
    .deletingLastPathComponent()
    .deletingLastPathComponent()

let resources = packageRoot.appendingPathComponent("Resources")

However, I agree that there is probably never a good reason to have full paths still kicking around in a binary that will be shipped without its source. fatalError() and friends are over‐eager about it and also hide what they are doing.

Something should be done, but completely removing the ability to discover the full path even during development would be crippling.


(Joe Groff) #20

If we were going to make #file relative to a given source root, then maybe we could provide that #sourceRoot as a special value too. Would that work for your use cases?