Error embedding file data with a macro

I've written a Swift macro to substitute the contents of a file into a string. However, I get the following error:

Failed to read file at path: .../main.swift, error: Error Domain=NSCocoaErrorDomain Code=257 "The file “main.swift” couldn’t be opened because you don’t have permission to view it." UserInfo={NSFilePath=.../main.swift, NSURL=file://.../main.swift, NSUnderlyingError=0x600002d5d5f0 {Error Domain=NSPOSIXErrorDomain Code=1 "Operation not permitted"}} (from macro 'stringify')

It's reading its own file, so how am I getting a permission error? It doesn't work reading other files in the same directory, either, that I ought to have permissions to read.

Second, how do you usually embed files into an executable? (A CLI and Swift package where filesystem access may not be available, so an app package/bundle is not an option, unless there's some way bundles support embedding in compiled executables.)

Macros are run in a sandbox which limits filesystem access, for a variety of reasons.

You could consider using a C target in your package, and utilizing the newish #embed directive.

Is there somewhere I can get more information on the sandbox? The API provides the filesystem symbols as well as full filenames, so that's something I'd expect to have access to.
Also usually if there's limited filesystem access, this is distinctly noted, e.g. MEVideoDecoder | Apple Developer Documentation

You have to scroll down a little, but here is the quote (emphases mine):

Worth noting that macro expansion file system access works on Linux (on some distros; use at your own risk, but you really shouldn't).

1 Like

Yes, you absolutely shouldn't. Macros must not have side-effects, including I/O access. The build system makes an assumption that for the same set of inputs macros always provide exactly the same output. I/O is not a designated input for a macro, as it isn't deterministic and there's no way for the build system to know which exact file a macro is reading.

There are some solutions to this at a prototyping stage that would enforce macro virtualization on all platforms, e.g. [Macros] Add support for wasm macros by kabiroberai · Pull Request #73031 · swiftlang/swift · GitHub and it was mentioned in the corresponding vision for Swift evolution. In this case, macros would consistently not have access to any I/O on all platforms.

1 Like

Why ought it be the case that macros not have side effects? Or I/O more specifically, I don’t see how reading from the filesystem is a problem, as it’s obviously necessary for the compile step. That is, why does C get to have macros with filesystem read access but not Swift?

And granting that macros don’t get filesystem access, is there a way I can build a Swift-compatible object file from some source data, bypassing the Swift compiler and going directly to an LLVM frontend?

I also face the same problem but cant get any relible solution.

That was explained in my original message: source code placed on the file system is tracked by the build system, as source code file paths are known to the build system. Arbitrary side effects within Swift macros are opaque and unknown to the build system, breaking determinism and assumptions built on top of that determinism.

For the same reason that C gets to have unrestricted memory access and data races, but Swift does not thanks to its memory safety and strict concurrency. In the same way, macros in Swift are supposed to be free from side effects (partially enforced only on some platforms for now) to prevent build system errors and incorrect incremental builds.

You could embed data from a module written in C or C++, but do that at your own risk as Swift build system would have no way to ensure correctness of the build.

2 Likes

It's also a potential security risk if macros have unfettered access to the file system of the build machine, where some third-party dependency could read arbitrary files and embed their contents into a binary that gets shipped somewhere.

One way to handle this, if you want macros to be able to read files, is to require that the paths to those files be passed to the compiler, and provide a specialized interface that allows macros to only access the contents of allowlisted files, instead of using raw file APIs. That provides an audit trail of sorts because the package manifest/build system have to explicitly list the files that they're permitting access to.

Those issues aside, IMO something like #embed would be better implemented as a compiler built-in that just reads the file contents and generates the appropriate constant data at the IR level. That would be far more efficient than a macro that generates a huge array that then has to be compiled, especially because today the compiler performs inconsistently when doing this.

I’ll emphasise that however it could be implemented in the compiler, it has to let the build system know of the file path that was read to ensure as you said access control, but also to rebuild everything correctly when the file changed.

1 Like

Just add another executable target that generates those swift files and run that target before building the main target.

Or if you want to do it the hard way, you can always just switch to cmake.

Edit: you might also be able to make this into a build tool plugin, so your generate files step is run when you use swift build instead of having to manually run it. You can look up examples for this, like R.swift has several plugins.

I believe Rust deals with this by reexpanding all proc macros on every build, to check they produced the same output as last time, rather than by knowing "ahead of time". This also allows macros that have other forms of nondeterminism, eg. expand to the build timestamp (though "at what cost"!)