I'm posting this in Development > Compiler instead of Evolution > Pitches because I'm not sure what shape the feature I want should take—I have a few competing ideas, so I want to have a more focused implementation discussion because I don't yet have a specific concrete feature that could be pitched.
Motivation
Our use of Bazel encourages projects to be split up into many fine-grained modules instead of larger omnibus modules (not just for Swift, but historically for Objective-C dependencies as well). It's not uncommon for a single Swift source file to contain tens of import statements. Since we have a monorepo, we autogenerate module names by transforming the build system's target label into a module name. So for example,
# some/project/BUILD
swift_library(
name = "my_library",
)
This target is referred to in Bazel as //some/project:my_library, and we give it the Swift module name some_project_my_library. This has two advantages: (1) users don't have to contrive module names for their many Swift/Obj-C targets, and (2) it avoids collisions between targets in the same dependency graph.
We also have additional tooling that tries to automatically maintain BUILD files, using source files as the source-of-truth. So if someone writes this:
import some_shared_library
import another_shared_library
We can keep the BUILD file automatically in sync:
# some/project/BUILD
swift_library(
name = "my_library",
deps = [
"//another/shared:library",
"//some/shared:library",
],
)
The problem here is that this means the "ugly" derived names are what users have to internalize and write, rather than the target labels that naturally fall out from the project's directory structure. I want to invert this because it would yield a much better developer experience.
Ideas I've Considered
1. Allow macros to expand to import declarations
#bazelImport("//some/project:my_library")
// expands to: import some_project_my_library
In this case, the macro would implement the same module name derivation logic as the build system (or, in a future where a macro could read an input file, we could have the build system provide the mapping that way).
I hacked up the compiler to allow this, but found that my import was never actually resolved. That makes sense; the compiler has to resolve imports first to know which macros are available, and even if I generate an import declaration from a macro, there's no opportunity to resolve it later. I don't know if this is feasible without opening a can of worms—you could have a macro that introduced imports that introduced new macros, which you'd expect to be expanded, but then those could introduce their own imports, ad nauseam.
2. Introduce a new notion of "module origins"
import from "//some/project:my_library"
My idea here was that we could provide a mapping (probably in the -explicit-module-map-file JSON manifest) from opaque "origin identifiers" to module names and allow the import declaration to be written in terms of that origin. In Bazel, we would just use the target label as the identifier.
This would probably be the most work, and has the disadvantage that it introduces new syntax. A possible advantage is that it might be able to serve the needs of the oft-requested package-import feature for Swift scripts. Just brainstorming, a Swift script could contain something like the following:
import from "https://github.com/apple/swift-argument-parser.git"
@main
struct MyCommand: ParsableCommand { ... }
and when the script was run under SPM, it could use a syntax scan to resolve the repositories, generate the mapping to module names, and pass that to the compiler. There are some details that would need to be worked out here though, like how to handle repositories that export multiple targets. (Import all of them? Let the user choose a single one, like import ArgumentParser from "..."?)
3. Allow module names to contain non-identifier characters
import `//some/project:my_library`
SE-0275 would have allowed backtick-delimited identifiers to contain non-identifier characters, which ought to have allowed this. However, that proposal was rejected, and I'm not sure if a subset of it for module names would warrant revisiting it.
I think this would also work for Objective-C modules; to my knowledge Clang lets you write module "//some/project:my_library" { ... } in a modulemap file; we do this to implement layering checks in Bazel's C++ support so that the diagnostics show the target label, and I imagine ClangImporter could be updated to handle those names.
Separately, there's the question of how to deal with serialized module filenames for modules with non-identifier characters. Search-path-based imports require the module name to match the file name, which wouldn't work if we allow characters disallowed in paths. We'd need to invent an encoding, or restrict it to modules imported using the -explicit-module-map-file JSON manifest.
OTOH, a big advantage of this idea is that the module name is exactly what's written in the import. So, if the user needs to write it elsewhere in source (e.g., to fully qualify a name), the name is the same. The other options above would require the user to still be aware of the transformation scheme or for us to provide an affordance to retrieve it.
Other options I haven't thought of
Anything I've missed?
Wrap-up
This is something I'd really like to make progress on in the near future, but none of the options above feels totally satisfying yet. I'd love to hear other folks' thoughts!