Cross-Import Overlays
The problem
[Note: I've added new information to this section with some additional benefits of this approach after reading the comments. These points are prefixed with a ★.]
Say we have a pair of frameworks NumericKit
and FormatKit
where NumericKit
covers numerical simulations and FormatKit
covers serialization and deserialization for a bunch of data formats. Ideally, a downstream project that depends on both of these frameworks would "automatically" get cross-cutting functionality. In this example, it could be something like serializing complex data using the HDF5 format (say).
If one wants the automatic behavior today, the cross-cutting functionality has to either live in NumericKit
(in which case it depends on FormatKit
) or FormatKit
(in which case it depends on NumericKit
). This is a dissatisfying solution (★ especially when both NumericKit
and FormatKit
are not system frameworks) because it means that
- clients potentially incur dependencies which are unused: (★) this increases the amount of code that might need to be audited, especially if it is relying on a C/ObjC core for functionality
- (★) clients incur additional compile time + link time overhead:
a. It reduces build parallelism because swiftmodules are built in dependency order.
b. It increases the amount of code one needs to compile and link.
The other option to have a separate framework, say NumericKitFormatKitAdditions
for the cross-cutting functionality, which solves the problems 1, 2a and 2b mentioned above. ★ Such cross-cutting functionality not only extensions for third party types but also conformances. We would generally like people to avoid writing retroactive conformances with increasing adoption of library evolution; such an Additions
framework would provide a natural home for retroactive conformances without issues.
However, this solution requires downstream users to write import NumericKitFormatKitAdditions
which is inconvenient and a bit ugly. If there is just one such module, it is probably fine. However, if a framework's functionality overlaps with many other frameworks, this can lead to a displeasing wall of imports:
import NumericKitFormatKitAdditions
import NumericKitPlotKitAdditions
import PlotKitFormatKitAdditions
import NumericKitTestKitAdditions
import FormatKitTestKitAdditions
import PlotKitTestKitAdditions
Wouldn't it be nice if we could mark these "Additions" modules in a way such that if you wrote
import NumericKit
import FormatKit
import PlotKit
import TestKit
(which you are probably writing anyways) you automatically get the cross-cutting functionality as well?
The proposed solution
The typical case (N = 2)
We propose a new language feature 'Cross-Import Overlays' that allows framework authors to provide cross-cutting functionality in a modular fashion without having to burden users with additional import
clutter.
Consider the example shown earlier. Say if NumericKit
decides to expose an overlay for FormatKit
, NumericKit
can add a new JSON file FormatKit.json
under an Overlays
subdirectory in its swiftmodule
.
NumericKit.framework/Modules/NumericKit.swiftmodule/Overlays/FormatKit.json
The contents of this JSON file will tell the compiler what the name of the overlay module is.
{
"version": 1,
"modules": [
{"name": "_NumericKitFormatKitAdditions"}
]
}
Here, version
represents the scheme version for the JSON file and _NumericKitFormatKitAdditions
is the name of the cross-import overlay. As a convention, we recommend that the name consist of an underscore, followed by the name of the primary underlying module (here NumericKit
), followed by the name(s) of the secondary underlying module(s) (there’s just one here: FormatKit
) followed by Additions
.
(Aside: The exact naming scheme is up for discussion, this need not be the final one. One benefit of it being a convention, instead of say a new symbol like +
, is that we don't need to change the grammar, parser and potentially other parts of the compiler and downstream clients.)
Once the compiler sees this module name, it can try to find a module for NumericKitFormatKitAdditions
using the usual search paths.
The expected behavior would be that when the compiler sees both NumericKit
and FormatKit
, it will try to find _NumericKitFormatKitAdditions
, and if successful, the resulting code behaves as if import _NumericKitFormatKitAdditions
had been explicitly written in the source.
The general case (N >= 2)
Say we want to add a cross-import overlay which has three underlying modules: NumericKit
, FormatKit
and TestKit
. In such a case, we can add the following two JSON files:
# same as before
NumericKit.framework/Modules/NumericKit.swiftmodule/Overlays/FormatKit.json # contains
# new overlay
_NumericKitFormatKitAdditions.framework/Modules/_NumericKitFormatKitAdditions.swiftmodule/Overlays/TestKit.json
[Note: the second path was initially using NumericKit.framework
, which was incorrect, as pointed out below. Thanks @Xi_Ge and @allevato.]
Once the compiler sees this directory structure, and all three of NumericKit
, FormatKit
and TestKit
are imported, it will automatically import the two other modules as well
import _NumericKitFormatKitAdditions // via FormatKit.json
import _NumericKitFormatKitTestKitAdditions // via TestKit.json
Impact on name lookup
We recommend the following shadowing behavior:
- A cross-import overlay’s declarations shadow those in the primary underlying module.
- A cross-import overlay’s declarations do not shadow those in the secondary underlying module(s)
The second rule has a chance of creating ambiguity: what if there is a name collision between a declaration in a cross-import overlay and a declaration in one of its secondary underlying modules?
There is an existing pitch Fully qualified name syntax which tackles this issue in a more general setting. We defer to that proposal for the core mechanism and assume that we somehow have a way to fully qualify names using the module name. The question then becomes: what module name should be used for a definition for declarations that come from the cross-import overlay? We suggest that the name of the primary underlying module be used instead.
Going back to our running example, say all of NumericKit
, FormatKit
and _NumericKitFormatKitAdditions
have a function named blop()
(yes, I realize the example is a bit contrived).
// Say the accepted syntax is using '::' for module qualification instead of '.'
import NumericKit
import FormatKit
// import _NumericKitFormatKitAdditions // overlay imported implicitly
blop() // error: ambiguous call to blop()
FormatKit::blop() // works
NumericKit::blop() // works: refers to the declaration from the overlay
One might wonder: “how can you refer to the declaration from NumericKit
"? It’s not possible to do that. We anticipate that it is extremely unlikely that the owners of the cross-import overlay (they would also own the primary underlying module) — would want to use the same name in both modules and have them both be usable. The simplest way to do so is giving the two declarations different names from the beginning.
Interaction with other kinds of imports
-
@_exported
imports: The cross-import overlay is@_exported
-imported if all the underlying modules are also@_exported
-imported. -
@_implementationOnly
imports: The cross-import overlay is@_implementationOnly
-imported if any of its underlying modules is@_implementationOnly
-imported.
Alternatives
Maintain the status quo
As argued in the beginning, this is not a good state of affairs as it prevents framework authors from providing a natural home for cross-cutting functionality.
Have a single overlays.json instead of individual JSON files
One might ask: why have so many individual JSON files, one per overlay, instead of having one combined Overlays.json
file that describes all of them? One reason to have separate files is that in a build system, the latter avoid race conditions if multiple jobs happen to modify the same Overlays.json
file at once.
Use special feature flags to enable extra functionality
For example, one could have a framework level feature flag that is strictly additive. Otherwise, if the flag can remove definitions, and if one happens to transitively depend on the framework with both the flag on and off, one would need both copies of the framework.
We think this solution is not as good as our approach outlined above because:
- It requires more implementation work: we would need to come up with a design for strictly additive feature flag and the implementation would probably touch many different parts of the compiler.
- It increases the amount of work the compiler needs to do: we would need to make sure that the framework compiles successfully both with the flag on and off.
- It will lead to poorer build parallelism and incrementality: (assuming library evolution is enabled) changes to the flag-imported framework might end up rebuilding the whole flag-using framework (which might be large), instead of just rebuilding the overlay (which would probably be smaller).
We think the cross-import overlays approach is cleaner both in terms of implementation and likely to lead to better outcomes for build times.
Concerns and Limitations
How will ambiguities be resolved in case there are multiple possible candidates?
Say somehow we end up having two modules _NumericKitFormatKitAdditions
and _FormatKitNumericKitAdditions
— how will the compiler pick which one to import? Will it import both? Will it import neither?
We think that if both of these modules are present, it is probably a bug, either in some code or in the communication between framework authors. We don't think there is a good engineering reason for such ambiguous cases to arise in practice. If it does, the compiler will emit a note (not a warning) asking the developer to contact the framework authors about the situation, and not implicitly add any import
statement.
The downstream developer may still write out import _NumericKitFormatKitAdditions
(or import _FormatKitNumericKitAdditions
, or both) if they wish to do so.
Will my framework be able to provide retroactive cross-import overlays, i.e., overlays for which all underlying modules are third party modules?
No. For example, if you work at Capitalist Enterprise Inc., and your CapitalistEnterprise.framework
contains a module StockKit
, then it can provide an overlay for StockKit
+ NumericKit
. However, it cannot provide an overlay for NumericKit
+ FormatKit
. Allowing such overlays makes it more confusing for downstream clients (adding a new, seemingly unused framework somehow changed the build, huh?), and increases the I/O the compiler needs to do because such retroactive overlays could be in any framework.
Will this adversely affect diagnostics and autocomplete in an IDE?
Since the exact naming scheme will be decided up front, the compiler's diagnostics, and the information it gives to downstream IDE clients can be adjusted to provide more information on whether a declaration came from a cross-import overlay or not.
Similarly for autocomplete -- we expect that exact name of the cross-import overlay will be hidden in most cases but one could perhaps, if one wanted, ask for it explicitly.
We expect the behavior here to be the same as that for the existing overlays.
Will I be able to turn this off if I don't want these invisible imports?
We do not expect many people to want to turn off these automatic imports. So the initial version of the feature will not have the ability to turn it off. If usage reveals that, yes, this is indeed something desired by many developers, we can add an attribute to achieve this functionality, say something like (strawman syntax):
@skipOverlay("FormatKit")
import NumericKit
While we don't want people to do this, pedantically speaking, it would be possible to manually pass framework/include paths to contrive a situation in which the cross-import overlay is not present on the search path. In such a situation, the compiler will emit a note about the missing overlay, not a hard error.
Will this create additional challenges for my company's build system?
We don't think it will. The design is very deliberate in that the paths where swiftmodules are located is kept unchanged, so that the caching logic in a build system doesn't need to be changed. The only change is to how the compiler figures out which module needs to be imported.
The compiler's emitted dependencies will certainly contain paths to the cross-import overlays that were imported, as one might expect.
Will these replace the existing overlays?
Conceptually, the existing overlays may be thought of as cross-import overlays with the primary underlying module being the standard library. Since the standard library is implicitly imported by every module, the effect is that adding a single import gets you the overlay as well.
It would take some effort to make sure that this works well in practice without any gotchas’. So for now, the existing overlays will continue to exist as they are. In the future, we might change that if we are able to do so in a source-compatible and binary-compatible way.
Will I be able to "sink" code from a module to a cross-import overlay?
No, you can't do that, as that would break source compatibility for clients who only import one of the underlying modules. It would also break binary compatibility as the linker would not know that the cross-import overlay library also needs to be linked.
More generally, it is not possible to move code from a module to one of its dependents without breaking compatibility.
What about submodules?
Submodules do not solve the same problems that cross-import overlays do, so they are not an alternative to cross-import overlays. It is hard to speculate about features for which there are no active pitches/proposals — we anticipate that if submodules are added in the future, this design will still work without many compromises.