Supporting strict imports

Keith · December 3, 2020, 6:36pm

We have a large multi-module Swift codebase where we want to support strict imports, meaning files only import what they use, and they import everything they use (not particularly at the class level with import class Foo.bar, but at least import Foo). We have a few specific reasons for wanting this to be enforced:

When reading a diff it makes it clear when a new dependency on a module is being introduced
When this is explicitly defined we can enforce what types of modules are allowed to import what other types of modules, for example we don't want modules with business logic importing our UI layer etc
By trimming unused imports, we can eliminate unnecessary intermodule dependencies, simplifying the dependency graph and speeding up compilation

We've hit a bunch of roadblocks trying to support this over time and I'm curious what folks think about a solution to this problem could be.

Currently we use SwiftLint's unused_import rule for this. It works by requesting the cursor info from every location in every source file, and cross referencing that with the imports in the file. Unfortunately this approach has some false negatives due to cases like this bug (I don't want to harp to much on the specific bugs since I'm not sure they'll ever be fixed given source breaking concerns + other reasons I'm probably unaware of). There was also an attempt to switch to using the SourceKit index instead, but it seems there were similar issues there. There are also some other difficulties with these approaches in terms of outlier cases like custom operators, and name collisions from multiple modules defining type with the same name. If the SourceKit index had a more complete set of dependencies containing these outliers, we could rely on it for this kind of tooling.

I've also done a bit of testing with other approaches such as trying to dependencies determine from the symbols in the binary, fetching info in the swiftmodule file, using -emit-imported-modules, or using -emit-loaded-module-trace, but I haven't been able to come up with any alternatives that successfully cover the cases mentioned above.

We briefly brainstormed some solutions to this that I would love to have folks input on.

We could ignore the specifics of this issue, and just try and fix the “bugs” causing these. This would likely make the SourceKit index usable for this use case
We could make the compiler produce a warning for unused / implicit imports (potentially with a flag if that felt too aggressive by default)
We could provide another -emit flag that dumped more complete info about the modules a specific file depended on

Overall I'm curious about a few things:

Is this something people agree we should solve?
Do any of the solutions above sound reasonable, are there other potential solutions to solving this?

Thanks to @jpsim for help writing this up

jpsim · December 10, 2020, 1:32pm

I'd like to ask the sourcekit folks if there's any reason why we shouldn't extend the index's key.dependencies values to include more modules used in a file, such as operator overloads, protocol conformances, and initializers. Perhaps there's a good reason why the dependencies in the index aren't comprehensive. If there isn't, and this is considered a bug, then we can make a push to cover the known missing cases. cc @akyrtzi @rintaro

akyrtzi · December 10, 2020, 5:38pm

Could you clarify with more details, I don't quite follow..

jpsim · December 10, 2020, 5:41pm

If we remove all the imports in a file that aren't in its index's key.dependencies, the file fails to compile in some cases, such as when the module being imported was being used for its operator overloads, or initializers, or protocol conformances.

Would it be acceptable for us to identify these cases and add those missing dependencies to the index's key.dependencies or is there a reason why these are sometimes missing?

rintaro · December 10, 2020, 5:55pm

MyFW/MyFW.swift

public extension String {
  func foo() {}
}

MyApp/AppDelegate.swift

import MyFW
...

MyApp/ViewController.swift

import UIKit
// NOTE: no 'import MyFW'

class ViewController: UIViewController {
  override func viewDidLoad() {
    super.viewDidLoad()
    "".foo() // no error!
  }
}

String.foo() is visible in ViewController.swift even without import MyFW. So ViewController.swift actually depends on MyFW framework. But we don't include it in key.dependencies in source.request.indexsource response.

@jpsim Is that what you are saying? If so, could you file a bug with concrete example for operator overloads, initializers, and protocol conformances?

rintaro · December 10, 2020, 5:58pm

This is actually a bug in compiler [SR-3908] Extension methods visible without importing module in a file · Issue #46493 · apple/swift · GitHub , but I don't expect it'll be fixed in near future as it's source breaking.

jpsim · December 10, 2020, 6:25pm

Yes, Keith and I are referring to issues like that one. However, adding MyFW to ViewController.swift's dependencies wouldn't be a breaking change. What I'd really like to know is if you'd consider that a bug or if there's something that you know of that depends on those modules being missing from the dependencies deliberately. It'd be good to know how you feel about this before we put in the effort to collect all these cases with minimal reproduction examples.

blangmuir · December 10, 2020, 7:01pm

The test case @rintaro posted above may be related, but I don't think on its own it would cause

If we remove all the imports in a file that aren't in its index's key.dependencies , the file fails to compile in some cases

The module MyFW is still listed in the dependencies for AppDelegate.swift, so it wouldn't get removed by the rule described.

Do you have a full example to make sure we fully understand the issue?

I'm not necessarily opposed to including these unfortunate implicit dependencies in key.dependencies if it's not expensive to compute, but it would be good to make sure we fully understand the use case and impact first.

typesanitizer · December 10, 2020, 7:28pm

Could you talk about how you tried using the module trace and what issues you encountered with it? I also have a PR up for more fine-grained information as a module graph: [ModuleTrace] Emit import graph to ease debugging import issues. by varungandhi-apple · Pull Request #33980 · apple/swift · GitHub, but right now it's setup mainly for debugging and I haven't had the chance to work on it lately.

Keith · December 11, 2020, 12:00am

I've filed [SR-13954] Sourcekit index request doesn't include polluted transitive dependencies · Issue #56351 · apple/swift · GitHub which shows an example of how key.dependencies is missing the polluted module dependency

Keith · December 11, 2020, 12:44am

I didn't hit any specific issues with that, it's just that it didn't shine a light on the specific pollution issues. For example whether or not you were using something in the example above from the polluted module, all transitive swiftmodules were represented the same way in the output json file. Specifically in this example you see:

".build/x86_64-apple-macosx/debug/transitive.swiftmodule",
".build/x86_64-apple-macosx/debug/model.swiftmodule",
".build/x86_64-apple-macosx/debug/extender.swiftmodule",

Whether or not you're using polluted symbols from extender.

My last ditch hope was that I could get something from the detailed info, but none of the attributes in this case (whether or not you use the polluted symbol) reflected what I was hoping for:

{
    "name": "extender",
    "path": ".build/x86_64-apple-macosx/debug/extender.swiftmodule",
    "isImportedDirectly": false,
    "supportsLibraryEvolution": false
},

Any idea if your proposed changes there would shed light onto whether or not the polluted module was being used?

typesanitizer · December 11, 2020, 2:10am

Okay, got it. I'm not entirely sure what a good short-term solution to your problem would look like. The module trace and import graph are certainly not sufficiently granular for your use case.

One of the possibilities you discuss is warning on unused/implicit imports: IMO this is a no-go because it runs the risk that you pointed out; someone looks at the warning, deletes the import, and then it breaks something downstream. The core problem is that the default import is equivalent (as undesigned behavior) to something like @_exported(extension, operator). You could do this kind of check for @_implementationOnly imports, which as you've seen don't suffer from this problem, but I don't think there are many gains to be had unless you are already using @_implementationOnly imports pervasively.

I think one potential option that might be worth considering is warning on usage of indirect import. After type-checking, compute the subgraph of loaded modules which are reachable via @_exported imports, including the current module (the module trace has logic similar to this for computing the isImportedDirectly field). Then walk over all decls referred to in the module's AST and emit a diagnostic + import fix-it if the decl originates in a module outside that subgraph.

This "diagnostic" can either be a proper diagnostic or a separate compiler output (behind some -emit frontend flag). The latter is likely to get less resistance, but then it also doesn't offer easy IDE integration with Xcode. The former... I imagine we'd have to test it out and see how many warnings it emits in practice. If it's too many, there is likely to be push-back behind having additional warnings behind a compiler flag (we don't have precedent for it).

That's the short-term aspect. IMO the ideal medium/long-term fix would be to have a new kind of import that is something like (strawman name) @_exported(none), as opposed to the default ~ @_exported(extension, operators) import kind we have today. Or perhaps (strawman name) @_fileExported(none), i.e. even other files in the same module can't see the contents of the imported module. That said, from what I understand (which is not a lot), changing name lookup to work properly with a new import kind might be tricky. And you'd have to go in and audit your modules to use this new attribute for imports wherever possible.

Keith · December 11, 2020, 6:55pm

IIUC what you're saying you mean in my example we'd get a warning in the file that only imports a module, and removing it there would break a module upstream? IMO this would be a fine case for a warning since that behavior is not expected and the upstream module should import its dependencies instead, but I understand that might not work in practice depending on how we think about warnings leading to breakages (my interpretation of the current policy is that it would actually be ok since it's "just" a warning and warnings are "allowed" to warn about breaking changes as long as it still builds).

So we actually have gone down this path wherever we can, and most critically we've done this for one of our modules that exclusively adds extensions on system types so it was disproportionally polluting other modules. But we've generally hit some problems here such as [SR-11910] @_implementationOnly types not allowed as private properties on public types · Issue #54328 · apple/swift · GitHub that lead us to not really be able to use this much unfortunately.

This would still be an improvement for sure, even though it wouldn't solve removing unused imports.

I like the goal of this but I think it would be tough to justify significant extra syntax for this case not only from a user facing complexity point of view but also just a usage pov. I don't think we would want to change / enforce every import be annotated with this.

typesanitizer · December 11, 2020, 7:21pm

Yes. (Although, I would probably say 'dependent' instead of 'upstream' just because 'upstream' vs 'downstream' is often confusing.)

Right, if a warning comes with a suggestion (or a fix-it), often people will simply apply the fix-it without thinking too much. It would be dangerous if that changed the semantics of your program. Worse, in this case, it doesn't affect you, but it affects other modules which are importing yours, so you are likely to discover any breakage much later on.

But I think this would be definitely be okay as a separate compiler output.

Yeah... Thinking back to your suggestion in the original post, maybe we can combine the two.

Initially, the module graph has indirect dependencies + redundant imports. Enable the warnings on usage of indirect imports. Fix those warnings. Now the module graph has explicit dependencies + redundant imports. Next, you warn on unused imports. Because the module graph has explicit dependencies now, you can remove these redundant imports without breaking dependent modules.
That way, you end up with a module graph with explicit dependencies + no redundant imports. After that, you can have both warnings be turned into errors, so nobody can break the explicit + no redundancies as invariants.

Of course, it's much easier said than done, but having the two in concert might be better than having one or the other.

I'm not suggested every import be annotated with this, but I think this is one possible good default for the ~majority of imports (and better than what we have today), IF (that's a big if) we are able to change the default import semantics for (say) Swift 6 or Swift 7. Before we can change that, we'd need a way to have early adopters start using it and provide feedback to guide refinements to the semantics.

Keith · December 11, 2020, 7:28pm

I agree with the sentiment here but IMO if folks were onboard with fixing this in general this short term, not required, migration pain may be worth it.

For us I guess this would be fine too, but I agree with what you implied above that this would greatly limit how many folks would benefit from this since it would heavily complicate usage.

+1 I think this would be a great migration plan that would actually alleviate the concerns above I believe.

Sorry I wasn't more clear, my point here was as someone who wants this everywhere, we would have to annotate every import with this to get the behavior we wanted, and therefore I imagine very few folks would use it (unless we swapped the defaults).

Keith · December 11, 2020, 10:32pm

For more context on the original issues here's what I believe to be all the distinct pollution / missing data issues:

issue	tickets	workaround
protocol conformances from other modules are missing from index data	[SR-13959] Protocol conformance in other module not present in sourcekit index · Issue #56356 · apple/swift · GitHub	none
initializers fail cursor info	[SR-13958] Sourcekit cursor info request with USR fails for initializers · Issue #56355 · apple/swift · GitHub [SR-13140] Impossible to get cursorinfo for trailing closure initializer · Issue #55587 · apple/swift · GitHub [SR-13120] SourceKit syntaxMap doesn't contain info about trailing closure initializers · Issue #55566 · apple/swift · GitHub	get data from index instead
index missing polluted things in `keys.dependencies`	[SR-13954] Sourcekit index request doesn't include polluted transitive dependencies · Issue #56351 · apple/swift · GitHub	get data from cursor info / index USRs instead
extensions from other modules are polluted without imports	[SR-11901] extensions on directly depended on types are polluted from transitive dependencies · Issue #54318 · apple/swift · GitHub	cursor info should still contain the right place (except initializers above)

If my list here is actually exhaustive I believe we can almost achieve what we want by fetching both sourcekit's index data + cursor info for every offset in the file, and cross reference those with the imports. Then the only case we'd miss would be dependencies on modules that define protocol conformances for types they do not define.

I'm still interested in the options above, I just wanted to more accurately represent the state of today.