@_exported and fixing import visibility

jrose · February 1, 2018, 4:34am

I told @Jon_Shier I'd write something up on @_exported and visibility in general, so here it is. It's a bit of a brain dump, but I know if I wait until I have time to edit it it won't get posted for months. (That's already true, in fact; I've had this model kicking around my head for probably close to a year now.) So, thanks for the prompt, Jon.

Today, if your Swift library "Foo" uses a library "Bar", the headers (or swiftmodule files) of "Bar" must be available to all clients of "Foo". This is clearly undesirable, especially when "Bar" is just an implementation detail. (It may even be linked statically.)

At the same time, if part of Foo's public interface uses a type from Bar, the interface for "Bar" must be available. Otherwise, a client of Foo wouldn't be able to use that particular API from Foo.

(Technically, we could make this more fine-grained: a client of Foo can't use that particular API without Bar being available, but they can use other APIs. But I'm not sure if that level of control is worth it at the moment.)

Note that Swift already has a fairly robust (and well-understood) model for access control. However, just because an entity is public does not actually mean it is visible; you may not have imported its module yet. This proposal is only concerned with modifying the rules for visibility.

On the flip side, there is no supported way to get the effect of C headers in Swift. With C, you can make something like Cocoa.h, whose only purpose is to make the Foundation, AppKit, and CoreData modules available to clients all in one go. Or you can have UIKit re-export Foundation, so that import UIKit is enough to get Foundation as well. In practice, developers for Apple platforms have gotten very used to this model in both Objective-C and Swift; most people don't care that some of the APIs made available through import UIKit are actually from the MobileCoreServices module.

Today's Swift is designed more like Java or C# or Python, in that if you import Bar in the implementation of Foo it doesn't affect clients who import Foo. Or, well, it doesn't make the top-level names of Bar visible to clients who import Foo.

You still need Bar around, because the compiler doesn't track whether you've used one of its types in Foo's public interface. (That's the previous section.)
Extensions in Bar are still made visible to clients who import Foo, because the compiler doesn't distinguish where extensions come from today.
Operator declarations in Bar are still made visible to clients who import Foo, because the compiler finds operators in a different way than it finds everything else at the top level.

(2) and (3) are closer to "bugs" than "features", but we may have no chance of fixing them at this point, because there'd be so much code depending on them, and we don't actually have an implementation of the "right" model for either.

But we're supposed to be looking on the "feature" side now, too. What if I want to make the top-level names of Bar visible to clients who import Foo, the way UIKit exposes the types in Foundation?

(There's a non-supported attribute for this, @_exported. It pretty much works, as far as I know.)

Bonus complication: on Apple platforms, we generate headers for the parts of a framework that are exposed to Objective-C, and those headers have (Objective-C) imports of their own. The net effect is that some modules end up getting re-exported when you import a library...but it's only those that are needed to define the Objective-C interface. What a mess!

While I think the Java / C# / Python model is a good one, I don't think there's a good path to get there for Swift, particularly on Apple platforms where its visibility model has to play with C's. So the way I think we should go is something like this: Formalize @_exported, then require that all types exposed in a module's public API come from @_exported modules.

Then we have a simple rule: any time you can see a type in a particular API, you can also see the (public) contents of the module that defines that type, regardless of whether you imported it directly. This is consistent with the behavior of the generated Objective-C header on Apple platforms today.

I would implement this requirement with a warning, not an error. That is, if a type is not available from the current file's set of re-exported imports, the compiler would emit a warning and then implicitly re-export the module containing the type.

This doesn't directly solve any of the "too much is visible" problems I mentioned above, but I think it will help us get to a point where we no longer need to have Bar available to import Foo. We're not there yet because the compiler currently expects to have full knowledge of every type it deals with, even internal and private members, but I think it's a step in the right direction. It fixes the semantics to be something sensible and consistent with C, giving us space to work towards dropping the non-re-exported dependencies altogether.

To cap it off, I would suggest that the natural spelling for a proper supported @_exported is public, as in public import Foundation.

hlovatt · February 1, 2018, 4:51am

I agree with your assessment. In particular that 2 and 3 are probably bugs and that a method of exporting imported 'headers' is needed and I like your proposed public import XXX. It is a shame 2 and 3 can't be tackled because both are problematic for me. If 2 and 3 were depreciated then I am not sure how much code this would actually break, when I have come up against these issues I didn't want this behaviour and had to code round it, e.g. pick a different operator symbol. Therefore if 2 and 3 became bugs it wouldn't break my code - maybe that is true of others people's code also and therefore we could fix 2 and 3.

davedelong · February 1, 2018, 4:53am

I love this!

Can we take this a step further?

Let's say I have ModuleA and ModuleB and ModuleC.

ModuleB wants to import ModuleA as part of its public interface:

public import ModuleA

However, ModuleC wants to import ModuleB, but only as an internal implementation detail:

internal import ModuleB

This would mean that our existing import ModuleC code, which only exposes the symbols from that module to the current file, would more accurately be described as:

private import ModuleC // or fileprivate import ModuleC

anandabits · February 1, 2018, 4:14pm

Hi Jordan, thanks for posting this! I agree that 2 and 3 are troublesome. I would really like to see this fixed. We should not be getting any symbols from a module Foo that we did not import directly, or at least indirectly after importing a module that uses your proposed public import Foo.

I appreciate your aside:

(Technically, we could make this more fine-grained: a client of Foo can’t use that particular API without Bar being available, but they can use other APIs. But I’m not sure if that level of control is worth it at the moment.)

A significant consideration here is that if we don't do this it reduces ability to control the dependencies of our project. I may wish to depend on a 3rd party module A that depends on modules B, C and D. However, I only want to depend on APIs in module A that use types from A and B. I do not need to use APIs that use types from C and D and I do not want to take on a dependency on C and D.

In theory, it may be possible to factor these APIs in A out into a separate module. In practice, I do not have control over 3rd party module A and it isn't practical to create factorings to satisfy all users.

In theory it may also be possible to simply use discipline to avoid avoid the APIs in A that involve C and D. In practice, on large teams with mixed skill levels enforcing such discipline is challenging.

If we want Swift to give people precise control over their dependencies and encourage a robust library ecosystem I think we should give consideration to a finer-grained approach. At the same time, I do think the "you get everything" approach is appropriate in some cases. import UIKit is obviously a good example.

When you say "I’m not sure if that level of control is worth it at the moment" are you thinking primarily in terms of value to users or effort to implement?

jrose · February 6, 2018, 1:34am

General reply first: I forgot to mention that requiring public import for all visible things sneakily "fixes" (2) and (3) if we get to a world where we don't need to import the non-public things at all. Unfortunately, source compatibility, so it might end up being locked to a language version or compiler option or something.

More specifically to Matthew's point:

I was wondering if someone was going to call me on this. :-) Swift isn't set up in general to handle this today, because it's in the C world of "compile a library, which has run-time dependencies", but also has the additional runtime reflection features that make it very hard to eliminate dead code. There's a very attractive solution here for modules distributed as source: expose custom options (via Package.swift, probably) that turn into conditional compilation guards for the library. (Or do something very clever with canImport.)

But that doesn't solve all the problems; in theory we want to get to a world where we can distribute binary libraries too, and then we want those libraries to have conditional dependencies. That requires more than just this "conditional re-exporting"; it also means making sure we don't end up with unexpectedly-undefined symbols when we try to link the library into the app. (I'm willing to punt on protecting someone who uses the runtime to, say, instantiate a type whose initializer calls something from a missing library.)

@John_McCall and I have had cool thoughts in the past around a "generalized availability model", some of which made it into the "Library Evolution" planning doc / laundry list that's guiding a bunch of our efforts this year. I think some of that would help—if you conditionally import something, you'd guard all of its uses with @available(MagicKit) or something. But I'm wary of over-design, and I'd like to have the main problem in a better shape sooner rather than later. So I don't think it would be the end of the world to punt on that now and have something like @conditional public import MagicKit if/when we decide to support that.

anandabits · February 6, 2018, 4:57pm

Thanks for the reply @jrose.

General reply first: I forgot to mention that requiring public import for all visible things sneakily “fixes” (2)

Sure, at the cost of requiring us to transitively depend upon any modules whose symbols are used in the public interface of a module we depend upon.

and (3) if we get to a world where we don’t need to import the non-public things at all. Unfortunately, source compatibility, so it might end up being locked to a language version or compiler option or something.

Can you elaborate on this? I don't quite follow it.

There’s a very attractive solution here for modules distributed as source: expose custom options (via Package.swift, probably) that turn into conditional compilation guards for the library. (Or do something very clever with canImport.)

Are you talking about allowing library A to detect whether library B is available using a conditional compilation feature of some kind? This might be interesting, but it also sounds quite complex and isn't what I was thinking.

My concern is primarily that I want to be able to control whether my module directly depends on module B. If I choose not to depend upon B then clearly the APIs from module A that make use of symbols declared in module B will not be available to my module. Further, no symbols (including extensions and operators) declared in module B will be available to my module.

However, I don't mind if module B is available to module A internally - no conditional compilation required. This indirect dependency is a separate issue. I want to be aware of it but will usually not be as concerned about the indirect dependency. The flexibility I want to preserve is the ability to replace module A with some other module providing similar functionality. Specifically, I don't want any possibility that code depending directly on module B creeps into the codebase because importing module A implicitly results in importing module B. In fact, I would prefer if the project-wide linker configuration prevents import B in my code: I do not want the direct dependency.

To make this more concrete, let's imagine a networking library which makes liberal use of a Result module in its APIs. Now imagine this library adopts async / await (assuming that is added to Swift) as an alternative set of APIs. A lot of user code still depends upon the Result-based APIs so this library continues to support these APIs.

In this context, somebody is writing an app and wants to ensure async / await is used rather than callback-based APIs. They do not want to depend upon Result and do not want it to be available to their code. But they do want to use the async / await family of APIs provided by the networking library. This is a pretty pragmatic example of the kind of control over dependency management that I think is most important. Does this example help?

michelf · February 6, 2018, 6:05pm

This is a bit like Removing File Scope Restriction for Import Statements, where you limit the visibility of the imported symbols to a specific scope. Except here, we're expanding the scope to the whole module (with internal) or to other modules (with public).

jrose · November 30, 2018, 1:47am

I'm starting to pick this up again (though I'm also still working on module stability), and it seems like we've still got some open questions.

@anandabits's concern about unwanted indirect dependencies

(which I didn't respond to back in February)

Hm. Other than the "APIs from SomeNetworkingLibrary that rely on CustomTypeKit shouldn't be visible if I haven't imported CustomTypeKit", this is pretty close to the Python/Java/C# model, i.e. the model Swift was originally aiming for. Which I think indicates the following analysis

If we wanted to stick to what we have today, we'd have two, possibly three kinds of imports:

implementation detail imports, not to be exposed to others except possibly as link dependencies (new, the problem I'm most trying to solve)
imports used anywhere in API/ABI, which must be present for a client to import this module
re-exported modules (optional, but highly requested and basically already working)

If we expanded that to include your "you can only use API that relies on types you've already imported"*, the list changes a little:

implementation detail imports, not to be exposed to others (same as before)
imports used in API/ABI, which may be present on the client side
re-exported modules
possibly another kind for "imports used in API/ABI that are pretty much required because without them you can't do anything, but still not re-exported"

We could leave that last kind off for simplicity's sake, though.

* This is a non-trivial amount of work, but it's at least conceptually similar to the work needed to check that types within the module are only using imports you can see.

The thing I don't like about both of these solutions is that they don't match up with C's model, but I suppose we could say "types used in @objc ways have to come from re-exported modules [possibly unless they're in a forward-declarable position]". For comparison, my proposal was a much more C-ish

implementation detail imports, not to be exposed to others (same as above)
re-exported modules

and that's it. It does close off some opportunities, but it's a way simpler model, and it defines away the need to fix the extension and operator problems.

Finally, it's also possible to add additional kinds of import in the future as long as they're not the default, which brings me to the next section…

What do we do with all the existing code that just says `import ModuleA`?

I'm coming around to the idea that the least breaking change says "all existing imports become re-exported; you need a new syntax to do an 'implementation detail import'". I don't love this for two reasons:

it's a change from the current behavior that will have observable effects (more things being re-exported)
it goes against the principle that anything that's part of your interface should be clearly marked as such

But anything else I can think of will result in massive churn for everyone who has already published a package that uses Foundation types.

(We could say that an unannotated import gets a warning in Swift 6 or whatever, so we're only dealing with this during the backwards-compatibility period, but I don't think we'd want to introduce a language version break just for this, so it might be a long time. Possibly forever.)

Spelling

Let's hold off on this one and pick awful underscored names for all these things for now. :-) No matter what we pick this whole effort probably needs a proper proposal anyway.

Sketch of a plan

No matter what we go with, we're going to need a way to tell what modules a public declaration depends on, so I (or someone else) can start working on that.

We will have to deal with the implementation detail module not being available, which means that anything that depends on implementation details has to degrade nicely. This is similar to some of the work we do papering over Clang importer differences between Swift versions:

Non-public methods in classes can be replaced with placeholders (for vtable layout).
Members of structs have to have their definitions around, even if they're private, because that affects the struct's ABI. :-(
Non-public conformances are a big mess if some of the associated types are missing and we need to think about them carefully.

I think I've written quite enough in this post, but how does this sound?

Slava_Pestov · December 6, 2018, 1:16am

Do you think we need private imports to work for non-resilient modules? If not, you can avoid both of these issues by dropping these members entirely.

Another general concern I have is this is yet another case where Swift has "multiple kinds" of things where the other languages you mentioned (Python, Java, C#) have only one construct. We're always finding ourselves adding new special cases to handle some new odd situation and we can never seem to keep up. Why is this so?

jrose · December 6, 2018, 4:58pm

I'll answer these in reverse order.

This is a really good question. After consideration, I've come up with two answers.

The first is C. Python, Java, and C# can get away with their import models—models which I think we would have preferred!—because they don't have to interoperate directly with C, which dumps everything you #include into a flat namespace. Transitively. The consequence of this is the UIKit/MobileCoreServices thing I keep bringing up: C developers don't have to know which header specifically defines the thing they need, and they'd be very frustrated if they had to start keeping track. That's now set in place in Swift—we're not going to start requiring people to keep track of where things come from in the Cocoa APIs, or even in C on Linux.

Okay, so why not just do it for C things? Well, partly because our implementation isn't great anyway (see (2) and (3) in the original post), but mostly because whenever Swift generates an Objective-C header, that header plays by Objective-C rules. Which means we're stuck with the C model for at least some of what we're doing. Currently Swift is trying to pretend that it's not using the C model for Swift-native things, and I think we should fix that inconsistency.

The second reason is binary frameworks. If you want to send someone a binary library in Python, well, you can't. If you want to do it in Java, you have to include all your dependencies in the jar file, which means in theory anyone can use them—there's no information hiding. But there are parts of iOS that Apple doesn't ship public interfaces for, and the same logic could apply to a third-party binary framework as well. That suggests a use for "an import I can use where I don't have to ship an interface as well".

My conclusions from this are that we're in a place where we should use the C model (always exported) to be consistent, but that we then need to make up for C having separate interface and implementation files. That's why we end up with two things: because C just puts them in two files, and Python and Java don't have a way to hide dependencies in the first place.

jrose · December 6, 2018, 5:00pm

I thought about this one, and maybe we won't have it in the first cut, but this has definitely been a pain point in the package manager, where people want to use an API in their implementation but not re-export it. (See again points (2) and (3) about our current incomplete handling of non-exported imports.)

Joe_Groff · December 6, 2018, 6:06pm

There's another way to achieve consistency, which would be to be able to describe what happens with C imports in terms of what's possible in the "native" Swift model. It seems to me Swift modules have a reexport feature, then you can describe what happens with imported C modules in terms of that—they always reexport their dependencies and it's a limitation that you can't disable that behavior. I worry about dooming Swift to that model because, while it may be barely workable for an SDK that's developed and managed as a single unit like Apple's for everything to swim in one giant namespace, that's not a scalable model if we expect people to start using large amounts of packages, and it isn't particularly easy even for Apple to manage.

jrose · December 6, 2018, 6:07pm

I maintain that we don't have a choice here because of mixed-source frameworks. Right now, the rule is "Swift module re-export the set of modules necessary to represent the Objective-C parts".

Joe_Groff · December 6, 2018, 6:09pm

Why can't that behavior be contained to the Objective-C interface?

jrose · December 6, 2018, 6:10pm

Because a downstream Swift client can import the mixed-source framework, and then they pick up both the Swift and Objective-C parts.

Joe_Groff · December 6, 2018, 6:13pm

I see, I misunderstood what you meant by mixed-source. That still seems fine; it's the Objective-C part that's imposing the transitive behavior, and that's understandable when you know how Objective-C imports work. There's only going to be more Swift-interface-only libraries and packages going forward.

jrose · December 6, 2018, 6:16pm

I guess that's another option in this space: add _exported properly, try to fix the bugs for (2) and (3) when not using _exported, and require imports used in Objective-C-methods-that-are-going-to-be-printed to be re-exported…

…that "going-to-be-printed" modifier seems really complicated. At the same time, it doesn't make sense to do this to all Objective-C methods. Maybe just public ones, and punt on the fact that app modules will leak some details?

millenomi · December 6, 2018, 6:41pm

A small aside: as we are splitting Foundation into multiple modules, I am going to start relying on @_exported to achieve umbrella framework-like effects in swift-corelibs-foundation.

jrose · December 6, 2018, 6:54pm

Yeah, "_exported as formalized feature" is also a very useful thing, and I think you can get that effect in Python (though not Java). Hopefully that's just a matter of picking the spelling, though.

jrose · December 11, 2018, 2:32am

Reporting some progress from early experimentation:

Checking whether all types used in a public decl are from exported modules is tedious but easy; it's yet another form of access control checking. I prototyped this by just checking enum element associated types in a manner similar to the existing AccessControlChecker and UsableFromInlineChecker.

There is, however, one complication: C modules with redeclarations. If modules Foo and Bar both declare a type foobar_t (and neither imports the other), and I import both but have only re-exported Bar, the naïve checking logic will think that I haven't satisfied the criterion. This is "just a minor matter of engineering" to fix, but it is annoying.
Similarly, the restrictions on what decls can be used in inlinable code would also need to include a check like this. I didn't look into how hard this would be but I think it's pretty much the same as what's already there. (This is a place where "this module must be present on a client machine, even if it doesn't make its names accessible" is a potentially interesting level.)
On a separate path, I tried to see how much would break if Swift modules didn't automatically re-export their imports. The answer is both "almost nothing" and "absolutely everything": "almost nothing" because the overlays always re-export everything anyway, and "absolutely everything" because no one else does even for the simplest things they use. This again seems like evidence towards treating an unadorned import Foo as @_exported import Foo in this new world.

There is one interesting case here: the standard library needs to now re-export SwiftShims instead of just importing it because some of its types are in public ABI. This isn't great but everything in there already has an underscore. Still, maybe we can get away with separate SwiftShims and SwiftShimsTypes modules, the former of which we can then hide.

@_exported and fixing import visibility

@anandabits's concern about unwanted indirect dependencies

What do we do with all the existing code that just says import ModuleA?

Spelling

Sketch of a plan

What do we do with all the existing code that just says `import ModuleA`?