New Access Modifier: package

Motivation

Packages are often composed of multiple modules; packages exist as a way to organize modules in Swift, and organizing often involves splitting a module into smaller modules. For example, a module containing internal helper APIs can be split into a utility module only with the helper APIs and the other module(s) containing the rest. In order to access the helper APIs, however, the helper APIs need to be made public. The side effect of this is that they can “leak” to a client that should not have access to those symbols. Besides the scope of visibility, making them public also has an implication on the code size and performance.

For example, here’s a scenario where App depends on modules from package gamePkg .

App (Xcode project or appPkg)
  |— Game (gamePkg)
      |— Engine (gamePkg)

Here are source code examples.

[Engine]

public struct MainEngine {
    public init() { ... }
    // Intended as public
    public var stats: String { ... }
    // A helper function made public to be accessed by Game
    public func run() { ... }
}

[Game]

import Engine

public func play() {
    MainEngine().run() // Access a helper API within the same package
}

[App]

import Game
import Engine

let engine = MainEngine()
engine.run() // `run` is a helper API and should not be accessed here
Game.play()
print(engine.stats) // `stats` is intended as public so can be accessed here

In the above scenario, App can import Engine (a utility module in gamePkg) and access its helper API directly, even though the API is not intended to be used outside of its package.

Proposal

A current workaround for the above scenario is to use @_spi, @_implemenationOnly, or @testable. However, they have caveats. The @_spi requires a group name, which makes it verbose to use and harder to keep track of, and @_implementationOnly can be too limiting as we want to be able to restrict access to only portions of APIs. The @testable elevates all internal symbols to public, which leads to an increase of the binary size and the shared cache size. If there are multiple symbols with the same name from different modules, they will clash and require module qualifiers everywhere. It is hacky and is strongly discouraged for use.

We propose to introduce a new access modifier package. This would limit the visibility of the symbols to only modules within a package.

Declaration Site

Using the scenario above, the helper API run can now be declared package

public struct MainEngine {
    public init() { ... }
    public var stats: String { ... }
    package func run() { ... }
}

The package access modifier can be added to any types that an existing access modifier can be added to, e.g. class, struct, enum, func, var, protocol , etc. The access level of package will be between internal and public. It will allow access as well as subclassing (unless final) cross-modules within a package, and does not allow the symbol visibility outside of a package. The parallels to internal are strong. With internal, every potential client is in the same module. With package, every potential client is in the same package. Since a package is developed as a unit, package-visible declarations can be updated with their clients. The exportability rule will be similar to the the existing behavior; for example, a public class is not allowed to inherit a package class, and a public func is not allowed to have a package type in its signature.

Use site

A package name will be stored in a .swiftmodule and looked up during type check to dis/allow access to package APIs. Swift Package Manager knows the package boundary and will pass it down to the compiler. Other build systems such as Xcode and Basel will need to pass a new command-line argument -package-name to the build command to let the compiler know what package the compiled module is in, per below.

[Engine] swiftc -module-name Engine -package-name gamePkg ...
[Game] swiftc -module-name Game -package-name gamePkg ...
[App] swiftc App -package-name appPkg ...

The input to -package-name is a package identity, which, besides alphanumeric characters, can contain a hyphen, a dot, and other characters valid in URL; such characters will be transposed into a c99 identifier.

When building the Engine module, it will store the package name gamePkg to Engine.swiftmodule. When building Game, it will store and compare the package name of Game and that of Engine and allow access to package APIs if they are the same. When building App, it will detect that its package name is different from the package name of Game or Engine that it imports, so it will disallow access to package APIs and throw an error if it attempts to do so. With the helper API run in MainEngine now declared package, App can now only access its public API stats, which is the intended behavior.

Future

Limiting the scope of visibility per package can open up a whole lot of optimization opportunities. A package containing several modules can be considered as a unit for applying size and performance optimizations, which could yield notable improvements in the future.

43 Likes

As my packages have been getting more and more modular, I've been yearning for a package level access modifier as well. @_spi is an ok workaround but it feels quite forced. So big +1 from me.

1 Like

My brain instantly responded to this with the following term: submodules

Sorry, I hate to be that guy. :sweat_smile:

10 Likes

I’d be happy to say a package is a kind of tool-managed @_spi, and conversely that this is a good way to make the niche library-evolution-focused “SPI” into something more generally relevant and reviewable. I’ll also be the other That Guy and ask how this interacts with open. >:-)

4 Likes

I'm super excited for this! :partying_face: I've had to either resort to making things sit in the same module, or very annoying amounts of "no-one should call this" methods public when making extension in the distributed actors cluster - this would solve all those issues -- really looking forward to it!

2 Likes

Are package symbols ever exported (in the sense that dlsym would find them if you used the correct mangled name) in final libraries/executables?

Yes, please. It is so annoying not to have this.

Would @usableFromInline be applicable to package declarations? Currently it's only applicable to internal ones, since public ones are always implicitly usable-from-inline.

I assume @inlinable would be applicable, since it's applicable for both internal and public.

I’ll also be the other That Guy and ask how this interacts with open . >:-)

open will continue to behave how it currently behaves: possible to access (i.e. public) and subclass from another module, and is visible to any client. package will allow accessing and subclassing from another module as long as both modules are within the same package; this is parallel to how internal behaves; internal allows accessing and subclassing within the same module.

Would @usableFromInline be applicable to package declarations? Currently it's only applicable to internal ones, since public ones are always implicitly usable-from-inline.
I assume @inlinable would be applicable, since it's applicable for both internal and public.

Yes both will be applicable to package.

3 Likes

Are package symbols ever exported (in the sense that dlsym would find them if you used the correct mangled name) in final libraries/executables?

Yes they will be exported.

2 Likes

Big +1 from me. I have been resorting to spi for a lot of use-cases lately which all could have been solved with package level access.

2 Likes

Another +1 here, I maintained a multi-module project where absence of something like package modifier was clearly missing and would help greatly if it were available. Very excited to see this pitched!

Big +1 too, I like the proposed approach straightforward, similar to existing concepts and solves a number of use cases we’ve had.

As an alternative name to package, would external be better?
It is more naturally an adjective, and so fits better with the other privacy modifiers.

I do like package as pitched, it’s crystal clear what the scope is - external could be confused with public imho.

1 Like

Thanks for considering us Bazel folk in the design! This looks super easy to support; I can imagine the Swift build rules mapping the concept of a Bazel package (a directory containing a BUILD file that defines one or more targets) one-to-one with -package-name. Bazel encourages builds to be split into many very small libraries, which currently forces us to declaring much more public than it should otherwise be., and I see many SwiftPM packages moving in a similar direction away from large monolothic modules, so this fills a very important gap.

Does this mean that anyone who passes -package-name Foo to the compiler manually would have access to Foo's package-visible symbols, even if they aren't "physically" part of the same SwiftPM package/Bazel package/etc.? Are we concerned about risks from people doing this? (In a past life, I remember declaring some Java code to be in the same package as another library I was using because I needed access to something that was only package-scoped, and nothing in the tooling prevented me from doing it.)

Currently (or at least, the last time I checked), public symbols are emitted as llvm.used so it's never possible to remove them with dead-stripping, for example if they're never used but statically linked into a large binary. Would package symbols have the same limitation then?

+1 to the implementation, glad to see bazel addressed.

This proposal further cements separate libraries/modules as the preferred way of organizing code in swift, and I’m wondering what effect encouraging this will have on link times for a typical SPM project using popular libraries like TCA, which if memory serves has a lot of submodules.

My concern is not that it wouldn’t be clear what the scope of the access modifier is, but rather that it wouldn’t be clear that it’s an access modifier. How about packageprivate? It’s long, but it is at least an extremely clear name that we will never wish we had reserved to mean something else (unlike simply package), and although I think that it does make sense to have this additional access modifier added to the language I also am interested in the idea of the somehow more thorough solution of “submodules”, so I like the idea that the new keyword is more obscure/less in the way if in the future it becomes a bit obsolete in the face of new solutions.

The reason I put submodules in quotes is because it’s a term I see used frequently and which I think
I can intuit a lot about, but which I believe I’ve never read a detailed pitch about nor have I thought very deeply about myself in truth. Can anyone point me to some good writing on the topic?

We can enforce prevention of it at the build system level but it would be harder to do if it was directly fed into the compiler manually. Suppose if anyone were to really access package symbols, they could copy paste the code over directly as well though.

That's correct, although we could introduce a build setting that lets users decide whether to hide package (or public) symbols if statically linked.

Currently symbols have to be declared public which already affects the link time; if we introduce a build setting in the future to not export public or package symbols for static linking, we could optimize for link time.

3 Likes

This package concept is likely to pave the path towards a structure of an "umbrella module", i.e. package, and modules in it. A package consisting several modules would be considered as a unit from the symbol visibility point of view and possibly treated as a resilience domain (for optimizations, etc).