SE-0386: `package` access modifier

What if we call it a "module group", then make the keyword groupinternal?

  • It's a little verbose, but I think that's okay here: it calls attention to the unusual scope and gently discourages unnecessary use of it.
  • We wouldn't want to use group as a keyword, but making compounds of it seems reasonable, and "module group" is a nice, generic phrase for the concept.
  • The keyword plays a bit with the ambiguity between a group of modules and a group of programmers, which I think should be beneficially suggestive.
  • Using internal as the suffix rather than private takes what we did with fileprivate and makes it a pattern: these "secondary" access modifiers are more-permissive tweaks on a "primary" modifier.

Then the SwiftPM rule is that it defaults to putting all of the modules in a package in a default group, but you can override that to take things out of the group or put them in a different, non-default group. In any case, the group is namespaced to the package, meaning only modules in the same package can be part of it. Other build systems can pretty easily use their own mechanisms for grouping modules.

11 Likes

Yes. Whoever writes the build‐system‐agnostic documentation will need to be careful to define “module group” in a way that makes the ABI effects intuitive: “a group of modules developed, versioned, and distributed together which can therefore interact using an unstable interface exclusive to the group; that exclusive interface is marked with groupinternal” (or some such).

1 Like

It just occurred to me that this feature might be good for those concerned about their internal modules currently leaking and being accidentally importable by clients. (We get many such comments around here.) They would now have a way to seal such modules off by doing a find‐and‐replace of public with groupinternal.

i think this kind of stance, while understandable, is just not realistic given how critical and widespread __shared and its sibling __owned are in high performance swift. today we have a number of underscored keywords and attributes that are part of the language in all but name:

  • @_spi
  • @_exported
  • @_disfavoredOverload
  • _modify
  • __owned
  • __shared
  • __consuming

as much as certain language designers want to pretend otherwise, these features are load-bearing columns, and removing or renaming any one of them would cause large swaths of the library ecosystem to collapse. in fact i myself was kind of taken aback on the other thread by how many people not only know about but are depending on @_spi, as i had assumed that attribute was still pretty obscure.

yes, there are efforts to naturalize __shared (last i heard, we were thinking of renaming it to borrowed). but even if that were to land this spring in 5.8, we would not be able to replace any instances of __shared with borrowed until our various minimum toolchain requirements reach 5.8, because older toolchains would not understand borrowed. and that could take quite some time, possibly around the release of 5.11, if the minor version number ever gets that far.

so to summarize, __shared is not going away any time soon, and yes, options for evolution do need to account for its existence.

2 Likes

This definitely resonates with me. I get why something like “team” here is compelling but it feels weirdly prescriptive to me to have the language surface directly describing the human organizational structure working on it.

As such,

I like this better, though “group” does strike me as a bit generic. It doesn't really say anything about the modules other than they're grouped together, and the most direct answer of how they're grouped together is a bit tautological: they're grouped together by virtue of being able to access groupinternal declarations across modules.

To the extent we would want this access control modifier to be the thing that formalizes the "resilience domain"/unit of distribution concept that Becca mentions above, "package" (or packageinternal, whatever) strikes me as a suitable name. It is a bit more meaningful than "group" and maps on to how we actually talk about "packaging" code up for distribution. As I've understood the objections to the correspondence with SwiftPM packages is (at least) twofold:

  1. There may be products in a SwiftPM package which are not suitable to have frog-level access to the rest of the package and should instead receive an "external" view of the package they are distributed with.
  2. One may want to have multiple 'frogs' within a single SwiftPM package because they want to keep the unit-of-distribution benefits provided by a single package but still present a simplified intra-package interface between unrelated sets of files within the same package.

The first objection strikes me as something that is perhaps suitable for a solution on the SwiftPM side. It seems reasonable to me to have products like benchmarks and examples which are conceptually 'part' of the package for distribution purposes, but which should receive an external view of the package for organizational purposes. If SwiftPM were to provide a way to specify "external" products like this, would it resolve this objection?

My difficulty with the second objection is that I have trouble seeing its limiting conditions. In @taylorswift's example the two 'nuclei' themselves seem like they could be liable to grow larger and end up with 'sub-nuclei' of their own while still wanting to take advantage of the unit-of-distribution benefits of being in a single package. In this case, the ability to have multiple frogs in a single package would no longer be enough—you'd need a tree of frogs! It seems like this objection motivates a potentially significantly finer-grained access control feature than what 'package' offers. (Something like folderinternal has come up in the past which would have that tree structure fall out naturally...)

I guess at the point where we've already accepted that it makes sense for there to be a 'default' group and a further 'external' group, it may just be worth just going the distance and allowing for fully arbitrary top-level groups. But it does feel to me like something is a bit lost if the vast majority of packages just use the default group, so that almost everywhere groupinternal really does just mean packageinternal but with the added cognitive step of "oh what group is this in? Ah, it's just the package".

1 Like

I don’t think that level of disconnect for package authors/users is solvable without calling it “package”-something, though, which there are specific arguments against. Once it’s not called “package”, there’s always going to be that question of “what frog is this…?” And fortunately, I’m not sure that’s an important question to be confused about, at least for code readers, because intra-project differences between groups just means that code that doesn’t exist would be invalid.

I will say that I’m not personally motivated by having multiple groups within a package, and if SwiftPM decided not to allow that as a policy matter, I think that’s an acceptable answer. (The decision to namespace groups within packages would be a similar policy statement; after all, someone could want to span packages.) I care a lot more about making this feature feel available to Xcode users and other people not working within the package ecosystem.

3 Likes

I agree with this, there are several existing drawbacks of not having a way to semantically delineate these in the package manifest somehow, e.g. it can be unclear which executables a package vends are examples, testing tools, etc, so it would make sense to offer a solution to this in SwiftPM. This doesn't seem like a blocking concern for this proposal, though, since today any APIs were talking about need to be public anyway so e.g. examples already have access to them.

4 Likes

I definitely agree with this as a goal, though on 'group' specifically it is perhaps worth noting that Xcode already uses the 'group' terminology to refer to a unit of the project organization tree (which may or may not map onto filesystem folders), and those would be completely unrelated to the 'group' terminology at the language level.

More broadly, in your post above you mention two drawbacks to the 'package' nomenclature: that 'package' might be weird for folks who don't think of themselves as developing a 'package,' and that it could be weird to have any code in a SwiftPM package (examples, benchmarks, whatever) that can't access packageinternal declarations.

I don't really see how choosing 'group'-based terminology helps with the first issue, since I doubt folks think of themselves as writing a 'group' either. Is the concern here that users who don't interact with the package ecosystem will be turned off from even considering using the feature because "we don't use packages," so that picking a term entirely unrelated to SwiftPM will make this new feature more approachable?

On the second point, I agree with you that even if SwiftPM had support for making the distinction, it is perhaps a bit difficult to talk about and explain. But I think there's reasonable lines to draw that aren't totally convoluted (e.g. if we referred to 'external' package targets then it seems somewhat natural that such targets wouldn't have access to the 'internals' of the package). If we expect the access control level to, in practice, map onto a single SwiftPM package, I don't love the idea of having a completely different word for this concept in the language. It would be nice if we could coordinate the opinionation between SwiftPM and the Swift language here.

No, I think people will say, "Oh, they added a way to use things across modules within packages, I guess that's cool but not for me." And if an IDE like Xcode wanted to add a way to use the feature within a project, we'd really have put their UI in a bind, because presumably the IDE also wants to support writing SwiftPM packages and/or adding them as dependencies.

Thanks for reminding me about Xcode groups. I think we'd want the UI and documentation to always call these module groups, though, so this shouldn't really be a problem. The only place it'd be shortened to just "group" is in the keyword, where the "module" part is implicit because internal is always about module-level stuff.

But of course if someone suggests a better name, I'm open to that.

1 Like

i am not sure if i understand correctly the scenario you are describing here. if one of the ‘nuclei’ grew to a size where it became difficult to manage, i would probably just split it again into two nuclei, and the package would now have three nuclei instead of two.

i don’t really envision a situation where recursive subsystem divisions would ever come into play, because it doesn’t make sense to me for package nuclei to have nuclei of their own. a flat package partitioning model seems perfectly scalable in my view.

2 Likes

There are a lot of good points in this thread; it's been hard to follow along. The points about targets within a package that only exercise public API is a great one; I've done that myself several times. I'm also sympathetic to the point that a large package can be developed by multiple "teams" of people, each wanting to expose functionality for their teammates, but keep it hidden from other "nuclei" in the package. Some of the stuff I work on would definitely fit that description as well.

So this makes me wonder… what if teams could define their access scopes themselves to suit their needs? There was a preliminary pitch for that almost 6 (!!!) years ago (direct link to the file), but it came at the tail end of The Great Access Control Wars and everyone was too exhausted to litigate another pitch. Perhaps it's times to resurrect and refine the idea of code that can precisely define its own access restrictions.

9 Likes

Thanks John, this makes sense. And I see why splitting hairs between "SwiftPM package" and "module package" is perhaps more difficult than between the entirely unrelated concepts of "Xcode project group" and "Swift module group."

I'm not sure that this is much consolation to me since I expect most users will primarily be interacting with this feature at the source level rather than via the UI/documentation. It seems like a pretty natural inference when coming across groupinternal in source that this has some relationship to the "group" already present in Xcode. Perhaps I'm overestimating how much people's minds would jump to Xcode groups as the most obvious referent of "group," though.

Fair enough. I suppose my concern is that if you want to take advantage of the unit-of-distribution benefits regardless of how complex the internal conceptual structure of your package becomes then it seems not unreasonable to me to have a situation where you want some stuff to be shared only within each of the MongoA and MongoB groups, some stuff shared between MongoA and MongoB but not with BSON, and some stuff shared across all modules within the package.

I guess this is just a more general discomfort about defining a new level of access control between public and internal that is completely disconnected from any sort of opinion about what sort of organization is appropriate. Adding an invented level of organization above module seems like it is one step down the road toward a full submodule/namespace system, and so it might make sense to design such an access level with that in mind. What makes us sure that one more level of access control is the right stopping point? What is it about "group of modules" that wouldn't have folks wanting to reach for "families of groups of modules" somewhere down the line?

I don't think such a direction would necessarily be wrong, but I think it would be pretty silly to just pick a new word for each level of organization, so if we see ourselves going down this road I think it would make sense to just call these organizational units sub-/super-modules and come up with a way for users to specify where in the module tree a particular declaration should be visible. Today's internal could default to either the root or the leaf, and if you had a module structure such as:

A
|- B
|  |- C
|  |- D
|
|- E

then maybe you'd write B.internal on a declaration in C to have it be visible to B, C, and D (but not A or E).

2 Likes

-1 on this proposal, because I think I disagree with how I view the authors view access control, and IMO it doesn't align with how the language has done it in the past either. The fundamental problem that it solves is that some code should be able to access certain APIs. In a perfect world everyone would magically have a precise set of clients that could call them and they'd list those out, but in reality we've decided that the approximations that people are typically fine with include things like "everyone can use this" (public), "only I can use this" (private), and two also fairly common ways of organizing code, "I pasted a bunch of stuff in a file and it should be able to see each other" (fileprivate) and "let all the files in this target access each other" (internal). We then shut down discussion of adding additional broad scopes like these, because we figured that if someone wanted wanted them they should just make do with the ones we had provided.

In the meantime, we also added one specific access modifier, @_spi, to precisely control the list of of clients rather than providing broad control. This has actually been pretty popular! There are a lot of APIs that only really have one or two clients, but that are separated from each other in ways that don't make the existing scopes a good fit.

Anyways, this proposal suggests another access control modifier, package, which does scoping on the package level. If we look at this from a purely access control perspective, why should we consider packages special? People organize their code in all sorts of ways, not just packages. A lot of people don't use SPM at all, as has been mentioned in this thread. I don't actually think this unit of organization is any better than say "directory" (where all files in the same directory can access each other) or "bundleid" (where clients sharing the same reverse DNS prefix can access each other). Lots of projects are organized in a way where this would be about as useful as package. People group their code in all sorts of ways and I think if we actually feel like doing this properly we should just take @davedelong's suggestion let people pick their own custom access modifiers that fit their project the best. I feel like a lot of people who are asking for @_spi to be more powerful are really asking for "I have some sort of internal API and a specific type of client in mind, let me specify who I think those should be and if we both agree to be adults by opting in to this internal API boundary then they can access it".

Additionally, I feel like too much of the motivation for the proposal (and the subsequent arguments in this thread) is to make compiler things line up at the expense of how most users actually care about access control. We shouldn't pick an access control scope because it allows for eliding some inter-module resilience boundaries inside of a package, or because we want to minimize the number of public symbols exported from a binary. There's been a lot of pushback on the concept of "arbitrary clients". I think some of it comes from a place of being unhappy that this requires these things to be exposed publicly ABI-wise and reduces opportunities for optimizations. I feel like the authors have decided that packages are a convenient place to do some sort of intra-module grouping for the compiler and feel that it's also a workable place to do some access control work to make it happen. I don't actually think it's the wrong abstraction for a lot of code, but I think this proposal blesses this grouping specifically for reasons that aren't all that unique or compelling if you measure them up to the previous access control modifiers we had. And, I mean, a lot of what you view to be an ad-hoc coupling is a completely sane relationship for my organization. I mean, there are all sorts of things that are totally reasonable to support in Swift but are far more similar to @_spi than package. I don't think there is any static grouping that can the range of access control people look for in their projects.

I think if we move ahead with this it should come as an instance of a more general solution that allows developers to make their own classification based on their project's needs, and if it so happens that we can only make access modifiers like package fast and someone's directoryprivate access control expends all its benefits at compile time and doesn't end up making things any faster than public does so be it.

6 Likes

I’m going to use this access modifier a lot, whatever it is called. And I‘m an Indie developer working all alone in most of my projects. Naming the access identifier after some motivation about the needs within Apple seems wrong to me. Yes, those needs are real and I totally respect them. But the name should really reflect what the access modifier does, not how it’s used in most cases. At least for me it would feel weird to use it.

5 Likes

I feel like those are pretty good reasons to introduce a new access modifier, to be honest.

And also, unlike “directory” or “bundleid”, a “package” is an official and accepted grouping concept within the Swift language project. Long term, I think it could also replace targets within Xcode in some way.

3 Likes

I usually refer to Xcode groups as “folders”, but sometimes that has led to confusion when the group structure in Xcode doesn’t correspond 1:1 with the directory structure on disk.

1 Like

It still seems to me that coupling the unit of code distribution to an access modifier designed to improve visibility of declarations across modules is rather arbitrary, because they are separate concepts. In other words, the fact that some blob of code is distributed as a single Swift Package is unrelated to how complex its internals are, and it's still definitely possible that multiple, independent people or teams work on different parts of that codebase, and would love to take advantage of this feature.

To be clear, I think that the feature as presented is good enough for my particular set of use cases, so I pretty much support it, but I can very well think of cases where one group per Swift Package is going to be too limiting, given the way SPM currently works.

a package is not a grouping concept, a package is a distribution concept that this proposal would now promote to a grouping concept.

what i fear will happen if this proposal is accepted as-is is that if i have a multi-nuclear package, like the BSON/MongoDB example i gave, then only one of the nuclei would be able to use the feature. and since neither nucleus is clearly “more important” than the other, then i would have no choice but to split the package into two packages with independently incrementing version numbers, which would be very very bad.

things like subsystems within subsystems, overlapping subsystems, or arbitrary user-defined access control levels are unimportant to me, because lack of those things will not exacerbate dependency hell, whereas promoting packages to a grouping concept certainly will.

to reiterate, all i really care about is not having to manage multiple independently-incrementing version numbers, and that is the only reason i am -1 on this proposal as is.

3 Likes

I had already mentioned it in the pitch, but for the sake of completeness I‘ll repeat it here again:

I can imagine instead of a new access modifier that we could also alter the behavior of the existing “internal” modifier for targets by introducing a new target option in the Package manifest, called something like “extendInternalToPackage”.

This way we would effectively have the same as the suggested “package” behavior, but we would lose the “module wide internal” access modifier and would only have private & fileprivate left for hiding things away from other modules within the same package. But for my use cases, where my main goal is to hide away public APIs in helper modules from the actual public APIs of the main library target(s), this would do exactly what I need without a new access modifier.

I’m not saying this is my preferred way, I’m just mentioning it to provide another alternative to the table for consideration.

To riff of this thought and thinking about the multi-nuclei use case and going through the proposal again (and thinking of the test/benchmark considerations) - isn’t the major issue with the proposal as it stands the passage:

The Swift Package Manager already has a concept of a package identity string for every package, as specified by SE-0292. This string is verified to be unique via a registry, and it always works as a package name, so SwiftPM will pass it down automatically. Other build systems such as Bazel may need to introduce a new build setting for a package name. Since it needs to be unique, a reverse-DNS name may be used to avoid clashing.

If changing this logic slightly in the context of swiftpm such that the package identify string is passed by default, but that it can be overridden for a target to be set to something else (“BSON” or “nuclei2” which would be appended to the default identity string) for advanced use cases or removed (for eg. a benchmark) - similar to how custom source paths can be overridden. Wouldn’t this solve most of the issues brought up, or am I missing something? Then most users can just use the default logic, but it’s possible to tweak if needed - progressive disclosure. Other build systems would need similar approach.

I’ve tried to follow all the discussion and hope I haven’t missed if this was already suggested, apologies in advance if that’s the case.

4 Likes