Pitch: Fully qualified name syntax

beccadax · September 3, 2019, 10:38pm

Swift's name lookup is…messy. Very messy. This causes a number of problems, and today I'd like to talk about a potential solution for one of them.

Swift allows names to be shadowed by declarations in a nested scope. The idea is that, if you import WidgetKit to access its Widget type, but you yourself declare a Widget type, you can always access WidgetKit's version by writing WidgetKit.Widget. But what if something shadows the name WidgetKit? Then you can't fully qualify the names in WidgetKit at all.

That might seem unlikely, but many frameworks don't use the Kit suffix; in particular, it's quite common for a framework to be named after a type it provides, like the XCTest class in the XCTest framework. If you import a module like this, you have no way to fully qualify any of its names—if you say XCTest.Something, it will look for Something inside the XCTest.XCTest class.

This problem is especially severe in swiftinterface files, where we would like to fully qualify all names; XCTest and a couple other modules are currently working around this with a special flag, but we'd like to handle this properly. However, we don't want to give name lookup different semantics in swiftinterface files, and this problem does come up in developer-written code sometimes too, so some kind of language change is needed to address this problem.

Although the problem is clear, the solution is not, so I'd like to open discussion on that.

Three approaches

I can think of three ways to deal with this:

Change the semantics of all name lookups in some way, for instance by looking up module names before imported module contents.
Add a syntax which unambiguously indicates that the first identifier in a dotted name is a module name. For instance, using a straw syntax, XCTest.Something might refer to Something inside the XCTest.XCTest class, but @qualified XCTest.Something would refer to Something inside the XCTest module. This approach has two variants:

a. Continue to prefer the current undifferentiated names wherever possible, but provide this new syntax as an escape hatch to be used when necessary.

b. Deprecate the use of undifferentiated module names and encourage developers to use the new syntax for all fully-qualified names. We might even remove support for undifferentiated module names in a future source-breaking version of Swift.

I don't think we can seriously consider #1 for two reasons: It would be badly source-breaking and the changed semantics would probably trade precision for convenience. (For instance, you might always need to say XCTest.XCTest to address the XCTest class.)

Both variants of #2 seem workable, but they would have different syntactic trade-offs—if we chose 2b, we would want a short, unobtrusive syntax since it would be used frequently, whereas for 2a, we would want something longer and more self-identifying since it would be used only when necessary. 2a changes the feel of the language less, but 2b simplifies the language. I think I prefer 2a, but I could probably be persuaded otherwise.

Syntax

The other open question is, what should the "this is a module name" syntax look like? I'm prototyping this with the syntax @qualified XCTest.Something because the compiler already understands type attributes, but there are a lot of other directions we could go:

Different spellings of a type attribute
Type attribute containing the module, like @module(XCTest) Something
Magic root identifier, like #Qualified.XCTest.Something or #Modules.XCTest.Something
Similarly, some sort of parameterized thing, like #module<XCTest>.Something or #module(XCTest).Something or even #<XCTest>.Something.
Sigil indicating the following is a module name, like 'XCTest.Something or ::XCTest.Something or ..XCTest.Something
Different symbol to look up a name inside a module vs. a type/instance, like XCTest'Something or XCTest::Something
Straight single-quotes mean a module name, like 'XCTest'.Something; this would obviously preclude use for character literals

I'm open to other suggestions too, so bikeshed away.

owenv · September 3, 2019, 11:05pm

Thanks for writing this up, this definitely seems like a problem worth solving!

A couple of quick thoughts:

For what it's worth, I'm not a huge fan of the @qualified spelling the prototype uses, it seems a little too vague. Maybe @fullyQualified or @moduleQualified would be better?
The suggestions like #module(XCTest).Something which don't introduce whitespace between the module and member names read a little more clearly in my opinion. I think it's partly a matter of whether this feature is framed as "apply this attribute to opt in to different name lookup rules" or "use this syntax to unambiguously name a module, and then look something up in it."

Just out of curiosity, did you consider allowing for aliased imports like import XCTest as XCTestFramework? I assume that wouldn't necessarily solve the problem for module interfaces.

anandabits · September 3, 2019, 11:25pm

This is an important problem to solve. I just ran into it recently and had to resort to importing a (sometimes long) list of declarations from the module in question instead of importing the module itself. This is obviously not a scalable approach and we should offer a better solution.

I think we should go with 2a. The problem with 2b is that the need for disambiguation is rare enough that a more concise syntax such as a :: prefix sigil is not something Swift programmers will encounter that often even if required to use it for disambiguation. The shorthand syntax is thus unlikely to be familiar to a lot of Swift programmers.

2a has the advantage of being non-breaking, being simpler in the common case for disambiguation, and being clear and explicit in the new support offered for explicitly qualified disambiguation.

This is just my initial reaction, but of the ideas you listed, I like the #Modules.XCTest.Something syntax best. I don’t like the space between the attribute syntax and the module name. This seems to fit reasonably well with other uses of # in the language. That said, it’s worth considering whether using # here would preclude anything we might want to use # syntax for in the future. It seems easier to accommodate with other syntax than some other uses might be if there is a potential conflict.

anandabits · September 3, 2019, 11:26pm

This wouldn’t solve the disambiguation problem within the XCTest module either (unless it was accompanied by syntax that supported aliasing the declaring module).

beccadax · September 3, 2019, 11:28pm

Those are definitely clearer, but they're also on the verbose side. It might be better to present the feature in a different way instead, like that you're specifying an @absolute name or the name is @rooted or something.

(I am not attached to @qualified; I used it only because I needed to use something and an underscored type attribute name would create some awkward identifiers inside the compiler.)

~~Just off the top of my head, module interfaces could probably use that syntax by detecting whether the module name was ambiguous and adding a suffix if necessary.~~ @anandabits is right and I'm wrong. We'd need a way to rename your own module.

I don't love it, though, because in a sense it makes name lookup harder rather than easier. Instead of introducing a way to remove ambiguity, it adds ambiguity because you're indirecting through an alias for the real name. It's also the kind of thing you can't really offer a fix-it for—I could imagine a future version of Swift offering to replace XCTest.Something with @qualified XCTest.Something if that would make your code valid; a fix-it to rename an import and change all references to it would be a bit more challenging. And although we could stop with just renaming the module, it does open a can of worms around renaming individual declarations from the module; I'd rather keep that can closed.

owenv · September 3, 2019, 11:33pm

That's perfectly reasonable, feel free to completely ignore that suggestion then! I agree it would be really nice to be able to provide a great fix-it since I assume most users won't be familiar with this feature the first time they encounter it.

typesanitizer · September 3, 2019, 11:37pm

Would it be worthwhile to discuss this in light of potential things we might add in the future?

If there's a chance that we add hierarchical modules (I don't know how likely this is), we should have a terse syntax supported.
If there's a chance that we add type ascription (i.e. adding inline type annotations for subexpressions), and we decide that we're going to use a sigil here, that sigil should not be : (not to mention that it would lead to parsing ambiguties around dictionaries).

jrose · September 3, 2019, 11:50pm

We have type ascription; it's _ as Foo. There are places where that can additionally do coercion, but it will never change the meaning of code to make an inferred type explicit.

GetSwifty · September 4, 2019, 12:12am

I've come up against this a couple times, so thanks for addressing it!

Seems like 2a is the way to go. It's a relatively constrained issue, so it it seems like it should be something that can fixed with additions to the language with fix-it-able warnings for the places where it's ambiguous.

I would prefer some sort of symbol to indicate "treat this as a module name" over a # or @ option. As much as I'm hesitant to suggest something that looks like C++, ModuleName::ClassName seems like a good option.

Reasons:

It's foreign in Swift, but is a syntax that exists in other language and could be reasoned through if seen in the wild.
If nested modules are ever supported, the syntax would be self-explanatory while still concise: ParentModule::ParentModuleClass, ParentModule::ChildModule::ChildModuleClass
It would feel natural at both call site and as an import.

Hopefully there's a better sigil than :: that would probably be better. Unfortunately nothing better comes to mind immediately. I'm unsure how the C++ compatibility is coming, but I can see it potentially being used there (even if just internally).

Jon_Shier · September 4, 2019, 1:06am

I like this option well, and agree with the earlier reasoning about this type of disambiguation being rare. However, if we do want a sigil, I'd like to suggest @XCTest, as a callback to @import.

Would it be possible to have an non-prefixed Modules discriminator? Perhaps in the future it could expose API to see what modules are currently visible, but for now just be used to namespace the available modules? Modules.XCTest seems fine to me.

marcusrossel · September 4, 2019, 7:42am

Perhaps using backticks to refer to a module explicitly would fit the language: `Module`.Something.
Swift already uses backticks to remove the ambiguity from using names that collide with keywords. So this could feel natural, as it would also remove ambiguity - just in the context of modules names.
Not sure if this could lead to any collisions with the already existing backtick functionality though.

beccadax · September 4, 2019, 8:06am

Yeah, that would be the problem—it would change the meaning of existing code. I’d wager that at least a third of backtick uses are on unqualified names.

strictlyswift · September 4, 2019, 12:06pm

I agree with this - the requirement, the use of :: and the reasoning for doing so. I don’t think we should avoid something just because it looks a bit like C++ !

michelf · September 4, 2019, 1:31pm

ModuleName::ClassName

While this looks nice, I don't think solving this problem is worth introducing a new scoping operator in Swift. Things are much simpler as they are now with a dot (.) everywhere.

The problem at hand only requires a way to refer to a module unambiguously. Any variation of this would work fine:

#.ModuleName.ClassName

where # acts as some sort of language root to which modules belong to.

xwu · September 4, 2019, 1:45pm

michelf:

ModuleName::ClassName
While this looks nice, I don't think solving this problem is worth introducing a new scoping operator in Swift. Things are much simpler as they are now with a dot ( . ) everywhere.

The problem at hand only requires a way to refer to a module unambiguously. Any variation of this would work fine:
#.ModuleName.ClassName
where # acts as some sort of language root to which modules belong to.

If we want our design to be able to accommodate the possibility of submodules, a new scoping operator (either in name or in effect) is actually inevitable. It's just a question of how clunky it appears.

Consider that submodules may themselves need disambiguation; if we go with your idea, we have:

#.ModuleName.#.SubmoduleName.ClassName

...and then .#. becomes a new scoping operator in all but name (particularly if we subscribe to @anandabits's point about option 2a versus 2b).

Likewise, if we go with something like Modules, then we have:

Modules.ModuleName.Modules.SubmoduleName.ClassName

...and then .Modules. becomes a new scoping operator in all but name.

Compared to these options, the far superior choice in my view is ::; it's terse but not to the point of confusion, visually distinct as an actual operator that can be documented and taught, and precedented in other languages:

ModuleName::SubmoduleName::ClassName

jayton · September 4, 2019, 1:48pm

I’m in favour of ::, but I don’t think it makes sense to introduce that while preferring Module.Name, so I see it as a 2b option.

Lantua · September 4, 2019, 2:13pm

I’d prefer

Module.SubModule::ClassName

Dante-Broggi · September 4, 2019, 2:18pm

Would the current naming system work in the presence of submodules, save the OP issue, if a module is not allowed to have a submodule and a top level type with the same name?

I would think so, and if so we would only need to create a way to name the (current) true top level, i.e. uniquely name the namespace of currently imported modules. (OP syntax 3)

harlanhaskins · September 4, 2019, 2:24pm

The more I think about this problem, the more I like :: as “explicit module qualification” syntax.

It seems like we could use the same syntax for module qualification of members as well:

myStr.Foundation::range(of: “hi”)

This would also solve the very real problem of 2 dependencies introducing the same overload with no way to disambiguate.

michelf · September 4, 2019, 2:34pm

I think this deserve more thought. The reason you need two scoping operators is so you can avoid clashes with identifiers coming from two different uncoordinated sources. Wouldn't it make sense for submodules to be coordinated with the parent module so they don't cause clashes?