[Pitch] Module selectors

Parsing details

...
The :: token may have whitespace on either, both, or neither sides without affecting how the code is parsed.

I'm not fully convinced by this. In my opinion, the module selector should behave more like a part of the identifier, whitespace on either side should not be allowed.

Consider the following example:

foo.bar
::baz()

Here, bar doesn’t visually appear to be part of a module selector at first glance.

foo::
bar()

This might feel acceptable, especially if either the module name or function name is particularly long. But

foo.bar::
  baz()

In this case, I don’t find foo.bar:: to be a meaningful grouping.

6 Likes

Another syntactic consideration is whether to allow bare keywords as identifiers after ::, just as we do after .. E.g.

MyModule::switch = true
thing.MyModule::as(.that)

My opinion is YES because there's no ambiguity here.

But I think allowing whitespaces after :: can cause confusions in parser. E.g.

Foo::
switch bar {
  ...
}

This Foo:: is probably an incomplete expression, or mistyped Foo: (statement label). But allowing whitespaces after :: would eat Foo::\nswitch as an expression, and the diagnostics will be confusing ("consecutive statements must be separated by ;" between switch and bar)

edit:
I thought it might be worth mentioning that . actually allows whitespaces before the member name, but only when it's balanced:

foo . 
  bar() // OK

But I consider this a bug, though it’s probably too late to change now. That said, I don't think we need to follow it.

1 Like

I'm not sure if that's a bug, or just a design oversight. Allowing spaces between operators when the spaces are balanced is the general parsing rule for infix operators, and would have required special casing to avoid.

From a formatting point of view, I think it would be unfortunate if we disallowed breaking here. I'm imagining a post-SE-0451 world where Bazel users are using target-labels-as-module-names and need to use a module selector to disambiguate something:

let x = someVerboselyNamedReceiver
  .`//my/project/location/and/maybe/its/really/long:MyModule`::aDescriptivelyNamedMethod(...)

This definitely shouldn't happen very often, but it's nice to have options so that we don't end up with extremely long unbreakable token sequences. My personal preference would be to allow breaking before the :: since that's where swift-format breaks member references today; I don't care too strongly about breaks after the ::. Then, we could reflow to this if necessary:

let x = someVerboselyNamedReceiver
  .`//my/project/location/and/maybe/its/really/long:MyModule`
    ::aDescriptivelyNamedMethod(...)

Now, this raises a question. Let's simplify that:

let x = a
  .b
    ::c()

Are we willing to close off a possible situation where the thing on the left-hand-side of the :: is empty/implied, similar to how C++ uses ::blah to reference the global namespace? Because if we do allow whitespace around :: and want to consider that syntax later for something, the code above would be ambiguous. Is it one statement, assigning a.b::c() to x? Or is it two statements—assigning a.b to x and then calling whatever ::c() might mean later on?

We could probably finesse it similar to how contextual keywords like copy disallow line breaks between themselves and the subsequent identifier, but then we're back to the initial problem—we've made it impossible to line wrap in some situations again.

This can be resolved by having the current global module be spelled _::.

My point was that I'm not so much interested in solving for a specific purpose of the no-left-side ::X syntax, but rather make sure we're ok with being unable to use it for anything (or ok with having odd whitespace-sensitive rules for it, if we did).

The only reasonable meaning for the no-left-side syntax that I can think of would be look up in the current module. So, as there exists a reasonable explicit syntax for inferring that module in the form of _::, I don't think forgoing the no-left-side spelling is, or even could be, a problem.

Sure, but special casing would not be an issue. The parsing behavior of . is already distinct from that of infix operators (e.g.foo .bar is valid). And it's not just .: many punctuators such as ( in function calls, { in trailing closures, [ in subscripts, @ in attributes, and # in macro expansions all have their own specialized parsing rules.

My observation here would be that I think the particular syntax issue here is another manifestation of the mismatch between the pitched relative precedence of :: versus . and (at least many people's) intuitions on that point.

If we instead reconciled that intuition and adopted leftward hoisting rules as mooted above, mandating parens for componentwise qualification as an advanced use, I think we wouldn't have to choose between whitespace-sensitive rules or disqualifying bare ::X syntax.

2 Likes

It doesn’t necessarily have to be used for module selectors. In the future, we might want to support something like ::alpha:: as an expression. I think what @allevato is concerned about is whether we’re okay with closing off that possibility.

1 Like

If this does come to a proposal, I think there should be some serious bikeshedding around using ::, which has precedent in two languages that read like symbol soup: C++ and Rust.

2 Likes

I just want to chime in with support for having some way to spell “the current module, without naming it”.

I’ve run into situations with playgrounds where I need to disambiguate, but I don’t actually know the name of the playground module (is it dynamically generated and different at each execution?)

5 Likes

On balance, I agree. I'm editing that change in.

Less convinced about this. In C++, I often end up writing things like:

SomeBigPolysyllabicReturnType VeryLongAndComplicatedClassName::
evenLongerAndMoreComplicatedMethodName(SuperVerboseParameterType param) { ... }

Granted, function declarations are especially wordy and module selectors aren't valid there in Swift, but I don't think it's inherently confusing or bad style. We could just declare it off limits anyway, but it wouldn't be without cost.

Personally? Yeah, I'm willing to close that off.

In the proposal as written, module selectors are always rooted because modules exist in a single flat, global namespace, and if it were accepted, any future extensions would still have to look in that namespace first or they would break source. That means there would never be any need for a "no, this really is rooted, I pinkie-swear" syntax.

We could still use "no module name" to mean "rooted in the current module", but I prefer Self::foo() or Any::foo()* for that—they're nearly as short and a lot clearer.


* Any makes sense as a spelling because a module selector for the current module will also end up looking into modules it imports.

4 Likes

I understand that allowing whitespace can sometimes improve readability. However, I’d still vote against allowing whitespace around :: .

Unlike C++ :::

  • In Swift, use of module selectors will be relatively rare, it's only need to be used there are name conflicts between modules. Also, the likelihood of conflicts is even lower when symbol names are long or descriptive.
  • Swift doesn't require ; to separate statements. Stricter whitespace rules help enable more sensible diagnostics and better parser recovery.

Yes, applying whitespace rules is not free, but that happens only when the parser sees ::. That cost is probably negligible.

1 Like

Okay. We'll review the proposal with a newline restriction and see if there's a lot of pushback.

4 Likes

My main concern about the current direction is that SomeClass.AModule:: may look very similar to SomeClass.ANamespace. where AModule and ANamespace are spelled the same. In general it breaks reading left to right in increasing specificity. It might be possible to work around this with syntax highlighting in the editor, but just reading it as text I feel the need to have this new syntax scream more loudly at me.

There’s a lot of discussion about operator precedence and I do share some of the confusion raised. I also agree with a suggestion raised upstream that this syntax should look unclean and surprising.

Have we considered always requiring brackets on the bit being operated on? SomeClass.(AnotherModule::MyNamespace).doThing()?

FWIW in that scenario, anything after :: would have to be directly implemented in AnotherModule for the syntax to make sense to me. And I‘d „left-attach“ it such that MyNamespace must be implemented in AnotherModule, not „just“ the thing on the right-end of it. (This is the same as what’s currently proposed, just stricter due to the parens)

2 Likes

You most likely want to post in the current proposal review thread rather than the old pitch thread.