SE-0491: Module selectors for name disambiguation

Note that today you can declare a function named subscript and call it this way. The name of a subscript declaration is a special DeclName and it’s not the identifier subscript.

3 Likes

Ideally, every declaration reference in every @inlinable body in a .swiftinterface should be rewritten with a module selector to prevent future breakage. However, if we cannot attach a module selector to a subscript (or to any declaration reference), that benefit would be lost.

In that context, I think the proposal should discuss module selectors for operators (binary, prefix, and postfix). For infix operators, we can already rewrite operand1 + operand2 as (Swift::+)(operand1, operand2) . But as far as I know, there is currently no way to disambiguate prefix or postfix operators using this syntax. For example, (^)(operand) is ambiguous - it could mean either ^operand or operand^

prefix operator ^
postfix operator ^
prefix func ^(x: Int) -> Int { x + 1 }
postfix func ^(x: Int) -> Int { x + 2 }

_ = ^42 // OK
_ = 42^ // OK
_ = (^)(42) // error: ambiguous use of operator '^'
2 Likes

I'm investigating this now and trying to figure out the exact shape of the restriction. Are there things that the compiler models as (possibly structural) DependentMemberTypes which ought to support module selectors? I notice that there are code paths in GenericSignatureImpl::lookupNestedType() which return concrete types; that makes me wonder if perhaps the presence of a module selector means that the type must be concrete.

This code path is there to support some legacy behaviors that were part of the GenericSignatureBuilder, but are now just modeled in an approximate way. Consider these two functions:

class C {
  struct A {}
} 

func f<T: C>(_: T, _: T.A) {}

protocol P {}

extension P {
  typealias A = Int
} 

func g<T: P>(_: T, _: T.A) {}

The T.A in each case refers to a concrete member type, and not an associated type. They are resolved in interface type resolution, and we don't actually form a DependentMemberType in this case. We ought to be able to support module selectors here, so you could write T.Mod::A.

The hack you saw comes in if you reference a concrete type member of a type parameter from a where clause. In the general case, T.A in a where clause can only refer to an associated type declaration named A. However, there is a hack in swift/lib/AST/RequirementMachine/ConcreteContraction.cpp at main · swiftlang/swift · GitHub to allow some things like this to work:

class C {
  struct A {}
}

func f<T: C, U: Sequence>(_: T, _: U) where U.Element == T.A {}

Here, we start with a DependentMemberType T.A, but we fold it down to a concrete type in concrete contraction.

In the first case, you could allow module selectors to appear if you think it's worth it -- interface type resolution doesn't form DependentMemberTypes unless name lookup returns an associated type declaration.

When we're resolving the where clause though, we don't have enough information to do name lookups, so we just form structural DependentMemberTypes. I don't think it's worth trying to hack module selectors into DependentMemberType, we should just say module selectors cannot be used with a type parameter base in a where clause.

(If we were doing this all over again, the first behavior of member types in interface type resolution might be defensible, but the thing that concrete contraction models should not exist at all.)

5 Likes

If this discussion needs to continue could we move it to Development > Compiler?

1 Like

I feel like nailing down the positions in the grammar where module selectors should be accepted is a pretty relevant part of the review of this proposal, and not just something to be shunted off to the "implementation details" corner.

5 Likes

That’s fair, and I agree! I was overeager with “continue this elsewhere”, I suppose I more meant “if we’re going to have this discussion here can we keep it a bit more focused on intended semantics/syntax and less on implementation plan?”

1 Like

I don’t want to go off-topic, but I wish Special syntax for the current module were part of the proposal’s scope. I’d really love to see this included, as I worry that if it’s left as a future direction, it won’t be picked up.

1 Like

I've been looking into this, and yeah, I think the right answer is to ban module selectors in dependent member types even if they ultimately end up resolving to a concrete type. I've pushed a commit to the PR that implements this, and I'll revise the proposal shortly.

Update: Revision PR: https://github.com/swiftlang/swift-evolution/pull/2970

2 Likes

Agree with provided suggestion. When reading this pitch I also realized that parentheses fit better for Swift.

We can not exclude all letters and syntax from other languages just because it has another meaning in each of other languages. Those who come from other language assumed to realize that:

  1. there is limited count of alphabet letters and symbols suitable for programming syntax, so each programming language can not invent another syntax that is not similar to any other and is completely unique.
  2. each syntax construct may have another meaning, and everything I know from one language may have another meaning and semantics. I hardly imagine that someone from C++ can begin writing Swift code while having only C++ knowledge, without learning Swift documentation, and unexpectedly totally confused by :: so much that it blocks all the work progress.

I don’t know if there are hidden pitfalls in parser or language grammar rules for usage parentheses, but they seems to look clean and rather intuitive.

If parentheses can not be used, I’m ok with ::

I’m out of my depth, but I haven’t seen it suggested here yet: something like a from keyword?

let withColons: Foundation::NSString // currently proposed
let withFrom: NSString from Foundation // easier to read

// Simple nested type
func makeIonThruster() -> Spacecraft.Engine from IonThruster { ... }

And allows for more complex nesting with different semantics?

// Base module
public struct Mission {
    public struct Booster { ... }
}


// NASA module
public extension Mission.Booster from BaseModule {
    struct Exhaust { ... }
}


// Current module
let exhaust1 = Mission.Booster.Exhaust from NASA
// Base module
public struct Mission { ... }


// NASA Module
public extension Mission { 
    struct Booster { ... }
}


// Current module
protocol BoosterProtocol { 
    associatedtype Exhaust
}

extension Mission.Booster from NASA: BoosterProtocol {
    typealias Exhaust = CO2 from RocketScience
}

let exhaust2 = Mission.(Booster from NASA).Exhaust
// Base module
public struct Mission { ... }


// NASA module
extension Mission {
    public struct Booster {
        public struct Exhaust { ... }
    }
}


// Current module
let exhaust3a = Mission.(Booster.Exhaust from NASA)
let exhaust3b = Mission.(Booster from NASA).Exhaust from NASA // same thing, maybe?
9 Likes

This revision has been merged. To allow time for the fully amended proposal to be considered, the review will be extended through September 30, 2025.

4 Likes

Indeed, there is an ongoing conversation which highlights a very practical pitfall related to this issue:

With the proposed feature as it is, Michael's preferred spelling would end up allowing us to write File.System::Stat to mean FileStat rather than FileSystemStat. That is to say, the very problem that Michael is trying to avoid by a deliberate naming choice ("System.Stat implies the status of the system instead of a file") would be undermined.

Mind you, this namespacing practice already has precedent in the standard library: we have decided to put a bunch of APIs under the Unicode namespace, such as Unicode.Encoding. The ability to use an unparenthesized module qualifier for any nested component interrupts the intended straight-through reading: Unicode.Swift::Encoding reads like it's a Swift encoding rather than a Unicode encoding. In some places, that reading is merely nonsensical and not actually confusing, but in others (such as the File.Stat example above) it would be a dealbreaker.

Going forward, unparenthesized module qualifiers would require authors to consider how nested types will be read with any combination of module prefixes, making current namespacing practices probably unwise--which admittedly would advance the goal of not having ambiguous nested types in clashing modules, but not exactly in the way proposal authors intend.

4 Likes

I think it's worth keeping in mind that module selectors, especially in member position, are basically meant for two situations:

  1. You are machine-generating code and want to be extremely defensive against possible collisions, even at the cost of readability.

  2. You are working around a known conflict between two APIs.

I submit that, in both of these circumstances, we don't actually care very much about APIs reading "naturally". In the first case, the code is not primarily meant for human consumption; in the second, the API design has already "failed" so dramatically that the same type has two members with the same name but different behavior. It's better in these scenarios to focus on precision—making it clear that (a) a module selector is being used and (b) what it is being applied to—rather than reading like a sentence.

4 Likes

I expect some people will use :: defensively within the package ecosystem to improve source compatibility. I think usage could become more widespread than what seems to be expected here.

Currently if you import library A and library B, you may get surprises if a newer version of library B introduces a symbol with the same name as one that is already present in library A. If you’re vending a package C that makes use of A and B, an update to either A or B with new symbols could break compilation of package C and impact every client of C.

If you gain the ability to prefix everything with A:: and B:: I wouldn’t be surprised if some package authors start to use this extensively. You could even automate it with a linter. This would make package C shielded from new symbols in A and B that could create ambiguities.

For reference, here’s somewhere in the middle of a previous discussion about this:

So this “nest everything” rule could easily transform into “:: everything” for some people who care about semantic versioning.

3 Likes

Yeah, I think this is a very good point.

I do continue to believe that some sort of expression-level "hoisting" of the module qualifier best addresses both use cases above, and particularly in light of the point made earlier that we can't be completely defensive due to lack of support for subscripts and operators. (It would also provide a principled way to use parens that respects the existing convention that they surround whole expressions, and it would answer the critique that :: is heavier than . but binds more tightly.)

That is to say, I think we'd be better off with a feature where you can write Swift::A.B.frobnicate(bar() + baz[1]) to indicate to the compiler that it should resolve everything in the expression in favor of the Swift module whenever possible, behaving something like an expression-local import statement with a behavior like !important in CSS. This would—even without prioritizing readability, for the reasons you argue above—be more ergonomic than Swift::A.Swift::B.Swift::frobnicate(Swift::+(Swift::bar(), Swift::baz.Swift::[1]), and without necessitating casting about for a specific way to including subscripts and operators.

It's true that you'd lose the pinpoint precision of resolving exactly one component some way and another component some other way, but as you point out, that is neither common scenario (1)—where you want to be defensive about everything—nor common scenario (2)—where it's a single rare conflict between APIs from two libraries A and B.

However, since the rules around what it would mean to hoist a module qualifier to the whole expression and then "resolve in favor" are not so easily defined, I would concede that the proposal represents a sufficiently worked out feature that addresses the sorely needed ability to use single-type libraries of the form Foo::Foo.

2 Likes

I would also say this is a natural consequence of reusing the C++ :: syntax. It will encourage developers to act defensively, like they do in C++, and will confuse them when it doesn't have the same semantics.

2 Likes

I wouldn't be surprised if we see handwritten defensive use on unqualified lookups (the first name in a chain), but I think it's a lot more readable there:

// Might be done defensively, but not likely to confuse
func encode<T>(as _: T.Type) where T: Swift::Unicode.Encoding { ... }

// More confusion-prone, but probably not done defensively
func encode<T>(as _: T.Type) where T: Unicode.Swift::Encoding { ... }

The reason I think we're more likely to see handwritten defensive use on unqualified lookups is that they search a very cluttered, collision-prone namespace—local declarations, methods of self, top-level declarations in the same module, top-level public declarations in imported modules, etc.—whereas qualified (i.e. member) lookups only search the members of a specific type. In other words, developers would be a lot more worried about some import shadowing the top-level Unicode type than they'd be about an import shadowing the Encoding type inside Unicode.

Your example is using standard library types, which are less likely to come in conflict because every library depends on it and has to compile with it: a third party can’t realistically add a conflicting Encoding to Swift::Unicode.

But if those types were from third party libraries, I think the only fully unambiguous version would be more like:

import A
import B
import C

func encode<T>(as _: T.Type) where T: A::Unicode.B::Encoding { ... }

Anything less is potentially source-breaking: a future version of C could add Unicode to the root namespace, or a future version of either A or C could add Encoding to A::Unicode. So if someone is using automated tools to make things fully unambiguous (and thus minor-version-upgrade-proof), the above is what you’re likely to see in a source-stable library.

I don’t really know how it’ll actually play out. I’m also not saying this is bad. I’m only pointing out that we could see a lot more use of this syntax than machine-generated code and disambiguating actual conflicts, as suggested by your previous post.

1 Like

To require this two-part disambiguation, wouldn't this imply that B declared an Encoding type namespaced under A::Unicode and further that there's some other B::Encoding type namespaced under, say, Swift::Unicode? If so, I'd be pretty comfortable saying that the author of B is deliberately trying to torture their clients in that case, and that we needn't prioritize this important feature around being able to solve such a problem ergonomically.