Infer associated types as generic parameters more eagerly

Here's a small change to associated type inference that @Douglas_Gregor and I have been discussing. It's source breaking; the Deferred project in the source compatibility suite hits this case.

Introduction

This proposal changes associated type inference behavior to short-circuit a large amount of inference work in the case where the conforming type defines a generic parameter with the same name as an associated type.

This breaks source compatibility with certain valid programs; in Swift 5.1, it is possible for an associated type and a generic parameter with the same name to have different types. However, the name lookup behavior in this case was already very fragile.

Motivation

The Swift language allows the programmer to omit declarations of associated type witnesses inside a conforming type, as long as the associated types can be inferred from other declarations using a set of rules, which are attempted in order.

One important rule states that if we have any value requirements that reference the associated type, we can try to match up the type of the requirement with the type of a witness from the conforming type, and "guess" the associated types as long as the rest of the types match.

For example,
the associated type Bag.Contents is inferred to be Int, by matching the type of the protocol requirement takeOut() with the witness in ConcreteBag:

protocol Bag {
  associatedtype Contents
  func takeOut() -> Contents
}

struct ConcreteBag : Bag {
  func takeOut() -> Int { return 0 }
}

Another rule checks if the conforming type declares a generic parameter having the same name as the associated type, and attempts to use the parameter as the witness, if one exists.

In Swift 5.1, the value witness rule takes precedence over the generic parameter rule. That is, you can have an associated type with the same name as a generic parameter, but a different type.

Here is an example, using the same Bag protocol as above:

struct GenericBag<Contents> : Bag {
  func takeOut() -> [Contents] { return [] }
}

extension GenericBag {
  func getContentsType() -> Any.Type {
    return Contents.self
  }
}

let bag = GenericBag<Int>()
let type: Any.Type = bag.getContentsType()
// type of 'type' is Int.Type
let value: GenericBag<Int>.Contents = bag.takeOut()
// type of 'value' is [Int] not Int

To understand what's going on here, suppose we have a value of type GenericBag<Int>. The generic parameter Contents is Int, but in order for GenericBag.takeOut() to witness Bag.takeOut(), we must substitute the associated type Bag.Contents with [Int].

To add to the confusion, Swift 5.1 always introduces an implicit typealias to the conforming type when an associated type was inferred. The existence of this typealias is observable via unqualified name lookup, which leaks out declaration order and compiler implementation details.

Proposed solution

This proposal changes the order in which the two inference rules are applied, so that the generic parameter with the same name as an associated type is always preferred over any other means of inferring the associated type.

Source compatibility

Source compatibility is affected, as in the GenericBag example above. This proposal does not impact the ability to explicitly define a typealias with the same name as a generic parameter; so the GenericBag type can still be implemented by explicitly declaring the type witness for Contents, which will type check under both Swift 5.1 and the language change in this proposal:

struct GenericBag<Contents> : Bag {
  typealias Contents = [Contents]
  func takeOut() -> [Contents] { return [] }
}

Compatibility of module interface files is not affected. In an interface file, almost all associated type witnesses are explicitly printed out as typealias declarations. The one exception where a typealias is not printed is specifically the case that is still allowed under this proposal: an associated type that is inferred to be a generic parameter of the same name. Under this proposal, the type witness can be inferred unambiguously while type checking the module interface file.

Effect on ABI stability

This proposal has no effect on ABI stability.

Effect on API resilience

This proposal has no effect on API resilience.

Alternatives considered

One alternative is to stage this change in with a -swift-version flag. This proposal as written introduces an unconditional breaking change.

22 Likes
struct GenericBag<Contents> : Bag {
  typealias Contents = [Contents]
  func takeOut() -> [Contents] { return [] }
}

That example is very confusing to me, because I don't know what takeOut() actually returns. Either [Contents] or [[Contents]].


In general I think I'm in favor of this change as it seems to help lay out a direction towards qualified name lookup on generic types.

7 Likes

I think it’s strange that we have a formal definition for the language syntax, but the inference rules (equally important to source compatibility) are essentially undocumented AFAIK.

It would be nice if there was some way to learn what exactly is supposed to be supported and what isn’t.

This change seems to make things more straightforward and understandable, so I support that.

4 Likes

This change makes the language align with what I would intuitively expect, so +1. I do agree with @DevAndArtist that allowing a typealias to shadow a generic parameter is extremely confusing, especially when the generic parameter itself is allowed to be part of the definition of the typealias. I think it would be better to support the conformance with @implements than by shadowing.

I also have a question of what the intended behavior is in this case:

protocol Bag {
  associatedtype Contents: Collection
  func takeOut() -> Contents
}

struct GenericBag<Contents> : Bag {
  func takeOut() -> [Contents] { return [] }
}

Here, the generic parameter does not meet the constraints specified by the requirement, but the return type does. What is the proposed behavior in a case like this?

2 Likes

I think there is some misunderstanding about the example (there is no shadowing going on afaict). Some clarification might help here.

The following works before the proposal but will break if this proposal is approved. (Example 1)

protocol Bag {
  associatedtype Contents
  func takeOut() -> Contents
}

struct GenericBag<Contents> : Bag {
  func takeOut() -> [Contents] { return [] }
}

let value: [Int] = GenericBag<Int>().takeOut()

If you uncomment the one line, it works with the existing behavior and will continue to work in the future. (Example 2)

protocol Bag {
  associatedtype Contents
  func takeOut() -> Contents
}

struct GenericBag<Contents> : Bag {
  typealias Contents = [Contents]
  func takeOut() -> [Contents] { return [] }
}

let value: [Int] = GenericBag<Int>().takeOut() // Ok

Now the question is why is the first example being broken.

It is breaking because:

  • Right now, we try to infer the associated type first by looking at the arguments. So, the compiler will look at Example 1 and automatically map it to Example 2 by implicitly generating a type alias Contents = [Contents]. This means the compiler needs to do extra work to synthesize the type alias by looking at the types in different places.

  • After this change, we will try to avoid doing the work of trying to do the inference if we see a generic parameter with the same name, and say "hey, I saw a generic parameter with the same name as the associated type, I'm gonna' do typealias Content = Content". This means that the return type of takeout() is expected to be [associatedtype] Content (because of the protocol) and [genericParam] Array<Content> (because of the signature and scoping rules). But, we just synthesized [associatedtype] Content = [genericParam] Content, so we have an inconsistency, hence type error.

Note that we will only skip the work if the names match up. If the names don't match, we will still need to infer the associated type. The only thing changing is the ordering of the rules, we are not removing or adding rules.

2 Likes

Would you like to work your explanation into my pitch, and add yourself as a co-author? ;-)

I was referring to typealias Contents = [Contents] shadows the generic parameter Contents. The fact that I read quickly and thought it was shadowing the generic parameter only strengthens my opinion that @DevAndArtist is right that allowing this is very confusing. @implements would be a much more clear way to support this conformance after the pitched change.

The shadowing behavior is actually even more subtle; unqualified lookup always picks the generic parameter, but qualified lookup will find the typealias. I think chaging the shadowing behavior of nested types is outside of the scope of this proposal; it doesn't have anything to do with associated type inference per se.

1 Like

Partial shadowing. :cry:

  1. I agree that it is confusing.
  2. When I first read it, I expected the type alias to shadow the generic parameter (same as you, and I think @DevAndArtist).
  3. However, the typealias is not shadowing the generic parameter -- the Contents in the takeOut() type refers to the generic parameter and not the typealias (I thought this wasn't clear, which is why I wrote the long-ish comment to clarify :slight_smile:)
  1. This proposal is not about changing the scoping rules, it is about changing the order of the inference rules. The scoping/shadowing rules are the same as before. We are not "allowing a typealias to shadow a generic parameter" any more than before.
  2. I concur that having a clearer way of doing the same (whether it be spelled out as @implements or something else) instead of having to remember "this looks like shadowing but it is not actually shadowing" would be a better outcome.
2 Likes

To be quite honest, unfortunately at this point many of those rules are extremely complex and intertwined with implementation details. This results in emergent behavior that can change in subtle ways across compiler versions too.

The work we're doing now on refactoring the declaration checker is partially about surfacing corner cases like the one described in this proposal and putting everything on a more solid foundation so that we can clean up the language in a way that could one day be documented in detail, like you describe.

9 Likes

Understood. I'm only offering my comments on this corner of the language since it came up for discussion.

When we have @implements, if we can stomach the source breakage I would support disallowing conflicting generic parameter names and typealias declarations.

2 Likes

It's worth pointing out that this isn't even specific to typealiases; any nested type with the same name as a generic parameter would behave this way. I agree with you that it's probably best to prohibit this, diagnosing it like any other kind of redundant redeclaration; the only problem is that such a change closes off an easy, localized fix for the source break in this proposal. While it's arguably better style to not have a generic parameter and nested type with the same name, it at least allows someone to hypothetically migrate their codebase without having to rename anything.

2 Likes

Yeah, the proposal could probably do a better job drawing an explicit distinction between the rules that are implemented today that explain the current behavior, and the proposed change to those existing rules.

I'm only offering my comments on this corner of the language since it came up for discussion.

Definitely. Sorry if I came across a bit too strongly.

When we have @implements , if we can stomach the source breakage I would support disallowing conflicting generic parameter names and typealias declarations.

That is certainly a reasonable position to take. There is also a reasonable argument to be made that type parameters and typealiases ought to behave analogous to function parameters and local variable declarations, i.e. usual lexical scoping instead of banning duplicates. I suppose we can cross that bridge when we get there. :slight_smile:

Right, that's why I suggested not considering it until we have @implements, which is desirable as a way to resolve naming conflicts when providing conformances in general. There aren't always workarounds available, particularly if the conflict is between two protocols defined in different third party modules.

Unfortunately here is another example of a source break: [SR-11791] Generic parameter conflict with nested iterator · Issue #54201 · apple/swift · GitHub

I wonder how common code like this is. Maybe this proposal isn't as clear of a win as I thought.

To be honest, I’d rather suffer a one off source change than repeatedly wait for the compiler and indexer to inefficiently infer the correct type. In the bigger picture, the sum of optimisations such as these will do much to alleviate many frustrations we experience.

7 Likes

Hi Slava,

The proposal sounds reasonable to me.

Question for you though: what is the payoff of doing this? Is it an improvement to compile times for some important case, or is it a clarification of semantics and more predictable behavior for users? Both?

-Chris

4 Likes

A little bit of both. I don't think it's a huge in either performance or understandability, but at least on a couple of test cases, cutting out the inference work allowed us to type check the code whereas before it would fail.

While of course it would be better to fix up the inference algorithm to work better, anything we do to limit the scope of the search space is a good thing, IMO.

8 Likes