Overload resolution for `init` ambiguous but same signature `func` is not. Bug? Intended?

toastersocks · March 9, 2023, 6:58pm

The same signatures for func and init give different overload resolution results. Is this a bug?
If I have an overloaded method/function: (I've edited this from the original to use just free functions to make things clearer)

func doSomething<Thing>(_ thing: Thing) {
    print("Thing")
}

func doSomething<Thing>(_ thing: Thing?) {
    print("Optional Thing")
}

func doSomething<Thing>(_ thing: Array<Thing>) {
    print("List of Things")
}


let thing = 3
let optThing: Int? = 3
let listOfThings = [3]
let something = Something(thing: 3)

doSomething(thing)
doSomething(optThing)
doSomething(listOfThings)

/* 
Prints:
Thing
Optional Thing
List of Things
*/

The compiler can disambiguate the calls and calls the intended overload.
But change the funcs to init:

struct Something2<Thing> {
    init(_ thing: Thing) {
        print("Thing")
    }

    init(_ thing: Thing?) {
        print("Optional Thing")
    }

    init(_ thing: Array<Thing>) {
        print("List of Things")
    }
}

let thing = 3
let optThing: Int? = 3
let listOfThings = [3]

Something2(thing)
Something2(optThing)
Something2(listOfThings) // Error: Ambiguous use of 'init(_:)'

And I get an error for the call to init(_: Array<Thing>): Ambiguous use of 'init(_:)'
The compiler can't decide if it should call init(_:Thing) vs init(_: Array<Thing>) even though for the funcs it could. It seems that Array<Thing> is more specific than just Thing so it should prefer that, even though it could call the plain Thing overload. This seems like a bug to me, but maybe there's something special about init rules that I'm missing.
Is this a bug?
Would making it so inits resolve the same way as funcs break anything? Seems like it would "just" make things more permissive, so purely additive?

I know there are a couple ways to resolve this but...

I know I can add the unstable attribute @_disfavoredOverload or add a dummy default argument (thanks @jrose) a la init(_ thing: Thing, noOp: Void = Void()) to resolve the ambiguity for the compiler, but things start to get reeeally complicated if, for instance, I'm creating a SwiftUI view that takes a StringProtocol parameter in addition to the Thing parameter and then you want to add LocalizedStringKey overloads as well, you have to use a very specific combination of @_disfavoredOverload and dummy default arguments to properly disambiguate and make sure the proper init is called due also in part to the compiler's preference for ExpressibleBy... protocols over concrete types. It's very confusing and not at all straight-forward to get this right.
Plus having these weird dummy arguments pop up in code completion and documentation is confusing for users of your API.

Jumhyn · March 9, 2023, 7:22pm

In the function example, the generic param Thing is fixed to Int at the point of the call to doSomething, so the call to something.doSomething(listOfThings) can only be resolved to the (Array<Int>) -> Something function.

But when you're calling the initializer directly, the Thing param is not fixed, and must be resolved as part of type checking the initializer call. However, as you note, the compiler doesn't know whether to resolve Thing == Int and call the Array<Thing> initializer, or Thing == Array<Thing> and call the Thing initializer. The compiler doesn't have any built-in rules for resolving Array<Thing> as more specific than Thing, so the compilation has to fail. If you manually fix the type of Thing, though, this works as expected:

Something2<Int>(listOfThings) // OK!

Also worth noting that the compiler does have disambiguation rules for subtyping relationships, which allows it to pick the Thing init over the Thing? init when calling with a value of type Thing.

toastersocks · March 9, 2023, 7:48pm

Thanks @Jumhyn I think I see what you're saying. I think my example made things more muddled than was necessary. A simpler example of the func versions would be just 3 free functions:

func doSomething<Thing>(_ thing: Thing) {
    print("Thing")
}

func doSomething<Thing>(_ thing: Thing?) {
    print("Optional Thing")
}

func doSomething<Thing>(_ thing: Array<Thing>) {
    print("List of Things")
}

doSomething(thing)
doSomething(optThing)
doSomething(listOfThings)
/*
Still prints:
Thing
Optional Thing
List of Things
*/

If I'm understanding you correctly, these exactly mimic the init versions. What Thing is is not fixed but they are not ambiguous to the compiler and the correct ones gets called.

toastersocks · March 9, 2023, 8:07pm

Hmm, ok if I do:

Something {
    init<Thing>(_ thing: Thing) {
        print("Thing")
    }

    init<Thing>(_ thing: Thing?) {
        print("Optional Thing")
    }

    init<Thing>(_ thing: Array<Thing>) {
        print("List of Things")
    }
}

The calls are differentiated as I intend. But if I need the generic parameter at the type level, things are ambiguous. I guess I don't really understand why these are treated differently by the compiler.

orobio · March 9, 2023, 8:10pm

I don’t have an answer, but just wanted to note that your edited first post doesn’t have the generic type parameter in the free functions.

Jumhyn · March 9, 2023, 8:52pm

Ah! Thank you for the updated example. I'm able to reproduce that behavior but don't have a great explanation off the top of my head. I suspect there is some ranking rule that scores Array<Thing> better than Thing when Thing is a generic param of the overloaded declaration, but when the generic param is attached to the type the overloaded declaration itself is not considered generic and so the ranking rule doesn't apply.

Jumhyn · March 9, 2023, 8:53pm

You can see this isn't specific to init by using static functions instead:

struct Something2<Thing> {
    static func doSomething(_ thing: Thing) {
        print("Thing")
    }

    static func doSomething(_ thing: Thing?) {
        print("Optional Thing")
    }

    static func doSomething(_ thing: Array<Thing>) {
        print("List of Things")
    }
}

Something2.doSomething(thing)
Something2.doSomething(optThing)
Something2.doSomething(listOfThings) // Error: Ambiguous use of 'doSomething'

toastersocks · March 9, 2023, 9:22pm

Whoops! Fixed. Thanks!

toastersocks · March 9, 2023, 11:08pm

Yeah, it does seem to be the case. This seems like a bug to me, or at least an inconsistency of the overloading rules.

toastersocks · March 9, 2023, 11:17pm

Jumhyn:

You can see this isn't specific to init by using static functions instead:

struct Something2<Thing> {
    static func doSomething(_ thing: Thing) {
        print("Thing")
    }

    static func doSomething(_ thing: Thing?) {
        print("Optional Thing")
    }

    static func doSomething(_ thing: Array<Thing>) {
        print("List of Things")
    }
}

Something2.doSomething(thing)
Something2.doSomething(optThing)
Something2.doSomething(listOfThings) // Error: Ambiguous use of 'doSomething'

Oh, very interesting. But the non-static versions work even with the generic parameter at the type level. Hmm... Now that I think about it, init is just a static function with a different spelling. That does seem to indicate it's something to do with the type-level (static) inference rules with type-level generic parameters.

Jumhyn · March 11, 2023, 6:51pm

Well, really, it's because when you have an already resolved instances of a generic type, the type parameter is fixed as discussed before, so it doesn't get resolved in the same expression. We can recover the failure in the instance method case by contriving the example so that the instance generic parameter still gets resolved all as part of the same expression:

struct Something2<Thing> {
    func doSomething(_ thing: Thing) {
        print("Thing")
    }

    func doSomething(_ thing: Thing?) {
        print("Optional Thing")
    }

    func doSomething(_ thing: Array<Thing>) {
        print("List of Things")
    }
}

let thing = 3
let optThing: Int? = 3
let listOfThings = [3]

Something2().doSomething(thing)
Something2().doSomething(optThing)
Something2().doSomething(listOfThings) // Error: Ambiguous use of 'doSomething'

I spent a little more time looking into this and it's a bit more subtle than I first though—I believe it's expected behavior, but I'm not certain. I'm using the following example, which uses the same pattern as before:

struct Something {
    static func doSomething<Thing>(_ thing: Thing) {
        print("Thing")
    }

    static func doSomething<Thing>(_ thing: Array<Thing>) {
        print("List of Things")
    }
}

struct Something2<Thing> {
    static func doSomething(_ thing: Thing) {
        print("Thing")
    }

    static func doSomething(_ thing: Array<Thing>) {
        print("List of Things")
    }
}

let listOfThings = [3]

Something.doSomething(listOfThings)
Something2.doSomething(listOfThings) // Error: Ambiguous use of 'doSomething'

The call with Something compiles fine, but the call using Something2 fails to compile. I looked at the overload resolution behavior and this is what happens:

In both calls, we have two type variables that represent the Thing generic parameters in each overload.
The way we do generic opening overload resolution causes us to only substitute in a generic replacement for one of these type variables, with the other being left as a free type variable.
In the call to Something.doSomething, the two Thing parameters are different (since each is specific to its own declaration), and free to differ, so we discover a 'subtype' relationship between Array<Thing2>: Thing1 (by letting Thing1 == Array<Thing2>).
- A subtype relationship between parameters means that the overload with the subtype param is better than the other overload, and we rank it better.
- This subtype relationship is only discovered when Thing1 is the free type variable and Thing2 is bound to the generic replacement, since when Thing1 is bound to the generic replacement we are not free to bind it to Array<Thing2>.
- The result is that doSomething(Array<Thing>) is considered more specialized than doSomething(Thing). This makes some amount of sense, as you note in your original post—it feels more specific to call a function that takes a concrete type than one that takes an unrestricted generic parameter.
When we call Something2.doSomething, though, the generic parameter sits on the type, above the level of the doSomething declarations themselves.
- There's a step when comparing overloads that identifies cases where the self type of the two overloads are compared (these could be different if, for instance one overload was declared on a superclass and the other on a subclass).
- In this case, both declarations appear on the same type, which means the types get identified with one another. But because the generic parameter appears in the type, this has the effect of fixing the generic parameters across the two overloads as being identical. IOW, we've lost the flexibility when comparing the overloads to bind the generic parameters to different types.
- When we do subtype checking, now the comparison in both directions fails, since Thing1 == Array<Thing2> is no longer a solution (there's only one Thing and we can't have Thing == Array<Thing>).
- Since the subtype relationship is never discovered, we can't treat either declaration as more specialized than the other, and so it is not possible for us to prefer one to the other. Thus, we end up with an ambiguity error for the call to Something2.doSomething.

The step that seems potentially suspect to me is while comparing the Something2 overloads when we decide because both doSomething declarations appear in Something2 that they must be equivalent—I don't see why it shouldn't be permitted for the generic arguments to Something2 to differ based on the overload that gets chosen. Of course, we may be stuck with this behavior regardless of whether it's desirable!

cc @xedin do you have any thoughts about whether this behavior makes sense?

toastersocks · March 11, 2023, 8:02pm

Wow, @Jumhyn thanks for your work looking into this!

toastersocks:

There's a step when comparing overloads that identifies cases where the self type of the two overloads are compared (these could be different if, for instance one overload was declared on a superclass and the other on a subclass).

In this case, both declarations appear on the same type, which means the types get identified with one another. But because the generic parameter appears in the type, this has the effect of fixing the generic parameters across the two overloads as being identical. IOW, we've lost the flexibility when comparing the overloads to bind the generic parameters to different types.

When we do subtype checking, now the comparison in both directions fails, since Thing1 == Array<Thing2> is no longer a solution (there's only one Thing and we can't have Thing == Array<Thing>).

Since the subtype relationship is never discovered, we can't treat either declaration as more specialized than the other, and so it is not possible for us to prefer one to the other. Thus, we end up with an ambiguity error for the call to Something2.doSomething.

Ok, I think I get what you're saying.
It doesn't seem completely consistent to me especially when you consider the optional case (especially spelling out the Optional)

struct Something3<Thing> {
    init(_ thing: Thing) {
        print("Thing")
    }

    init(_ thing: Optional<Thing>) {
        print("Optional Thing")
    }

    init(_ thing: Array<Thing>) {
        print("List of Things")
    }
}
let thing = 3
let optThing: Int? = 3

Something3(thing) // prints "Thing"
Something3(optThing) // prints "Optional Thing"

In this case the compiler is able to see that the optional is more specific and call the right one. But with other generic types it doesn't.
I wonder if there's a language reason for this limitation, or if this is something that could be lifted. It seems like it would be generally really useful to differentiate the same way the generic function arguments do.
Right now I have to use @_disfavoredOverload or add dummy default arguments (init(_ thing: Thing, noOp: Void = Void())) to make the right thing happen, but neither of these is ideal for this case IMO.

Jumhyn · March 11, 2023, 8:26pm

The key with the analogous construction with Optional is that the type system does explicitly encode a subtype relationship between Thing and Optional<Thing>, so even in the instance where we’ve fixed the generic Thing argument between the overloads, the subtype relationship between the function arguments is discovered during ranking. But since no such relationship exists between Thing and Array<Thing>, the same doesn’t hold for that case.

Jumhyn · March 11, 2023, 8:29pm

I’ll also say that I agree with you that it seems like there’s an inconsistency here—the discovery of the subtype relationship that enables the call to Something.doSomething in my example above seems a bit suspect to me, as does the rule in the Something2.doSomething case that constrains the generic arguments to be equal to one another. It would be nice if there’s a way to make these behave the same unless there’s some higher level tule I’m not seeing that explains why we treat these two cases differently.

toastersocks · March 11, 2023, 8:55pm

Ok, so there's a special case in the compiler to differentiate Optional, if I'm understanding correctly.

toastersocks · March 11, 2023, 9:04pm

Haha to me it's suspect in the other direction It make sense to me that Something.doSomething works, but I feel like the compiler should enable the call in Something2.doSomething.
But yeah, I agree that it would be nice if the behavior was consistent. And I too wonder if there's some higher-level language design aspect that I'm missing that would make this not workable.

Jumhyn · March 11, 2023, 9:27pm

Ah yeah, actually, my first explanation for why the Optional case works is a bit confused, since the logic that Thing is a subtype of Optional<Thing> would actually make the (Thing) -> Void function a better choice than the (Optional<Thing>) -> Void function. In fact, you are correct—we explicitly disallow subtype relationships that require value-to-optional conversions from being used to decide that a given overload is better than another.

I'm actually somewhat ambivalent about which 'should' work, I can see the case in both directions. But it's certainly not obvious to me that these two cases should behave differently from one another and feels more like an emergent behavior of how exactly we've defined the subtyping rule for generic declarations.

toastersocks · March 11, 2023, 9:59pm

Ah, ok makes sense, thanks. That CSRanking.cpp file looks very interesting. I'll take a look at it more thoroughly.

Oh, thanks for clarifying. For me, I feel it makes the language more expressive, or at least more easily expressive, to have it work. My API can behave differently depending on what's passed in at the type level and I don't have to do runtime checks and try to cast, etc...
Out of curiosity, if you feel like sharing, what would you say is the case for the direction of having it not work?

Jumhyn · March 11, 2023, 10:14pm

Narrowly, it’s not obvious to me that this case falls into the (somewhat) easily explainable high-level rules that we’ve defined for overloading (generic declarations are worse than non-generic ones, declarations that have a subtype relationship with another are better, etc.) and so I am loathe to permit even more subtle overload resolution rules than we already have, especially if it only works ‘by accident’ today.

More broadly, I think type based overloading is generally just not that great especially when we have argument labels. I have seen many developers (and I myself have been) confused by the currently implemented behavior because their mental model of “more specialized” doesn’t map onto the compilers very specific definition of “more specialized.”

Less importantly, I’ve found it to be a bit of a nightmare to maintain source compatibility in the face of type based overloading when adding new features to the type system—if you improve the compiler’s ability to infer types for some expressions, you may bring new declarations into the set of valid overloads, and it is Not Good to suddenly change the meaning of existing code.

xedin · March 12, 2023, 6:47am

@Jumhyn what happens when you add result type of Something<Thing> to your instance and static function in Something example?