Modeling different generic types at head and tail of implicit member chains

Per this discussion thread, I've been working on extending my implementation for implicit member chains to allow different generic parameters at the beginning and end of implicit member chains, so that e.g. something like the following could compile:

struct Foo<T> {}

extension Foo where T == Int {
    static var intToString: Foo<String> { Foo<String> }
}

let _: Foo<String> = .intToString
                  // ^-- base inferred as Foo<Int>, but result is Foo<String>!

Lookup successfully binds the generic param at the base to Int since that's the only way lookup would succeed. I have an implementation working that allows this to compile, but it fails in cases like this:

extension Foo {
    static var someFooInt: Foo<Int> { Foo<Int> }
}

let _: Foo<String> = .someFooInt.intToString // error!

Here, if the generic arguments at the head and tail of the chain aren't bound to one another, lookup doesn't help resolve the base param, since someFooInt is present on Foo<T> for all T.

So, what I really want to express is "infer the generic params from lookup, but if that fails, utilize the contextual params". I'm planning on modeling this as a disjunction between a "equality up to generic params" and "deep equality" constraints, but AFAICT the former doesn't exist in the constraint system.

The current implementation hacks around this by special casing the simplification rule around the head-to-tail equality constraint for implicit member chains, but it feels really messy and like I'm fighting the constraint system, so I wanted to ask:

  1. Does this concept of "match only a generic base, ignoring params" exist anywhere else in the type system?
  2. If not, what is the advisable way to approach this? Is this concept worthy of a new constraint kind, or could a new restriction kind capture what needs to happen here? (Even though "restriction" isn't quite what's going on, since the modified constraint is more lenient than a normal equality constraint)

Would love to hear anyone's thoughts on this!

In the example that you give:

extension Foo {
    static var someFooInt: Foo<Int> { Foo<Int> }
}

let _: Foo<String> = .someFooInt.intToString // error!

I think the error makes sense here. Consider the following analogous code today, which does not use member chains and also gives an error:

struct S<T> {}
extension S {
    static var sint : S<Int> { S<Int>() }
}
let s = .sint // error: reference to member 'sint' cannot be resolved without a contextual type

since polymorphic types are not first-class. If polymorphic types were first-class then s would have a type forall T. S<Int> (where T is unused). For your example, the overall type of .someFooInt.intToString would become forall T. Foo<String> (which still doesn't match Foo<String>).

Wouldn't the analogous code be:

let s: S<Int> = .sint

?

Without the "different generic params allowed" rule for member chains, the error example from my post is straightforward—the context expects Foo<String>, so the base of the chain is inferred as Foo<String>, and the rest of the chain resolves from that.

Hmm. On thinking about it a bit more, I think the correct analog is actually a third thing.

let s = S.sint // error: note: explicitly specify the generic arguments to fix this issue

This reflects that there is a type parameter in between to be inferred.

let _ : Foo<String> = .someFooInt.intToString should be behaving as if it had the subexpression Foo.someFooInt.intToString as Foo<String>, where we need to infer the type parameter for the first Foo (the Foo<String> is essentially "irrelevant" after prefixing the Foo). This is similar to S.sint where we also need to infer the type parameter for the first S. In the latter case, we don't infer the type parameter to be anything specific (because the extension is applicable for all T). Hence, in the former case, we shouldn't be inferring the type parameter to be anything specific either.

Now, the thing I put in the parens seems questionable "the Foo<String> is essentially "irrelevant" after prefixing the Foo" -- where did this come from? Mostly that is from my mental model of how the feature is supposed to work. Maybe you had a different design in mind though. :slightly_smiling_face:

1 Like

The model I was originally trying to achieve with the member chains was simpler, and just involved the substitution of the contextual type of the expression to the base of the chain, generic parameters and all. As I mentioned, this model allows the problematic construction from the OP:

let _: Foo<String> = .someFooInt.intToString
                 // ^-- context is Foo<String>, so base is Foo<String>

This is just like how let s: S<Int> = .sint compiles today, just with more bits in the middle.

(Aside: as the title of this thread suggests, I'm still trying to formalize my mental model of the feature with this rule, so I appreciate you taking the time to talk through it with me! If it seems like I'm telling you things you already know, that's probably me just organizing my thoughts to reason things through :slightly_smiling_face:)

The idea to allow different generic parameters at the beginning and end of the chain came from the discussion thread. The motivating example from the thread was basically to allow constructions like this:

func fetchArray() -> [String] {
    return .init(repeating: 0, count: 50).map { "\($0)" }
}

Where the base is first inferred to Array<_> by the contextual type of the tail, then .init(repeating: 0, count: 50) provides the information to resolve the base to Array<Int>. This shouldn't prevent inferring the generic arguments of the head from the generic arguments of the tail, but it should be able to provide a solution when equating the head and the tail fails (e.g. when Array<String>.init(repeating: 0, count: 50).map { "\($0)" } fails to type check).

As I think this through, though, it's definitely a bit more complicated than a disjunction between "just equate the generic bases" and "deeply equate the types". In order to be really useful, I'd want this to work with something like:

func process(_ callback: Result<String, SomeError> -> ()) {
    callback(.success(0).map { "\($0)" })
}

Here, the base should be inferred as Result<Int, SomeError>, but the two generic params are inferred from different sources. SomeError comes from the contextual type, but Int comes from the call to .success(0). That makes it even more difficult to model, and I'm starting to worry about the performance it would have on the type checker if there were, say, a disjunction for each generic parameter...

The idea to allow different generic parameters at the beginning and end of the chain came from the discussion thread. The motivating example from the thread was basically to allow constructions like this:

Right. I think another way to go about this, is ask: what is an intuitive/predictable extension of the type system we already have along the direction that we want (i.e. the resulting extended type system is both intuitive and consistent with the unextended type system), and can that extended type system check the expressions we want to be able to type check. After all, the type checker's job is not just to check/infer types but to do so in a predictable manner, so that when it fails, it is possible to easily reason about why it failed and how one might go about fixing it.

More concretely, here's my thought process:

  1. Consider incrementally adding member chains with both same-base-type and same-generic-parameter restrictions.
  2. Consider incrementally removing the same-generic-parameter extension and keeping the same-generic-type restriction.
  3. Consider removing the same-base-type restriction altogther (so the same-generic-parameter restriction is inapplicable).

Let's try to analyze examples after we come up with what is hopefully an intuitive model. (I'm going to use e as T as shorthand for representing the contextual type, instead of let _ : T = e but the idea is the same.)

With (1), the design is essentially forced, we don't really have any freedom. We attach the contextual type as the base, including generic parameters:

.member1...memberN as G<T1 ... Tn> 
-> (behaves as if)
G<S1 ... Sn>.member1...memberN as G<T1 ... Tn> where Si = Ti
// Can simply start checking member chain, no inference needed! \(^_^)/

I have deliberately manifested the same-generic-parameter restriction as an extra set of equalities.

With (2), a natural way to extend (1) is to drop the additional equalities, because we want to lift the same-generic-parameter limitation. Since we don't have any equalities for these type variables, we'll need to infer them:

.member1...memberN as G<T1 ... Tn> 
-> (behaves as if)
G<_1 ... _N>.member1...memberN as G<T1 ... Tn>
// Need to infer N generic parameters somehow (^_^;)
-> (short-hand which I used earlier with S.sint)
G.member1...memberN as G<T1 ... Tn>

With (3), because we have no limitations, we literally cannot make any progress at all. We need to infer the base type!

.member1...memberN as G<T1 ... Tn> 
-> (behaves as if)
_.member1...memberN as G<T1 ... Tn>
// Need to infer the base fully?!?! (O_O)

Clearly (3) is not feasible from an efficiency perspective (it requires looking at all types in scope, essentially). Even ignoring efficiency, it creates "spooky actions at a distance" which is not how we normally do things. So the question is between (1) and (2).

The way the description is written here: neither is a strict subset of the other in terms of programs that type check! With (1), we can't type chains with changing generic parameters. With (2), in some cases, we might hit an inference failure which were avoided in (1) because we were checking instead of inferring. This is the usual generality vs inference tradeoff in action. I'd argue that it would've been very suspicious if the programs type-checked by (2) were a strict superset of the programs type-checked by (1).

Ok! Now, let's look at some examples with (2) (since we're interested in more flexibility) and see if this holds up in theory.


The fetchArray() example will work out as you pointed out:

func fetchArray() -> [String] {
    return .init(repeating: 0, count: 50).map { "\($0)" }
}

-> (trying to check body)
.init(repeating: 0, count: 50).map { "\($0)" } as [String]

-> (follow the recipe)
Array<_e>.init(repeating: 0, count: 50).map { "\($0)" } as [String]

-> (infer _e = Int based on argument to first method)
(Array<_e>.init(repeating: 0, count: 50) as Array<Int> where _e = Int).map { "\($0)" } as [String]

-> (look at map using Array<Int> as the type of the inner expression)
((Array<_e>.init(repeating: 0, count: 50) as Array<Int> where _e = Int).map { "\($0)" } as Array<String>) as [String]

-> (finally check Array<String> == [String])
Success!

(Aside: Not entirely sure if the last == should be a proper equality or a subtype check. I think it should be subtype.)

Let's look at the second example:

func process(_ callback: Result<String, SomeError> -> ()) {
    callback(.success(0).map { "\($0)" })
}

-> (remove the uninteresting bits)
.success(0).map { "\($0)" } as Result<String, SomeError>

-> (follow recipe)
Result<_s, _f>.success(0).map { "\($0)" } as Result<String, SomeError>

-> (infer first parameter through success)
(Result<_s, _f>.success(0) as Result<Int, _f> where _s = Int).map { "\($0)" } as Result<String, SomeError>

-> (map doesn't yield equalities, no progress on _f. :-/)
((Result<_s, _f>.success(0) as Result<Int, _f> where _s = Int)
  .map { "\($0)" } as Result<String, _f>) as Result<String, SomeError>

-> (final check reveals what _f is; it's useful unlike the last example)
((Result<_s, _f>.success(0) as Result<Int, _f> where _s = Int)
  .map { "\($0)" } as Result<String, _f>) as Result<String, SomeError> where _f = SomeError

At least in theory, these two examples seem to work! Phew! (I hope I didn't make any mistakes. :sweat_smile:)

Now let's look at the original example that you posted:

struct Foo<T> {}
extension Foo where T == Int {
    static var intToString: Foo<String> { Foo<String> }
}
extension Foo {
    static var someFooInt: Foo<Int> { Foo<Int> }
}
let _: Foo<String> = .someFooInt.intToString

-> (focus on main thing)
.someFooInt.intToString as Foo<String>

-> (follow the recipe)
Foo<_t>.someFooInt.intToString as Foo<String>

-> (look through first member)
(Foo<_t>.someFooInt as Foo<Int>).intToString as Foo<String>
// No additional equalities because someFooInt is present for all possible _t :(

// Now there are two options: an eager check or a lazy check.
// We can eagerly say that _t will not be inferrable because it hasn't been inferred yet
// and it is not getting propagated outwards. This will immediately lead to an error.
// A lazy check will defer that, so we might end up doing some extra checking
// and confirm that Foo<Int> does indeed have an intToString member. Furthermore,
// we can check that the type of that member is Foo<String>, which matches the outer type.
// However we'll still fail later, because we can't leave unsolved type variables around.

So I think there is a model for (2) which doesn't seem to require additional forms of constraints, but still type-checks the desired examples with fetchArray and process but rejects the example with Foo. Now whether this is implementable easily/efficiently is a separate question (and not one I'm qualified to answer), but I think this might serve as a good jumping off point.

Ok! Now let me try to address the alternative you brought up.

This shouldn't prevent inferring the generic arguments of the head from the generic arguments of the tail, but it should be able to provide a solution when equating the head and the tail fails (e.g. when Array<String>.init(repeating: 0, count: 50).map { "\($0)" } fails to type check).

I'd argue that we should not check Array<String>.init(repeating: 0, count: 50).map { "\($0)" } at all. That leads to this demand for a disjunction (try this, if that fails try that) and we want to avoid disjunctions unless they are inevitable (because of the performance risks as you rightly pointed out). Uniformly creating type variables at the start, discovering new equalities while processing the chain, and checking against the contextual type at the end avoids that problem, while still being able to type-check a bunch of useful programs.

2 Likes

Thinking about this a bit more, I think you're right about the S<Int> parallel (and I finally understand what this thread is about, ugh).
This seems to imply that we are enforcing the same-generic-parameter restriction for single-member chains. So we're kinda' in a tricky situation where all options are not good :cry: (unless I am missing something, which is very much possible)

  1. Enforce the same-generic-parameter restriction for both single-member and multi-member chains, making it less useful particularly for multi-member chains.
  2. Keep enforcing the same-generic-parameter restriction for single-member chains and allow multiple chains to have changing parameters, making things inconsistent (if not tricky implementation-wise).
  3. Use the method I suggest in (2) in my previous comment. However, because of the non-subset behavior that I pointed out, some programs, where the generic parameter is not inferrable will start failing to compile. Perhaps that can be fixed by assigning default fallback types for type variables for single-member chains (I'm guessing we already have this facility for ExpressibleByIntegerLiteral and friends) utilizing the generic parameters from context? I'm not sure if this'll work but possibly worth exploring?
  4. Use disjunctions.

I've tried it as a "head conv tail" constraint, but that caused issues with lvalues at the tail of the chain (which need to be allowed since things like

struct Foo {
    static var foo = Foo()
}

var x = Foo
.foo = x //  works!

compile today!)

That all matches my intuitive expectation of how type checking should proceed in these examples, and I don't see any mistakes. Thanks for writing those up! When I wrote that, I was thinking that the inference of _f = SomeError was analogous to the inference of _t = String in the previous example, but your step by step solution does an excellent job of demonstrating why we can recover the link between the generic parameters at the head and tail for Result, but not Foo.

Unfortunately, I think it does mean that this very similar example would fail to type check:

func process(_ callback: Result<String, SomeError> -> ()) {
    callback(.success(0).map { "\($0)" } .mapError { SomeError($0.code + 1) })
}

-> (remove the uninteresting bits)
.success(0).map { "\($0)" } .mapError { SomeError($0.code + 1) } as Result<String, SomeError>

-> (follow recipe)
Result<_s, _f>.success(0).map { "\($0)" } .mapError { SomeError($0.code + 1) } as Result<String, SomeError>

-> (infer first parameter through success)
(Result<_s, _f>.success(0) as Result<Int, _f> where _s = Int).map { "\($0)" } .mapError { SomeError($0.code + 1) } as Result<String, SomeError>

-> (map doesn't yield equalities, no progress on _f. :-/)
((Result<_s, _f>.success(0) as Result<Int, _f> where _s = Int)
  .map { "\($0)" } as Result<String, _f>).mapError { SomeError($0.code + 1) } as Result<String, SomeError>

-> (mapError gives us output type, but relies on lookup in _f)
(((Result<_s, _f>.success(0) as Result<Int, _f> where _s = Int)
  .map { "\($0)" } as Result<String, _f>).mapError { SomeError($0.code + 1) } as Result<String, SomeError) as Result<String, SomeError>

At this point, we're stuck again. We can solve this down to a value member constraint _f[.code] == Int, but without any guidance on _f lookup can't proceed and type checking fails. IMO, this failure would be unacceptable and makes a strict rule (2) setting unattractive. It "feels" like, since the type checker knew that _f == SomeError after map when we didn't have mapError present, we "should" be able to run a .mapError on the result of map and have the input be inferred as mapError.

Is there a current solution for "these types are equivalent up to generic base"?

My current solution is to defer simplifying the head-to-tail equality constraint until we have at least resolved (the outer layer of) the tail, then check whether we have some BoundGenericType (if _t_tail is non-generic, we proceed as we normally would with an equality constraint). If _t_tail == G<_t1 ... _tn>, then we introduce type variables _s1 ... _sn, bind _t_head to G<_s1 ... _sn>, and carry on, leaving _s1 ... _sn to be resolved at some point by lookup.

This seems to work, but it means that in several locations I'm fighting the constraint system when it wants to simplify my _t_head == _t_tail constraint (e.g. by merging the equivalence classes) before the tail has been resolved to a type. It really feels like I'm abusing an equality constraint (because I don't actually want the types to be equal!), which is why I asked about introducing a new constraint/restriction kind.

Yep, that hits the nail on the head. Today, the "head" and "tail" (since there's only one member allowed) are linked by a good-old-fashioned equality constraint without any funny business. Ideally, all of users' intuitions about single-member chains would carry over to multi-member chains, so that a chain can just be thought of as a single "unit," without having to worry too much about generic parameter inference along the chain (so long as it works by equating the parameters at the head and tail).

In light of that, with regards to your potential solutions, (2) is the only one that feels unacceptable to me. I wouldn't want a chain to break just because a user adds a member to a chain thinking that the simple equality rule holds.

I really like (3), so I'll see if something like that would yield the results I'm looking for. The only item of concern I have about this solution is what happens in a situation where the "default" binding would yield one solution, but the lookup based binding would yield another? I think this should never happen, since if there were two solutions S(s1, ..., sn) != T(t1, ..., tn), then at some point along the chain, a member result or argument has si != ti, and so the lookup based approach would fail because of ambiguity.

If that assumption holds, then (3) sounds like a pretty attractive option! Thank you for that suggestion. I'd still really appreciate your advice on how to model that "generic bases only" equality constraint, but for now I'll move forward with the hack I have and see if (3) yields the results I'd like. Really appreciate you taking the time to write all that out!

So close! Using a defaultable binding for the generic arguments gets at just about every issue. We ignore the arguments of the base if they can be resolved via lookup, but default back to them in case lookup can't resolve things for us. The last piece of the puzzle is how to handle this:

struct ImplicitGeneric<T> {}

extension ImplicitGeneric where T == Int {
    static var implicitString: ImplicitGeneric<String> { ImplicitGeneric<String>() }
}

extension ImplicitGeneric where T == String {
    static var implicitString: ImplicitGeneric<String> { ImplicitGeneric<String>() }
}

let _: ImplicitGeneric<String> = .implicitString // Error: Ambiguous use of 'implicitString'

It's apparent why this happens, since with the generic parameters unbound the type checker can't choose between ImplicitGeneric<Int>.implicitString and ImplicitGeneric<Int>.implicitString, but unfortunately it is a regression (since in production, ImplicitGeneric<String> is the only option.

The system is actually able to find both solutions here, so I think the appropriate way to solve this is to increase the score of a solution which applies the "wrong" type to a defaultable generic argument constraint at the tail of an implicit member chain.

I've tried it as a "head conv tail" constraint, but that caused issues with lvalues at the tail of the chain

Good point, I forgot to consider lvalues.

Unfortunately, I think it does mean that this very similar example would fail to type check:
[..]
At this point, we're stuck again. We can solve this down to a value member constraint _f[.code] == Int , but without any guidance on _f lookup can't proceed and type checking fails. IMO, this failure would be unacceptable and makes a strict rule (2) setting unattractive. It "feels" like, since the type checker knew that _f == SomeError after map when we didn't have mapError present, we "should" be able to run a .mapError on the result of map and have the input be inferred as mapError .

I'm guessing you're saying "very similar" because the syntactic diff between the two is somewhat small. I think the way to go about thinking about similarity is thinking about information flow. In terms of information flow, that example is very similar to something like

struct X<T> {
  static func new<U>() -> X<U> { X<U>() }
  func display() { print(self) }
}
let _: () = X<Int>.new().display() // error: generic parameter 'U' could not be inferred

Method chains materializing unconstrained type parameters don't work today. Given that, I think it's consistent that analogous code (:crossed_fingers: I didn't mess up the analogy) with leading dot syntax doesn't work either.

It "feels" like, since the type checker knew that _f == SomeError after map when we didn't have mapError present, we "should" be able to run a .mapError on the result of map and have the input be inferred as mapError .

I kinda' see where this is coming from. There's also the opposite side of the coin where, if there's an ambiguity for type variables created along the way, someone might be surprised that the contextual type is getting propagated (generic parameters included) along the way. This is a tricky design decision. For the sake of concision, let's call these alternatives "conservative" (ambiguity => error) and "flexible" (ambiguity => default).

  1. Both designs have some level of counter-intuitive behavior.
  2. If we adopt the more conservative behavior at the beginning (and report an ambiguity error), we can still get feedback, potentially revisit the design and make it more relaxed and add more defaulting later. On the other hand, we cannot change the behavior to go from flexible -> conservative later on.

Given these two points, my personal preference is to favor predictability over flexibility. Moreover, since we can go from conservative -> flexible later on (if we get a bunch of feedback saying that it's a problem), we're not painting ourselves into a corner. I can understand if different people (including you or the core team!) feel differently.

There's also a case to be made that if there's an ambiguity, your program is too polymorphic in a way, making it harder for someone else to understand (consider a chain with 6-7 things later on getting defaulted, as opposed to just one extra mapError!), and that you should probably add some type annotations along the way as machine-checked documentation for the reader.


Is there a current solution for "these types are equivalent up to generic base"?

My current solution is to [..] resolved at some point by lookup.

This seems to work, [..], which is why I asked about introducing a new constraint/restriction kind.

So close! Using a defaultable binding for the generic arguments gets at just about every issue. [..], but default back to them in case lookup can't resolve things for us

Could you clarify if the former is still a problem, or if the defaulting addressed that? (My understanding is the defaulting fixed the problem you had with the equalities but maybe I'm misunderstanding.)


The only item of concern I have about this solution is what happens in a situation where the "default" binding would yield one solution, but the lookup based binding would yield another? I think this should never happen, since if there were two solutions S(s1, ..., sn) != T(t1, ..., tn) , then at some point along the chain, a member result or argument has si != ti , and so the lookup based approach would fail because of ambiguity.

I think this depends on to what extent we are having the defaulting behavior. In my original comment, I suggested having the defaulting for single-member chains primarily to preserve backwards compatibility. For multi-member chains, I outlined some reasons for why we may not want to have defaulting. However, having the defaulting work for multi-member chains too would make the mapError example compile.

Getting back to the question, if I'm understanding the situation correctly, I think this property (of the two solutions being equal) will not hold if we have defaulting for multi-member chains. Or rather, it's down to the details of how the defaulting works, and it's not clear to me that there is a design for the defaulting that makes it always work out. Consider the following code:

    static var empty : Self { get }
    func combine(_ other: Self) -> Self
}
extension Int8 : Monoid {
    static var empty : Self { 0 }
    func combine(_ other: Self) -> Self { self + other }
}
extension Int : Monoid {
    static var empty : Self { 0 }
    func combine(_ other: Self) -> Self { self + other }
}
struct Newtype<T> {
    let value : T
}
extension Newtype : Monoid where T : Monoid {
    static var empty : Self { Newtype.init(value: .empty) }
    func combine(_ other: Self) -> Self {
        Newtype.init(value: self.value.combine(other.value))
    }
    func combineViaT(_ other: T) -> Self {
        Newtype.init(value: self.value.combine(other))
    }
    func reset<U: Monoid>() -> Newtype<U> { .empty }
}

let _ = .empty.combineViaT(127).combineViaT(127).reset() as Newtype<Int8>

The question becomes: will this program trap (because we added 127 as Int8 + 127 as Int8 and overflow) or will it not trap (we used Int based on defaulting for the literals)? This depends on how we rank the defaulting rules:

  1. The head-tail defaulting wins. This means the program will trap.
  2. The ExpressibleByIntegerLiteral rule wins. This means the program will succeed. This would match the behavior as-if the head-tail defaulting rule were not present for multi-member chains.
  3. Neither rule wins. This would lead to an ambiguity error.

I haven't thought too hard if one can contrive an example where ranking things the other way might lead to the as-if semantics.


On a somewhat related, somewhat tangential note, there's a paper Kind Inference for Datatypes which is about (well) kind inference instead of type inference and is more about formalization, but it shows a similar flavor of problem, where defaulting (and the order of it) leads to undesirable behavior. Here's the relevant bit:

We formalize Haskell98’s datatype declarations, providing both a declarative specification and syntax-driven algorithm for kind inference. We prove that the algorithm is sound and observe how Haskell98’s technique of defaulting unconstrained kinds to ⋆ leads to incompleteness.

Essentially, what ends up happening there is that kind inference behaves surprisingly because the defaulting rule, so seemingly similar programs have different static semantics. (One compiles and other one doesn't; Section 4.3 Defaulting has more details.)


The last piece of the puzzle is how to handle this: [..]

It's apparent why this happens, since with the generic parameters unbound the type checker can't choose between ImplicitGeneric<Int>.implicitString and ImplicitGeneric<Int>.implicitString , but unfortunately it is a regression (since in production, ImplicitGeneric<String> is the only option .

The system is actually able to find both solutions here, so I think the appropriate way to solve this is to increase the score of a solution which applies the "wrong" type to a defaultable generic argument constraint at the tail of an implicit member chain.

I'm somewhat confused by this example. If we have the defaulting rule for single-member chains, wouldn't that make this example work without anything additional? I don't get why there is scoring involved here, or what you mean by applying the "wrong" type.

I think your analogy makes total sense if we view the relationship between generic parameters at the head and tail as totally unbound. I just think about it a little bit differently such that the "default to the contextual parameters" idea makes more sense to me. Note that this idea (in my mind) only applies to defaulting generic parameters at the head of the chain, so with your example:

extension X {
  func clone<U>() -> X<U> { X<U>() }
}

let _: X<Int> = .new().clone()

This should continue to be an error in a world with multi-member chains. The new inference rules would get us as far as:

(((X<Int>).new() as X<_t>).clone() as X<Int>)

But there are no additional rules for resolving generic parameters along the chain.

Yeah, it's definitely getting into very subjective territory about what feels "too magic" when it comes to inferring these parameters. I'm going to try to rephrase my thinking as to why, if we're going to support these different generic parameters at all, we should be as aggressive as possible about inferring the generic parameters at the base from the tail.

In a simple world with no overloading, single member chains are basically a simple lexical substitution. In a declaration like:

let _: Int = .zero

or

let _ = 3 + .zero

you simply have to find the declaration that gives the contextual type and "copy-paste" it to the base of the implicit member reference. If we move to multi-member chains, there's really no difference—the result at the end of the chain is entirely determined by the specified members as part of the chain, and we can again perform a simple lexical substitution: find the declaration(s) that give us the contextual type, and copy-paste that type to the head of the chain.

Introducing overloads complicates this model a little, but not too much—the members along the chain can actually help us resolve ambiguous overloads, but this again feels like a straightforward extension from single member chains, where this compiles today:

func f(x: Int) -> Int { return 0 }
func f(x: String) -> String { return "" }

let _ = f(x: .zero)

However, the simple case (where we're referencing only un-overloaded declarations and the type can be entirely determined from context outside the chain) continues to work exactly as expected.

Introducing generic parameter inference to the story again complicates the model a little, allowing for even more "advanced" usage of implicit member expressions. Now, this works:

struct S<T> {}
extension S where T == Int {
    static var sInt: S<Int> { S<Int> }
}

func f<T>(s: S<T>) {}

f(s: .sInt) // ok!

Again, this extra inference power does not interfere with the simple case, for single or multi-member chains.

My ideal model for this "different generic parameters allowed" rule is that it's just another step along this path. That is, it functions as an "advanced inference" feature for users who are using heavily generic code and many chained methods/properties. It shouldn't, however, interfere with the "simple case" by breaking the "just lexical substitution" model.

Completely breaking the link between the generic parameters of the head and tail, though, does break that model:

let _: Result<Int, SomeError> = .success(0).mapError { SomeError($0.code + 1) }

A user with the "simple model" in their mind would be surprised at this being an error, but without some sort of link between the generic parameters at the head and tail, there's no way to type-check this. As you've mentioned, we need to support it at least for the single-member-chain case, and that results in multi-member chains no longer being an "intuitive" extension of single-member chains.

Now that you've helped me with generating some potentially problematic examples for this "different generic parameters" rule, I'll probably take it back to the pitch thread and see what other members of the community think about them.

This is a really good illustration of where the defaulting rule starts to break down, since it requires us to make a value judgement about the two different kinds of defaulting. It's easy enough to construct an example for single-member chains as well:

struct Adder<T: AdditiveArithmetic> {
    init(x: T, y: T) {}
    static func add<U: AdditiveArithmetic>(x: T, y: T) -> Adder<U> {
        print(x + y)
        return Adder<U>(x: .zero, y: .zero)
    }
}

let _: Adder<Int8> = .add(x: 127, y: 127) // overflow, or no? (today, the answer is overflow)

In order to maintain full source compatibility, the rule has to be that generic parameter defaulting always beats out literal defaulting even though there are two literal defaulting failures in one case and only one generic param defaulting failure in the other.

I'm still curious how you were imagining modeling the "equal up to generic arguments" constraint. As-is, just dropping in an equality constraint is too aggressive. Depending on the order in which constraints are resolved, the equivalence classes for the head and tail might get merged before there are any generic arguments to pull out of the tail to create the defaultable constraint. My hack right now involves getting really specific about how this particular equality constraint gets resolved, which does not seem like the right way to do things. E.g., in matchTypes (:nauseated_face:):

        // Instead of merging the two type variables at the head and tail of
        // an implicit member chain, let the tail get resolved to a concrete
        // type first. This way, if it's a generic type, we can defer
        // binding the type params to one another, allowing for different
        // params at the head and tail of the chain.
        SmallVector<LocatorPathElt, 4> path;
        auto anchor = locator.getLocatorParts(path);
        if (auto *UME = getAsExpr<UnresolvedMemberExpr>(anchor))
          if (path.size() == 1)
            if (path.back().getKind() == ConstraintLocator::MemberRefBase)
                return formUnsolvedResult();

What I want is sort of like a OneWayEqual constraint, except that I only want to wait until the "outer layer" is resolved (rather than the entire type), and I want the handling upon resolution to be different than just inserting a fixed binding.

The defaulting rule helps us bind free type variables, but if the lookup-based generic parameter inference can find two different results that are equally good, then defaulting doesn't do much for us. I'm guessing this is the same reason that we have a score kind for "non default literal" which is what's used to properly type-check a program like this:

let x = f(x: 1)

func f(x: Int) -> Int { return 0 }
func f(x: UInt) -> UInt { return 0 }

If $T1 is the type of the literal, then as soon as we attempt $T1 := UInt, we increase the score and resolve the ambiguity between the overloaded fs. For member chains, the scoring would similarly say "if there's ambiguity, the programmer probably meant for the generic parameters to match the contextual type as closely as possible".

Thanks for the paper! I'll give that a read. And again, thanks for your thoughtful engagement on this topic. You've really helped me organize my thoughts in a way that I think will make taking these questions back to the pitch thread much more productive.

My only remaining question is about how best to model the "equal up to generic parameters" concept (and whether a new constraint kind is appropriate), so if you have any thoughts on that I'd love to hear!

Argh, the further I go down this defaultable binding path, the more it becomes clear that it's not quiiiiite the right tool for the job. Literal defaults are considered preferable when deciding between, e.g., two different overloads in a disjunction, since the score is used to differentiate the otherwise-ambiguous solutions.

However, within a particular branch, defaultable constraints are essentially constraints of last resort—if we can solve the system without relying on them, we should. This bleeds into building our list of bindings, selecting a particular PotentialBindings to attempt next, and potentially elsewhere that I haven't found yet.

So, for literal defaults the rule is basically:

Avoid using defaultable constraints unless absolutely necessary, but if needed to disambiguate two solutions, prefer those which match defaults.

Here, because of the Adder example, the rule for these generic argument constraints needs to be:

Prefer the defaultable generic constraint, and only rely on the chain to resolve the generic base if the defaultable constraint does not yield a solution.

I see where you're going with this line of reasoning, I think the only question is what is a good stopping point, which I think we both agree is subjective to some extent. Good thing you brought it up in the pitch thread, so we can see if there is some broad agreement here.

Thanks. I was kinda' struggling to come up with a smaller example without having it be silly, this is a good one. In hindsight, it does make sense that you can concoct a similar example with single-member chains; that kinda' forces our hand design-wise so one less decision to make.

The type checker is not really my area of expertise. At this stage, I think you almost certainly know more about the constraint system than I do. :slightly_smiling_face: For example, you could talk to Pavel (xedin) or Robert (CodaFi) (or generally just other people who work more on type checking), they might have some suggestions for you.

Conceptually though, using an equality does seem a bit suspect. If we had the base be Foo<$T0> and the contextual type be Foo<X> (where X is some concrete type), what needs to happen is that we need something like a $T0 defaults to X. Doing this eagerly might not work, because we might immediately default if we type checked part of the chain and didn't find any information for $T0. However, the defaulting may need to trigger in the middle of type checking the chain (e.g. we need to access some members as you showed earlier); we can't necessarily do it at the end of processing the chain (I suppose theoretically we can do that but I'm not sure if it's desirable/possible in practice or if it'd be backwards compatible).

You don't need to necessarily go read it :sweat_smile:, it's more of a fun, slightly relevant diversion rather than something directly applicable to what you're trying to do, but I figured it might be interesting (or someone else reading the thread).

Thanks for the discussion as well. I think forming a solid mental model is really important for things like this. For example, I had the wrong mental model for 9 months on how single-member chains worked.


Oh well. IDK what the next steps should be. :neutral_face: I suspect using disjunctions might prove to be problematic perf-wise with long chains. Maybe someone else has ideas on how the defaulting could be made to work.

1 Like

Yeah this is the crux of the issue. We essentially end up with a two-way dependency between generic arguments from the chain and arguments from the contextual type, where we can't bind either of them without the other being resolved first. In examples such as these two:

let _: Result<String, SomeError> = .success(0).map { "\($0)" }
let _: Adder<Int8> = .add(128, 128)

We essentially want opposite things. If $T0 is a type variable for (one of) the literal(s) in the first call and $T1 is the (first) generic argument to the base of the implicit member expression we reach a point in both examples where we have:

$T0 potential bindings = {subtypes of Int (from ExpressibleByIntegerLiteral)}
$T1 potential bindings = {subtypes of <default generic argument from chain tail (String/Int8)>}

$T0 converts to $T1
$T0 conforms to ExpressibleByIntegerLiteral
$T1 defaults to <default generic argument from chain tail (String/Int8)>

If we let $T1 get bound first, then the Adder example works as required for source compatibility. The literal types are inferred as Int8, and they produce an overflow error. However, in the Result example, binding $T1 first fails to compile, since String doesn't conform to ExpressibleByIntLiteral.

OTOH, binding $T0 first lets the Result example compile since we do away with the "default to String" constraint as soon as $T1 is inferred as Int. However, this same logic infers the base generic argument as Int rather than Int8 in the Adder example, violating source compatibility.

I think what I'd like to happen is for the system to look through the $T0 converts to $T1 constraint and identify the additional potential binding(s) for $T1 via the default literal bindings attached to $T0 before it enters the binding step for $T1. Then, after attempting the default generic argument from the tail of the chain (and finding the Int8 solution for the Adder example), the Result example would continue on after the String binding failed, and attempt binding $T1 to Int, finding that solution.

Feel free to tap out of the discussion, I'm mostly just thinking out loud at this point :sweat_smile:. I'll reach out to Pavel or Robert and see if either of them can spare the time to advise about how to approach this. Thanks again for all your help!

Terms of Service

Privacy Policy

Cookie Policy