Matching behavior with expectations for associated types and typealiases in constrained protocol extensions

This thread is an attempt to see if the following can be worked out (emphasis mine):

There's no need to look into the quoted thread for the purpose of this thread, which is to see if people share the same expectations about what the behavior should be for this program:

protocol P {
  associatedtype A = Bool
}

extension P where Self: BinaryFloatingPoint {
  typealias A = Float
}

func printA<T: P>(of: T.Type) {
  print(T.A.self)
}

extension Double: P {}
extension String: P {}

printA(of: Double.self)
printA(of: String.self)
print(Double.A.self)
print(String.A.self)

Questions:

According to your expectations, do you think this program should compile?

   If you think it should, what (four lines) should it print?

   If you think it shouldn't, what should the error(s) be?

(EDIT: Clarified the mutual exclusiveness of the sub-questions.)

3 Likes
  • Should this program compile?
    Theoretically, yes.

  • If it should, what (four lines) should it print?
    My natural expectation:

Float
Bool
Float
Bool

Actual result because of how type inference works and how the where clause is ignored in that regard.

Float
Float
Float
Float
  • If not, what should the error(s) be?
    I'd wish that it would compile, but only if the result would be also as expected and not messed up through the associated type inference. Otherwise what's the point to make this compilable in first place?

Just to clarify, the current actual result (in Swift 5.2 and 5.3) is that the last line will not compile, reporting the error message Type 'String' does not conform to protocol 'BinaryFloatingPoint'. If the last line is removed though, it will compile and print:

Float
Float
Float

I know, that's why I'm asking, why would we want to make the last line compile if the associated type inference going to mess up the result again from Bool to Float as it does for line 2 already?

I have no idea why anyone would want that :-) , perhaps someone thinks that the program should not compile because of some other line, I'm just interested to see if most (if not all) people seem to agree on what the future/fixed behavior should be.

2 Likes

I think this should print

Float
Bool
Float
Bool

And the current behavior should be reported as a bug.

1 Like

Btw. If we‘re going to solve this issue somehow, it must go through the evolution process as noted by @Ben_Cohen in this issue I previously filed. There is likely a huge breaking change involved here, as it‘s no longer 'just a bug'.

| Jens Jens Persson
June 1 |

  • | - |

This thread is an attempt to see if we can address the following (emphasis mine):

How does associatedtype inference work

Oh, and my conclusion is that no one should use typealiases in constrained protocol extensions until this gets worked out, i.e. until there's an actual design. But that's just my opinion.

No need to look into that thread, suffices to look at this little example program and answer the questions (without compiling the program first!):

protocol P {
  associatedtype A = Bool
}

extension P where Self: BinaryFloatingPoint {
  typealias A = Float
}

func foo<T: P>(_: T.Type) {
  print(T.A.self)
}

extension Double: P {}
extension String: P {}

foo(Double.self)
foo(String.self)
print(Double.A.self)
print(String.A.self)

Questions:

Should this program compile?

Yes

If it should, what (four lines) should it print?

I expect after thinking about it for a little while and skimming the thread but not compiling or running the code and only having the first cup of coffee:

Float
Bool
Float
Bool

(Where the extension P on Double uses the extension P where Self: BinaryFloatingPoint conformance because Double conforms to protocol BinaryFloatingPoint and the typealias A matches up with the associatedtype A in the protocol P.) String does conform to BinaryFloatingPoint so it does not pick up the extension and just uses the associatedtype in protocol P.

I think that’s what you are also expecting by posing the question but peoples’ intuitions or expectations vary kind of a lot in this area.

Couple of things, some of which are introspecting my intuitions and may or may not be what you’re looking for, and are on the first cup of coffee of the day:

  1. It is not that obvious that the typealias with the same name replaces/overrides/whatever the associatedtype. It’s kind of a how-else-would-you-do-it kind of thing. It might be clearer if you put ‘associatedtype’ instead of ’typealias’ in the extension and explained that it was replacing or implementing the one in the protocol. Or something like that. It feels like a rough edge in there. This is independent of generalized existentials I think—it’s just that the syntax (with some semantics) has a specific rule here that you just have to know (where typealias in extension goes to associatedtype with same name in protocol.)

  2. The extension … where for a conditional conformance (maybe not the right terms exactly) has always felt a little odd to me also. My initial intuition on conditional conformances was not that they were conditional, maybe because the where is after the extension. It felt like the extension is always applied (the way it would be if there were no ‘where’) but you get something extra if the ‘where’ applies. Which does not really make sense but it’s where my mind went at first. If the syntax were something like “if Self : BinaryFloatingPoint then extension P …” it would be gross but more initially intuitive. Once your intuition is guided by experience it’s clear.

  3. I think 'x where y’ is similar to ‘if y then x’ in type terms (maybe identical) but that’s not a fully-formed thought. If that or something like it were set as a core syntactic principle of the language it might sit clearer.

  4. My intuition wants everything to be like class inheritance where same name in subclass means use it instead (to use short words) but for extensions, conformances and protocols the rules are not as built in. For the associatedtype and typealias thing, the correct rule is that same name replaces but the label next to it is different so you think that the same-name rule may not apply. For conditional conformances same-name extends but only under conditions in the where clause.

Now that I wrote all that down extension … where seems a lot more intuitive to me, which shows how much intuition can vary, since I just started thinking about this at all in the last half-hour or so.

I wrote a lot there, hope that’s OK and that some of it helps.

—Dan

Sure! Though I guess @DevAndArtist could be right here:

According to your expectations, do you think this program should compile?

It's not obvious that it should because that that would let you "check" whether something conforms to a protocol or not. If say the stdlib later added an extension String : BinaryFloatingPoint (unlikely here, but not unlikely for some arbitrary type T and protocol P, especially where both T and P were not defined in the current module), the behavior of a compiling program would change without any warning.

I can see it going either way; it really depends on the design priorities.

Also, if it does compile, the behavior comes down to a question of: do we penalize all generic code with more lookups at run time? Or do we avoid that in this case? Again, there's no "right answer". It depends on design priorities and the costs involved.

I don’t think anybody so far has asked for this behavior to be dynamic, nor does it have to be for this to compile. IMO the compiler should be able to sort out the constraints and conformances at compile time and give the expected behavior statically.

Changes to dependencies after compilation shouldn’t have any effect I’m this case, just like adding retroactive conformances in your app has no effect on pre-compiled foundation code.

1 Like

If you comment out the last line so this actually compiles today, I wonder if it’s really the associated type that’s the problem here, or is it the conditional conformance used in generic context? The fact that line 2 doesn’t print “bool” sounds a lot like this problem:

...but I don’t really understand the generics system well enough to be sure.

I don’t think anybody so far has asked for this behavior to be dynamic, nor does it have to be for this to compile. IMO the compiler should be able to sort out the constraints and conformances at compile time and give the expected behavior statically.

I don't know how that could work. Consider only the printA statements (ignoring the print statements). Here are some options for compiling printA:

  1. Static + Constant: Obtain a way of resolving T.A to a single type at compile time that works for any choice of T conforming to P. In this case, it gets resolved to Float, hence you get the behavior that you get today.
  2. Static + Varying: For every T that printA gets instantiated with, try to resolve T.A separately and create a separate copy of printA. This breaks the single representation requirement which we need for modularity. We can't have monomorphization as a requirement for all constrained generic functions, only as an optimization.
  3. Dynamic + Varying: Create a way of resolving T.A at run time (hence it may be different for different T's that are passed in). This requires some additional runtime lookup.

Do you have any suggestion on how it could work purely statically while giving different answers for Double and String without resorting to monomorphization?

To be clear, I'm not saying it can't work statically. It certainly can, it does work today (ignoring the last two lines). I'm saying that I don't see how it can work statically while still giving different results and having the "generic functions must compile to a single representation" constraint.

Fun fact:

If you change the extension from this:

extension P where Self: BinaryFloatingPoint {
  typealias A = Float
}

to this:

extension BinaryFloatingPoint where Self: P {
  typealias A = Float
}

Then the example in the original post compiles as written and prints:

Float
Bool
Float
Bool
6 Likes

So here I’m going to show my total lack of understanding of how the compiler works :yum::
Types that conform to protocols carry with them a witness table that describes the details of how they conform. This is how, for example, you can pass either an Int64 or an Int8 to a function that is generic over FixedWidthInteger and it will do the right thing.

In this case, part of “doing the right thing” involves knowing what is the type of A, so couldn’t that information also be added to the witness tables for everything that conforms to P?

That's corresponds to what I'm saying in option 3. If you pass that information as part of the witness table (witness tables are passed as "hidden arguments" at run time), sure you can have that information. But then the code generated for printA needs to go look at the witness table when it wants to retrieve the type information. Unless I'm misunderstanding something, that's an additional lookup we don't have today. This is what I was referring to in my original comment when I said "do we penalize all generic code with more lookups at run time".

If we don’t have that, how do explain @Nevin‘s Example above?

Nevin's example works because the types are statically determined to be Float and Bool respectively. In the example posted at the beginning of the post, the types are statically determined to be Float and Float respectively because of the way associated type inference works. You can check this because the following fails to compile:

protocol P {
  associatedtype A = Bool
}
extension P where Self: BinaryFloatingPoint {
  typealias A = Float
}
extension String: P { // error: String does not conform to P because of multiple matches A = String and A = Float
  typealias A = String
}

I think the argument you are making is that "why not have the code in the original post have the same behavior as in Nevin's post". Apart from backwards compatibility, one of the other reasons is what I spoke about earlier:

It's not obvious that it should because that that would let you "check" whether something conforms to a protocol or not. If say the stdlib later added an extension String : BinaryFloatingPoint (unlikely here, but not unlikely for some arbitrary type T and protocol P, especially where both T and P were not defined in the current module), the behavior of a compiling program would change without any warning.

When I say "compiling program", what I mean is: if you compiled a program against the stdlib relying on this (i.e. statically resolving the types to be Float and Bool), then a new version of the stdlib came out with an extension String : BinaryFloatingPoint, and you recompiled the program against this new stdlib without changing it, then the behavior of your newly compiled program is different from that of your old program even though you didn't change your code.

This is already true with Nevin's example today. I don't see that as a good reason for having more programs change semantics under you when you bump your dependencies but don't change the program itself.

The more trouble adding extensions to pre-existing types creates for downstream clients, the more difficult it becomes to evolve APIs.

Indeed. If you can reverse the constraint and make it work as expected then the current behavior certainly seems like a bug, or at least it is extremely confusing.

I don’t understand this argument. Doesn’t Nevin’s example allow the same check? Or in other words, if I want to use this trick to check if something conforms to a protocol, I just have the write it like Nevin’s example and not like the example in the OP. But this distinction makes no sense from my perspective as a user because in either case it looks like I’m defining A whenever I have a type that conforms to BinaryFloatingPoint & P but that’s apparently not the case.

2 Likes