Improved operator lookup semantics

rudkx · May 22, 2018, 11:36pm

I wrote up a pitch for an idea which I believe if implemented will make it possible to eliminate exponential type checking times in virtually all cases for expressions composed primarily of operators (as opposed to say large collection literals).

Actually achieving the intended effect will also require making use of this feature within the standard library, which is outside the scope of this pitch and will be a follow-on pitch.

I wanted to start getting some feedback and discussion of this idea soon in order to improve the quality of the write-up and address any obvious shortcomings, and then start looking at what the changes to the standard library will end up looking like.

https://github.com/rudkx/swift-evolution/blob/improved-operator-lookup-semantics/proposals/0000-improved-operator-lookup-semantics.md

Chris_Lattner3 · May 23, 2018, 3:24am

From a quick read: +1 to the idea. I'm concerned that a strict interpretation of "This lookup behaves as if it were member lookup on the metatype" because in many cases you can't infer the type of an expression without contextual result information:

let x: Int8 = 1 + 1

You can't wait for the LHS to get resolved to resolve the +, you have to use the result type information of Int8 and propagate that up. More generally though, assuming you have an answer for this, this seems like the sensible start to the proposal.

I see that you're keen to split the actual stdlib changes out from the proposal, and that may be fine from a formalism standpoint, but I'd be opposed to running this review without having both proposals ready to evaluate. Learning from SE-0091, we should be wary about taking language changes with the promise of compile time improvement without an implementation that 1) demonstrates those improvements, and 2) is source compatible enough to be possible to take in practice.

Thank you for pushing this forward, this will be a huge step forward!!
-Chris

rudkx · May 23, 2018, 3:55am

Chris_Lattner3:

I'm concerned that a strict interpretation of "This lookup behaves as if it were member lookup on the metatype" because in many cases you can't infer the type of an expression without contextual result information:
let x: Int8 = 1 + 1

This example would fall under the category of expressions with a leading literal, covered by the section Literal operator lookup protocols, so the lookup in this case would happen on the protocol associated with leading literal arguments for +, which in the stdlib implementation I imagine would yield an operator with type (T, T) -> T, so the context would naturally be taken into account (it would be the only binding we would try when solving the expression).

I totally agree that we should not run the proposal in isolation without the stdlib proposal fleshed out and some prototype implementation that demonstrates that we think we can really achieve the expected benefit.

Chris_Lattner3 · May 23, 2018, 4:00am

I see, great!

xwu · May 29, 2018, 4:06pm

This is certainly an issue that needs work in Swift, so I’m glad you’re tackling it. I had some shower thoughts on the topic a while back, but I can’t recall if I jotted them down somewhere. One issue that I do see with the proposal draft is the intention to base the “target” on associativity:

For one, we have some non-associative operators in Swift.

But more saliently, for right-associative operators, it is not actually intuitive that the right-hand side determines the target type. It turns out that using the left-hand side to determine the target type works out more predictably. At some point, I had many concrete examples of this, but I can’t recall what all of those are today—although it did not take too much effort to arrive at this conclusion and I imagine you will start to see it too. One rationalization of that empiric finding is that we read code left-to-right. Therefore, it would be more problematic if “x ** 2” doesn’t typecheck than if “2 ** x” doesn’t typecheck. Always using the left-hand side as the target type also more closely parallels the syntax we would use if the operation were spelled as a typical instance method (i.e., “lhs.method(rhs)”). If operators were declared within types not as static methods but instance methods (e.g., “func ** (rhs: Self) -> Self”), then using the left-hand side always as the target type would be very much become the only natural behavior, and perhaps we could consider that syntax change as well.

hlovatt · May 29, 2018, 9:13pm

Good suggestion. Also good idea to type on LHS always, not based on binding.

nuclearace · May 29, 2018, 9:16pm

If we do this, I would also like to see something akin to Scalas operator syntax. Where you can call methods without . and () if the method takes one argument.

rudkx · May 30, 2018, 11:26pm

Yes, good point. I hadn't realized we were allowing this as long as they weren't used in a sequence.

In retrospect I don't recall why I thought this would make sense, and I don't have an example in mind where I think it helps or clarifies anything, so I'll remove it and specify that the type of the left operand determines where lookup will occur.

This is exactly where I started with this idea when it first occurred to me a couple months ago. I first imagined it as a straightforward desugaring from infix to member syntax, like:

  x + y

becomes

  x.+(y)

I realized that the same effect (in terms of helping compile-time performance) could be achieved without actually adding instance member operators and support for direct invocation of these member functions by instead just changing the lookup rules. I mentioned in the pitch that I'd really like to keep that as a separate idea to consider at a future date.

jrose · June 1, 2018, 5:45pm

This is great. Thanks for following through on it, Mark.

The literal-on-the-left situation still bugs me, particularly because 3 + x and "abc" + x really shouldn't share a protocol. What would the effects be if it turned into a contextual lookup, like we do for prefix dot syntax? That is, 3 + x would be turned into .+(3, x), which requires that we know the result type like .init() does? (And would propagate the default Int outward, I guess.)

rudkx · June 1, 2018, 8:19pm

I'm curious what your reasoning is for thinking these shouldn't share a protocol.

The protocol describes a capability that is shared by multiple types. I don't see this as being any different than the use of protocols in non-operator situations.

I think that's a very interesting idea, but it's not clear to me what we would do when we had no context, e.g.:

  let x = 1 + 2 // aka .+(1, 2)
  let y = 1 + 2 + 3 // aka .+(.+(1,2), 3)
  let z = (1 + 2) + (3 + 4) // aka .+(.+(1,2), .+(3,4))
  let s = S()
  let tmp = 1 + s // perhaps S is ExpressibleByIntegerLiteral or we have a
                  // +(Int, S) and +(S, Int) defined

I'm not sure what you mean here, but perhaps you mean that for expressions like those above we would try to use the default literal types of all the direct literal operands of a given subexpression as the context?

jrose · June 1, 2018, 8:33pm

The protocol describes a capability that is shared by multiple types. I don't see this as being any different than the use of protocols in non-operator situations.

Protocols define a semantic capability that is shared by multiple types, not just a syntactic one. We don't have separate protocols for + and -, for example, but - only works on Numeric types, not Collections.

I'm not sure what you mean here, but perhaps you mean that for expressions like those above we would try to use the default literal types of all the direct literal operands of a given subexpression as the context?

That's correct. let tmp = 1 + s would become .+(1, s) with a fallback result type of Int. We must already have a way to deal with ambiguous fallback types because of the following example:

func f(_ x: Int, _ y: Character) { print("Int, Character") }
func f(_ x: Double, _ y: String) { print("Double, String") }
f(1, "a") // error: ambiguous use of 'f'

griotspeak · June 2, 2018, 8:20am

If we're actually considering a significant change to operators, it would be nice to consider eliminating top level operators.

xwu · June 2, 2018, 8:41pm

How would you then declare an operator that takes two tuple operands?

felix91gr · June 2, 2018, 9:27pm

Wait. Are top-level operators... global operators? As in, not bound to a type?

griotspeak · June 2, 2018, 9:32pm

/me mumbles about non-nominal types ruining everything.

Well, perhaps on all nominal types. The presence of top-level operators causes issues for compilation performance that we might be able to work around without much pain. @Graydon_Hoare explained it better

felix91gr · June 2, 2018, 9:37pm

So those would be types that are present in the compiler, but not the language itself? Like tuples.

rxwei · June 3, 2018, 1:46am

+1!

Earlier we added a protocol that has a + operator in the tensorflow fork of the standard library, and that protocol alone (without any conformance in stdlib) caused some existing sema tests to fail. So the current operator lookup behavior is pretty fragile.

rudkx · June 3, 2018, 4:08am

You run into the same issue with fully generic functions, e.g. if you want to define a pipe-forward operator for Swift:

  func |> <D,R>(arg: D, fn: (D) -> R) -> R { return fn(arg) }

You need to be able to look this up for all types.

rudkx · June 3, 2018, 4:11am

Yes, this is one of the really unfortunate aspects of the current behavior that I want to fix with this pitch. Currently simply importing other modules can result in code that previously compiled in a reasonable period of time to compile much more slowly or hit the "too complex" error.

rudkx · June 3, 2018, 4:23am

I'm not sure if this is the point you're trying to get across, but what I would say is that the pitch is taking an existing mechanism and using it both for its originally intended purpose and additionally as a convenient mechanism to reduce the scope lookup for a special case (subexpressions with leading literals).

I don't follow how that would work for something like

  let x = 1.0
  let y = 1 + x

The type we want for y is Double, not Int, so looking the operator up in Int wouldn't work.

There are cases where you have heterogeneous operations that yield results of types that are completely different from their source types, too. For example consider a type T for which the natural type of the result of computing the difference of two Ts (done by perhaps the - operator) is an Int rather than another T. It doesn't seem more natural to me to define these operations on T in extensions of Int.