Why do open_existential instructions use a unique type per call site?

Hi,

I am looking into a performance issue that Mark Lacey identified yesterday. He pointed out that the implementation of calls that go through protocols could be optimized by CSE-ing the method lookup. Before each call we emit code for loading the pointer of the witness table from the existential container, and then we perform another lookup in the witness table to find the method pointer. In some cases we can prove that the witness table is not changed by the program and we could reuse the method pointer and eliminate two loads.

Consider the code below:

protocol Pingable { func ping() }

func foo(x : Pingable) {
  x.ping()
  x.ping()
}

The Swift code above is translated into the abridged SIL function below:

sil hidden @_TF4main3fooFPS_8Pingable_T_ : $@convention(thin) (@in Pingable) -> () {
bb0(%0 : $*Pingable):

// "CE4" call
  %2 = open_existential_addr %0 : $*Pingable to $*@opened(“8DF56CE4-...") Pingable
  %3 = witness_method $@opened(“8DF56CE4...") Pingable, #Pingable.ping!1, %2 : $*@opened(“8DF56CE4-..") Pingable : ..
  %4 = apply %3<@opened(“8DF56CE4-...") Pingable>(%2) : …

// “4B4" call
  %5 = open_existential_addr %0 : $*Pingable to $*@opened("8DF574B4-..” )Pingable
  %6 = witness_method $@opened(“8DF574B4...") Pingable, #Pingable.ping!1, %5 : $*@opened("8DF574B4-") Singable : …
  %7 = apply %6<@opened("8DF574B4-") Pingable>(%5) : …

destroy_addr %0 : $*Pingable
  %9 = tuple ()
  return %9 : $()
}

The three instructions open_existential_addr,witness_method and the apply are paired together with a unique type id. This means that the optimizer can’t replace the method pointer that was computed by the first call and use it for the second call. In the example above, we can’t use %3 for making the call in %7. Notice that the ID of the first call ends with CE4 while the second ID ends with 4B4.

My question is, why are the open_existential_addr+witness_method+apply instructions tied together with a unique type? Can we come up with a different representation that will allow us to optimize the witness method lookups?

Thanks
-Nadav

It uses a unique ID because every single opening site might find a different type. Even on a single address, we don’t know that the type hasn’t been changed between opening sites. And representationally, we aren’t set up to do anything like have a type expressed in terms of a SIL value reference that then becomes canonically the same type as another if you CSE the value reference; our type representation expects canonical type equality to be pointer equality.

It should, however, be very straightforward to write a utility that clones (or rewrites) an instruction’s opened-existential uses. The limitation there is that we don’t currently have a way to find all the occurrences of an opened existential type. However, Mark and I were talking several weeks ago about adding implicit extra use tracking to instructions that are expressed in terms of a type other than the type of a value operand, so that, if those type(s) happen to contain opened archetypes, we can mark them as using the opening site.

Every instruction that has a Substitution or bare Type in its representation would have a variadic set of these extra uses, and they would be implicitly set up when creating the instruction by inspecting those types, finding the opened archetypes, and asking the SILFunction where those archetypes were introduced. (It’s efficient to find out whether a type is expressed in terms of any opened archetypes.)

John.

···

On Feb 18, 2016, at 9:46 AM, Nadav Rotem via swift-dev <swift-dev@swift.org> wrote:
Hi,

I am looking into a performance issue that Mark Lacey identified yesterday. He pointed out that the implementation of calls that go through protocols could be optimized by CSE-ing the method lookup. Before each call we emit code for loading the pointer of the witness table from the existential container, and then we perform another lookup in the witness table to find the method pointer. In some cases we can prove that the witness table is not changed by the program and we could reuse the method pointer and eliminate two loads.

Consider the code below:

protocol Pingable { func ping() }

func foo(x : Pingable) {
x.ping()
x.ping()
}

The Swift code above is translated into the abridged SIL function below:

sil hidden @_TF4main3fooFPS_8Pingable_T_ : $@convention(thin) (@in Pingable) -> () {
bb0(%0 : $*Pingable):

// "CE4" call
%2 = open_existential_addr %0 : $*Pingable to $*@opened(“8DF56CE4-...") Pingable
%3 = witness_method $@opened(“8DF56CE4...") Pingable, #Pingable.ping!1, %2 : $*@opened(“8DF56CE4-..") Pingable : ..
%4 = apply %3<@opened(“8DF56CE4-...") Pingable>(%2) : …

// “4B4" call
%5 = open_existential_addr %0 : $*Pingable to $*@opened("8DF574B4-..” )Pingable
%6 = witness_method $@opened(“8DF574B4...") Pingable, #Pingable.ping!1, %5 : $*@opened("8DF574B4-") Singable : …
%7 = apply %6<@opened("8DF574B4-") Pingable>(%5) : …

destroy_addr %0 : $*Pingable
%9 = tuple ()
return %9 : $()
}

The three instructions open_existential_addr,witness_method and the apply are paired together with a unique type id. This means that the optimizer can’t replace the method pointer that was computed by the first call and use it for the second call. In the example above, we can’t use %3 for making the call in %7. Notice that the ID of the first call ends with CE4 while the second ID ends with 4B4.

My question is, why are the open_existential_addr+witness_method+apply instructions tied together with a unique type? Can we come up with a different representation that will allow us to optimize the witness method lookups?