SIL level substitutions and generic arguments

Hi All (and in particular, SIL experts like @Slava_Pestov, @John_McCall, @Joe_Groff, and @Erik_Eckstein),

I'm working on the interpreter for the constexpr evaluator and am having trouble understanding how to propagate substitutions reliably between callers and callees. Our branch is still a couple months old (the merge will hopefully will be done soon), so we have an older version of the generics system than is currently on master, and notably still has SubstitutionList.

I have something that works in many many cases, but doesn't handle "non-generic substitutions" (I say that, but I don't really understand what a non-generic substitution is). Doug provided a nice testcase which I've further reduced to the code at the end of this post. The issue I'm facing is that when resolving witness table members, sometimes I resolve it down to a SILFunction that has fewer requirements, than the apply that is invoking it. Specifically, in the testcase below, I have this call to T(something:):

func substitutionsF<T: substitutionsP>(_: T.Type) -> T {
  return T(something: substitutionsY())
}

Which is lowered into this SIL code:

  %9 = witness_method $T, #substitutionsP.init!allocator.1 : <Self where Self : substitutionsP><T where T : substitutionsP> (Self.Type) -> (T) -> Self : $@convention(witness_method: substitutionsP) <τ_0_0 where τ_0_0 : substitutionsP><τ_1_0 where τ_1_0 : substitutionsP> (@in τ_1_0, @thick τ_0_0.Type) -> @out τ_0_0 // user: %10
  %10 = apply %9<T, substitutionsY>(%0, %7, %3) : $@convention(witness_method: substitutionsP) <τ_0_0 where τ_0_0 : substitutionsP><τ_1_0 where τ_1_0 : substitutionsP> (@in τ_1_0, @thick τ_0_0.Type) -> @out τ_0_0

Note that the apply has two substitutions present. They alook like this:

T
  (abstract_conformance protocol=substitutionsP)

Y
  (normal_conformance type=substitutionsY protocol=substitutionsP
    (value req=init(something:) witness=pound_assert.(file).substitutionsY.init(something:)@/Users/clattner/swift-base/swift/test/SILOptimizer/pound_assert.swift:93:3)
    (value req=get() witness=pound_assert.(file).substitutionsY.get()@/Users/clattner/swift-base/swift/test/SILOptimizer/pound_assert.(lldb) 6:8))(lldb) 

The problem is that this is getting resolved to this witness table thunk:

// protocol witness for substitutionsP.init<A>(something:) in conformance substitutionsX
sil private [transparent] [thunk] @$S12pound_assert14substitutionsXVAA0C1PA2aDP9somethingxqd___tcAaDRd__lufCTW : $@convention(witness_method: substitutionsP) <τ_0_0 where τ_0_0 : substitutionsP> (@in τ_0_0, @thick substitutionsX.Type) -> @out substitutionsX {

... but that this thunk only has one requirement. I attempt to build a SubstitutionMap using the substitution list from the apply instruction, using this (effectively) this call:

        auto signature = fn->getLoweredFunctionType()->getGenericSignature();
        if (signature)
          substitutionMap = signature->getSubstitutionMap(substitutions);

However, this dies in the end of getSubstitutionMap() with this assert:

Assertion failed: (subs.empty() && "did not use all substitutions?!"), function getSubstitutionMap, ...

Beyond the assertion, I'm concerned that I'm not passing these things in the right order.

Ok, so coming back to the actual question :-), how does one go about matching up the substitutions in an apply instruction to the requirements expected by a SubstitutionMap? I need a SubstitutionMap whose keys are the generic types in the callee, but whose values are the constant propagated types that the constexpr evaluator is pushing around.

Thank you so much for any help on this, I've been banging my head against it for a long time...

-Chris

Here's the full testcase:

protocol substitutionsP {
  init<T: substitutionsP>(something: T)
  func get() -> Int
}

struct substitutionsX : substitutionsP {
  var state : Int
  init<T: substitutionsP>(something: T) {
    state = something.get()
  }

  func get() -> Int {
    return state
  }
}

struct substitutionsY : substitutionsP {
  init() {}
  init<T: substitutionsP>(something: T) {
  }

  func get() -> Int {
    return 123
  }
}
func substitutionsF<T: substitutionsP>(_: T.Type) -> T {
  return T(something: substitutionsY())
}

func testProto() {
  #assert(substitutionsF(substitutionsX.self).get() == 123)
}

Ok, I spent all day on this, and I now know a lot more about Substitution map and Generic signatures. I have a lot of cases working, using this fairly gross code, which maps from an apply instruction + resolved callee to a substitutionMap that makes sense in the callee:

  SubstitutionMap calleeSubMap;

  auto calleeFnType = callee->getLoweredFunctionType();
  if (auto signature = calleeFnType->getGenericSignature()){
    ApplySite AI(apply);

    // Get the substitution map of the call.  This maps from the callee's space
    // into the caller's world.
    auto requirementSig = AI.getOrigCalleeType()->getGenericSignature();
    auto callSubMap =
      requirementSig->getSubstitutionMap(apply->getSubstitutions());

    // If we are invoking a witness method, then strip off one level of generic
    // type parameter's index space, since Self will be bound to a concrete
    // type.
    if (calleeFnType->getRepresentation() ==
        SILFunctionType::Representation::WitnessMethod) {
      callSubMap = SubstitutionMap::combineSubstitutionMaps
        (callSubMap, callSubMap, CombineSubstitutionMaps::AtDepth,
         0, 1, calleeFnType->getGenericSignature());
    }
    // The substitution map for the callee is the composition of the callers
    // substitution map (which is always type/conformance to a concrete type
    // or conformance, with the mapping introduced by the call itself.  This
    // ensures that the callee's substitution map can map from its type
    // namespace back to concrete types and conformances.
    calleeSubMap = callSubMap.subst(substitutionMap);
  }

This seems to work in a lot of cases, but the blob of code for witness methods seems really unprincipled and weird to me. This (or something better than it!) is required because witness methods on the callers pass in a self requirement and the callee's generally know what self is.

This works in a lot of cases, but fails in at least one case I've seen where the caller is a default implementation in a conditional conformance. The case in question is in Collection<>.makeIterator() which is calling Collection.startIndex, and I know that Self == ClosedRange<Int>. Here is the call and the signature of the callee (full code below), and the state of callSubMap at the time that I'm trying to do this transformation:

  // call in Collection.makeIterator
  %8 = witness_method $Self, #Collection.startIndex!getter.1 : <Self where Self : Collection> (Self) -> () -> Self.Index : $@convention(witness_method: Collection) <τ_0_0 where τ_0_0 : Collection> (@in_guaranteed τ_0_0) -> @out τ_0_0.Index // user: %9
  %9 = apply %8<Self>(%7, %2) : $@convention(witness_method: Collection) <τ_0_0 where τ_0_0 : Collection> (@in_guaranteed τ_0_0) -> @out τ_0_0.Index

// Resolved callee.
// protocol witness for Collection.startIndex.getter in conformance <> ClosedRange<A>
sil public_external [transparent] [serialized] [thunk] [canonical] @$Ss11ClosedRangeVyxGs10Collectionss10StrideableRzs13SignedInteger6StrideRpzrlsADP10startIndex0I0QzvgTW : $@convention(witness_method: Collection) <τ_0_0 where τ_0_0 : Strideable, τ_0_0.Stride : SignedInteger> (@in_guaranteed ClosedRange<τ_0_0>) -> @out ClosedRange<τ_0_0>.Index {

// This is in LLDB
$ call callSubMap.dump()
Generic signature: <τ_0_0 where τ_0_0 : Collection>
Substitutions:
  τ_0_0 -> Self

Conformance map:
  τ_0_0 -> [(abstract_conformance protocol=Collection)

$ call signature->dump()
<τ_0_0 where τ_0_0 : Strideable, τ_0_0.Stride : SignedInteger>

The confusing thing to me about this is that I've missed a leap somewhere. The call is talking about Collection, but the callee is thinking that t_0_0 is the Index type of the collection. Where did the mapping happen? Needless to say, my little witness table mapping thing above does not handle this case correctly :-)

-Chris

// Full IR for caller and callee:

(lldb) call fn->dump()
// Collection<>.makeIterator()
sil public_external [serialized] [always_inline] [canonical] @$Ss10CollectionPss16IndexingIteratorVyxG0C0RtzrlE04makeC0AEyF : $@convention(method) <Self where Self : Collection, Self.Iterator == IndexingIterator<Self>> (@in_guaranteed Self) -> @out IndexingIterator<Self> {
// %0                                             // user: %13
// %1                                             // user: %3
bb0(%0 : $*IndexingIterator<Self>, %1 : $*Self):
  %2 = alloc_stack $Self                          // users: %17, %14, %9, %6, %3
  copy_addr %1 to [initialization] %2 : $*Self    // id: %3
  %4 = alloc_stack $IndexingIterator<Self>        // users: %16, %15, %13, %10, %5
  %5 = struct_element_addr %4 : $*IndexingIterator<Self>, #IndexingIterator._elements // user: %6
  copy_addr %2 to [initialization] %5 : $*Self    // id: %6
  %7 = alloc_stack $Self.Index                    // users: %12, %11, %9
  %8 = witness_method $Self, #Collection.startIndex!getter.1 : <Self where Self : Collection> (Self) -> () -> Self.Index : $@convention(witness_method: Collection) <τ_0_0 where τ_0_0 : Collection> (@in_guaranteed τ_0_0) -> @out τ_0_0.Index // user: %9
  %9 = apply %8<Self>(%7, %2) : $@convention(witness_method: Collection) <τ_0_0 where τ_0_0 : Collection> (@in_guaranteed τ_0_0) -> @out τ_0_0.Index
  %10 = struct_element_addr %4 : $*IndexingIterator<Self>, #IndexingIterator._position // user: %11
  copy_addr [take] %7 to [initialization] %10 : $*Self.Index // id: %11
  dealloc_stack %7 : $*Self.Index                 // id: %12
  copy_addr %4 to [initialization] %0 : $*IndexingIterator<Self> // id: %13
  destroy_addr %2 : $*Self                        // id: %14
  destroy_addr %4 : $*IndexingIterator<Self>      // id: %15
  dealloc_stack %4 : $*IndexingIterator<Self>     // id: %16
  dealloc_stack %2 : $*Self                       // id: %17
  %18 = tuple ()                                  // user: %19
  return %18 : $()                                // id: %19
} // end sil function '$Ss10CollectionPss16IndexingIteratorVyxG0C0RtzrlE04makeC0AEyF'



(lldb) call callee->dump()
// protocol witness for Collection.startIndex.getter in conformance <> ClosedRange<A>
sil public_external [transparent] [serialized] [thunk] [canonical] @$Ss11ClosedRangeVyxGs10Collectionss10StrideableRzs13SignedInteger6StrideRpzrlsADP10startIndex0I0QzvgTW : $@convention(witness_method: Collection) <τ_0_0 where τ_0_0 : Strideable, τ_0_0.Stride : SignedInteger> (@in_guaranteed ClosedRange<τ_0_0>) -> @out ClosedRange<τ_0_0>.Index {
// %0                                             // user: %3
// %1                                             // user: %3
bb0(%0 : $*ClosedRange<τ_0_0>.Index, %1 : $*ClosedRange<τ_0_0>):
  // function_ref ClosedRange<>.startIndex.getter
  %2 = function_ref @$Ss11ClosedRangeVss10StrideableRzs13SignedInteger6StrideRpzrlE10startIndexABssACRzsAdFRQrlE0H0Oyx_Gvg : $@convention(method) <τ_0_0 where τ_0_0 : Strideable, τ_0_0.Stride : SignedInteger> (@in_guaranteed ClosedRange<τ_0_0>) -> @out ClosedRange<τ_0_0>.Index // user: %3
  %3 = apply %2<τ_0_0, τ_0_0.Stride>(%0, %1) : $@convention(method) <τ_0_0 where τ_0_0 : Strideable, τ_0_0.Stride : SignedInteger> (@in_guaranteed ClosedRange<τ_0_0>) -> @out ClosedRange<τ_0_0>.Index
  %4 = tuple ()                                   // user: %5
  return %4 : $()                                 // id: %5
} // end sil function '$Ss11ClosedRangeVyxGs10Collectionss10StrideableRzs13SignedInteger6StrideRpzrlsADP10startIndex0I0QzvgTW'
1 Like

incidentally, in addition to this nice short code, this seems to work equally well/poorly for my testcases. It seems like it would result in exactly the substitutionMap needed by the genericSignature, whereas the short code above computes a superset of the mapping. I'm curious to know if this is better/worse, or if it is all wrong and I should do something else entirely :-)

    calleeSubMap = signature->getSubstitutionMap(
    [&](SubstitutableType *type) -> Type {
      // First map the callee type back to the caller's world.
      auto result = Type(type).subst(callSubMap);
      // Then map from the caller's world onto a concrete type.
      return simplifyType(result);
    }, [&](CanType type, Type replacement, ProtocolType *proto)
        -> Optional<ProtocolConformanceRef> {
      auto confResult = callSubMap.lookupConformance(type, proto->getDecl());

      assert(confResult.hasValue() &&
            "ConstExpr evaluation should always resolve substitutions");
      // If we resolved this all the way down to a concrete conformance, then
      // we are done.
      auto result = confResult.getValue();
      if (result.isConcrete()) return result;

      // Otherwise, we can use the callers substitution map to resolve it.
      assert(!substitutionMap.empty() && "Should have submap for caller");
      auto protocolDecl = result.getAbstract();

      // Map the callee's type back to the caller's namespace.
      auto callerType = Type(type).subst(callSubMap)->getCanonicalType();

      // Look up the conformance in the caller's space.
      confResult = substitutionMap.lookupConformance(callerType, protocolDecl);
      assert(confResult.hasValue() && confResult.getValue().isConcrete() &&
             "ConstExpr evaluation should always resolve substitutions");
      return confResult.getValue();
    });

Thanks,

-Chris

For more context, this patch is included in PR17298

This is one of the long-standing messy issues with generic representation in SIL. @John_McCall has long had ideas for how to better represented "substituted generic environments", which would give us a way to represent the abstract generic environment of an unknown protocol witness along with its concretization for a specific conformance, <Self: Protocol with <T, U> Self == ConformingGenericType<T, U>> or something like that. Absent that, there's an implicit contract that a concrete witness_method has all of its type-level generic arguments supplied by the type in Self position, so for a witness applying to conforming type Foo<T, U> you want to turn the Self-oriented <t_0_0: P, t_1_0, t_1_...> signature into <T, U, t_1_0, t_1_...>. The specializer and devirtualizer have to contend with this mess already; maybe there are helpers you can extract from those passes to do this manipulation.

Can you please elaborate on this implicit contract? I'm particularly curious about the makeIterator -> Collection.startIndex jump in the second comment there. I feel like I'm missing a critical piece of this puzzle.

-Chris

Hi Chris,

You can find code for mapping substitutions from the protocol requirement's generic signature to the witness thunk's generic signature in the devirtualizer. Look at lib/SILOptimizer/Utils/Devirtualizer.cpp, getSubstitutionsForCallee() and getWitnessMethodSubstitutions().

Feel free to refactor the code to move these functions to a common location. Please don't roll your own, because there are multiple tricky corner cases that they were all painfully extended to handle over time.

The code in this file also handles class methods because you can have similar mismatches. For example, imagine you have this code:

class Base<T, U> {
  func foo() {}
}

class Derived<T> : Base<[T], Int> {
  override func foo() {}
}

let b: Base<[String], Int> = Derived()
b.foo()

Here I'm calling foo() on a value of type Base so the substitutions are in terms of the generic signature of Base, but in reality the method is implemented on Derived, so again they have to be translated.

As @Joe_Groff said, we need better abstractions to express this eventually. For now, it's kind of messy, so let's try to at least only have one implementation of this remapping. :)

1 Like

Thank you so much for the help. I manufactured a testcase that the devirtualizer could work on to see how it is handling my case. It appears that the missing link is that I'm not using the substitution map from the conformance produced by auto baseSubMap = conformance->getSubstitutions(mod);.

This is a bit weird for me, and I think this is a pretty significant problem for SIL: my symbolic value representation for a function pointer is naturally a SILFunction*: resolving a witness_method_inst always produces one of these (given that I always have a concrete type coming in), and when I propagate this to an apply_inst, I want to be able to resolve the substitution map. Given that the actual conformance is also required to resolve the mapping, I think this means I need to expand my symbolic value representation to be a "SILFunction* + Conformance" so I have the information necessary when I get to the apply. I will give this a try.

That said, I think that this is a pretty big problem for SIL itself as well - is IRGen able to properly lower witness_method_inst and apply_inst separately? E.g. if they end up in different functions because of outlining or something?

If I understand the situation correctly, it seems that this mapping information that is currently in the conformance should actually be represented in the "substitution list" thing on the apply_inst.

-Chris

Indeed, this looks like it might be working! Thank you Slava and Joe!

I'm dealing with a somewhat different but related problem here:
Rewrite SILCombiner::propagateConcreteTypeOfInitExistential. #17315

In this case, I need to dig all of an opened existential's conformances out of an earlier init_existential before performing concrete type substitution on an apply. I was given the explanation that the optimizer should not be in the business of conformance lookup so we can guarantee that conformances are fully evaluated by the typechecker. That makes sense to me.

I only understand enough of this stuff to hurt myself with, but I think there is a serious representational problem. Things with the witness_method calling convention have the type of the conformance, but it seems like it should have the full conformance information in the SILFunctionType.

A SILFunctionType with witness_method convention stores a conformance already. So if you start with a generic SILFunctionType and substitute Self with a concrete type using a substitution map, it will have a concrete conformance inside of it.

I'm not sure what you mean with witness_method and apply instructions being in different functions due to outlining. When we apply to call something generic, we store the conformance in the substitution list that is passed in to the callee, regardless of what kind of callee it is (static function_ref, a witness_method, a closure we found somewhere, etc).

Aha, that was the missing link that I didn't understand. It took a bit of working with it to figure out, but this looks like exactly what I needed. Thank you so much for the help @Slava_Pestov. My code is nicely simplified in PR 17461.

-Chris

1 Like