Need Help Understanding Protocols and Generics

jrose · June 18, 2020, 6:53pm

Wow, you weren't kidding. :-) I'll try to answer these as best I can, though in the interest of time I'm not going to try to verify all the little statements in the details disclosures. (I noticed a few are a little off but they seem mostly correct.)

(Question E) I do not see any metadata that would capture the fact that Q inherits from P . How is that relationship tracked?

E) Inheriting from another protocol is essentially the same as saying where Self: P from the runtime's perspective, so I believe that's part of the requirement signature in a protocol descriptor.

(Question F) I do not see any metadata to track non-required implementations provided in a protocol extension, such as P.id2 . I assume that is because those implementations are dispatched statically at compile time, so there is no need to reference them in metadata intended for use by the runtime. Is that correct?

F) It's correct that protocol descriptors do not include protocol extension members. In general, it's not possible to include all protocol extension members, because they might be defined in downstream modules. But it's certainly technically feasible to give protocol extension members in the same module some kind of special privileges; doing so for arbitrary protocol extension members would require some more thought on implementation. (See also a very old pitch of mine: [Pitch] Overridable Members in Extensions)

When a Protocol Conformance Record is created, a Protocol Witness Table is created.

This one's off enough that I do have to call it out for further answers to make sense. A conformance record is created for Y: Q, but each instantiation of Y is going to get its own witness table Y<Int>: Q, Y<String>: Q. That has to happen at run time for a generic type.

(Question G) It seems that the first PWT, for Y: P , will be unconditional in nature, and will have just one requirement, P.id . Will the entry for that requirement point at both the default implementation provided in P and the default implementation provided in Q ? (If not, please see my notes hidden under the PWT reveal, which explain why I so surmise.)

G) No, a witness table contains a single "witness" reference for each requirement. (With the caveat that an associated type requirement can have more than one datum associated with a witness: the type itself, plus references to witness tables for protocols that are specified as constraints.) All your PWT notes seem correct for me, so I'm not sure why you'd come to this conclusion; today, the compiler's going to pick exactly one witness based on an overload-resolution-like process and that's what's going to go into the conformance record, which is used to generate the witness table.

(Question H) The second PWT, for Y: Q , will be conditional in nature. If I understand this correctly, this table will not have any entries, because Q has no requirements of its own. The id requirement belongs to P. Is that correct? It doesn't feel right...

H) That's correct…nearly. In practice, because of what I said before about inheritance, the protocol witness table for Y<T>: Q will have one entry, which is a reference to the protocol witness table for Y<T>: P. (Arguably the compiler could optimize away "marker" protocols like this, where there's only one parent and no requirements, but that would make things more complicated, so there's no such optimization implemented today.)

(Question I) I find virtually no documentation for PWTs. Aside from the source code and pull requests, does any documentation exist? Are there specific source files that would be useful to review?

I) Good question. It doesn't help that we talk about "witness tables" and "conformances" at different parts of the compiler stack, and then in the runtime you have to think about both. Someone else might be able to talk about this better, but you can look at FragileWitnessTableBuilder and ProtocolConformanceDescriptorBuilder in GenProto.cpp to get the general idea. There's not much documentation because it's treated as an opaque structure by the runtime; any time you need something from a protocol witness table, you're usually just selecting one item at a known offset (say, the witness for P.id). The protocol conformance descriptor has a little more structure, but only enough to understand how to instantiate witness tables.

(Question J) For a conditional conformance, is seems that the PWT is provided via an accessor, which I take to be a function. Is that correct?

J) I'm not sure, and part of that is that I think you're conflating references to protocol witness tables in code with references in data. In data sections, you'd want to reference a witness table by its symbol, but if it needs instantiation (like for a generic type), you'd have to call a function instead. I'd expect the PWT-by-accessor to happen for any generic type, but maybe the compiler is smart enough to realize that Y: P doesn't actually depend on T. (I'm also not sure if the accessors are generated once-per-protocol, or once-per-use-site, since they'll be small calls to a runtime function anyway.)

(Question K) Does the runtime use the PWT accessor to determine whether a given instance of a generic type satisfies a conditional conformance?

K) It looks like the checking of requirements happens in the runtime function ProtocolConformanceDescriptor::getWitnessTable and a (complex) helper swift::_checkGenericRequirements. I'm not quite sure how this relates to your question, though, because when a witness table is referenced in code I'm not sure it goes through the accessor functions I talked about in (J).

(Question L) What does the signature of the PWT accessor look like? How does it behave, and how is it used?

L) I'm gonna pass on answering this one because I'm not sure which accessors you're talking about. :-(

(Question M) I assume the "template" to which @jrose referred, upthread, is used solely by the compiler, and so is not persisted in the metadata. Is that correct?

M) That's incorrect. It's called the witness table "pattern" in the runtime; I called it a template because it's used to fill out parts of an instantiated witness table that won't change. Some conformances seem to put everything in the instantiation function, though.

(Question N) The documentation loosely refers to a Generic Argument Vector and a Generic Parameter Vector. Are those one and the same?

N) Not sure but probably.

(Question O) The documentation refers to a generic parameter descriptor (as part of the Nominal Type Descriptor). Is that descriptor describing the layout of the Generic Argument Vector?

O) Yes.

(Question P) In what ways is the Generic Argument Vector used by the runtime?

P) As a whole unit, not much. But this is where generic arguments and any associated conformances are stored, and so if a method on a generic type wants to reference a generic argument, that's how it'll do it: by going to the type metadata and looking at the right offset.

(Question Q) To select a witness for y.id , are the PWTs for Y: P and Y: Q both accessed (and why or why not)? What does each PWT report back?

Q) The compiler chooses what to call for y.id using overload resolution; it finds a requirement on P, a method on an extension on P, and a method on an extension on Q. The last is the most specific, so it gets chosen. Nothing ever looks at the contents of any witness tables for this to happen.

(Question R) How does the runtime determine that Q is more specialized than P ?

R) N/A

(Question S) The call to y.id2 was dispatched statically, at compile time. The call goes to the implementation at P.id2 . Inside that getter, a call is made to id . That call is to a protocol requirement, so it dispatched dynamically, correct?

S) Correct. Inside P.id2, there are only two options: the requirement on P, and the extension method on P. I can't remember the specific rules that make the requirement better than the extension method; in particular, I can't remember if a constrained extension method on P will ever be chosen when calling from an identically-constrained extension method on P.

(Question T) To select a witness for the call to id inside of P.id2 , which PWTs are accessed, and why? What is reported back by the consulted PWT(s)?

T) Protocol extension methods are basically generic methods in disguise, so at run time, P.id2 has two additional hidden parameters: the concrete type it's being called on, and the conformance of that type to P (i.e. the witness table). This means it's the caller that chooses what witness table to pass when calling P.id2; the compiler sees that the concrete type is Y<Int> and the method in question is on an extension of P, so it's going to pass Y<Int>: P. At this point there's information about the generic arguments of Y, but no information about overloads of id; the one-size-fits-all implementation has already been chosen by the compiler back when we said Y: P.

I could use double-checking from someone else on all of these, but especially I-N and P.