Reconsidering the global uniqueness of type metadata and protocol conformance instances

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects,

This seems pretty compelling

and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses.

What do you mean by doing this lazily and favoring the specializations?

The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

Do you have some examples here to illustrate? E.g. if I pass an instance of concrete type to something taking a T:Hashable, how does that currently work vs how it would work with this change? Would this mean that I can just pass off a function pointer to the hashing function? Do I need some kind of id scheme if the callee might want to cast or do something else with it?

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks.

How do you think this will work? Do you think we will want a way to go from a non-unique type metadata to some kind of canonical, uniqued type metadata? Would it make sense to key this off of a mangled name?

It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality.

This sounds very niche, and I don’t think we have promised any kind of stability or compatibility in that area.

···

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

In a premature proposal two years ago, we agreed to ditch unique protocol conformances but install the canonical address as the first entry in each specialized table. That would mitigate the disadvantages that you pointed to. But, we would also lose the ability to emit specialized metadata/conformances in constant pages. How do you feel about that tradeoff?

-Andy

···

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe

Is there a way to split the difference?

You would still have a repository of canonical metadata, but you could have another non-unique structure which can be used for most cases without having to go get the canonical version. You could use the non-unique structure to lookup the canonical version when desired, and the structure would contain enough information to create a canonical version lazily the first time it is needed.

That way, you could just use the non-unique structures… but they could be exchanged for a unique/canonical version (or one can be created) when that really is needed.

I am a bit out of my depth here, so hopefully this makes at least a little sense...

Thanks,
Jon

···

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

I think your proposal makes sense, particularly when we start caring about metadata/conformances for non-nominal types, which don’t have a declaration location. They are a bit over the horizon right now, but we need to support making tuples conform to protocols someday. Eliminating the requirement for them to be uniquely emitted across the entire program would make that much simpler, because otherwise you’re in the land of weak symbols or something.

-Chris

···

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

Would it be possible that whenever a specialized class is instantiated and the metadata already exists with a generic vtable, it just overwrites the vtable pointer in the metadata with the specialized version?
I didn’t think that through, but maybe the same could be done for witness tables?

···

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects,

This seems pretty compelling

and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses.

What do you mean by doing this lazily and favoring the specializations?

We've generally tried to make metadata instantiation as lazy as possible. If we have runtime-mediated unique metadata but the compiler can also generate specialized metadata candidates for concrete instances, then you either have to eagerly scan for and register specialized instances the first time you try to instantiate any instance of a generic type, or you keep it lazy and let the first metadata the runtime sees win, which runs the risk of an unspecialized, fully runtime-synthesized instance getting anointed as the official instance before any specialization can. To be fair, there's a potential mitigation here too since the value witness table is independent of the type metadata; we could keep the canonical metadata address stable and redirect the value witness table to a better candidate if we discover one. This is still all a lot more complex than forgoing the need for a canonical instance altogether.

The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

Do you have some examples here to illustrate? E.g. if I pass an instance of concrete type to something taking a T:Hashable, how does that currently work vs how it would work with this change? Would this mean that I can just pass off a function pointer to the hashing function? Do I need some kind of id scheme if the callee might want to cast or do something else with it?

Today, given:

struct Foo<T>: Hashable { ... }

We'll generate one unspecialized protocol witness table forall T. Foo<T>: Hashable. The uniqueness of the witness table is significant since it may be part of the parameterization of a type like Dictionary with a Hashable type parameter. If you pass Foo<Int> as a U: Hashable to an unspecialized generic function, we'll still pass that set of fully unspecialized witnesses, so we get no benefit from knowing T == Int at runtime.. If there wasn't that uniqueness requirement on the witness table, then the compiler could synthesize a specialized witness table for Foo.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks.

How do you think this will work? Do you think we will want a way to go from a non-unique type metadata to some kind of canonical, uniqued type metadata? Would it make sense to key this off of a mangled name?

Adding a pointer to the mangled type string or some other unique identifier to the witness table seems workable to me.

It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality.

This sounds very niche, and I don’t think we have promised any kind of stability or compatibility in that area.

Sure, but it is something that people could be conceivably rely on that we would break.

-Joe

···

On Jul 28, 2017, at 2:53 PM, Michael Ilseman <milseman@apple.com> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

We could. This is similar to what we already have to do for type metadata from types that got imported from C. Since there's no canonical place for a C type's metadata to live, each binary that needs the type's metadata emits a candidate with the type name, and the runtime chooses one (currently, the first one that asks) to be the canonical type. We could do something similar with some or all metadata.

-Joe

···

On Jul 28, 2017, at 4:27 PM, Jonathan Hull <jhull@gbis.com> wrote:

Is there a way to split the difference?

You would still have a repository of canonical metadata, but you could have another non-unique structure which can be used for most cases without having to go get the canonical version. You could use the non-unique structure to lookup the canonical version when desired, and the structure would contain enough information to create a canonical version lazily the first time it is needed.

That way, you could just use the non-unique structures… but they could be exchanged for a unique/canonical version (or one can be created) when that really is needed.

I am a bit out of my depth here, so hopefully this makes at least a little sense...

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe

In a premature proposal two years ago, we agreed to ditch unique protocol conformances but install the canonical address as the first entry in each specialized table.

This would be a reference to (unique) global data about the conformance, not a reference to some canonical version of the protocol witness table. We do not rely on having a canonical protocol witness table. The only reason we unique them (when we do need to instantiate) is because we don't want to track their lifetimes.

That would mitigate the disadvantages that you pointed to. But, we would also lose the ability to emit specialized metadata/conformances in constant pages. How do you feel about that tradeoff?

Note that, per above, it's only specialized constant type metadata that we would lose.

I continue to feel that having to do structural equality tests on type metadata would be a huge loss.

John.

···

On Jul 28, 2017, at 6:02 PM, Andrew Trick via swift-dev <swift-dev@swift.org> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

Not really, because the conformance is presumably still declared somewhere in Swift and therefore has a natural unique definition point even if the type doesn't.

John.

···

On Jul 29, 2017, at 4:24 PM, Chris Lattner via swift-dev <swift-dev@swift.org> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

I think your proposal makes sense, particularly when we start caring about metadata/conformances for non-nominal types, which don’t have a declaration location. They are a bit over the horizon right now, but we need to support making tuples conform to protocols someday. Eliminating the requirement for them to be uniquely emitted across the entire program would make that much simpler, because otherwise you’re in the land of weak symbols or something.

Sure, we could redirect tables at runtime. It would require more runtime infrastructure, though we can't really get around uniquing vtables for classes, so that might be unavoidable.

-Joe

···

On Jul 31, 2017, at 9:21 AM, Erik Eckstein <eeckstein@apple.com> wrote:

Would it be possible that whenever a specialized class is instantiated and the metadata already exists with a generic vtable, it just overwrites the vtable pointer in the metadata with the specialized version?
I didn’t think that through, but maybe the same could be done for witness tables?

I don't think it necessarily needs to be deep structural equality. If the type metadata object or value witness table had a pointer to a mangled type name string, we could strcmp those strings to compare equality, which doesn't seem terribly onerous to me, though if it were we could perhaps use the string to lazily resolve the canonical type metadata pointer, sort of like we do with type metadata for imported C types today.

-Joe

···

On Jul 28, 2017, at 3:15 PM, John McCall <rjmccall@apple.com> wrote:

On Jul 28, 2017, at 6:02 PM, Andrew Trick via swift-dev <swift-dev@swift.org> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe

In a premature proposal two years ago, we agreed to ditch unique protocol conformances but install the canonical address as the first entry in each specialized table.

This would be a reference to (unique) global data about the conformance, not a reference to some canonical version of the protocol witness table. We do not rely on having a canonical protocol witness table. The only reason we unique them (when we do need to instantiate) is because we don't want to track their lifetimes.

That would mitigate the disadvantages that you pointed to. But, we would also lose the ability to emit specialized metadata/conformances in constant pages. How do you feel about that tradeoff?

Note that, per above, it's only specialized constant type metadata that we would lose.

I continue to feel that having to do structural equality tests on type metadata would be a huge loss.

So generic code to instantiate type metadata would have to construct these mangled strings eagerly?

John.

···

On Jul 28, 2017, at 6:24 PM, Joe Groff <jgroff@apple.com> wrote:

On Jul 28, 2017, at 3:15 PM, John McCall <rjmccall@apple.com> wrote:

On Jul 28, 2017, at 6:02 PM, Andrew Trick via swift-dev <swift-dev@swift.org> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe

In a premature proposal two years ago, we agreed to ditch unique protocol conformances but install the canonical address as the first entry in each specialized table.

This would be a reference to (unique) global data about the conformance, not a reference to some canonical version of the protocol witness table. We do not rely on having a canonical protocol witness table. The only reason we unique them (when we do need to instantiate) is because we don't want to track their lifetimes.

That would mitigate the disadvantages that you pointed to. But, we would also lose the ability to emit specialized metadata/conformances in constant pages. How do you feel about that tradeoff?

Note that, per above, it's only specialized constant type metadata that we would lose.

I continue to feel that having to do structural equality tests on type metadata would be a huge loss.

I don't think it necessarily needs to be deep structural equality. If the type metadata object or value witness table had a pointer to a mangled type name string, we could strcmp those strings to compare equality, which doesn't seem terribly onerous to me, though if it were we could perhaps use the string to lazily resolve the canonical type metadata pointer, sort of like we do with type metadata for imported C types today.

My question was really, are we going to runtime-initialize the specialized metadata and specialized witness tables in order to install the unique identifier, rather than requiring a runtime call whenever we need the unique ID. I think the answer is “yes”, we want to install the ID at initialization time for fast type comparison, hashing and casting.

-Andy

···

On Jul 28, 2017, at 3:15 PM, John McCall <rjmccall@apple.com> wrote:

On Jul 28, 2017, at 6:02 PM, Andrew Trick via swift-dev <swift-dev@swift.org> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe

In a premature proposal two years ago, we agreed to ditch unique protocol conformances but install the canonical address as the first entry in each specialized table.

This would be a reference to (unique) global data about the conformance, not a reference to some canonical version of the protocol witness table. We do not rely on having a canonical protocol witness table. The only reason we unique them (when we do need to instantiate) is because we don't want to track their lifetimes.

That would mitigate the disadvantages that you pointed to. But, we would also lose the ability to emit specialized metadata/conformances in constant pages. How do you feel about that tradeoff?

Note that, per above, it's only specialized constant type metadata that we would lose.

I continue to feel that having to do structural equality tests on type metadata would be a huge loss.

John.

Ok, so you’re suggesting that the stdlib would have the “automatically provided” conditional conformances for things like equatable, then each module that actually uses one gets a specialization?

-Chris

···

On Jul 29, 2017, at 1:32 PM, John McCall <rjmccall@apple.com> wrote:

On Jul 29, 2017, at 4:24 PM, Chris Lattner via swift-dev <swift-dev@swift.org> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

I think your proposal makes sense, particularly when we start caring about metadata/conformances for non-nominal types, which don’t have a declaration location. They are a bit over the horizon right now, but we need to support making tuples conform to protocols someday. Eliminating the requirement for them to be uniquely emitted across the entire program would make that much simpler, because otherwise you’re in the land of weak symbols or something.

Not really, because the conformance is presumably still declared somewhere in Swift and therefore has a natural unique definition point even if the type doesn’t.

Well, presumably the stdlib's generic conformance would actually be usable itself, but yes, other modules could of course emit specialized witness tables if they want.

John.

···

On Jul 29, 2017, at 4:33 PM, Chris Lattner <clattner@nondot.org> wrote:

On Jul 29, 2017, at 1:32 PM, John McCall <rjmccall@apple.com> wrote:

On Jul 29, 2017, at 4:24 PM, Chris Lattner via swift-dev <swift-dev@swift.org> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

I think your proposal makes sense, particularly when we start caring about metadata/conformances for non-nominal types, which don’t have a declaration location. They are a bit over the horizon right now, but we need to support making tuples conform to protocols someday. Eliminating the requirement for them to be uniquely emitted across the entire program would make that much simpler, because otherwise you’re in the land of weak symbols or something.

Not really, because the conformance is presumably still declared somewhere in Swift and therefore has a natural unique definition point even if the type doesn’t.

Ok, so you’re suggesting that the stdlib would have the “automatically provided” conditional conformances for things like equatable, then each module that actually uses one gets a specialization?

We already do exactly that for the ObjC runtime name of generic class instantiations, for what it's worth, but it could conceivably be lazy as well, at the cost of making the comparison yet more expensive. There aren't that many runtime operations that need to do type comparison, though—the ones I can think of are casting and the equality/hashing operations on Any.Type—so how important is efficient type comparison?

-Joe

···

On Jul 28, 2017, at 3:30 PM, John McCall <rjmccall@apple.com> wrote:

On Jul 28, 2017, at 6:24 PM, Joe Groff <jgroff@apple.com> wrote:

On Jul 28, 2017, at 3:15 PM, John McCall <rjmccall@apple.com> wrote:

On Jul 28, 2017, at 6:02 PM, Andrew Trick via swift-dev <swift-dev@swift.org> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe

In a premature proposal two years ago, we agreed to ditch unique protocol conformances but install the canonical address as the first entry in each specialized table.

This would be a reference to (unique) global data about the conformance, not a reference to some canonical version of the protocol witness table. We do not rely on having a canonical protocol witness table. The only reason we unique them (when we do need to instantiate) is because we don't want to track their lifetimes.

That would mitigate the disadvantages that you pointed to. But, we would also lose the ability to emit specialized metadata/conformances in constant pages. How do you feel about that tradeoff?

Note that, per above, it's only specialized constant type metadata that we would lose.

I continue to feel that having to do structural equality tests on type metadata would be a huge loss.

I don't think it necessarily needs to be deep structural equality. If the type metadata object or value witness table had a pointer to a mangled type name string, we could strcmp those strings to compare equality, which doesn't seem terribly onerous to me, though if it were we could perhaps use the string to lazily resolve the canonical type metadata pointer, sort of like we do with type metadata for imported C types today.

So generic code to instantiate type metadata would have to construct these mangled strings eagerly?

I'm still strongly against any feature that relies on type names being present at runtime. I think we should be able to omit those for both code size and secrecy reasons when the type isn't an @objc class or protocol.

For the actual feature, I'd be interested in how it interacts with conflicting protocol conformances in different frameworks, including both two different modules trying to do retroactive modeling and in a library adding a conformance in a later release that a client may have also added.

Jordan

···

On Jul 28, 2017, at 15:35, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

On Jul 28, 2017, at 3:30 PM, John McCall <rjmccall@apple.com> wrote:

On Jul 28, 2017, at 6:24 PM, Joe Groff <jgroff@apple.com> wrote:

On Jul 28, 2017, at 3:15 PM, John McCall <rjmccall@apple.com> wrote:

On Jul 28, 2017, at 6:02 PM, Andrew Trick via swift-dev <swift-dev@swift.org> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe

In a premature proposal two years ago, we agreed to ditch unique protocol conformances but install the canonical address as the first entry in each specialized table.

This would be a reference to (unique) global data about the conformance, not a reference to some canonical version of the protocol witness table. We do not rely on having a canonical protocol witness table. The only reason we unique them (when we do need to instantiate) is because we don't want to track their lifetimes.

That would mitigate the disadvantages that you pointed to. But, we would also lose the ability to emit specialized metadata/conformances in constant pages. How do you feel about that tradeoff?

Note that, per above, it's only specialized constant type metadata that we would lose.

I continue to feel that having to do structural equality tests on type metadata would be a huge loss.

I don't think it necessarily needs to be deep structural equality. If the type metadata object or value witness table had a pointer to a mangled type name string, we could strcmp those strings to compare equality, which doesn't seem terribly onerous to me, though if it were we could perhaps use the string to lazily resolve the canonical type metadata pointer, sort of like we do with type metadata for imported C types today.

So generic code to instantiate type metadata would have to construct these mangled strings eagerly?

We already do exactly that for the ObjC runtime name of generic class instantiations, for what it's worth, but it could conceivably be lazy as well, at the cost of making the comparison yet more expensive. There aren't that many runtime operations that need to do type comparison, though—the ones I can think of are casting and the equality/hashing operations on Any.Type—so how important is efficient type comparison?

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe

In a premature proposal two years ago, we agreed to ditch unique protocol conformances but install the canonical address as the first entry in each specialized table.

This would be a reference to (unique) global data about the conformance, not a reference to some canonical version of the protocol witness table. We do not rely on having a canonical protocol witness table. The only reason we unique them (when we do need to instantiate) is because we don't want to track their lifetimes.

That would mitigate the disadvantages that you pointed to. But, we would also lose the ability to emit specialized metadata/conformances in constant pages. How do you feel about that tradeoff?

Note that, per above, it's only specialized constant type metadata that we would lose.

I continue to feel that having to do structural equality tests on type metadata would be a huge loss.

I don't think it necessarily needs to be deep structural equality. If the type metadata object or value witness table had a pointer to a mangled type name string, we could strcmp those strings to compare equality, which doesn't seem terribly onerous to me, though if it were we could perhaps use the string to lazily resolve the canonical type metadata pointer, sort of like we do with type metadata for imported C types today.

So generic code to instantiate type metadata would have to construct these mangled strings eagerly?

We already do exactly that for the ObjC runtime name of generic class instantiations, for what it's worth, but it could conceivably be lazy as well, at the cost of making the comparison yet more expensive. There aren't that many runtime operations that need to do type comparison, though—the ones I can think of are casting and the equality/hashing operations on Any.Type—so how important is efficient type comparison?

I'm still strongly against any feature that relies on type names being present at runtime. I think we should be able to omit those for both code size and secrecy reasons when the type isn't an @objc class or protocol.

I think nonuniquing type metadata gives us a bit more leeway to discard runtime info about types, at least public ones, since it's no longer the owning binary's sole responsibility to provide canonical metadata.

For the actual feature, I'd be interested in how it interacts with conflicting protocol conformances in different frameworks, including both two different modules trying to do retroactive modeling and in a library adding a conformance in a later release that a client may have also added.

We could still have some way of uniquely identifying a conformance to prevent these collisions.

-Joe

···

On Jul 28, 2017, at 3:59 PM, Jordan Rose <jordan_rose@apple.com> wrote:

On Jul 28, 2017, at 15:35, Joe Groff via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

On Jul 28, 2017, at 3:30 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Jul 28, 2017, at 6:24 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Jul 28, 2017, at 3:15 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Jul 28, 2017, at 6:02 PM, Andrew Trick via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects, and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses. The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.

On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.

There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks. It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality. It's unlikely to me that giving up uniqueness would buy us any simplification to the runtime, since the runtime would still need to be able to instantiate metadata for unspecialized code, and we would still want to unique runtime-instantiated metadata objects as an optimization.

Overall, my intuition is that the tradeoffs come out in favor for nonunique metadata objects, but what do you all think? Is there anything I'm missing?

-Joe

In a premature proposal two years ago, we agreed to ditch unique protocol conformances but install the canonical address as the first entry in each specialized table.

This would be a reference to (unique) global data about the conformance, not a reference to some canonical version of the protocol witness table. We do not rely on having a canonical protocol witness table. The only reason we unique them (when we do need to instantiate) is because we don't want to track their lifetimes.

That would mitigate the disadvantages that you pointed to. But, we would also lose the ability to emit specialized metadata/conformances in constant pages. How do you feel about that tradeoff?

Note that, per above, it's only specialized constant type metadata that we would lose.

I continue to feel that having to do structural equality tests on type metadata would be a huge loss.

I don't think it necessarily needs to be deep structural equality. If the type metadata object or value witness table had a pointer to a mangled type name string, we could strcmp those strings to compare equality, which doesn't seem terribly onerous to me, though if it were we could perhaps use the string to lazily resolve the canonical type metadata pointer, sort of like we do with type metadata for imported C types today.

So generic code to instantiate type metadata would have to construct these mangled strings eagerly?

We already do exactly that for the ObjC runtime name of generic class instantiations, for what it's worth, but it could conceivably be lazy as well, at the cost of making the comparison yet more expensive. There aren't that many runtime operations that need to do type comparison, though—the ones I can think of are casting and the equality/hashing operations on Any.Type—so how important is efficient type comparison?

I'm still strongly against any feature that relies on type names being present at runtime. I think we should be able to omit those for both code size and secrecy reasons when the type isn't an @objc class or protocol.

I think nonuniquing type metadata gives us a bit more leeway to discard runtime info about types, at least public ones, since it's no longer the owning binary's sole responsibility to provide canonical metadata.

My point is we can't use our usual mangling in the string, because that contains type names.

For the actual feature, I'd be interested in how it interacts with conflicting protocol conformances in different frameworks, including both two different modules trying to do retroactive modeling and in a library adding a conformance in a later release that a client may have also added.

We could still have some way of uniquely identifying a conformance to prevent these collisions.

To be clear, we don't have an answer for this today. Something we need to discuss along with library evolution. :-)

Jordan

···

On Jul 28, 2017, at 16:03, Joe Groff <jgroff@apple.com> wrote:

On Jul 28, 2017, at 3:59 PM, Jordan Rose <jordan_rose@apple.com <mailto:jordan_rose@apple.com>> wrote:

On Jul 28, 2017, at 15:35, Joe Groff via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

On Jul 28, 2017, at 3:30 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Jul 28, 2017, at 6:24 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Jul 28, 2017, at 3:15 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Jul 28, 2017, at 6:02 PM, Andrew Trick via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote: