Pitch #3: Opt-in Reflection metadata

I like the idea of overloading the APIs, but in Swift 6 we propose to depreciate -disable-reflection-metadata and -reflection-metadata-for-debugger-only options to eliminate the possibility of emitting incorrect code by the compiler which will leave us with opt-in and full-emission modes only.

I believe the question here is which one to make a default behaviour in Swift 6, based on the tradeoffs we are ready to accept.

  1. Initial idea was to enable Opt-in mode for Swift 6 by default, but yes, some apps will break but a binary size win will be there for everybody. (Swift 6 is going to break compatibility anyway, so we were ready to accept it as collateral damage)

  2. On the other hand, if we make Full-emission mode on by default, the compatibility won't be broken and everything will work as expected until a developer explicitly enables Opt-in mode.

For (1) we could also soften consequences, but I don't think we will be able to fully eliminate them as I mentioned in my post above.

As is, both options should be equally easy to implement, the mitigations for option (1) will bring lots of complications though, especially if we want it to be backward compatible.

Any thoughts? @ksluder @benpious

We didn't initially have plans to require reflection metadata for using such APIs because not everybody wants to leak implementation details with reflection metadata.

I'm not necessarily suggesting this. But I think if you want to prevent regressions, you need to require that types that are passed to the non-debug functions (String(describing:), string interpolation) implement one of the following:

  • Reflectable
  • CustomStringConvertible
  • TextOutputStreamable

I don't think you need to require this for debugging functions like print, and dump's documentation does imply the use of reflection metadata.

So my suggestion would be to make a new protocol, perhaps named StringConvertible, and have Reflectable, CustomStringConvertible, and TextOutputStreamable be subtypes of it. Then you can modify String(describing:) and String interpolation to only accept types conforming to StringConvertible, and the compiler will be able to tell folks when they are passing something to these functions that will have a different output in Swift 6.

Theoretically, we also could just codemod all call-sites of print/dump/String(describing:) to arg as! Reflectable

Let's not lose sight of the goal: to avoid breakages, not to force everyone to implement Reflectable. This will produce broken code in cases where people are relying on CustomStringConvertable instead. In fact, this might break any prints that are using Strings.

Another, unrelated question: what happens if I write

extension String: Reflectable {}

I don't own String, and the Swift stdlib has already been compiled. So presumably this implementation will do nothing, right?

Wouldn't this break ABI compatibility?

Ah, yeah, it would.

Imo continuing to emit full metadata (in all configurations) unless turned off explicitly for the module is the way to go. Having differences in behaviour between debug and release configurations by default sounds like a recipe for confusion to me, and the potential savings in binary size don't look big enough to me to make up for that. Disabling metadata on the default Vapor template reduced the binary (in release mode, Swift 5.6.1) from 32MB to 30MB (plus the version built without metadata failed to run properly).

I also wanna mention that I think the way print and string interpolation result in a useful description of a value, without the need to implement anything or even to conform to any marker protocol, is a very valuable feature of Swift and shouldn't be discarded lightly. Maybe even to the point where it might be worth somehow making that work with reflection metadata disabled.

2 Likes

I agree that it would be best not to make this the default unless it can be made safe, but here鈥檚 some devil鈥檚 advocate perspective:

Disabling metadata on the default Vapor template reduced the binary (in release mode, Swift 5.6.1) from 32MB to 30MB

The Instagram app is 200mb installed today, according to the App Store. If it was written entirely in Swift, the binary size contribution of type metadata would probably be between 15 and 20mb (based on my anecdotal experience with the Uber apps and your number from Vapor, the contribution of type metadata is probably between 5-10% of the binary). I think it was 10mb for the Uber Rider app last we checked, and we are over 100mb in binary size.

So it鈥檚 not a trivial amount, especially if you鈥檙e a big company organized around the feature team model. I imagine if you鈥檙e trying to use Swift for some more systems oriented purpose it might also be useful to turn this off.

4 Likes

I feel like this pitch is missing a few steps: "what metadata does Swift emit today, what purpose is it used for, what can we omit tomorrow, and what do we have to keep?"

I share some of the concerns of people who are relying on printed representations today, particularly around enums, but I think gating on Swift 6 probably makes that safe enough. Part of updating to Swift 6 will be remembering to add Reflectable where it matters. The one place where that wouldn't be good enough is if a client is relying on a library's type's printed representation, but it's not something the library wanted to commit to.

I want to suggest that if full debugging is enabled (i.e. not "line tables only") the full metadata should be emitted in a debug section no matter what mode you're in (EDIT: unless you're in the "on" mode and it's already in the binary). That way, it generally won't be shipped with the binary, but you don't lose any debugging capabilities you would have otherwise had. That might also simplify the flag story: it could just be "off", "opt-in", "on".

16 Likes

Thank you all for the feedback folks!

I think I have a plan in mind for how safety issues can be addressed:

  1. Deprecate the existing API that might be using reflection metadata in release builds with @available(swift, deprecated: 5.8, message: "Argument should conform to Reflectable") but keep it around for compatibility with apps built with older stdlib versions.

  2. Introduce the overloaded variance of that API with a generic requirement on conformance to Reflectable protocol.

  3. The compiler will autosynthesize conformance to Reflectable for all types, if reflection metadata mode is "Fully available".

  4. Allow force cast to Reflectable as! Reflectable to silence the warning. (Reflection metadata is still not available)

In that case, during migration to Swift 6, if such an API is used, a developer will get a compile-time warning pointing out the need to add the conformance to Reflectable if the developer wants reflection capabilities.

A developer will be able to silence the warning by force-casting as! Reflectable or by enabling reflection metadata in full.

3 Likes

I wonder if the polarity is right here. Maybe it would make sense if:

  • All nominal types default to a lazy internal implicit conformance to Reflectable. The type conforms to Reflectable, but only if something in the module requires that conformance. Outside the module, it is not known to conform.
  • The implicit conformance can be explicitly disabled.
  • The implicit conformance can also be explicitly declared, allowing it to be public.
4 Likes

I can imagine wanting to set the default for a whole module, but I definitely don't want to have to track down every type in a module that values code size or secrecy and ensure that I've written !Reflectable or whatever. I do see that we're in a unique situation because at least basic reflection capabilities are currently provided to every type by default, though.

2 Likes

All nominal types default to a lazy internal implicit conformance to Reflectable

Would the compiler be able to statically determine all such cases to add implicit conformance?

Apart from these, it doesn't seem too different from what is being proposed,
However, the distinction between internal and public conformances might be more complicated to comprehend.
Jordan's point also makes a lot of sense because disabling might not be a good solution, and we discussed in the beginning that it should be an opt-in rather than an opt-out mechanism to control the emission of reflection symbols.

We updated the proposal doc and the implementation. A few key points:

  • Introduced Reflectable Casts (as! Reflectable, as? Reflectable, is Reflectable)
  • Opt-in mode is set by default starting from Swift 6.
  • Synthesized conformance to Reflectable to all declarations if reflection metadata is enabled in full.
  • Implicit conversion to Reflectable is forbidden.
  • Better diagnostics.

Proposal - [Proposal] Opt-In Reflection metadata by maxovtsin 路 Pull Request #1203 路 apple/swift-evolution 路 GitHub
Implementation - Swift Opt-In Reflection metadata by maxovtsin 路 Pull Request #34199 路 apple/swift 路 GitHub

1 Like

I'm glad to see continued progress here, but my previous feedback still stands:

  • What metadata does Swift 5 currently include for all types that Swift 6 will make opt-in? The existing flags are not documented and not very well known, so this is knowledge you can't expect reviewers to have. (In particular, I would hope that the names of non-Reflectable types would not appear in the final binary, and this proposal does not tell me if that is the case.)

  • What stdlib APIs (and perhaps Foundation APIs, as part of corelibs) will behave differently on non-Reflectable types? In particular, if a type is used with NSCoding, even as a generic parameter, it must be findable by name later, which would be a migration hazard going from Swift 5 to Swift 6 that could result in the loss of user data.

  • What happens when I compile in release mode with full debug info? Does my debugging experience suffer? (more than it does today)


I also think not supporting the Reflectable casts on older OSs makes this tricky to adopt on the consumer side, but maybe it's okay because you're not proposing to add bounds to existing stdlib APIs, which will have some sensible fallback (like "no children") for non-Reflectable types on new and old OSs. Still, I know our dynamic cast system is hookable, and it may be that on older OSs you can use something like "has no name" as a proxy for non-Reflectable.


A thought I've just had now: what reflection metadata is generated for imported types? Do we have any hope of controlling that?

9 Likes

Apologies for not addressing these concerns earlier.
Let me try to do that now, and if the explanation makes sense I'll update the proposal accordingly.

What metadata does Swift 5 currently include for all types that Swift 6 will make opt-in? The existing flags are not documented and not very well known, so this is knowledge you can't expect reviewers to have.

I perhaps need to add more details to the proposal explaining different kinds of metadata and what information reflection metadata contains.
But in general, there are two levels of metadata:

  1. Core metadata, such as the type metadata record, nominal type descriptor, etc.
  2. Reflection metadata which contains information about fields' types and their names. (Data from swift5_fieldmd section of a binary)

Core metadata will be emitted in full and not affected by this proposal, while Reflection metadata will be emitted only for types that conform to Reflectable protocol or for debug builds.

(In particular, I would hope that the names of non-Reflectable types would not appear in the final binary, and this proposal does not tell me if that is the case.)

Type names are kept in "nominal type descriptor" which isn't a part of this proposal.

What stdlib APIs (and perhaps Foundation APIs, as part of corelibs) will behave differently on non-Reflectable types? In particular, if a type is used with NSCoding, even as a generic parameter, it must be findable by name later, which would be a migration hazard going from Swift 5 to Swift 6 that could result in the loss of user data.

We decided not to include changes in stdlib in the proposal, because all current API that consumes reflection are kinda for debug purposes and developers shouldn't rely on the output of those APIs. (But might be proposed separately)
Foundation APIs, as far as I am aware, it doesn't know about Swift's reflection and won't be affected.

What happens when I compile in release mode with full debug info? Does my debugging experience suffer? (more than it does today)

By full debug info did you mean an arg from -g family?
I didn't consider that case, and probably it makes sense to keep reflection emission enabled for at least ASTTypes and DwarfTypes debug options.

I also think not supporting the Reflectable casts on older OSs makes this tricky to adopt on the consumer side, but maybe it's okay because you're not proposing to add bounds to existing stdlib APIs, which will have some sensible fallback (like "no children") for non-Reflectable types on new and old OSs. Still, I know our dynamic cast system is hookable, and it may be that on older OSs you can use something like "has no name" as a proxy for non-Reflectable.

We considered backporting the Reflectable casts, but wouldn't want to introduce a compatibility library only for that case. I also don't think this feature will be critical since not many libraries consume reflection nowadays.

A thought I've just had now: what reflection metadata is generated for imported types? Do we have any hope of controlling that?

No, that topic was raised already in the thread and it doesn't seem reasonable to generate reflection for imported types.
I'll mention explicitly in the proposal that conformance to Reflectable is allowed only at the type declaration level, not at the extension level.

3 Likes

Thanks for clarifying! The motivation to remove type names is for secrecy reasons, so that someone can鈥檛 search the binary for e.g. 鈥淧asswordValidationState鈥 to find out how that state is represented, or 鈥淪hinyNewFeatureConfig鈥 to confirm that an app developer is working on a new feature. Maybe that鈥檚 just more 鈥渨alls and ladders鈥 obfuscation, but it still seems relevant.

It sounds like this proposal is planning to change the behavior for field metadata only. I don鈥檛 know if we鈥檒l get other types of reflection in the future (invoking methods, listing computed properties along with stored ones like ObjC does, etc), but I think calling the capability (and the protocol) Reflectable makes sense. The proposal, however, would feel a lot more approachable if it were in terms of field metadata, with other kinds of reflection mentioned in Future Directions.

So what other reflection do we have today?

  • Custom mirrors. This is opt-in already.
  • Getting the name of a type. This is currently supported by all types, and it sounds like that won鈥檛 change for the time being.
  • Looking up a type by name. Ditto.
  • Getting the name of an enum case. Will this be affected by this proposal?
  • Dynamic casts. A big enough deal that they deserve their own proposal, and it makes sense that they鈥檙e not covered by Reflectable.
  • probably anything else that鈥檚 in a custom section, if we haven鈥檛 hit them all already

Why is this list important? For any sort of resource-constrained environments where we鈥檇 like to have no metadata at all if it鈥檚 never used, but where it鈥檚 really hard to prove that. (For instance, if there are no dynamic casts to a protocol type, then the conformance metadata for that protocol only needs to be present if it鈥檚 actually used, after optimizations.)

I鈥檓 not saying we need a switch for every single thing, and of course I hope our optimization continues to improve. But this is why I鈥檓 pressing on this: the current API surface of Mirror does not represent everything the runtime does with 鈥渕etadata鈥 that could be considered 鈥渞eflection鈥.

1 Like

The motivation to remove type names is for secrecy reasons, so that someone can鈥檛 search the binary for e.g. 鈥淧asswordValidationState鈥 to find out how that state is represented, or 鈥淪hinyNewFeatureConfig鈥 to confirm that an app developer is working on a new feature. Maybe that鈥檚 just more 鈥渨alls and ladders鈥 obfuscation, but it still seems relevant.

Secrecy isn't our primary goal, even though it might be improved. Hiding/removing strings from binary might be pretty challenging since many features depend on it and will require changes in Core Metadata which isn't a part of this proposal.

It sounds like this proposal is planning to change the behavior for field metadata only. I don鈥檛 know if we鈥檒l get other types of reflection in the future (invoking methods, listing computed properties along with stored ones like ObjC does, etc), but I think calling the capability (and the protocol) Reflectable makes sense. The proposal, however, would feel a lot more approachable if it were in terms of field metadata, with other kinds of reflection mentioned in Future Directions.

This is a good point, the proposal currently affects only field metadata. I will emphasize that in the document and add in the "Future Directions" section that all reflection metadata added in the future might also be covered by the proposal.

Why is this list important? For any sort of resource-constrained environments where we鈥檇 like to have no metadata at all if it鈥檚 never used, but where it鈥檚 really hard to prove that. (For instance, if there are no dynamic casts to a protocol type, then the conformance metadata for that protocol only needs to be present if it鈥檚 actually used, after optimizations.)

I鈥檓 not saying we need a switch for every single thing, and of course I hope our optimization continues to improve. But this is why I鈥檓 pressing on this: the current API surface of Mirror does not represent everything the runtime does with 鈥渕etadata鈥 that could be considered 鈥渞eflection鈥.

All emitted metadata might be considered as Reflection, but we try to distinguish between required Core and optional Reflection Metadata. Some metadata you mentioned in the list we consider as Core metadata and to handle it, the compiler needs to use a different approach rather not emitting it. There are a bunch of optimizations to improve dead-stripability of such metadata if provenly not used.
To limit the scope of this proposal we concentrated only on Reflection metadata.

1 Like

Realm Swift currently does the following thing: we call objc_copyClassList(), filter the classes to ones inheriting from a base class we define, then use Mirror(reflecting:) to read the property names and types for each of those subclasses.

It appears that with this proposal what we're doing almost would still work with no changes for our users, and no longer require that users build their app with full reflection metadata enabled. The problem is that while subclasses inherit reflectability, it sounds like we won't be able to mark our base class as Reflectable due to it being defined in obj-c (and has to be to work around FB7201126), and thus we can only declare protocol conformances in extensions.

We could require users to explicitly mark each of their subclasses as Reflectable, but that's clunky and error prone, especially if we can't check at runtime for that specifically due casts not being backdeployable (which otherwise doesn't sound like a problem for us).

I don't see an easy workaround for your use case. Would you be able to expose a Swift class that would inherit from your ObjC base class and conform to Reflectable at the same time?

(Since RealmSwift is consumed by Swift code, it should be fine, but probably would break source compatibility)

2 Likes

The base class used to be defined in Swift, but because of FB7201126 we need it to not be a resilient type even when our Swift library is compiled in evolution mode. Defining it in obj-c instead was the only way I could find to do that.

Can you attach a link to that bug? (I assumed that it was an openradar issue, but it wasn't)