StoredPropertyIterable

Torust · April 27, 2019, 11:11pm

One question I have: does using runtime metadata limit potential compiler optimisations? For example, in the SOA<T> use-case mentioned before, the implementation may need to iterate through all key-paths until it finds one that matches a query key-path (calculating an offset in the process). That should ideally be constant-folded for efficient access; would that still be possible when using a runtime metadata based solution?

dan-zheng · April 28, 2019, 2:41am

If T is known at compile time, optimization should be possible:

clayellis · April 28, 2019, 4:47pm

I know this isn’t substantial to functionality, but could recursivelyAllStoredProperties be spelled allRecursivelyStoredProperties or allRecursiveStoredProperties? The way it’s spelled currently feels odd.

rxwei · April 28, 2019, 7:00pm

One key reason we propose to introduce a protocol instead of using general reflection is to enable custom behavior, that is, users can define a custom schema for their type. The title StoredPropertyIterable itself might be a confusing name. We ended up using a single protocol named KeyPathIterable as described in this document.

In this design, we conform Array to KeyPathIterable so that custom key-path-iterable elements in arrays can also be iterated over. They are not stored properties.

extension Array: KeyPathIterable {
    public typealias AllKeyPaths = [PartialKeyPath<Array>]
    public var allKeyPaths: [PartialKeyPath<Array>] {
        return indices.map { \Array[$0] }
    }
}

Also, we use "recursively" to modify the adjective "all" to convey the accurate meaning, so it's better for "recursively" to come before "all".

Joe_Groff · April 29, 2019, 4:33pm

To be clear, these aren't mutually exclusive. The protocol can still have a default implementation on a shared reflection-based implementation. Conversely, reflection APIs could use the protocol to allow types to override their default reflection behavior, like Mirror does with CustomReflectable today.

Troy_Harvey · May 12, 2019, 2:50am

I'd love to see KeyPathIterable make its way upstream! Thanks to Richard and the tensorflow team for their work. We are using KeyPathIterable in our system, but the management of off-main-branch breaking code and toolchain incompatibilities is causing lots of hair to be pulled.

The ability to dynamically query lenses from objects is important to a wide variety of problems, and obviously ML. Without it, you have to use template meta-programming, which gets kind of viral and gross.

There are currently few options for languages that are high performance compiled system languages that also support deep introspection of member accessors. Swift is headed down an exciting path

JohnEstropia · August 20, 2019, 3:38am

Please make this happen
ORMs and basically every modelling library (JSON etc) will benefit from this performance boost

rxwei · August 20, 2019, 4:22am

We haven't got a chance to address @Joe_Groff's comments. If anyone wants to take a stab at re-implementing KeyPathIterable using runtime metadata, please go ahead!

JohnEstropia · September 30, 2019, 5:05am

IMO, StoredPropertyIterable doesn't need to replace Mirror. Sure it would be nice to have KeyPath utilities in Mirror, but I think that can be a separate feature from the original proposal. If I understood correctly, StoredPropertyIterable will me mainly compile-time generated code, while Mirror is mostly runtime introspection.

dan-zheng · January 8, 2020, 7:04pm

Bump: [stdlib] Add _forEachField(of:options:body:) function by natecook1000 · Pull Request #29042 · apple/swift · GitHub by @nnnnnnnn implements a _forEachField(of:options:body:) function using runtime metadata:

This function walks all the fields of a struct, class, or tuple type, and calls body with the name, offset, and type of each field. body can perform any required work or validation, returning true to continue walking fields or false to stop immediately.

Perhaps the runtime metadata support added in that PR can be adapted to replace the current implementation of KeyPathIterable.allKeyPaths using derived conformances (limited to struct types).

@nnnnnnnn: I wonder if [stdlib] Add _forEachField(of:options:body:) function by natecook1000 · Pull Request #29042 · apple/swift · GitHub was added with a use case in mind?

dan-zheng · January 15, 2020, 4:01pm

Follow-up: TF-1102 tracks reimplementing allKeyPaths using runtime metadata instead of derived conformances. @shabalind plans to look into it soon.

With a runtime metadata implementation, allKeyPaths should become available for values of any type (including Any), not just KeyPathIterable-conforming types. Thus, KeyPathIterable should be deleted, and we need an API that works with Any.

This means we're back to the API design drawing board! Let's chat about what a new API based on runtime metadata could look like. I'll reply with some ideas below.

dan-zheng · January 15, 2020, 4:03pm

Regarding static key path schemas (StoredPropertyIterable), @Douglas_Gregor's reply above suggests defining APIs on MemoryLayout:

Douglas_Gregor:

extension MemoryLayout {
  static var storedProperties: _StoredPropertiesCollection<T>?
}
My goal here would be that the Element type had the property name, key path, and other metadata flags (e.g., whether it is indirect or weak, currently described by the C++ [FieldType](https://github.com/apple/swift/blob/master/include/swift/ABI/MetadataValues.h#L918 in the C++ part of the runtime).

static var storedProperties can only provide key paths to static properties. However, note that instance-based key path schemas are necessary to support key paths to array elements and dictionary values:

Array.allKeyPaths: [WritableKeyPath<Self, Element>]: key paths to elements.
Dictionary.allKeyPaths: [WritableKeyPath<Self, Value>]: key paths to values.

We could exposed both static and instance-based key path schema APIs, which was the original intention of the pitch (StoredPropertyIterable vs CustomKeyPathIterable) and was suggested in some replies (like this one from @Joe_Groff). I'd like to focus on what an instance-based key path schema API could look like, since that's the more general API.

Regarding instance-based key path schemas, we could try something like:

/// A type that explicitly defines its own key path schema.
// Note: this is similar to `CustomReflectable`.
// Conforming types include `Array` and `Dictionary`.
protocol CustomKeyPathSchema {
  /// A collection of all custom key paths of this value.
  var allKeyPaths: [PartialKeyPath<Self>]
}

// Extending `Any` with `var allKeyPaths: [PartialKeyPath<Self>]` is not
// possible in Swift code. Instead, we can create an API that takes `Any`
// as an argument and provides `var allKeyPaths: [PartialKeyPath<Self>]`.

struct KeyPathSchema<T> {
  var value: T
  init(_ value: T) {
    self.value = value
  }

  var allKeyPaths: [PartialKeyPath<T>] {
    // Note: we need a `_CustomKeyPathSchema` implementation detail
    // similar to `_KeyPathIterable` to work around PAT limitations.
    if let customSchemaValue = value as _CustomKeyPathSchema {
      return customSchemaValue. _allKeyPathsTypeErased.compactMap { kp in
        kp as? PartialKeyPath<T>
      }
    }
    // Fallback: use runtime metadata to get all key paths to:
    // - Structs and classes: stored properties.
    // - Enums: associated values of the current enum case.
    // - Tuples: elements.
    ...
  }

  // Include existing `KeyPathIterable` default implementation utilities:
  // https://github.com/apple/swift/blob/tensorflow/stdlib/public/core/KeyPathIterable.swift

  /// An array of all custom key paths of this value and any custom key paths
  /// nested within each of what this value's key paths refers to.
  var recursivelyAllKeyPaths: [PartialKeyPath<T>] { ... }

  /// Returns an array of all custom key paths of this value, to the specified
  /// type.
  func allKeyPaths<T>(to _: T.Type) -> [KeyPath<Self, T>] {
    return allKeyPaths.compactMap { $0 as? KeyPath<Self, T> }
  }

  ...
}

// Usage:
struct Wrapper<T> {
  var item: T
  var array: [T]
}
let x = Wrapper<Float>(item: 0, array: [1, 2, 3])
for kp in KeyPathSchema(x).recursivelyAllWritableKeyPaths(to: Float.self) {
  x[keyPath: kp] += 1
}
print(x) // Wrapper<Float>(item: 1, array: [2, 3, 4])

Any thoughts?

I think some "key path view" wrapper abstraction is more natural than adding top-level functions like _forEachField to the global namespace.

Joe_Groff · January 15, 2020, 6:12pm

I agree, the way Mirror works would be a good model to follow: provide an API that works for every type using runtime metadata, but which allows for customization if types opt in to implementing Custom* protocols. Ideally, these key path iteration APIs could supersede Mirror entirely, since they ought to be strictly more expressive in what they enable; this design solves many of the inherent problems with Mirror, such as the lack of mutation support and the inability to get the static schema of a type independent of an instance. There's also an opportunity for these APIs to eventually be made compiler-evaluable, since the compiler also knows the layouts of types, which could allow for "constexpr" code to process the schemas of types at compile time.

Doug's idea of adding storedProperties to MemoryLayout is an interesting approach to namespacing this functionality, but I'm not sure MemoryLayout is the best place to put this. To me, the existing memory layout APIs seem like fairly low-level memory management concerns, whereas the schema of a type seems like a generally useful API. Maybe we could make a new Reflection namespacey type to contain these APIs instead of MemoryLayout.

mattpolzin · January 15, 2020, 6:26pm

This is a very exciting prospect as someone who wrangles with Mirror quite often because it is both very powerful and simultaneously frustratingly limited.

Makes sense to me to separate concerns here. I like Reflection and would not vote to get closer in name to MemoryLayout than perhaps TypeLayout.

dan-zheng · January 15, 2020, 6:32pm

A Reflection namespace sounds nice to me. On the topic of where to define var allKeyPaths: [PartialKeyPath<T>] (and related utilities): we could have Reflection.KeyPathSchema(x).allKeyPaths (adapting the example above).

Joe_Groff · January 15, 2020, 6:37pm

Reflection.allKeyPaths(for: x) could be another possibility. I don't have a strong opinion as to the exact name.

So one thing Mirror does today, that a flat collection wouldn't naively be able to do, is that it gives you both keyed and indexed access to the schema. For dictionaries, structs, and classes, the mirror lets you ask for the elements by their names in addition to iterating them through their indices. We might want something like that for these key path iteration APIs as well; in addition to allKeyPaths: [PartialKeyPath<T>], maybe you could also have allNamedKeyPaths: [String: PartialKeyPath<T>].

dan-zheng · January 15, 2020, 6:39pm

Enumerating key paths along with name information would be huge! Could you please share some pointers on how to implement this (using runtime metadata)?

Joe_Groff · January 15, 2020, 6:41pm

In the fields metadata, the names and types of the fields ought to be right next to each other, so however you iterate through the field metadata to gather the offsets for key paths, you should also be able to collect the corresponding names.

DevAndArtist · January 15, 2020, 6:54pm

Would be cool to have the ability to gather not only stored property key-paths but its superset including computed properties.

Reflection.properties(for: x)
Reflection.storedProperties(for: x) // returns Reflection.Property 

extension Reflection {
  public struct Property: Equatable {
    public let label: String
    public let value: Any
    public let keyPath: PartialKeyPath<Value>
    ...
  }
}

All this kinda screams for some good Reflection API.

dan-zheng · January 15, 2020, 7:02pm

Thanks for the info! It sounds like supporting var allNamedKeyPaths: [String: PartialKeyPath<T>] shouldn't be too much additional work after reimplementing var allKeyPaths: [PartialKeyPath<T>] using runtime metadata.

Minor: I suspect var allNamedKeyPaths: KeyValuePairs<String, PartialKeyPath<T>> is more desirable to preserve ordering of properties/elements.

I'd also like to reiterate that "getting property/element name from key path" is probably the most hotly-requested extension to KeyPathIterable! The common impression seems to be that "getting string names from key paths is generally impossible" ([1], [2]) - jointly iterating over names and key paths is a clever workaround. This feature would've personally saved me a few hours from debugging an incorrect key path - the error would've been obvious if I could just print the key path name.