StoredPropertyIterable

rxwei · January 3, 2019, 2:07am

Hi all,

@dan-zheng and I have been experimenting with a protocol that helps you iterate over all stored properties at runtime.

Here's the idea:

protocol StoredPropertyIterable {
    associatedtype AllStoredProperties : Collection
        where AllStoredProperties.Element == PartialKeyPath<Self>
    static var allStoredProperties: AllStoredProperties { get }
}

By conforming to StoredPropertyIterable, you get a type property that represents a collection of key paths to all stored properties defined in the conforming type. The conformance is compiler-derived.

struct Foo : StoredPropertyIterable {
    var x: Int
    var y: String

    // Compiler-synthesized:
    static let allStoredProperties: [PartialKeyPath<Foo>] {
        return [\.x, \.y]
    }
}

A protocol extension can provide a recursivelyAllStoredProperties computed property that returns an array of key paths to stored properties of this type and key paths to any nested stored properties whose parent also conforms to StoredPropertyIterable. This can also be a lazy collection based on allStoredProperties, of course.

extension StoredPropertyIterable {
    static var recursivelyAllStoredProperties: [PartialKeyPath<Self>]
}

struct Bar : StoredPropertyIterable {
    var foo: Foo 
    var z: Int
    var w: String
}
Bar.allStoredProperties
// => [\Bar.foo, \Bar.z, \Bar.w]
Bar.recursivelyAllStoredProperties
// => [\Bar.foo, \Bar.foo.x, \Bar.foo.y, \Bar.z, \Bar.w]

Why do we believe StoredPropertyIterable should be added to the standard library? It provides a canonical API for accessing all stored properties of a struct, and can define away the existing compiler synthesis for derived Hashable conformances. Here's @Joe_Groff's earlier idea: keypath-hashable.swift · GitHub.

Advanced use cases lie in many fields, one of which I'm familiar with is machine learning. ML optimizers operate on bags of parameters (sometimes defined as stored properties, sometimes as collections), and apply an update algorithm to every parameter. Since this use case is more complex than stored properties and requires key paths to nested parameters, I won't go into details for now. If you are interested, you can look at CustomKeyPathIterable in the gist below.

The following gist demonstrates a simple StoredPropertyIterable and a more advanced protocol called CustomKeyPathIterable that lets you define key paths to non-compile-time elements. Comments are welcome!

gist.github.com

https://gist.github.com/rxwei/e316f7114b723ad69a30a3aba224ccb3

property-key-paths.swift

//============================================================================//
// Part 1. StoredPropertyIterable
// This models the purely static layout of a struct.
//============================================================================//

// This is an implementation detail that is required before PAT existentials are
// possible.
protocol _StoredPropertyIterableBase {
  static var _allStoredPropertiesTypeErased: [AnyKeyPath] { get }
  static var _recursivelyAllStoredPropertiesTypeErased: [AnyKeyPath] { get }

This file has been truncated. show original

Lantua · January 3, 2019, 2:35am

I wonder if it’s better to incorporate this into Mirror, esp. since it concerns mostly on stored properties. Their capabilities are eerily similar to me.

rxwei · January 3, 2019, 2:40am

That could be interesting! We intended to align our pitch with CaseIterable because we feel it's more approachable than reflection APIs.

Joe_Groff · January 3, 2019, 2:53am

Having a way to get a collection of key paths for a type makes a lot of sense. I think it makes sense not to make this about stored properties per se, but about the set of properties that make up the logical "schema" of the type; the set of stored properties makes sense as the default schema for a struct, but it would be useful to be able to override that default with a set of computed properties or subscripts that cover the type.

For collections, there's also the issue that the schema is value-dependent rather than uniform for all instances of a type, so you might want to have separate static and instance properties in the protocol to model this:

protocol KeyPathSchema {
  static var typeSchema: [PartialKeyPath<Self>]
  var valueSchema: [PartialKeyPath<Self>]
}

let x = [1, 2, 3]
type(of: x).typeSchema // []
x.valueSchema // [\.[0], \.[1], \.[2]]

I could see this all being particularly useful with compile-time evaluation, since you could use the statically-known layout of types to generate per-field logic based on those layouts. However, with the limited closed hierarchy of key paths that exist now, it isn't ideal to use key paths as a seed for generating things like Hashable conformance, since there's no way to state the requirement that all the fields in a schema must themselves be Hashable for the default implementation to be viable. Eventually, if we had "protocol-oriented" keypaths and generalized existentials, it'd be nice to be able to express this as conditional constraints on the key path collection:

protocol KeyPathSchema {
  associatedtype Schema: Collection where Schema.Element: KeyPath
}

extension Hashable where Self: KeyPathSchema, Self.Schema.Element.Value: Hashable {
  ...
}

rxwei · January 3, 2019, 6:59am

I agree, though I feel the name "schema" is a little unintuitive. Any other naming suggestions?

I like the idea of having the both a type property and an instance property. The functionality is like a combination of StoredPropetyIterable and CustomKeyPathIterable in that gist. I definitely think having a single protocol is better if the protocol is defined around the concept of key paths instead of properties.

Great point!

GalCohen · January 4, 2019, 1:37am

Neat! This is exactly what I wished existed the other day. Currently solving the problem using Mirror, but this would be much nicer and performant for my use case.

CTMacUser · January 5, 2019, 1:25am

rxwei:

struct Bar : StoredPropertyIterable {
    var foo: Foo 
    var z: Int
    var w: String
}
Bar.allStoredProperties
// => [\Bar.foo, \Bar.z, \Bar.w]
Bar.recursivelyAllStoredProperties
// => [\Bar.foo, \Bar.foo.x, \Bar.foo.y, \Bar.z, \Bar.w]

I wonder if the recursive property list should include both the direct properties and second-level properties. If you walk the list, each second-level property could get touched twice (once directly and once as part of whatever function you apply on its direct container). It gets even worse once deeper levels get involved. Maybe we need a third property list, for all the direct and indirect properties that can't be broken down any further.

mpangburn · January 5, 2019, 3:25am

Related idea—wouldn't need to be tied to this pitch, but just thought I'd throw it out for future consideration.

It'd be cool to have a compiler-synthesized failable initializer that takes a dictionary of partial keypaths to property values:

struct Person: StoredPropertyInitializable {
    var firstName: String
    var age: Int

    // compiler-synthesized:
    init?(propertyValues: [PartialKeyPath<Person>: Any]) {
        guard
            let firstName = propertyValues[\.firstName] as? String,
            let age = propertyValues[\.age] as? Int
        else {
            return nil
        }

        self.firstName = firstName
        self.age = age
    }
}

Such a feature would enable, for example, the creation of generic builder types.

rxwei · March 6, 2019, 12:17pm

@dan-zheng wrote a document on our current design and implementation of KeyPathIterable and its conformances synthesis.

KeyPathIterable is released as part of Swift for TensorFlow v0.2, and is one of the core building blocks of the Swift for TensorFlow Deep Learning Library.

Please have a read and give us feedback!

github.com

tensorflow/swift/blob/main/docs/DynamicPropertyIteration.md

# Dynamic property iteration using key paths

[Richard Wei](https://github.com/rxwei), [Dan Zheng](https://github.com/dan-zheng)

Last updated: March 2019

> #### Experimental
>
> `KeyPathIterable` is being incubated in the
> ['tensorflow' branch of apple/swift](https://github.com/apple/swift/tree/tensorflow)
> and released as part of the
> [Swift for TensorFlow toolchains](https://github.com/tensorflow/swift#getting-started),
> which you can play with. The authors will propose this feature through
> [Swift Evolution](https://forums.swift.org/c/evolution) in 2019. Updates will be posted
> on the [initial pitch thread](https://forums.swift.org/t/storedpropertyiterable/19218).

## Background and motivation

The ability to iterate over the properties of a type is a powerful reflection
technique. It enables algorithms that can abstract over types with arbitrary

This file has been truncated. show original

GalCohen · March 6, 2019, 1:20pm

I’m curious, because I’m interested in this feature but not for Tensor flow. How does work done on Tensor Flow get merged back into the main Swift branch? Does it?

What happens if something like KeyPathIterable is rejected during the Swift Evolution process... doesn’t it mean the two repos diverge over time?

Chris_Lattner3 · March 6, 2019, 5:56pm

Yes, our intention is to merge back all language changes, but that is subject to community review and the normal swift-evolution process (which we are committed to following). If you prefer, you can think of the S4TF branch as an incubator for the work we need, but our goal is to drive the diff to zero over time.

Joe_Groff · March 6, 2019, 9:01pm

This is a good start. Some comments:

For code size, the default implementation might be best implemented using the runtime instead of by compiler codegen. We could use @_semantics to allow the SIL optimizer and constant evaluator to expand the default implementation into the list of stored properties when known at compile time.
In order for this to sufficiently deprecate Mirror, there should be a universal function that can get the key path collection from any value. KeyPathIterable could be used to customize the behavior (and as a signal that code is actively relying on this type being key-path-iterable), and the runtime could fall back to traversing metadata, similar to how Mirror works today.
One of the most common requests for Mirror is the ability to get the fixed keys from a type independent of any instance. Your design is great because it can do the right thing for collections which have dynamic sets of keys, but it'd be nice to be able to address the use case for fixed-layout types like structs too. The proposal mentions that you had explored having two separate protocols for these two purposes. Having one protocol seems to me like it could work too (but I don't have a strong opinion one way or the other).

anandabits · March 6, 2019, 9:32pm

The latest design looks really nice overall.

Additionally, conformances to KeyPathIterable for Array and Dictionary are provided in the standard library: Array.allKeyPaths returns key paths to all elements and Dictionary.allKeyPaths returns key paths to all values. These enables recursivelyAllKeyPaths to recurse through the elements/values of these collections.

If we're going to synthesize conformances for collections and support deep recursion would it make sense to have an associated type for the key path collection instead of hard-coding it as an array? That might enable a lazier approach to generating the individual key paths and avoid allocating an array when the key paths are accessed. Users could still create an array explicitly if desired.

I also have a couple of questions about the synthesis the proposal includes. It isn't stated explicitly, but I assume the synthesis is only available when the conformance is declared in the same file as the type. Is that correct?

Secondarily, I assume that the synthesized implementation will "leak" key paths to private stored properties. This is not necessarily an issue as it is possible to write code that does this manually, I'm only asking to confirm my understanding of the design. Is this correct as well?

Finally, one future enhancement that might be interesting is to also support a synthesized conformance for enums with associated values if / when enums receive property synthesis.

rxwei · March 6, 2019, 10:24pm

The associated type is already in there, defined in the document. Were you looking for this?

    associatedtype AllKeyPaths: Collection
        where AllKeyPaths.Element == PartialKeyPath<Self>

Yes, just like other synthesized conformances in stdlib. But @Joe_Groff pointed out that defining the default implementation by accessing the runtime would be better than synthesis, so I think the same-file restriction can be lifted.

Yes. What we have is a prototype, and we haven't really thought carefully about this. Will definitely address this issue when it becomes a formal pitch/proposal.

anandabits · March 6, 2019, 10:33pm

Yes, Somehow that didn't register and I was looking at the proposed conformances for Array and Dictionary which both used Array as their AllKeyPaths type. Did you give any thought to taking advantage of the associated type to make that a lazy collection of element key paths?

Lifting that restriction would violate access control when there are private or file private stored properties. Code outside the file could declare conformance and then receive a key path to a one of those properties without the key path having been vended by the file declaring the property. I think you should keep the restriction. If you do that, synthesis won't do anything that couldn't be written manually at the site of the conformance declaration.

The design looks good to me modulo the question about the concrete AllKeyPaths type used by collection types.

Joe_Groff · March 6, 2019, 10:36pm

It isn't violating access control if the implementer chooses to offer up references to private things. It may or may not be the right default behavior for a compiler-synthesized implementation, though it would match what Mirror currently gives you.

anandabits · March 6, 2019, 11:08pm

I agree, that's why I don't have a problem with it as long as the conformance is declared in the same file as the type (and its stored properties). However, if the conformance is declared in a different file then the site of the conformance cannot see private properties at all so code in this location is unable to form a key path to them. IMO, the synthesized conformance should not be allowed to behave differently in this respect.

That said, I suppose it would be fine to lift the same-file restriction for types with no stored private or fileprivate properties (or internal properties if the conformance is declared in a different module).

GalCohen · March 7, 2019, 10:11am

But what happens if S4TF adds a feature to Swift that is later rejected by the Swift Proposal process?

rxwei · March 8, 2019, 1:57am

Then we would address review feedback, try to come up with something better, and pitch again.

Again, we are not trying to create a dialect (either in the language or in the standard library), and we do not want the tensorflow branch to become a dumping ground for arbitrary niche features. If there's an existing language feature that solves our problem, we will use it; if not, we will build them, make them general for all Swift users (not just for machine learning) and pitch them via Swift Evolution.

It is not controversial that stored property iteration is a commonly requested feature. It turns out also that stored property iteration is an integral part of machine learning use cases. So, we gave it a try.

GalCohen · March 8, 2019, 4:27am

Thanks! Looking forward to seeing this integrated. I have a very different use case for it