Getting KeyPaths to members automatically using Mirror

keypaths
keypath
(Porter Child) #1

I've discovered an ability of Swift but I'm not sure it's safe.

After modifying this StackOverflow answer, I've got a protocol that will give you, for any conforming object, a dictionary of [member variable name : its keyPath]. This functionality seems pretty magical to me:

protocol KeyPathListable {
    associatedtype AnyOldObject
    // require empty init as the implementation use the mirroring API, which require
    // to be used on an instance. So we need to be able to create a new instance of the
    // type. See @@@^^^@@@
    init()

    var keyPathReadableFormat: [String: Any] { get }
    var allKeyPaths: [String:KeyPath<AnyOldObject, Any?>] { get }
}

extension KeyPathListable {

    var keyPathReadableFormat: [String: Any] {
        var description: [String: Any] = [:]
        let mirror = Mirror(reflecting: self)
        for case let (label?, value) in mirror.children {
            description[label] = value
        }
        return description
    }

    var allKeyPaths: [String:KeyPath<Self, Any?>] {
        var membersTokeyPaths: [String:KeyPath<Self, Any?>] = [:]
        let instance = Self() // @@@^^^@@@
        for (key, _) in instance.keyPathReadableFormat {
            membersTokeyPaths[key] = \Self.keyPathReadableFormat[key]
        }
        return membersTokeyPaths
    }

}

So if Tree conforms (there's some weird init() business that has to happen):

struct Tree: KeyPathListable {
    var diameter: Double
    var circumference: Double
    var barkThickness: Double
}
extension Tree{
    // Custom init inside an extension to keep auto generated `init(x:, y:)`
    init() { //not called as of now
        self.diameter = 10
        self.circumference = 31.415
        self.barkThickness = 0.5
    }
}

Then I can call tree.allKeyPaths on an instance and get the magical dictionary I referred to:

var tree = Tree(diameter: 10, circumference: 31.415, barkThickness: 0.5)

for entry in tree.allKeyPaths{
    print("member: ", entry.key, " / value: ", tree[keyPath: entry.value]!)
}

//prints
//
//member:  barkThickness  / value:  0.5
//member:  circumference  / value:  31.415
// member:  diameter  / value:  10.0

This functionality seems super cool to our team and we want to use it in some fundamental parts of our code.
One application is navigating a graph of objects and returning KeyPaths to interesting member variables. We will then use these KeyPaths to probe the member variables while training a machine learning model that uses the graph to produce predictions.

However, the original answerer on the StackOverFlow question thought it was excessively hacky and "might contain a lot of bugs/strange behaviors".

Before we use this, I wanted to ask the experts:

  1. Is this exploiting some part of the language that is unsafe?
  2. Are we going to be sorry for using it?
  3. Am I trying too hard to make a static language act dynamic, or something like that?
2 Likes
(Zachary Waldowski) #2

This is fun!

You could get the protocol down to only allKeyPaths and save it from making a dictionary from all the values on every access:

protocol KeyPathListable {
    var allKeyPaths: [String: PartialKeyPath<Self>] { get }
}

extension KeyPathListable {

    private subscript(checkedMirrorDescendant key: String) -> Any {
        return Mirror(reflecting: self).descendant(key)!
    }

    var allKeyPaths: [String: PartialKeyPath<Self>] {
        var membersTokeyPaths = [String: PartialKeyPath<Self>]()
        let mirror = Mirror(reflecting: self)
        for case (let key?, _) in mirror.children {
            membersTokeyPaths[key] = \Self.[checkedMirrorDescendant: key] as PartialKeyPath
        }
        return membersTokeyPaths
    }

}

Neither version is relying on any bugs or anything, per se. But the results might have unexpected quirks, like that none of the returned key-paths will be == \Tree.diameter, which may lead to downstream weirdness if you're expecting that from Dictionary or Set of these key-paths. That would be a deal-breaker for me personally; I'm getting hives thinking about walking a junior dev through it all.

It's important to note for both versions that Mirror(reflecting:) isn't cheap, and both versions do it initially then on every subsequent property access. Accessing everything through a new Mirror every time is several orders of magnitude slower than simple property access.

This isn't necessarily an indictment of using reflection for your use case. The performance implications matter for an ML use case, I think, but maybe I'm talking out of my [redacted]. The cost is basically linear in nature, so you might be able to live with it?

What I would personally recommend is allKeyPaths being static and user-generated, possibly leaning on code generation (like Sourcery).

All four ways I've mentioned are graphed below. It's an imperfect test because it's just using different amounts of the same Tree struct. Varying the number of properties would be a way better test, but it got some meaningful data all the same.

For further reading, you may be interested in the video Swift’s Reflective Underpinnings. With Swift 5's stable ABI, it would be theoretically possible to get the default implementation of allKeyPaths a lot closer to the green line above. (Although I don't know if the recipe for making KeyPaths themselves is ABI, so I might again by talking out of my [redacted].) But that's even less maintainable, IMHO, than either Mirror-based approach, or at least presents a significant bus factor. :stuck_out_tongue_winking_eye:

4 Likes
(Joanna Carter) #3

I use a simpler method to give me a list of KeyPaths. Granted it means implementing a single static method on each derived class or struct, but it's more type safe than returning Any for the value types.

public protocol KeyPathDeclaration
{
  static var keyPaths: [String : PartialKeyPath<Self>] { get }
}  

struct Tree: KeyPathDeclaration
{
  var diameter: Double
  
  var circumference: Double
  
  var barkThickness: Double
  
  init()
  {
    diameter = 0.0
    
    circumference = 0.0
    
    barkThickness = 0.0
  }
  
  static var keyPaths: [String : PartialKeyPath<Tree>]
  {
    return ["diameter" : \Tree.diameter, "circumference" : \Tree.circumference, "barkThickness" : \Tree.barkThickness]
  }
}

This then allows the following code:

    let tree = Tree()
    
    for keyPath in Tree.keyPaths
    {
      print("member: \(keyPath.key) / value \(tree[keyPath: keyPath.value])")
    }

But, strictly, KeyPaths were designed to avoid the use of strings, which are prone to spelling errors and the like; you really shouldn't use strings to describe members of a type; that is very much the mentality of Objective-C, not Swift. You can (and should) simplify this by just doing something like this:

struct Tree
{
  var diameter: Double
  
  var circumference: Double
  
  var barkThickness: Double
  
  init()
  {
    diameter = 0.0
    
    circumference = 0.0
    
    barkThickness = 0.0
  }
}
    let tree = Tree()
    
    print("diameter \(tree[keyPath: \Tree.diameter])")
    
    print("circumference \(tree[keyPath: \Tree.circumference])")
    
    print("barkThickness \(tree[keyPath: \Tree.barkThickness])")

Can I ask, why you want to use this idea of string-addressed members?

(Joanna Carter) #4

Hi Zachary

Here's an "improved" version of your code that ensures that the dictionary only ever gets created once:

protocol DefaultValueProvider
{
  init()
}

protocol KeyPathListable : DefaultValueProvider
{
  static var allKeyPaths: [String : AnyKeyPath] { get }
}

fileprivate var _membersToKeyPaths: [String: AnyKeyPath]?

extension KeyPathListable
{
  private subscript(checkedMirrorDescendant key: String) -> Any
  {
    return Mirror(reflecting: self).descendant(key)!
  }
  
  static var allKeyPaths: [String : AnyKeyPath]
  {
    if _membersToKeyPaths == nil
    {
      _membersToKeyPaths = [String: PartialKeyPath<Self>]()
      
      let mirror = Mirror(reflecting: Self())
      
      for case (let key?, _) in mirror.children
      {
        _membersToKeyPaths![key] = \Self.[checkedMirrorDescendant: key] as PartialKeyPath
      }
    }
    
    return _membersToKeyPaths!
  }
}
struct Tree : KeyPathListable
{
  var diameter: Double
  
  var circumference: Double
  
  var barkThickness: Double
  
  init(diameter: Double, circumference: Double, barkThickness: Double)
  {
    self.diameter = diameter
    
    self.circumference = circumference
    
    self.barkThickness = barkThickness
  }
  
  init()
  {
    self.init(diameter: 0.0, circumference: 0.0, barkThickness: 0.0)
  }
}
    let tree = Tree()
    
    let tree2 = Tree(diameter: 1.0, circumference: 2.0, barkThickness: 3.0)
    
    for x in Tree.allKeyPaths
    {
      let key = x.key
      
      let value = x.value
      
      print("tree - member: \(key) / value: \(tree[keyPath: value])")
      
      print("tree2 - member: \(key) / value: \(tree2[keyPath: value])")
    }
(Porter Child) #5

Thanks for your feedback!

  1. In the talk you linked, it looks like Joe referred to exactly what we're doing at about the 22 minute mark. Any idea when that will surface as a native feature of Swift?

  2. Your code looks a lot cleaner, I'm trying it out but getting the

    Key path of type PartialKeyPath<Tree> cannot be applied to a base of type Tree

    error when running this:

    for entry in tree{
     print("member: ", entry.key,  "/ value: ", tree[keyPath: entry.value]!)
    }
    

    which seems like it shouldn't be happening. When I change PartialKeyPath everywhere in your code to AnyKeyPath, it works.Any idea why?

  3. Neither version is relying on any bugs or anything, per se. But the results might have unexpected quirks, like that none of the returned key-paths will be == \Tree.diameter , which may lead to downstream weirdness if you're expecting that from Dictionary or Set of these key-paths.

    So here are you just referring to the inability to use == \Tree.diameter in an if statement, for example? Junior dev here, causing hives since obviously not long enough. :stuck_out_tongue_winking_eye:

  4. It's important to note for both versions that Mirror(reflecting:) isn't cheap, and both versions do it initially then on every subsequent property access. Accessing everything through a new Mirror every time is several orders of magnitude slower than simple property access.

    For our application, using the KeyPaths would happen very rarely compared to how long it takes to get predictions from the object graph. So I think Mirror lethargy won't hurt us too much. Thanks for the graphs!

  5. What I would personally recommend is allKeyPaths being static and user-generated, possibly leaning on code generation (like Sourcery).

    That was my first implementation, it just seems like more work and more code to monitor. The ease of getting the KeyPaths automatically appeals to us for those reasons. We might end up going that way though, given the speed, predictability...
    Going to take a look at Joanna's version of this tomorrow.

(Porter Child) #6

Thanks for the ideas!

String-addressed just because in our application we want a function sort of like:

navigateObjectStructureAndGetKeyPaths(to members: [String], startingAt root: Object) -> [KeyPath]

where the list of strings represents the members we want to be able to probe while tuning the machine learning model. It's just the first way I thought of to specify the desired KeyPaths in the object structure, I haven't given much thought to a safer way to do this. Suggestions welcome.
Going to take a look at your improvement of Zachary's code tomorrow.

(Porter Child) #7

Hey look the Swift for TensorFlow team has already solved this problem!
https://github.com/tensorflow/swift/blob/master/docs/DynamicPropertyIteration.md

2 Likes
(Porter Child) #8

Here's an extension to Swift for TensorFlow's KeyPathIterable that gives the functionality of my original question. (get KeyPath based on member string name)

It relies on mirror.children giving you the members in the same order that KeyPathIterable gives them to you. As far as I have tested it, it does.

I couldn't figure out how to use .enumerated() to get rid of using a manual counter i variable, improvements welcome.

import TensorFlow

extension KeyPathIterable{

    var membersToKeyPaths: [String: PartialKeyPath<Self>]{
        let keyPaths = self.allKeyPaths as! [PartialKeyPath<Self>]
        let mirror = Mirror(reflecting: self)
    
        var membersToKeyPaths: [String: PartialKeyPath<Self>] = [:]
        var i = 0
        for case (let member?, _) in mirror.children{
            membersToKeyPaths[member] = keyPaths[i]
            i += 1
        }
        return membersToKeyPaths
    }
}
(Joanna Carter) #9

Well, it's a bit more complicated to write the "framework" code but, in the end, this solution only creates a Mirror once for each type.

Let's start with a couple of helper types

protocol DefaultValueProvider
{
  init()
}

public struct HashedType : Hashable
{
  public let hashValue: Int
  
  public init(_ type: Any.Type)
  {
    hashValue = unsafeBitCast(type, to: Int.self)
  }
  
  public init<T>(_ pointer: UnsafePointer<T>)
  {
    hashValue = pointer.hashValue
  }
  
  public static func == (lhs: HashedType, rhs: HashedType) -> Bool
  {
    return lhs.hashValue == rhs.hashValue
  }
}

Then we declare a version of @zwaldowski 's protocol to which all types that want to use strings to access property values, along with a "related" static class, where all the magic happens and the mirrors for each type are cached

protocol Reflectable : DefaultValueProvider { }

extension Reflectable
{
  static var keyPaths: [String : AnyKeyPath]?
  {
    return KeyPathCache.keyPaths(for: Self.self)
  }
  
  fileprivate subscript(checkedMirrorDescendant key: String) -> Any
  {
    let hashedType = HashedType(type(of: self))
    
    return KeyPathCache.mirrors[hashedType]!.descendant(key)!
  }
}

class KeyPathCache
{
  fileprivate static var mirrors: [HashedType : Mirror] = .init()
  
  private static var items: [HashedType : [String : AnyKeyPath]] = .init()
  
  static func keyPaths<typeT : Reflectable>(for type: typeT.Type) -> [String : AnyKeyPath]?
  {
    let hashedType = HashedType(type)
    
    return items[hashedType]
  }
  
  static func register<typeT : Reflectable>(type: typeT.Type)
  {
    let hashedType = HashedType(type)
    
    if mirrors.keys.contains(hashedType)
    {
      return
    }
    
    let mirror = Mirror(reflecting: typeT())
    
    mirrors[hashedType] = mirror
    
    var keyPathsDictionary: [String : AnyKeyPath] = .init()
    
    for case (let key?, _) in mirror.children
    {
      keyPathsDictionary[key] = \typeT.[checkedMirrorDescendant: key] as PartialKeyPath
    }
    
    items[hashedType] = keyPathsDictionary
  }
}

Test code looks like this

class Person : Reflectable
{
  var name: String = "Joanna"
  
  var age: Int = 21 // yeah, it's been that way for years now :slight_smile: 
}
  {
    KeyPathCache.register(type: Person.self)
    
    let person = Person()
    
    guard let personKeyPaths = Person.keyPaths,
          let nameKeyPath = personKeyPaths["name"] else
    {
      return
    }
    
    if let name = person[keyPath: nameKeyPath] as? String
    {
      print(name)
    }
  }

Well, what do you think ?