Introspection of KeyPaths

Pampel · May 26, 2020, 1:11pm

One bane of programming is 'stringly typed' APIs, where strings are used to convey some sort of meaning that would probably be better conveyed with types, but for whatever reason, we can't or don't. A great example of this is code that generates queries against databases.

KeyPaths seem like they should solve this issue, but currently there's no way to examine a KeyPath and figure out what the path is or what it points to - at the moment all you can do with them is use them to access properties.

If KeyPath had an API to expose the path it expresses, we could do neat stuff...

struct Person {
    ...
    let name: String
    let age: Int
    ...
}

struct Database {
    func select<Table, Property, OrderBy>(_ source: KeyPath<Table, Property>, orderBy: KeyPath<Table, OrderBy>) -> [Property] {
        let query = "SELECT \(source.pathComponents.first!) FROM \(Table.self) t ORDER BY t.(orderBy.pathComponents.joined(separator: "."))"
        return execute(query)
    }
}

let database = Database()

let sql = database.select(\Person.name, orderBy: \.age)

That's a pretty convoluted and definitely non-production ready example, but hopefully illustrates the point (and actually compiles if you add stubbed extensions to KeyPath). I'm not sure what shape the API should take, but it could enable some really interesting and powerful type-safe APIs for query generation, configuration, testing etc.

Expressing database schema details and constraints
As well as generating queries, KeyPaths could also be used to generate or validate schemas or schema-like structures, and allow frameworks to enforce constraints before a request makes it to the database.

Code

struct PersonMappingConfiguration {
    func map() {
        mapper.hasMany(\.pets)
        mapper.makeUnique(\.nickName)
        mapper.useColumnName("firstName", forProperty: \.name).makeReadonly()
    }
}

Validation
Similarly, KeyPath introspection could be used to create validators. In this case, the validator code becomes stateless.

Code

struct MemberValidation {
    let validation = Validator<Member>()
    
    func configure() {
        validation.of(\.name).required().maxLength(32)
        validation.of(\.age).minValue(1).withErrorMessage(message: { age in "Age \(age) is not valid!"})
        validation.of(\.nickName).required().withErrorMessage(message: { member, nickName in "\(member.name)'s nickname '\(nickName)' is too short!" })
    }
}

struct Validator<Type> {
    func of<Prop>(_ prop: KeyPath<Type, Prop>) -> Validation<Type, Prop> { return Validation<Type, Prop>() }
}

struct Validation<Type, Prop> {
    func required() -> Self { ... }
    func maxLength(_ maxLength: Int) -> Self { ... }
    func minLength(_ maxLength: Int) -> Self { ... }
    func minValue(_ minValue: Int) -> Self { ... }
    func withErrorMessage(message: (Prop) -> String) -> Self { ... }
    func withErrorMessage(message: (Type, Prop) -> String) -> Self { ... }
}

Generating data for tests

Code

struct FooTests {
    func BarTest() {
        let names = ["Bob", "John", "Sue"]
        let memberBuilder = Builder<Member>()
            .with(\.name, value: names.randomElement()!)
            .with(\.userId, value: UUID())
        
        let member = memberBuilder.build()

        ...
    }
}

struct Builder<Type> {
    func with<Prop>(_ property: KeyPath<Type, Prop>, value: @autoclosure () -> Prop) -> Self { ... }
    func build() -> Type { ... }
}

Paul_Cantrell · May 26, 2020, 2:53pm

I would appreciate something along these lines. Gathering good use cases would be a good place to start.

filip-sakel · May 26, 2020, 4:08pm

I’m by no means an expert in how key paths work, but how would that affect performance and the memory footprint of key paths. If these concerns are addressed I am definitely in favor of this proposal. I have encountered cases where some identifier derived from key paths would create a more intuitive API and had to result to the more “hacky” solution of using the hashValue which is unreliable.

gwendal.roue · May 26, 2020, 4:20pm

@Pampel You might want to look at KeyPath-to-String conversion performed by GitHub - vapor/core: 🌎 Utility package containing tools for byte manipulation, Codable, OS APIs, and debugging.. See for example https://github.com/vapor/core/blob/master/Tests/CoreTests/ReflectableTests.swift

Now, personally, as the author of GRDB (an SQLite database library that quite a few users love to use), I have never wanted to provide KeyPath-to-column automatic conversion.

The reason is that I think the intimate details of the relationship between a record type and the database should remain private. Column names are such details. The fact that the synthesized CodingKeys are private as well gives a good precedent.

By not relying on key paths, you can encapsulate your record exactly how you need it.

For example, the record below hides its latitude/longitude. There is no available key path which can talk to those database columns.

struct Place: Codable {
    var id: UUID
    var title: String
    private var latitude: CLLocationDegrees
    private var longitude: CLLocationDegrees
    var coordinate: CLLocationCoordinate2D {
        get { ... }
        set { ... }
    }
}

This ability to cleanly distinguish the private inner details and the internal/public facet of your records, the ability to refactor and migrate your database with minimal-to-zero impact on clients of those types, the ability to have complex records behave just the same as simple trivial ones, those are advantages that would be instantly ruined if key paths were publicly fostered as column proxies.

When such guts are unfortunately exposed, and when you realize that you really need to hide those guts because they are impractical to work with, you have to build a second layer of models, that wrap the first ones. Not only is this second layer a chore to build, but it is likely that not all of your records need one. You end up with an inconsistent database facade, with a mix of low-level and high-level types, without any clear reason why this mess has started. As a matter of fact, it's pretty clear: that's because of the KeyPath-to-string "convenience" conversion

gwendal.roue · May 26, 2020, 4:50pm

Now of course I wonder if Fluent users would confirm this prediction. I would enjoy a reality check :-)

Alejandro · May 26, 2020, 5:31pm

This is certainly possible right now with KeyPath’s current internals. It would require some more thought as to what the path component type looks like from an API point of view, but things like component type and name are all there (you’ll have to piece together offsets, but doable).

Jean-Daniel · May 26, 2020, 10:42pm

IIRC, the only time I miss such capability is when I have to works with Obj-C API that must take a string Key Path and don't have KeyPath based equivalent.

Pampel · May 28, 2020, 12:32pm

I totally agree with you in principle, although I have actually worked on projects that used ORMs with fluent configuration that do exactly this and found them an absolute pleasure to work with - there are plenty of areas where code driving a schema and queries against it is totally fine, and in those cases, this might be a good fit. Personally, I'd be cautious about using it for a major client project, but I'd love to be able to use it to spike out PoCs for personal stuff.

Using KeyPaths in the configuration of database adjacent code can really help, if not to generate the schema, just to validate that your code is compatible with it, e.g..

struct PersonMappingConfiguration {
    func map() {
        mapper.hasMany(\.pets)
        mapper.makeUnique(\.nickName)
        mapper.useColumnName("firstName", forProperty: \.name).makeReadonly()
    }
}

There's all sorts of opportunities in there to use the type system to both validate the schema, isolate the 'front end' of the type versus it's db representation, and prevent clients of your api doing bad things. Again, not always the right thing, but a great tool for where you want something similar, and one that is proven to work.

But, this isn't just about SQL.

I've used similar features for building and validation configuration, generating documents, improving testing and debugging, providing validation of user input and other things I can't remember. It's not a tool I use often, but when I need something similar, it's the perfect thing for the job.

Max_Desiatov · May 28, 2020, 12:50pm

Here's a use case for KeyPath introspection with CRDT: "Query into dynamic data using static key paths"

gwendal.roue · May 28, 2020, 1:02pm

Thanks for your answer, @Pampel. Yes, Fluent users are generally delighted. Maybe the trouble I envision does not bother them. Or maybe many servers mainly perform CRUD operations, and don't use the database models much, avoiding the need to hide database details.

Pampel · May 28, 2020, 1:05pm

For sure, there are plenty of circumstances where this use case won't be appropriate. Options are good though, we shouldn't be too judgemental about what people might use this for.

gwendal.roue · May 28, 2020, 1:14pm

You are right. Now, it's also useful to freely explore the consequences of some practices, emit hypothesis, confront them to oneself's past experience, and experience of others. To this end, those hypothesis have to be expressed. I don't think it was "judgemental" to express that IMHO, key path-to-string conversion can create trouble, while its absence fosters more robust practices.

I even provided a link to an implementation of path-to-string conversion, see how I don't prevent anyone from doing anything

Pampel · May 28, 2020, 1:18pm

Perhaps 'judgemental' was a little strong!

That said, code based off KeyPath introspection could cause the trouble you mentioned, but it can also rid code of string typing, so it also fosters more robust practices.

gwendal.roue · May 28, 2020, 1:20pm

Yes. In order to avoid string typing, GRDB fosters relying on the CodingKeys generated by Codable synthesis. They have a built-in stringValue property, and don't require much fuss.

Pampel · May 28, 2020, 1:34pm

I've added some more examples to the body of this discussion.

gwendal.roue · May 28, 2020, 1:37pm

Thanks! You may be interested in SQL Interpolation and Record Protocols, where Swift string interpolation is put to good use.

extension Player {
    static func maximumScore() -> SQLRequest<Int> {
        "SELECT MAX(\(CodingKeys.score)) FROM \(self)"
    }
}

let score = try Player.maximumScore().fetchOne(db) // Int?

mattpolzin · May 28, 2020, 2:32pm

My favorite potential use-case for key path introspection is serialization of key paths as JSON References. Only works given certain structural assumptions but in the context of a well-known schema (like, oh, I don’t know, OpenAPI, as a random example with no personal significance) it could be really nice.

ddddxxx · May 29, 2020, 5:03am

I did an implementation:

gist.github.com

https://gist.github.com/ddddxxx/7ba69196b8551efcf4025e7001cefa26

KeyPath+fieldName.swift

// This file is based largely on the Runtime package - https://github.com/wickwirew/Runtime

extension KeyPath {
    
    var fieldName: String? {
        guard let offset = MemoryLayout<Root>.offset(of: self) else {
            return nil
        }
        let typePtr = unsafeBitCast(Root.self, to: UnsafeMutableRawPointer.self)
        let metadata = typePtr.assumingMemoryBound(to: StructMetadata.self)

This file has been truncated. show original

lightsprint09 · May 29, 2020, 6:31am

While building a CRDT (see, I considered writing a similar proposal. During resarch I came up with the following questions.

Computed members.
Should one be able to introspect a key path and get information if the introspected member is computed?
Should the compiler give us a way to reference key paths, which must not contain computed members
Subscripts
While members could be represented a strings, how should we represent subscript?
As far as I know, a subscript can be called with any type.
Is a subscript similar to a computed member or not.
Optionals
Should one be able to get information if a member is optional.

filip-sakel · May 30, 2020, 7:23am

I don’t see why that’s important am I missing something? Couldn’t we just write:

struct Foo { var bar: Int { 5 } }

let name = \Foo.bar.nameComponents.last!

print(name) // bar

Note: I don’t know how the proposed syntax API would be, so used a nameComponents array as it seems kind of convenient

I think optionals should be treated just like any other type. If we started making exceptions it would be hard to maintain.

I think that the String “[0]” would be just fine for a subscript, although it’d be interesting to explore other ways.