“structs with metadata”: how to do better than `KeyPath`?

i’ve got a bunch of stats counters that look like

extension Stats
{
    @frozen public
    struct Decl:Equatable, Sendable
    {
        /// Typealiases.
        public
        var typealiases:Int
        /// Structs and enums.
        public
        var structures:Int
        /// Protocols.
        public
        var protocols:Int
        /// Classes, excluding actors.
        public
        var classes:Int
        /// Actors.
        public
        var actors:Int

        ...

after brainstorming some alternatives i eventually settled on a protocol-based solution that involves a protocol StatsCollection with a bunch of static requirements:

protocol StatsCollection
{
    static
    var keys:[KeyPath<Self, Int>] { get }

    static
    func id(_ key:KeyPath<Self, Int>) -> String?

    static
    func display(_ key:KeyPath<Self, Int>) -> String?
}

a conformance looks like:

extension Stats.Decl:StatsCollection
{
    static
    var keys:[KeyPath<Self, Int>]
    {
        [
            \.functions,
            \.operators,
            \.constructors,
            \.methods,
            \.subscripts,
            \.functors,
            \.protocols,
            \.requirements,
            \.witnesses,
            \.attachedMacros,
            \.freestandingMacros,
            \.structures,
            \.classes,
            \.actors,
            \.typealiases
        ]
    }

    static
    func id(_ key:KeyPath<Self, Int>) -> String?
    {
        switch key
        {
        case \.functions:           "decl function"
        case \.operators:           "decl operator"
        case \.constructors:        "decl constructor"
        case \.methods:             "decl method"
        case \.subscripts:          "decl subscript"
        case \.functors:            "decl functor"
        case \.protocols:           "decl protocol"
        case \.requirements:        "decl requirement"
        case \.witnesses:           "decl witness"
        case \.attachedMacros:      "decl macro attached"
        case \.freestandingMacros:  "decl macro freestanding"
        case \.structures:          "decl structure"
        case \.classes:             "decl class"
        case \.actors:              "decl actor"
        case \.typealiases:         "decl typealias"
        case _:                     nil
        }
    }

    static
    func display(_ key:KeyPath<Self, Int>) -> String?
    {
        switch key
        {
        case \.functions:           "global functions or variables"
        case \.operators:           "operators"
        case \.constructors:        "initializers, type members, or enum cases"
        case \.methods:             "instance members"
        case \.subscripts:          "instance subscripts"
        case \.functors:            "functors"
        case \.protocols:           "protocols"
        case \.requirements:        "protocol requirements"
        case \.witnesses:           "default implementations"
        case \.attachedMacros:      "attached macros"
        case \.freestandingMacros:  "freestanding macros"
        case \.structures:          "structures"
        case \.classes:             "classes"
        case \.actors:              "actors"
        case \.typealiases:         "typealiases"
        case _:                     nil
        }
    }
}

i decided to use a protocol and not a macro because the Stats.Decl structure is a fundamental data type that’s part of the database schema, while StatsCollection is part of the rendering logic, and i didn’t want to include any rendering logic in the database schema module.

but i find KeyPaths to be an awkward abstraction because they don’t have any concept of exhaustivity, so the mapping functions need to have a case _: nil default clause. this means when i add new fields to the structure, there are a lot of disparate code locations to update and the compiler doesn’t provide a lot of guardrails here.

can we do better than KeyPath?

1 Like

Just thinking out loud here

protocol StatsCollectionKey {
  var display: String { get }
  var id: String { get }
}

protocol StatsCollection {
  associatedtype Key: StatsCollectionKey

  subscript(key: Key) -> Int { get /*set*/ }
}

extension Stats.Decl: StatsCollection  {
  enum Key: StatsCollectionKey {
    case // ...

    internal // or fileprivate, etc
    var keyPath: KeyPath<Stats.Decl, Int> {
      // ...
    }
  }

  subscript(key: Key) -> Int {
    get { self[keyPath: key.keyPath] }
    /*set { self[keyPath: key.keyPath] = newValue }*/
  }
}

right, i’m actually pursuing something like this right now, the challenge is generating the Key type itself:

a macro can’t help here, because it can’t see the fields of the database schema type, since it is defined in an upstream module.

If you know the struct has no stored members other than public Ints, you could potentially generate MemoryLayout<Stats.Decl>.size / MemoryLayout<Int>.stride cases -- but I don't think macros have access to size information. It'd have to be more like a static assert(Key.allCases.count == ...).

Have you considered making those keys an enum cases and instead of a struct with individual fields just use either an array or a dictionary to go from key to integer.

aside from the extra heap allocations, i’ve found that Dictionary tends not to compose well with codable things because the key order changes non-deterministically. (one day, i hope that eternal sophomore of a type OrderedDictionary will finally graduate…)

one thing i did start doing instead was re-purposing the CodingKeys types with some of the non-Int cases ignored as a rough approximation of a list of field paths.

Array is also an option:

enum Key: Int, CaseIterable {
    case typealiases, structures...
}
var keys = [Int](repeating: 0, count: Key.allCases.count)
keys[Key.typealiases.rawValue] += 1

Array means you need to encode an element for every field, even if it is zero. for sparse counters, this can waste a lot of space in the database. it also means you can never remove fields without breaking the schema.

1 Like