[Pitch] Custom Metadata Attributes

Paul_Cantrell · December 13, 2022, 11:54pm

If so, Jens, then that’s merely frustrating but not broken. That’s the opposite of my reading, however: I take the text to mean that @EditorCommandRecord indeed silently not applied. Perhaps one of the proposal authors can clarify?

allevato · December 13, 2022, 11:58pm

I haven't full delved into the details yet, but I really like this so far. swift-format currently does some code generation to collect rules and construct a pipeline to run, and it would be interesting to see if the proposed feature gets close enough to replacing that with runtime reflection. It would also open up the ability for folks to hook their own rules into the pipeline as separate dylibs, instead of having to make them part of the core formatter binary.

How does the proposed design handle images being loaded into the process? Swift's own runtime hooks into dyld so that when a new dylib/framework is loaded, protocol conformances in that library are loaded and recognized. Since there are parallels here,

Does getAllInstances cache its result or scan all loaded images every time it's called? If we want to support dynamic libraries providing metadata-driven attributes, we probably need to do the latter.
Then, if I get fresh data every time getAllInstances is called, let's say the first call returns [A, B] and then a dylib is loaded later that adds C. Presumably a subsequent call to getAllInstances would return [A, B, C]. Do I need to do my own bookkeeping so that I only process C the next time around?

hborla · December 14, 2022, 12:29am

Sorry for the confusion, the intention is that one of these attributes on a protocol is effectively a requirement on conforming types, so the attribute will not be silently unapplied. If the protocol conformance is written in an extension, the original declaration of the conforming type must have the attribute written explicitly. We'll clarify this in the proposal text!

ktoso · December 14, 2022, 1:50am

I'm very excited for the "discover instances" capability, this'll really help in various scenarios that require annoying registration steps today. Hooray!

A few minor quick things before I dive deeper.

Overall shape

I really like how we shape the "what this can be applied on" by using specific initializers, it looks very nice.

Retention policy rather than "runtime" in the name

Since this is pretty Java inspired, we should take a closer look what's good about Java's annotation system: the runtime retention is a policy, and I can see this being useful for us since we both are going to have macros and runtime things wanting to inspect those attributes.

Specifically, would it make more sense to phrase this feature as:

@customAttribute struct Example { ... } 
@customAttribute(retention: .runtime) struct Example { ... } 
@customAttribute(retention: .source) struct Example { ... }

note that I'm not as interested in the naming bikeshedding, as I am in the capability to change the retention. For example, it would be very useful to have attributes be able to drive docc decisions or similar generators (like open-api or similar) source generators or macros about source-time things they do to such annotated types; but we don't need to retain this info at runtime.

getAllInstances shape

The shape of this API is very confusing. Why T?? Something more descriptive would be very helpful.

Ability to capture "locations" for code coverage

This might be a little bit "out there" but believe me when I say I've seen and need to solve this in Swift in the real world

We can imagine some piece of code with complex logic, where we need to ensure that all expected code paths are executed. For example, this can be done by littering the code with "probes", that describe and assert what is happening during the execution. Then, a programmer can inspect the trace of notable events and confirm what went wrong.

Specifically, the need here is about always registering all probes, and being able to query for those that were not hit. Our test function declares three probes, and always will miss one of them (for the purpose of this showcase):

func test(branch: Bool) {
  if branch {
    @CoverageMetadata CoverageProbe("took 'yes' branch").hit() // or however we'd spell this
  } else {
    @CoverageMetadata CoverageProbe("took 'no' branch").hit()
  }
  @CoverageMetadata CoverageProbe("took 'yes' branch").hit()
}

func coverageProbeTestSuite() {
  let registerTrue = CoverageProbeRegister().with { // sets a task-local registry
    test(false)
  }
  registerTrue.assertAllHit() // error: not all probes were hit!
  // missing hit: "took 'no' branch" at file.swift:124
}

The goal is to be able to report "hey, you didn't exercise the code in that branch!". The example is very silly here, but you get the idea.

So, I was hopeful we could make use of runtime metadata to be able to "give me all CoverageProbes" such that this registry can then remove them from the set of "pending" probes expected to be hit during an execution. In some pseudo-code, it might look like this:

@runtimeAttribute
struct CoverageMetadata {
  let function: String
  let fileID: String
  let line: UInt

  init(attachTo: ???, function: String = #function, fileID: String = #fileID, line: UInt = #line) {
    self.function = function
    self.fileID = fileID
    self.line = line
  }

  func hit() {
    CoverageRegister.current.hit(self) // hit in the current execution registry
  }
}

class CoverageRegister {
  var hits: Set<CoverageProbe> = []
  var missing: Set<CoverageProbe> = []

  init() {
    self.missing = Attribute.allInstancesOf(CoverageProbe.self) // OR SIMILAR?
  }

  static var current: Self {
    // use a task-local
  }

  func hit(_: CoverageProbe) {}
}

So we would need to find during compile time where those probes are, and add them to metadata that the registry can query at runtime.

This feature seems it might be able to support this, if we added a init(function: String, fileID: String, line: UInt) initializer perhaps...? We can't solve this with macros alone, since macros won't really be able to "run this code when this piece of code is NOT going to be executed" I think... Or we might actually with a "whole function body" macro transformation where we detect those probes and annotate the function itself using #detectCodeProbes?

What do you think, is this something that fits into this feature?

Paul_Cantrell · December 14, 2022, 2:01am

Ah, that’s a relief! Thanks for the clarification, Holly.

I’d put in a plug for allowing metadata (and protocols with metadata) on same-module extensions, but that’s not at all a dealbreaker.

xedin · December 14, 2022, 2:06am

It actually implemented this way but we haven't mentioned that in the text which we'll fix... Generator function would be transparent and init call would happen as if call is from the declaration so #function, #line and #column would point to the declaration attribute is attached to.

bmoo · December 14, 2022, 4:47pm

The other problem with this kinds of attributes is that your software no longer breaks when it is compiled, but now you have to run it to discover what you did wrong, which is a much longer feedback cycle

ebg · December 14, 2022, 7:30pm

Can the pitch add a survey of how these are used in pratice in other languages. From my Java days, Java annotations were a net negative to the language: reading Java code became a mishmash of custom annotations, and it was nearly impossible to understand anything you were reading without diving into each annotation (which nobody ever did, so they just became magic).

tgoyne · December 14, 2022, 9:29pm

I took a stab at implementing Realm's schema discovery using this. This is 100% drycoded, untested, and I have not yet tried to evaluate if doing any of these specific things would actually be a good idea.

Rough implementation

// A sample class declaration using this
@DefaultSchema
class MyModel: RealmSwift.Object {
  @PrimaryKey
  @Persisted
  var primaryKey: Int

  @Indexed
  @Persisted
  var indexedValue: Int

  @Persisted
  var nonindexedValue: Int

  @MapTo(underlying: "underlying_name_with_underscores")
  @Persisted
  var prettyCamelCaseName: Int
}

// The schema types we want to generate from this
struct ObjectSchema {
    var name: String
    var properties: [Property]
}

struct Property {
    var name: String = ""
    var underlyingName: String?
    var type: PropertyType = .int
    var optional: Bool = false
    var indexed: Bool = false
    var primary: Bool = false
    var keyPath: AnyKeyPath!
}

func objectSchema<T: ObjectBase>(for type: T.Type) {
    let schema = ObjectSchema()
    schema.name = String(describing: type)
    schema.properties = discoverProperties(for: type)
}

// The very gross basic implementation

func discoverProperties(for type: ObjectBase.Type) -> [Property] {
    let obj = type.init()
    var properties = [Property]()
    for property in Attributes.getAllInstances(of: Persisted.self) {
        guard let property else { continue }
        guard case let .metadata(root, initialize) = property.storage else { continue }
        guard root == type else { continue }
        properties.append(initialize(obj, properties.count))
    }

    for prop in Mirror(reflecting: obj).children {
        guard let label = prop.label else { continue }
        guard let value = prop.value as? DiscoverablePersistedProperty else { continue }
        guard let index = value.index else { continue }
        let property = properties[index]
        property.name = label
    }

    for prop in Attributes.getAllInstances(of: Indexed.self) {
        guard let prop, prop.root == type else { continue }
        prop.initialize(obj, properties)
    }

    for prop in Attributes.getAllInstances(of: PrimaryKey.self) {
        guard let prop, prop.root == type else { continue }
        prop.initialize(obj, properties)
    }

    for prop in Attributes.getAllInstances(of: MapTo.self) {
        guard let prop, prop.root == type else { continue }
        prop.initialize(obj, properties)
    }
}

@runtimeMetadata
struct DefaultSchema {
    let objectSchema: RLMObjectSchema
    init<T: ObjectBase>(attachedTo: T.Type) {
        objectSchema = RLMObjectSchema(forObjectClass: attachedTo)
    }
}

func defaultSchema() -> [ObjectSchema] {
    return Attributes.getAllInstances(of: DefaultSchema.self).compactMap {
        $0.map(objectSchema(for:))
    }
}

@runtimeMetadata
struct Indexed {
    let root: ObjectBase.Type
    let initialize: (ObjectBase, [Property]) -> Void
    init<T: ObjectBase, V: DiscoverablePersistedProperty>(attachedTo kp: KeyPath<T, V>) {
        root = T.self
        initialize = { obj, properties in
            guard let obj = obj as? T else { return }
            guard let index = obj[keyPath: kp].index else { return }
            properties[index].indexed = true
        }
    }
}

@runtimeMetadata
struct PrimaryKey {
    let root: ObjectBase.Type
    let initialize: (ObjectBase, [Property]) -> Void
    init<T: ObjectBase, V: DiscoverablePersistedProperty>(attachedTo kp: KeyPath<T, V>) {
        root = T.self
        initialize = { obj, properties in
            guard let obj = obj as? T else { return }
            guard let index = obj[keyPath: kp].index else { return }
            properties[index].primary = true
        }
    }
}

@runtimeMetadata
struct MapTo {
    let root: ObjectBase.Type
    let initialize: (ObjectBase, [Property]) -> Void
    init<T: ObjectBase, V: DiscoverablePersistedProperty>(attachedTo kp: KeyPath<T, V>, underlying: String) {
        root = T.self
        initialize = { obj, properties in
            guard let obj = obj as? T else { return }
            guard let index = obj[keyPath: kp].index else { return }
            properties[index].underlyingName = underlying
        }
    }
}

@propertyWrapper
@runtimeMetadata
struct Persisted<Value: _Persistable>: DiscoverablePersistedProperty  {
    private var storage: PropertyStorage<Value>

    @available(*, unavailable, message: "@Persisted can only be used as a property on a Realm object")
    public var wrappedValue: Value {
        get { fatalError("called wrappedValue getter") }
        set { fatalError("called wrappedValue setter") }
    }

    init() {
        storage = .unmanagedNoDefault
    }
    init(wrappedValue value: Value) {
        storage = .unmanaged(value: value)
    }
    init<T: ObjectBase, V: _Persistable>(attachedTo kp: KeyPath<T, V>) {
        storage = .metadata(T.self, { (obj: T, index: Int) -> Property in
            let property = Property()
            property.type = V._rlmType
            property.optional = V._rlmOptional
            property.keyPath = kp
            return property
        })
    }

    public static subscript<EnclosingSelf: ObjectBase>(
        _enclosingInstance observed: EnclosingSelf,
        wrapped wrappedKeyPath: ReferenceWritableKeyPath<EnclosingSelf, Value>,
        storage storageKeyPath: ReferenceWritableKeyPath<EnclosingSelf, Self>
        ) -> Value {
        get {
            return observed[keyPath: storageKeyPath].get(observed)
        }
        set {
            observed[keyPath: storageKeyPath].set(observed, value: newValue)
        }
    }

    internal mutating func get(_ object: ObjectBase) -> Value {
        switch storage {
        case let .unmanaged(value):
            return value
        case .unmanagedNoDefault:
            let value = Value._rlmDefaultValue()
            storage = .unmanaged(value: value)
            return value
        default:
            fatalError()
    }

    internal mutating func set(_ object: ObjectBase, value: Value) {
        switch storage {
        case .unmanaged, .unmanagedNoDefault:
            storage = .unmanaged(value: value)
        default:
            fatalError()
        }
    }
}

My most superficial observation is that I don't understand why getAllInstances() returns a [T?].
What does a nil in this array signify?

The @DefaultSchema part which replaces the use of objc_copyClassList() is very simple and seems like it'd straightforwardly work for us. It also seems like it'd be very easy to fall back to objc_copyClassList() for backwards compatibly.

Property discovery on classes is less rosy.

The first note is that I had to use Mirror to get the names of properties, and the approach I took wouldn't work for something like an @Test annotation for XCTest. Maybe there should be something like init<T, V>(named: String, attachedTo: KeyPath<T, V>) (name before the keypath to disambiguate from a custom parameter)? Maybe some more complex descriptor type with all of the potentially interesting data?

Correlating data between multiple attributes is awkward. Maybe a [AnyKeyPath: Property] dictionary would work better? Does that even work?

The part of this that doesn't work is that I pretended that Persisted is @propertyWrapper @runtimeMetadata struct Persisted { ... }, i.e. both a property wrapper and a runtime metadata attribute and that this would somehow actually work. We currently get the ivar offset of @Persisted members from the obj-c runtime and then do pointer math, and obviously I'd much rather use a KeyPath. However, we don't want to have to make users declare properties as @Foo @Persisted var value: Int, and instead would maybe want each use of the Persisted property wrapper to be implicitly annotated with a metadata attribute.

xedin · December 14, 2022, 9:37pm

The declaration attribute is associated with might be less available than the attribute itself, in this case we'll still produce a generator but it would return nil when underlying type is unavailable.

hborla · December 14, 2022, 9:38pm

We could also choose to flatten the array in the runtime query so it just produces [T]. The downside is you won't know whether there were any unavailable metadata attributes at runtime, but maybe that's okay.

xedin · December 14, 2022, 9:39pm

You can use (attachedTo: ..., name: String = #function), it should work.

John_McCall · December 15, 2022, 2:56am

I'm not sure I understand this point. Swift already has several kinds of custom attributes, and that is not being wholly reinvented here. This kind of attribute is specifically for declaring something that requires runtime discovery.

Jon_Shier · December 15, 2022, 2:57am

Of course they're not fleeing, the vast majority of them are just users and so have no influence over how those features are implemented. The alternative to not using attributes for Unity is... not using Unity (or at least not using it in a supported manner). But those runtime attributes are a huge drain on Unity's runtime performance, and like all attributes, lead to poor developer experience.

I don't see how this would change. Similar to @davedelong's example, you would just express the key difference with a different attribute. Your example is also pretty poor given Codable has many long standing UX issues that could just be fixed rather than needing a whole runtime system to make usable. It seems like your problem could be solved by improvements to Codable, improvements to property wrappers, or another compile time feature. It doesn't seem like this problem needs runtime involvement at all.

This doesn't really make sense to me. If you insist on having all of the metadata exposed, which is not a given, you can do so in a structured way at compile time. There's nothing requiring compile time constructs to be as awkward as Codable.

And in fact, composition issues can be mitigated by moving metadata around, since it then becomes possible to define away the sharp edges of the composition in the first place. Besides which, I was asking how the pitch's actual feature composes with other actual Swift features, which is important to define, so it wasn't a general question with a general answer. There is also the question of how these things interact at runtime, but that's a problem for attribute implementors, as there's little a general runtime system can do to limit those issues.

Would there be a way to see the backing storage? Is visibility impacted by the order of attributes? It seems like people would want to change behavior based on whether a property is wrapped by another type.

ktoso · December 15, 2022, 3:12am

It does, but not to the same extent really. I can't decide to make up a:

/// A lot of docs here
func oldImpl() {}
@sameDocsAs(oldImpl) func newImpl() {} 

// or a compile-time serialization system, that needs user assistance:
// these can be kind of done with a property wrapper but feels off
// @Tag(1) var field: String
// @Tag(2) var age: Int

in today's Swift. The attribute here being just an example, but I can imagine other cases - there's examples in the Java ecosystem we can look at if necessary (typical examples are things like OpenAPI/Swagger source generators). There are forms of custom attributes in Swift today, but they all have some meaning to Swift itself. One notable custom attribute type for example is global actors, but they carry specific meaning to the typechecker and runtime.

The attributes pitched here seem to be more usable by library/tool developers though and be it runtime, or compile time, don't really matter to Swift in any way. Thus, allowing the use in various other cases where today's "custom attributes" that exist in Swift don't facilitate.

I'm not aware of a way to make arbitrary custom attributes by library developers in Swift today, unless I'm missing something. There's type wrappers, property wrappers, global actors, but they all have specific meanings to Swift really, not just as a way to associate some meta-information with a field/type/method. Sendable is also an @Attribute but we can't declare such a thing ourselves as library authors, right?

That said, for myself and server and library use cases I'm thinking about, it's not a big deal; but I'm surprised that while "opening up attributes to anyone" we didn't think about ways to not store runtime metadata while at it -- using the same mechanism, as Java does, there's no separate different way to declare "runtime" vs "class" vs "source" retention, it's just a custom annotation, with a scope defined.

It might also be worth checking what for source retention is being used in the real world; A quick github search reveals a lot of use, but I've not dug very deep into them. I can do so if that'd be helpful.

On that same thread... I wonder, IF, we had such source retention, couldn't @available and friends eventually become "normal" attributes, rather their own weird custom thing...? But perhaps that ship has sailed already, and they're way too custom with their own little grammar even.

Consolidating these under the same rules as normal swift would be nice, but perhaps impossible by now

The re-iterate though; I don't think this is a show stopper nor my primary question here; just something that hit me as being not quite there, when comparing our many ad-hoc compile time attributes plus these runtime attributes with the Java annotations which of handle both cases, with the retention policy instead.

Alejandro · December 15, 2022, 10:21pm

Currently what's implemented is the type of the instance that this is on. I.e.

struct A {
  @CustomRuntimeAttribute
  var x: String
}

In this case I can actually provide you a value of (String.self, CustomRuntimeAttribute?), so the API could be edited to look like [(Any.Type, T?)] (or something wrapped in a struct with a proper name).

Yes, querying in the other direction is something a lot of folks have been asking for and I imagine we'll want to alter the implementation a bit to allow for efficient querying the other way. I would love if this interacted with the proposed API in the reflection pitch to allow for say Type.attributes or Field.attributes to get all the attributes on a specific type or a specific field.

These attribute initializers are only called when you request them via the getAllInstances API. Also, only the specific attribute's initializers are called. For example, if you have 2 custom runtime attributes @A and @B and you ask for @B 's instances, we'll only initialize those attributes whose type is B and not any of the A ones.

The getAllInstances method is provided for you via the proposed Reflection module ([Pitch] Reflection). You are expected to implement your own attachedTo: inits to constrain what type of declarations your attribute can be applied to.

So we do both. On platforms like Linux and Windows, we need to scan each section every time for every attribute. On platforms like Darwin, we can scan each section once, cache the results, and for new images loaded at runtime we'll be notified of new sections and only have to scan those and cache the results.

Yes, the API would return [A, B, C] and if you only wanted to consume C you'd need to do bookkeeping to figure what you've already seen.

Alejandro_Martinez · December 16, 2022, 4:47pm

I guess this answers my question

You folks know way more than me but are we sure it wouldn't be nice to design this metadata attributes with some plan to be able to access them at compile time with some comptime system? From the libraries I know in java that used pervasive annotations they switched to use them at compile time, so feels a bit backwards that Swift is not thinking about that when introducing custom attributes.

I like @ktoso overall thoughts on this.

John_McCall · December 16, 2022, 5:21pm

We can certainly add static metadata attributes. They’d currently only be useful for external source tools, simply because there’s nothing in the language which can ask arbitrary static questions about declarations. That’s a very likely future direction for procedural macros, though.

Maybe this confusion is coming from the lack of “runtime” in the thread title.

Joe_Groff · December 16, 2022, 6:06pm

As I mentioned upthread, there does seem to be some degree of functionality here that could be consumed at compile time; a module ought to be able to ask itself for the full set of metadata records for its own declarations and get an answer at compile time.

John_McCall · December 16, 2022, 6:12pm

How would it get that answer at compile time, though? Just an assurance that we can constant-fold some particular function call?