Calling through original implementations of autogenerated methods

tera · April 1, 2024, 4:50am

Got an idea how Swift could be minimally augmented to allow user types calling through original implementations of autogenerated methods. Currently it's "all or nothing": if I override the method to make a minute change on top of what standard library is doing – I can't access the standard library synthesised implementation and have to recreate everything from scratch.

I will use Hashable protocol here as an example, but it's applicable to anything else.

When I make my type Hashable and standard library can synthesise "hash(into:)" method, it's doing this:

struct T: Hashable {
    /*autogenerated*/ func auto_synthesized_hash(into hasher: inout Hasher) {
        // original implementation
    }
    /*autogenerated*/ func hash(into hasher: inout Hasher) {
        auto_synthesized_hash(into: hasher)
    }
}

if the type overrides hash it could call through the synthesized method:

extension T {
    /* user defined "override" */
    func hash(into hasher: inout Hasher) {
        print("calling synthesized")
        auto_synthesized_hash(into: hasher)
        hasher.combine(extraField)
    }
}

in this case there will be no "autogenerated" func hash.

if the type overrides hash and doesn't call synthesized hash method anywhere - the synthesized hash method could be stripped.
If the type doesn't override hash – linker inlines auto_synthesized_hash body into func hash.

A variation of this idea (albeit the one that will require more changes to the language / compiler) is introducing a keyword similar to super:

    /* user defined "override" */
    func hash(into hasher: inout Hasher) {
        print("calling synthesized")
        auto_synthesized.hash(into: hasher)
        hasher.combine(extraField)
    }

Is this idea worth exploring?

jeremyabannister · April 1, 2024, 11:43am

I just ran into this yesterday during my continued exploration of improving the ergonomics of serialization. The lack of this feature has pushed me into using macros, which for various reasons I prefer to avoid whenever possible. I would be very happy to see something like this pitched and explored. +1

Pippin · April 1, 2024, 4:15pm

I like the feature, but I feel like the experience would be an annoying game of wack-a-mole for figuring out which protocols have an automatic synthesis until you read documentation or have enough experience. If this were to exist, I wish there was some attribute on these compiler-protocol-functions that could tell us if it has an automatic synthesis.

For example:

// in Swift
public protocol Hashable {

    // ...

    @synthesized
    func hash(into hasher: inout Hasher)

}

Lancelotbronner · April 1, 2024, 5:52pm

I love the idea, it's definitely something I ran into once or twice. I'll throw in another syntax suggestion to the mix:

/* user override */
func hash(into hasher: inout Hasher) {
    print("calling synthesized")
    $hash(into: &hasher)
    hasher.combine(extraField)
}

Since the dollar sign is already reserved for compiler-generated identifiers I think it could fit the job.

Both other options are also fine with me, though I feel like auto isn't necessary because synthesized conveys enough meaning on its own.

synthesized.hash(into: &hasher)

In conclusion for me: excellent idea!

tera · April 2, 2024, 6:52pm

There's one more thing to close the loophole that'd still exist:

class C { /* ... */ }

struct S {
    var x001: Int
    var x002: String
    // ...
    var x999: Double
    var c: C
}

extension S: Hashable, Equatable {} // 🛑 Stored property type 'C' does not conform to protocol 'Equatable', preventing synthesized conformance of 'S' to 'Equatable'

(yes, Hashable does imply Equatable, but I am writing Equatable explicitly deliberately.)

Currently you'd need to either conform C to Equatable / Hashable, or (if that's undesired) you'd need to recreated EQ/hash conformances from scratch, which is not ideal especially for types with many fields.

Possible solutions:

"type level" approach:

extension C: !Hashable, !Equatable {} // 🆕

with the meaning: "exclude" values of type C from automatically generated Hashable & Equatable conformances.

"declaration level" approach:

struct S {
    var x001: Int
    var x002: String
    // ...
    var x999: Double
    @exclude (Hashable, Equatable) var c: C // 🆕
}

with the meaning: "exclude" this specific value from automatically generated Hashable & Equatable conformances.

With such "opt-outs" it would be possible to use the autogenerated EQ/hash conformances:

extension S: Hashable, Equatable {} // ✅

Here the value c will be ignored in EQ/hash). Or extend the autogenerated conformances:

extension S: Hashable {
    func hash(into hasher: inout Hasher) {
        synthesized.hash(into: &hasher)
        hasher.combine(ObjectIdentifier(c))
    }
}
extension S: Equatable {
    static func == (lhs: Self, rhs: Self) -> Bool {
        (synthesized.==)(lhs, rhs) &&
            ObjectIdentifier(lhs.c) == ObjectIdentifier(rhs.c)
    }
}

jeremyabannister · April 2, 2024, 7:21pm

If this already existed I think I would consider it a code-smell if I ever found myself reaching for it and would try to refactor to avoid it.

tera · April 2, 2024, 7:37pm

This is what I'd do in today's Swift:

class C { /* ... */ }

struct S {
    struct Fields: Equatable, Hashable {
        var x001: Int
        var x002: String
        // ...
        var x999: Double
    }
    var fields: Fields
    var c: C
}

extension S: Equatable {
    static func == (lhs: Self, rhs: Self) -> Bool {
        lhs.fields == rhs.fields && lhs.c === rhs.c
    }
}

extension S: Hashable {
    func hash(into hasher: inout Hasher) {
        hasher.combine(fields)
        hasher.combine(ObjectIdentifier(c))
    }
}

It's not so bad, although the use site becomes:

let s = S(fields: S.Fields(x001: 1, x002: 2, x999: 999), c: C())
print(s.fields.x001)

instead of a simpler:

let s = S(x001: 1, x002: 2, x999: 999, c: C())
print(s.x001)

which, in turn, could be solved by introducing another more convenient initialiser (for construction) and a dynamic member lookup (for deconstruction) – but the latter is not so good performance wise and both is one step too cumbersome.

jeremyabannister · April 2, 2024, 7:42pm

I’m suggesting that if a type that conceptually should be Hashable suddenly acquires a definitively non-Hashable property, I expect it to indicate a deeper architectural flaw and I would search for that, rather than trying to shoehorn it into working as-is.

tera · April 2, 2024, 7:47pm

Got you. I've seen it though, e.g. it could be a cache field or something similar – the field that should not affect semantics of the type and at the same time it must be ignored during EQ/hash calculations. Array's capacity field would be a semi-appropriate example (this field is hashable though, so it's not an ideal example; although it show cases that approach #2 above is superior (and more flexible) compared to approach #1).

tera · April 2, 2024, 8:03pm

Reminded me of this great idea of @itaiferber.

When I conform this type:

struct S {
    var count: Int
    var contents: Data
    var capacity: Int
}

to Hashable / Equatable:

extension S: Equatable, Hashable {}

this is what I am getting by default:

// autogenerated:
extension S {
    static func == (lhs: Self, rhs: Self) -> Bool {
        (lhs.count, lhs.contents, lhs.capacity) == (rhs.count, rhs.contents, rhs.capacity) == 
    }
    func hash(into hasher: inout Hasher) {
        hasher.combine(count)
        hasher.combine(contents)
        hasher.combine(capacity)
    }
}

Unless I redefine "the essence" of the type:

extension S {
    // pseudocode
    var essence = (count, contents) // (count, contents, capacity) by default
}

in which case Equatable, Hashable will use essence:

// autogenerated:
extension S {
    static func == (lhs: Self, rhs: Self) -> Bool {
        lhs.essence == rhs.essence
    }
    func hash(into hasher: inout Hasher) {
        hasher.combine(essence)
    }
}

and in fact essence would be used in the previous example as well, it'd be just using the autogenerated essence field if it's not customised.

The followup question here is: is "essence" specific to EQ/Hashable? What about other things like Codable, should there be another "essence" field for it?

Another issue that springs to mind with the "essence" approach is that it would be error prone (you define a type and it's essence, few months later you add a new field to the type and forgot to update the essence).