Decoding codable classes constrained by a protocol

hgraves · February 22, 2024, 6:55am

Hey everyone,

I have a couple of classes and some of them (but not all of them) are indistinguishable from one another. All of the classes share the same base class. And all of the classes implement an empty protocol:

protocol MyDecodable: AnyObject, Codable {}

class MyBaseClass: MyDecodable {
    var identifier: UUID {
        UUID()
    }
}
class MyClassOne: MyBaseClass {
    var myString: String?
    var myStrings: [String] = [String]()

}

class MyClassTwo: MyBaseClass {
    var myString: String?
    var myStrings: [String] = [String]()
}

class MyClassThree: MyBaseClass {
    let myString: String
    let myInts: [Int]
    init(myString: String, myInts: [Int]) {
        self.myString = myString
        self.myInts = myInts
        super.init()
    }
    enum MyClassThreeKeys: CodingKey {
        case myString, myInts
    }
    
    required init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: MyClassThreeKeys.self)
        self.myString = try container.decode(String.self, forKey: .myString)
        self.myInts = try container.decode([Int].self, forKey: .myInts)
        try super.init(from: decoder)
    }
}

I notice that when I have some data which can be decoded, I am having trouble with the objects once they are initialized:

    func test_MyClasses() throws {

        var dict = [String: Any]()

        let firstObj = MyClassOne()
        firstObj.myString = "firstObjProp"
        firstObj.myStrings = ["firstObjProp", "sasdasf", "asdasdasdasd", "adsasdasdasdasdasd"]
        let encoder = JSONEncoder()
        let data = try encoder.encode(firstObj)
        let decoder = JSONDecoder()
        var count: UInt32 = 0
        let classListPtr = objc_copyClassList(&count)
        UnsafeBufferPointer(
          start: classListPtr, count: Int(count)
        ).compactMap({
            if let cod = $0 as? MyDecodable.Type {
                do {
                    let result = try decoder.decode(cod, from: data)

                    if let description = (result as AnyObject).description?.components(separatedBy: ".").last {
                        dict[description] = result
                    }
                } catch {
                    //print(error)
                }
            }
        })
        print(dict.description)
    }
}

I am seeing that inside of dict I have an object of each class type, MyClassOne and MyClassTwo as well as MyBaseClass. Having these objects makes sense because the data from encoding MyClassOne should be able to be decoded as either of the other two.

However, the properties are empty. It seems like the properties are not being initialized with the object.

Can you help me figure out what I have overlooked? I was expecting that I can have my properties set from decoding the data.

Note:
It kind of looks like the synthesized initializer that is being chosen (during the decoding) is the one that corresponds with init() rather than init(myString: String, myStrings:[String]).. Would that be the expected behavior for this scenario?

tera · February 22, 2024, 12:45pm

Check you JSON after encoding step...

Distilling the issue:

class Base1: Codable { var x = 1 }
class Derived1: Base1 { var y = 2 }

class Base2 { var x = 1 }
class Derived2: Base2, Codable { var y = 2 }

class Base3: Codable { var x = 1 }
//class Derived3: Base3, Codable { var y = 2 } // 🛑 Redundant conformance of 'Derived3' to protocol 'Decodable'

var data = try! JSONEncoder().encode(Derived1())
print(String(data: data, encoding: .utf8)!) // {"x":1}
data = try! JSONEncoder().encode(Derived2())
print(String(data: data, encoding: .utf8)!) // {"y":2}

This does look quite dismal: if you stated Codable conformance in the base class you can't restate it in the derived class and derived class is in effect non-codable. Or if you state Codable conformance only in the derived class the base class is not codable.

In your particular case (base class has no fields to encode) just move the Codable conformance from the base class to derived classes.

itaiferber · February 22, 2024, 1:46pm

The behavior you're seeing here is that derived classes don't have protocol conformances re-synthesized by the compiler — i.e., your subtypes inherit encode(to:) directly from your base class (and never encode their properties), and both MyClassOne and MyClassTwo inherit init(from:) from your base class (and would never attempt to decode those properties, if they were present); MyClassThree doesn't have this latter issue, because it overrides init(from:) to decode the properties itself (though it does have a different problem*).

See:

You'll either need to

Override encode(to:) and init(from:) on your derived classes to encode and decode the properties they expect, or
If you never expect to encode or decode your base class, do as @tera suggests and move the conformances down into your derived types. This does, however, mean that your base class wouldn't be able to conform to your protocol, which might have some lead-on effects

*Inside of an init(from:) implementation, calling super.init(from: decoder) is generally not recommended, unless you have very tight coupling between the parent and child class and have special knowledge in the child class about the parent class's implementation, as both the parent and child class attempt to decode out of the same decoding container.

This can be an issue because:

A parent and child class must then use the same container type (e.g., both must be keyed or unkeyed identically), and
If the child and parent class share a field with the same name, the parent class's handling of that field will override the child's

In the general case, it's recommended that you fetch a superEncoder(forKey:)/superDecoder(forKey:) to sequester the parent class into a nested container, then pass that along to the parent.

hgraves · February 22, 2024, 3:51pm

In your particular case (base class has no fields to encode) just move the Codable conformance from the base class to derived classes.

If I do that can I still try to decode via protocol conformance? I don’t think I can because when I tried with Codable the compiler started yelling. It also picked up a bunch of other classes that conformed to Codable which defeated the intention of the code.

Ah ok. That’s too bad. It looks like I may not be able to do what I was hoping to do. Unless I’m missing something. I’ll need to explicitly decode each type manually in a switch statement or something.

Edit: Also just a heads up but even if I do get them to decode correctly I still have a dilemma of needing to unwrap them or something like that once they’re in the storage (the dict). Kind of a sorry state of affairs on this one but it makes sense I guess.

tera · February 22, 2024, 4:01pm

That would be a breaking change, but wouldn't it be fairer to not propagate conformances to derived classes? If derived classes want those conformances they could add them explicitly.

Example of current non-intuitive behaviour:

class Fruit: Equatable {
    var weight: Int
    init(weight: Int) { self.weight = weight }
    static func == (lhs: Fruit, rhs: Fruit) -> Bool {
        lhs.weight == rhs.weight
    }
}
class Apple: Fruit {
    enum Colour { case red, yellow, green }
    var colour: Colour
    init(weight: Int, colour: Colour) {
        self.colour = colour
        super.init(weight: weight)
    }
}
let 🍎 = Apple(weight: 1, colour: .red)
let 🍏 = Apple(weight: 1, colour: .green)
precondition(🍎 is any Equatable) // ✅ we can compare apples!
precondition(🍎 != 🍏)            // ❌ oops

wadetregaskis · February 22, 2024, 4:03pm

That breaks the "is a" relationship that's fundamental to classes, though. Anything a parent class can do, a subclass can do. Subclasses extend their parent classes, they don't constrict them.

tera · February 22, 2024, 4:22pm

ok, alternatively:

class Apple: Fruit {
    // 🛑 Error: Must provide it's own Equatable
}

itaiferber · February 22, 2024, 4:22pm

This isn't something you can do regardless, for exactly the reason you mention: you can't attempt to decode MyDecodable.self because the compiler needs to know which concrete type to call init(from:) on. Many types can adopt the protocol, so there's no right answer to choose here; you need to specify this statically.

This is typically the right approach, and the solution to the above — you need to have some information in the encoded representation that you can switch on to know which type to decode exactly.

tera · February 22, 2024, 4:26pm

Or you can make each class "unique":

class One {
    var magicOne = 1
    var x = 42
}
class Two {
    var magicTwo = 1
    var x = 24
}

hgraves · February 22, 2024, 5:16pm

for exactly the reason you mention: you can't attempt to decode MyDecodable.self because the compiler needs to know which concrete type to call init(from:) on.

Quick question though - given the above statements I have quoted, why is it that the objects that could decode the data each got initialized and added to dict? Sorry in advance if this was mentioned already or I missed something!

It looks almost like the compiler had enough information to compile because the protocol conformed to Codable.
By this I mean that MyDecodable.Type was totally valid at compile time. And then at run time the underlying type seems to have been selected (MyClassOne, MyClassTwo, etc)

Is the idea that, this is merely coincidental, because those classes had not overriding anything in their adoption of Codable ? Because, as you mentioned earlier, the protocol did not allow for the resynthesis for those classes?

In any case, as you mentioned, it seems I can’t do this without specifying the concrete type explicitly.

I imagine that I can do something like, add the properties to the protocol, and then when I decode, they’ll be there on the object? Hopefully maybe?

itaiferber · February 22, 2024, 6:45pm

Great question!

Your code works the way that it does because the cod value in decoder.decode(cod, from: data) represents a concrete (specific) type. objc_copyClassList returns to you a list of all classes known to the Objective-C runtime, as objects; within the compactMap, $0 is a metatype, or an object that dynamically represents a type. At the callsite, there is enough type information to make the call: you are passing in a (1) concrete type, which (2) is known to conform to Decodable.

Effectively, the code is attempting every known MyDecodable type: decoder.decode(MyBaseClass.self, ...), decoder.decode(MyClassOne.self, ...), etc. Note that more than one succeeds in decoding, because the structure of the objects in the data match more than one of the types.

However, you can't write something like try decoder.decode(MyDecodable.self, from: data) to achieve the same thing, because there isn't enough information to go off of: the compiler can't try every possible class like you are, because (besides being horribly wasteful) what if there's more than one result?

There's also the matter of MyDecodable requiring AnyObject conformance, and the usage of the Objective-C runtime to list classes. On Darwin platforms (macOS, iOS, etc.), Swift classes are inherently also Objective-C classes, so you can look them up by calling objc_copyClassList; but:

On platforms which don't have Objective-C support (e.g., Linux, and Windows), this type of lookup isn't possible
If you were to drop the object requirement via AnyObject, you wouldn't be able to pick up any struct or enum types that conformed to MyDecodable, only class ones

Theoretically, if you can guarantee that neither of these things will ever be an issue (you never plan to support non-Darwin platforms, and you never plan on having non-object types conform to MyDecodable), then you can keep doing what you're doing here — but as the number of types conforming to MyDecodable grows, this becomes increasingly inefficient: you will be trying every single type on every single decode call (and then possibly trying to figure out what to do if there's more than one successful decode).

Instead, if you're in control of the encoding/decoding format, it's a much better idea to insert some form of marker into your data to indicate what you should do. For example:

[
    { "variant": 1, // decode `data` as `MyClassOne`
      "data": { ... } },
    { "variant": 3, // decode `data` as `MyClassThree`
      "data": { ... } },
]

This is just a naive example of how to include that information, but having some form of reliable way to look at the data and figure out "what concrete type do I decode?" is ideal.

The specifics will depend on your actual use-case.