Retrieving class type dynamically for decoding

Hi everyone,

I'm trying to decode a wrapper that stores the type of an object (as enum) and the object itself. This enum has a computed variable to dynamically retrieve the class of the object, which works great during initialization via the type. But the same approach falls apart during decoding. It's probably easier to understand via code, so here is an example:

enum FruitType: Int, Codable {
    case Apple
    case Cherry

    var treeClass: Tree.Type {
        switch self {
        case .Apple:
            return AppleTree.self
        case .Cherry:
            return CherryTree.self
        }
    }
}

class Tree: Codable {
    required init() { }
}

class AppleTree: Tree { }

class CherryTree: Tree { }

struct FruitTreeWrapper: Codable {
    var fruitType: FruitType
    var tree: Tree

    init(fruitType: FruitType) {
        self.fruitType = fruitType
        self.tree = fruitType.treeClass.init()
    }

    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        self.fruitType = try container.decode(FruitType.self, forKey: .fruitType)
            
        // Doing the switch here yields the correct class
        switch fruitType {
        case .Apple:
            self.tree = try container.decode(AppleTree.self, forKey: .fruitType)
        case .Cherry:
            self.tree = try container.decode(CherryTree.self, forKey: .fruitType)
        }
    
        // But trying to retrieve the class dynamically will always yield a Tree object, no matter the fruitType
        self.tree = try container.decode(self.fruitType.treeClass, forKey: .fruitType)
    
    }
}

let fruitTreeWrapper = FruitTreeWrapper(fruitType: .Apple)

if let encodedFruitTreeWrapper = try? JSONEncoder().encode(fruitTreeWrapper),
    let decodedFruitTreeWrapper = try? JSONDecoder().decode(FruitTreeWrapper.self, from: encodedFruitTreeWrapper){
    print(type(of: fruitTreeWrapper.tree))
    print(type(of: decodedFruitTreeWrapper.tree))
}

If the switch statement to determine the class of the object is added inside the init(from decoder: Decoder), everything works fine. But as soon as I try to retrieve the class type via the computed variable, I will always end up with an object of the parents class type, in this case, a Tree.

Even if doing things that way is a bad idea, shouldn't this code still work? And if not, can anyone explain to me why? I'm just super curious.

Thanks,
Klemens

Interesting question; I would have expected this to work, too.

I played around with your code a little bit, and I believe it comes down to

try container.decode(AppleTree.self, forKey: .fruitType)

giving us an AppleTree, while

try container.decode(AppleTree.self as Tree.Type, forKey: .fruitType)

gives us a base Tree instead. Looks like the static type known to compiler plays a key role. In decode(_:forKey:), type is still AppleTree, but the generic type T is now Tree. So far, so good. However, it seems that the standard library throws this concrete type information away during decoding:

public func decode<T: Decodable>(_ type: T.Type, forKey key: Key) throws -> T {
    return try _box.decode(T.self, forKey: key)
}

I'm no expert here by any means, but maybe this can be fixed (in the standard library). Maybe someone else can clear this up?

1 Like

The reason for this is that Tree is the actual type that implements Decodable. The subclasses only inherit the Codable implementation from it and in this case it only works because all classes have exactly the same structure. If you had any additional properties in the subclasses, the autosynthesis of the Decodable part would fail, because there is no way for the parent class to know about the properties of the subclasses. So even is you were to manually implement the init(from decoder: Decoder) in the subclass, you would not have any keys available for its properties.

I know that this is just a toy example, but do you have a real world example of what you are trying to achieve? Maybe there is a more suitable way to do this, without subtyping.

I would actually say that this is a bug. There are places in JSONDecoder where we took special care to use the dynamic type instead of the static type and I think the feature is useful. I think this is worth a bug report; although this could technically be a breaking change, I can’t imagine reasonably passing in a dynamic type and expecting it to decode the static type.

Here is the commit applying this to Foundation decoders; I think not doing the same in the stdlib was oversight by me at the time.

2 Likes

This isn't necessarily true, by the way. If a class inherits an init from a superclass, calling that initializer on a dynamic metatype for the subclass constructs the subclass, not the superclass:

protocol Initable {
    init()
}

class Super: Initable {
    required init() {
        print("\(type(of: self)).init()")
    }
}

class Sub: Super {}

func create<T: Initable>(_ type: T.Type) -> T {
    print("\(T.self) <=> \(type)")
    return type.init()
}

let s1Type: Super.Type = Super.self // Statically Super.Type, dynamically Super.Type
let s1: Super = create(s1Type) // Super <=> Super, Super.init()
print(type(of: s1)) // Super
print("---")

let s2Type: Super.Type = Sub.self // Statically Super.Type, dynamically Sub.Type
let s2: Super = create(s2Type) // Super <=> Sub, Sub.init()
print(type(of: s2)) // Sub

If a subclass inherits an implementation for Decoable from its superclass, calling Sub.init(from:) will construct a Sub, not a Super. The issue is that the decode(_:forKey:) method is using the static supertype, not the dynamic subtype as it should:

public func decode<T: Decodable>(_ type: T.Type, forKey key: Key) throws -> T {
    return try _box.decode(type, forKey: key)
}
2 Likes

After reading your comment I've been playing around with this a bit more and I think you're right. However adding properties to the subtypes requires manually implementing Encodable and Decodable on the subtypes.

Overall I think that using subtyping for this is unwieldy and fragile. I would always prefer to use enums if possible. I know that the enum support in Codable is not where it should be, but that could be fixed fairly easily and with a bit of manual work it is possible to use it today and should provide more safety.

Sure, you can fall out of synthesized conformance relatively easily, but I don't think that's necessarily related β€” even if you need to implement init(from:)/encode(to:) manually, I don't see that as disqualifying this use-case.

This isn't always possible, especially if you've got an existing class hierarchy that you want to bring Codable support to.

I don't disagree that there could be better ways to restructure this depending on concrete needs, but I think it's important for this use-case to work because it's definitely valid. I think consistency is important β€” it works with some decode(_:...) calls, so it should work across all of them, IMO. Just my 2Β’.

1 Like

Absolutely, but I thought it would be good to point this out.

Also true, that's why I said if possible and asked the OP for a more concrete example of what he is trying to achieve.

Yes, I think we agree that the behavior is incorrect, I was just trying to suggest alternatives :slight_smile:.

2 Likes

Thanks for all the quick and insightful responses! :slight_smile: So I guess I'll file a bug report then!

Here is my little hobby project: I've implemented multiple 2 player board games. There is a struct called BoardGameData, which holds things like the GameType and the GameBoard (which in turn holds the GamePieces and is modified through some GameRules) . From my example above, the FruitTreeWrapper is the BoardGameData, the GameType is the fruitType and the GameBoard is the Tree. I can create a new BoardGameData by just handing over a GameType, which generates the correct board in its initializer. I wanted to do the same thing in the decoder and that's where I hit the problem.

I would love to hear any better suggestions! The code isn't that massive yet, so I wouldn't mind restructuring it. :slight_smile:

I think the first thing to think about is why GameBoard is a reference type and whether it could be made a value type instead. If you need a reference type, could you store the actual data in value type inside of that reference type? In general I prefer to have as few reference types as possible. Pure data containers should always be value types.

1 Like
Terms of Service

Privacy Policy

Cookie Policy