Making a codable wrapper for metatypes, will I get into trouble by doing this?

deaton.dg · June 28, 2021, 3:24am

In one of my projects, I am using a dependent dictionary type (dictionary whose keys are metatypes and whose values have types dependent on the associated types of the keys). In order to save/resume the state of my program, I am attempting to make my type conform to Codable. To make the keys be codable, I have devised a solution which I believe is novel.

The basic idea is as follows: NSClassFromString and NSStringFromClass allow converting class types to and from String. A generic class can statically reference an arbitrary (i.e. non-class) type through its parameter. This wrapped type can be exposed through a protocol with a static metatype requirement. This wrapper can be formed from a metatype by using Self in the body of a protocol extension to Codable (or Decodable where Self: Encodable due to language restrictions) or really any other protocol. These features can be combined into some very hacky code:

import Foundation

protocol CodableMetatypeWrapperProtocol: AnyObject {
    static var wrappedType: Codable.Type { get }
}
// If I make this private, NSClassFromString(NSStringFromClass(CodableMetatypeWrapper<TestType>.self)) returns nil.
class CodableMetatypeWrapper<T: Codable>: CodableMetatypeWrapperProtocol {
    static var wrappedType: Codable.Type { T.self }
}
extension Decodable where Self: Encodable {
    static var typeWrapper: CodableMetatypeWrapperProtocol.Type { CodableMetatypeWrapper<Self>.self }
}

struct CodableMetatype: Codable, Hashable {
    let type: Codable.Type
    
    init(_ type: Codable.Type) {
        self.type = type
    }
    
    func encode(to encoder: Encoder) throws {
        var container = encoder.singleValueContainer()
        let name: String = NSStringFromClass(type.typeWrapper)
        try container.encode(name)
    }
    
    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        let name = try container.decode(String.self)
        guard let wrapperClass = NSClassFromString(name) as? CodableMetatypeWrapperProtocol.Type else {
            throw DecodingError.typeMismatch(CodableMetatype.self, .init(codingPath: decoder.codingPath, debugDescription: "Result of NSClassFromString was nil or was not a CodableMetatypeWrapperProtocol.", underlyingError: nil))
        }
        self.type = wrapperClass.wrappedType
    }
    
    static func == (lhs: CodableMetatype, rhs: CodableMetatype) -> Bool {
        ObjectIdentifier(lhs.type) == ObjectIdentifier(rhs.type)
    }
    
    func hash(into hasher: inout Hasher) {
        hasher.combine(ObjectIdentifier(type))
    }
}

The data returned from CodableMetatype.encode seems to be stable across program execution from my limited testing, and it also seems like the CodableMetatypeWrapper does not need to be explicitly loaded into the runtime before decoding. The following example code demonstrates how this can work.

struct TestType: Codable {}

/// Obtained from a previous execution.
/// I ran this from a SwiftPM project called `Scratch`.
/// You might need a different value.
let typeString: String? = "\"_TtGC7Scratch22CodableMetatypeWrapperVS_8TestType_\""
//let typeString: String? = nil

let type: CodableMetatype
if let typeString = typeString {
    type = try! JSONDecoder().decode(CodableMetatype.self, from: typeString.data(using: .utf8)!)
    assert(type == CodableMetatype(TestType.self))
} else {
    type = CodableMetatype(TestType.self)
}
print(type)

let newTypeString = String(data: try! JSONEncoder().encode(type), encoding: .utf8)!
if let oldTypeString = typeString {
    assert(oldTypeString == newTypeString)
}
print(newTypeString)

let newType = try! JSONDecoder().decode(CodableMetatype.self, from: newTypeString.data(using: .utf8)!)
assert(newType == type)

Although everything appears to work fine right now, I do not really trust this solution. Will I get into trouble by using this code? I am especially worried because I will be using dlopen midway through the program, loading additional classes into the runtime.

This is my first post here, so feel free to leave any sort of feedback or let me know if this isn't the right place for this sort of question. Thanks!

Nickolas_Pohilets · June 28, 2021, 2:29pm

Identifiers of private types may be different across runs of the same binary:

// delme.swift
import Foundation
private struct QWERTY {}
public class Wrapper<T> {}
print(NSStringFromClass(Wrapper<QWERTY>.self))

produces:

$ ./delme 
_TtGC5delme7WrapperVS_P10$1029a6ed46QWERTY_
$ ./delme 
_TtGC5delme7WrapperVS_P10$1089dced46QWERTY_
$ ./delme 
_TtGC5delme7WrapperVS_P10$1018c2ed46QWERTY_

jrose · June 28, 2021, 4:56pm

This is basically how NSCoding works, which means it’s viable (for public types) but also that you have to be careful about security implications. If someone modifies a file to specify a class you didn’t expect, will your subsequent use of the class run dangerous code? Read from uninitialized memory?

A reasonable-ish way to protect against this in pure Swift is to require that the class you picked conforms to some particular protocol. That way, you defend against unexpected types and have a set of well-defined operations to use. Still, now those operations need to be safe. (NSCoding goes further by having the decoder pass in a list of all valid base classes, so that it’s even less likely to be something unexpected.)

This can technically still result in some arbitrary code running, because Objective-C classes can have code that runs on the first use of the class. However, any library you load into your process can already have code that runs on load, so code that runs on first use of a class doesn’t seem like an additional risk to me.

deaton.dg · June 28, 2021, 7:52pm

Good point! It appears that NSClassFromString(NSStringFromClass(Wrapper<QWERTY>.self)) is nil if QWERTY is private, so this method will not work with private types even within a single execution of the program.

deaton.dg · June 28, 2021, 8:00pm

Great, I am glad to hear this is workable. Thanks for your input!

I am going to use the method of making my types conform to a particular protocol. There is already a protocol that my keys conform to anyway, so this is hardly any work.

In my particular use-case, I am dlopening user-supplied plugins and executing user-supplied programs, so I agree that this is not really an additional risk. It's always a good idea to consider these risks though.

deaton.dg · June 28, 2021, 8:11pm

Private types definitely break this, but what is the difference between public and internal in this case?

jrose · June 29, 2021, 2:15am

Good question. I don’t think there’s any promise that internal types have stable names except for those that inherit from NSObject (and are non-generic), for NSCoding compatibility. It would be nice to have it written down one way or another, though.

deaton.dg · June 29, 2021, 6:50pm

Okay, I will stick to public types just to be safe then. Do you know where the promise for public types is written down? I’ve looked for places where I can read about these runtime behaviors, but I couldn’t find anything with much detail. I probably just didn’t know where to look.

dabrahams · November 17, 2022, 1:33am

Unfortunately, this only seems to work on Apple platforms, and there's no workaround I can find

I guess I need to make a table of all the types I want to serialize.