Optional decoding superpowers

tera · March 11, 2023, 11:39pm

I tried this example:

public enum MyOptional<Wrapped> {
    case none
    case some(Wrapped)
}

extension MyOptional: Decodable where Wrapped: Decodable {
    public init(from decoder: Decoder) throws {
        fatalError() // to see if it gets called
    }
}

struct S: Decodable {
    var field: MyOptional<Int>
}

let v = try! JSONDecoder().decode(S.self, from: "{}".data(using: .utf8)!) // key "field" not found

and while it works with "Optional" when I switched to "MyOptional" it didn't work and it didn't hit the fatalError in init(from decoder) above. Is it possible to do this somehow, or is Optional has some superpowers that MyOptional can never have? If that's the superpowers indeed, are they in JSONDecoder or Codable?

itaiferber · March 12, 2023, 12:51am

What you're seeing here is no inherent magic at the JSONDecoder or Codable levels, but a result of how the compiler synthesizes Decodable conformance. Specifically, the compiler knows about the Optional type, and will use decodeIfPresent(..., forKey: ...) for Optional properties instead of decode(..., forKey: ...), since the value is allowed to be missing.

This means that when you use Optional<Int>, the compiler synthesizes

init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    field = try container.decodeIfPresent(Int.self, forKey: .field)
}

When using MyOptional<Int>, the compiler synthesizes

init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    field = try container.decode(MyOptional<Int>.self, forKey: .field)
}

decodeIfPresent allows the key to be missing, and returns nil if it is; decode, however, will throw an error if the key is missing, before ever getting to MyOptional.init(from:), which is why it never gets called.

You can replicate the Optional behavior by implementing S.init(from:) directly, using decodeIfPresent as applicable.

itaiferber · March 12, 2023, 1:07am

BTW, we've had some discussion about this before in Decoding of optionals missing in json

tera · March 12, 2023, 1:42am

itaiferber:

What you're seeing here is no inherent magic at the JSONDecoder or Codable levels, but a result of how the compiler synthesizes Decodable conformance. Specifically, the compiler knows about the Optional type
...
When using MyOptional<Int>, the compiler synthesizes
init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    field = try container.decode(MyOptional<Int>.self, forKey: .field)
}

Wow, that's in compiler! I thought it would be in the standard library.

BTW, when I tried an empty project with "-parse-stdlib" compiler told me "Cannot find type 'Optional' in scope".

tera · March 12, 2023, 1:51am

How exactly does compiler check the type is Optional? Is it in the form "value is Optional" and could the check be changed to, say, "value is OptionalValue" where "OptionalValue" is some marker protocol, which protocol other types could conform to if needed?

// within compiler or std lib:
public protocol OptionalValue {}
extension Optional: OptionalValue {}

// elsewhere
extension MyOptional: OptionalValue {}

itaiferber · March 12, 2023, 2:19am

The compiler gets the underlying type of the property via getOptionalObjectType(); getOptionalObjectType() checks whether the declaration of the type in question is the declaration for Optional (the underlying declaration for which is in KnownStdlibTypes.def).

In other words, it checks whether the type of the var is Swift.Optional<T> for some T.

BTW, when I tried an empty project with "-parse-stdlib" compiler told me "Cannot find type 'Optional' in scope".

This would be expected, since the compiler is looking for Swift.Optional specifically, and -parse-stdlib prevents the Swift stdlib (Swift) from being loaded automatically.

Theoretically, it could — but it would involve adding a marker protocol to the stdlib, and then knowledge of that to the compiler. Having the behavior change depending on conformance to a protocol also adds a bit of trickiness, because a type can be conformed to a protocol after-the-fact, which means that will compile differently based on the context it's compiling in (e.g., you import a module which adds a conformance to OptionalValue onto some type and suddenly your code silently and implicitly compiles differently...)

The complexity of the solution may not be worth it.

What specifically are you hoping to achieve? You may be able, for instance, to get the same results by writing a property wrapper instead which leverages the current compiler synthesis to do what you want.

tera · March 12, 2023, 2:29am

Good to know. AFAIK this is a pure additive change that shouldn't break any existing code (perhaps only that that happens to use "OptionalValue" bikeshed name).

If there's a notion of protocols that could only be conformed to from the main type body – that could be used in this case to simplify things.

One thing would be: the "first class" optional type that contains one or more extra options in addition to none and some (alternatively a value associated with none / some or both).

Could you show a sketch of that?

itaiferber · March 20, 2023, 6:37pm

Apologies for the delayed response here!

The half-baked idea I had had in mind when suggesting this solution is that when a property wrapper is Codable, the compiler synthesis uses its underlying wrappedValue for encoding and decoding — which means that if you could convert your MyOptional<Wrapped> into a wrappedValue of type Wrapped?, the compiler would use that instead, and get you the behavior you want. e.g., something like

protocol OptionalConvertible {
    associatedtype Wrapped
    var optionalValue: Wrapped? { get set }
    init(_ optionalValue: Wrapped?)
}

@propertyWrapper
struct OptionalCodable<T: OptionalConvertible>: Codable where T.Wrapped: Codable {
    var wrappedValue: T.Wrapped? {
        get { storage.optionalValue }
        set { storage.optionalValue = newValue }
    }

    var storage: T
    
    init(storage: T) {
        self.storage = storage
    }
    
    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        storage = .init(try container.decode(Optional<T.Wrapped>.self))
    }
    
    func encode(to encoder: Encoder) throws {
        var container = encoder.singleValueContainer()
        try container.encode(wrappedValue)
    }
}

Then, with a custom type like

enum MyOptional<Wrapped>: OptionalConvertible {
    case none
    case some(Wrapped)
    case fileNotFound
    
    init(_ optionalValue: Wrapped?) {
        switch optionalValue {
        case .none: self = .none
        case .some(let v): self = .some(v)
        }
    }
    
    var optionalValue: Wrapped? {
        get {
            switch self {
            case .none, .fileNotFound: return nil
            case .some(let v): return v
            }
        }
        
        set {
            self = .init(newValue)
        }
    }
}

I envisioned that you could write

struct S {
    @OptionalCodable(storage: MyOptional<Int>.none)
    var field: MyOptional<Int>
}

However, this doesn't work, because the compiler enforces that the type of a property and its property wrapper's wrappedValue must be the same type:

struct S {
    @OptionalCodable(storage: MyOptional<Int>.none)
    var field: MyOptional<Int> // 🛑 Property type 'MyOptional<Int>' does not match 'wrappedValue' type 'MyOptional<Int>.Wrapped?'
}

~~This could work if Codable synthesis looked up whether the property wrapper had a projectedValue and used that, but it doesn't currently, and this might be a behavior-breaking change.~~

Edit: I actually think projectedValue is unlikely to be useful; besides the fact that property wrappers can have projected values completely unrelated to the underlying type, you can get into various ambiguous scenarios with property wrapper composition. If you compose multiple property wrappers, Codable synthesis currently follows wrappedValues all the way down to the base type, but if you stop to look at projectedValue branches along the way, it gets less clear what the end result would be. (And if a client reorders property wrappers, they may unexpectedly get completely different results.)

Someone more clever than I may find a way around this with property wrappers, but this is what I had in mind, at least. Sorry it didn't work out to help you.

tera · March 21, 2023, 4:42am

Thank you very much for that analysis Itai. It feels like the above suggested "additive" OptionalValue change (even in the most limited form, e.g. the one which prohibits retroactive conformance to simplify its implementation) is a simpler way to go.

A side question. It seems that Codable functionality could be implemented in swift using Mirror API, is this alright? Would that be slower or was there another reason to implement it the way it is implemented, in C++?

itaiferber · March 22, 2023, 12:26am

Quite possibly simpler, though I'd say that there's a pretty high bar to clear for adding new API like this to the stdlib, and especially so for something with a relatively narrow use-case.

What might be interesting to explore is the possibility of expanded customization options were Codable synthesis to be moved out of the compiler, and reimplemented in the form of Swift macros. If it hasn't been stated as such elsewhere yet, I hope that (at least eventually) it is a goal to move synthesized conformances out of the compiler itself and into the stdlib.

When you say this, are you referring to synthesized conformances, or library side of the functionality?

Though I guess in either case, the answer is largely the same:

Then, and now, the Swift reflection APIs as exposed through Mirror and others are insufficient to do the work that Codable requires, and safely at that. The Swift runtime has the capabilities to expose the necessary information, but the public APIs don't exist. (Somewhere on the forums, there have been discussions about revamping reflection APIs altogether, but you might be able to search for those just as well as I)
Although Codable is, unfortunately, not known for raw performance, doing the same work at runtime with reflection is significantly slower (tested a while back, at least a magnitude slower if not more, though this may have changed); and there doesn't seem to be enough benefit to doing things dynamically at runtime to justify the performance loss
Most importantly, the Codable APIs try to be as explicit as possible in implementation, to give maximal control over output when possible. If you want to, you can implement Encodable and Decodable conformance directly to very clearly express (both in code, and behavior) exactly how you want types to be represented — which is important, because archived data exists effectively forever, with little room to fix old mistakes (if you care about backwards compatibility, which most should)

Performing similar work at runtime via dynamic code makes everything seem more "magic": there's no code you can point to that shows what work is being done (and as such, you have much less control over what's actually happening in your code).

In all, although it's possible with greatly expanded reflection work, it doesn't seem to me that these specific APIs (with their current design philosophy, at least) would benefit from being implemented using reflection.