Propery wrapper decoding difference between `singleValueContainer` and `init(from: Decoder)`

jflan · September 9, 2021, 3:23pm

I ran into a difference in the decoding behavior of property wrappers based on the specifics of how it implements init(from decoder: Decoder) that I didn't expect.

If implemented with singleValueContainer everything behaves as I'd expect:

extension ContainerWrapper: Decodable where Wrapped: Decodable {
    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        self.wrappedValue = try container.decode(Wrapped.self)
    }
}

If implemented by calling Wrapped(from: decoder) then wrapped dictionaries apply the .convertFromSnakeCase conversion to their String keys. (full example code below)

extension DirectWrapper: Decodable where Wrapped: Decodable {
    init(from decoder: Decoder) throws {
        self.wrappedValue = try Wrapped(from: decoder)
    }
}

Can anyone explain why those two implementations would behave differently?

This code demonstrates the issue:

import Foundation

let jsonData = #"{ "dictionary": { "some_key": 1 } }"#.data(using: .utf8)!

// MARK: Property wrappers

@propertyWrapper
struct ContainerWrapper<Wrapped> {
    let wrappedValue: Wrapped
}

extension ContainerWrapper: Decodable where Wrapped: Decodable {
    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        self.wrappedValue = try container.decode(Wrapped.self)
    }
}

@propertyWrapper
struct DirectWrapper<Wrapped> {
    let wrappedValue: Wrapped
}

extension DirectWrapper: Decodable where Wrapped: Decodable {
    init(from decoder: Decoder) throws {
        self.wrappedValue = try Wrapped(from: decoder)
    }
}

// MARK: Subjects

struct ContainerSubject: Decodable {
    @ContainerWrapper
    var dictionary: [String: Int]
}

struct DirectSubject: Decodable {
    @DirectWrapper
    var dictionary: [String: Int]
}

// MARK: Tests

let decoder = JSONDecoder()
decoder.keyDecodingStrategy = .convertFromSnakeCase

// prints: ["some_key": 1]
print(try decoder.decode(ContainerSubject.self, from: jsonData).dictionary)

// prints: ["someKey": 1]
print(try decoder.decode(DirectSubject.self, from: jsonData).dictionary)

stephencelis · September 9, 2021, 6:34pm

We had this very behavior reported by @buscarini for one of our libraries here: Custom date decoding issue · Issue #23 · pointfreeco/swift-tagged · GitHub

I'm not sure if he filed a bug, but it wouldn't hurt to submit one, unless there's some explanation for this behavior being intentional.

mayoff · September 9, 2021, 7:42pm

When ContainerWrapper asks decoder to decode a [String: Int], that goes through a special-case code path. It calls this overload of decode:

    public func decode<T : Decodable>(_ type: T.Type) throws -> T {
        try expectNonNull(type)
        return try self.unbox(self.storage.topContainer, as: type)!
    }

which calls this overload of unbox:

    func unbox<T : Decodable>(_ value: Any, as type: T.Type) throws -> T? {
        return try unbox_(value, as: type) as? T
    }

which calls unbox_:

    func unbox_(_ value: Any, as type: Decodable.Type) throws -> Any? {
        if type == Date.self || type == NSDate.self {
            return try self.unbox(value, as: Date.self)
        } else if type == Data.self || type == NSData.self {
            return try self.unbox(value, as: Data.self)
        } else if type == URL.self || type == NSURL.self {
            guard let urlString = try self.unbox(value, as: String.self) else {
                return nil
            }


            guard let url = URL(string: urlString) else {
                throw DecodingError.dataCorrupted(DecodingError.Context(codingPath: self.codingPath,
                                                                        debugDescription: "Invalid URL string."))
            }
            return url
        } else if type == Decimal.self || type == NSDecimalNumber.self {
            return try self.unbox(value, as: Decimal.self)
        } else if let stringKeyedDictType = type as? _JSONStringDictionaryDecodableMarker.Type {
            return try self.unbox(value, as: stringKeyedDictType)
        } else {
            self.storage.push(container: value)
            defer { self.storage.popContainer() }
            return try type.init(from: self)
        }
    }

which detects the conformance of [String: Int] to _JSONStringDictionaryDecodableMarker and so calls this overload of unbox:

    func unbox<T>(_ value: Any, as type: _JSONStringDictionaryDecodableMarker.Type) throws -> T? {
        guard !(value is NSNull) else { return nil }


        var result = [String : Any]()
        guard let dict = value as? NSDictionary else {
            throw DecodingError._typeMismatch(at: self.codingPath, expectation: type, reality: value)
        }
        let elementType = type.elementType
        for (key, value) in dict {
            let key = key as! String
            self.codingPath.append(_JSONKey(stringValue: key, intValue: nil))
            defer { self.codingPath.removeLast() }


            result[key] = try unbox_(value, as: elementType)
        }


        return result as? T
    }

which iterates over the keys of the internal representation (an NSDictionary) without performing snake-case conversion.

When DirectWrapper calls the init(from: Decoder) method of [String: Int], it's calling this initializer:

  public init(from decoder: Decoder) throws {
    self.init()


    if Key.self == String.self {
      // The keys are Strings, so we should be able to expect a keyed container.
      let container = try decoder.container(keyedBy: _DictionaryCodingKey.self)
      for key in container.allKeys {
        let value = try container.decode(Value.self, forKey: key)
        self[key.stringValue as! Key] = value
      }
    } else if Key.self == Int.self {
    ... many more lines omitted ...

which (because Key is String) asks the decoder for a KeyedDecodingContainer by calling this method:

    public func container<Key>(keyedBy type: Key.Type) throws -> KeyedDecodingContainer<Key> {
        guard !(self.storage.topContainer is NSNull) else {
            throw DecodingError.valueNotFound(KeyedDecodingContainer<Key>.self,
                                              DecodingError.Context(codingPath: self.codingPath,
                                                                    debugDescription: "Cannot get keyed decoding container -- found null value instead."))
        }


        guard let topContainer = self.storage.topContainer as? [String : Any] else {
            throw DecodingError._typeMismatch(at: self.codingPath, expectation: [String : Any].self, reality: self.storage.topContainer)
        }


        let container = _JSONKeyedDecodingContainer<Key>(referencing: self, wrapping: topContainer)
        return KeyedDecodingContainer(container)
    }

which calls the init of _JSONKeyedDecodingContainer:

    init(referencing decoder: __JSONDecoder, wrapping container: [String : Any]) {
        self.decoder = decoder
        switch decoder.options.keyDecodingStrategy {
        case .useDefaultKeys:
            self.container = container
        case .convertFromSnakeCase:
            // Convert the snake case keys in the container to camel case.
            // If we hit a duplicate key after conversion, then we'll use the first one we saw. Effectively an undefined behavior with JSON dictionaries.
            self.container = Dictionary(container.map {
                key, value in (JSONDecoder.KeyDecodingStrategy._convertFromSnakeCase(key), value)
            }, uniquingKeysWith: { (first, _) in first })
        case .custom(let converter):
            self.container = Dictionary(container.map {
                key, value in (converter(decoder.codingPath + [_JSONKey(stringValue: key, intValue: nil)]).stringValue, value)
            }, uniquingKeysWith: { (first, _) in first })
        }
        self.codingPath = decoder.codingPath
    }

which performs snake-case conversion. This is the only way to get snake-case conversion while decoding.

Since ContainerWrapper doesn't end up going through a keyed container, it doesn't get snake-case conversion.

itaiferber · September 9, 2021, 7:45pm

This isn't a bug, but a consequence of the flow of events when you call methods on a container (owned by a Decoder as opposed to passing control directly to Wrapped.

When you call try container.decode(Wrapped.self), execution passes to the decode method on the container, which gives the container (and the decoder it belongs to) an opportunity to act on the specific value of the type passed in. __JSONDecoder.decode<T>(_:T.Type), the method being called here, passes T.Type down to __JSONDecoder.unbox(_:as:), which inspects the given type to override certain behaviors. In this case, it looks for a dictionary type (as a _JSONStringDictionaryDecodableMarker) to be able to apply the decoding strategy. Once the strategy is applied, the decoder ends up calling T.init(from:), which in this case is Wrapped.init(from:)
When you call Wrapped.init(from:) directly, you bypass all of this and pass execution over to Wrapped immediately. This means that the container/Decoder never got a chance to intercept execution, and apply any of its own strategies to the underlying data

In general, it's preferred that you decode directly from containers for this reason (to give the Decoder a chance to massage the data into the form you need), but there's room in certain very niche cases for your second strategy.

itaiferber · September 9, 2021, 7:46pm

Ah, @mayoff beat me to it while I was typing this up. Exactly the right analysis.

jflan · September 9, 2021, 9:13pm

Thank you both for the responses!

Is this considered a bug in JSONDecoder, or is calling init(from: Decoder) directly in this case considered a programming error?

I had assumed they were effectively interchangeable. Is there any way to better message that calling init(from: Decoder) directly is probably a mistake?

Does this advise also apply to encoding?

Is is better to prefer

func encode(to encoder: Encoder) throws {
    var container = encoder.singleValueContainer()
    try container.encode(wrappedValue)
}

over

func encode(to encoder: Encoder) throws {
    try wrappedValue.encode(to: encoder)
}

?

itaiferber · September 9, 2021, 10:51pm

I definitely wouldn't consider it a bug, but I also don't know if I'd go so far as to call it "programmer error" necessarily:

A bug implies that there's something to correct or fix, but there's nothing technically wrong with the code: in a way, you're getting exactly what you ask for — execution passes to the entity you're calling directly.
- Even if we call this a bug, there isn't anything terribly actionable: there's no way to turn the code from calling Wapped.init(from:) into code the Decoder is running without entirely subverting what you've written. But I would consider that equally, if not more surprising (i.e. I would be very surprised if the compiler somehow transformed try Wrapped(from: decoder) into something that didn't directly call Wrapped.init(from:))
It's also not necessarily an error: this is a sort of escape hatch from calling into the Decoder, and I can imagine some cases for calling code this way (potentially around some overlapping/nested objects encoding into the same container as self), but it would be very niche

That being said, it is certainly surprising, and I know I can dig up at least one or two more threads from the past where this has come up.

Short of a warning, I'm not sure how we'd do this, since this is valid code. The initializer requirement has to be publicly exposed so Decoders themselves can call it. The downside to a warning is that in a valid use-case (where you really know what you're doing), there'd be no good way to silence it. (There are schemes that Swift could implement, but that's way out in the weeds, and in general, we prefer warnings to have in-source solutions — there needs to be an obvious "right" way to write the code to disable the warning, but in this case there wouldn't be.)

Yes, for the exact same reasons. Your second code snippet doesn't afford an Encoder the opportunity to intercept the encoding type and apply strategies. So even in the case where you're encoding only a single value, you should use a container in the majority of cases.

stephencelis · September 9, 2021, 10:52pm

Perhaps some more documentation is in order?

dabrahams · January 27, 2023, 12:05am

Yes, the bug here is that the effects of using singleValueContainer are not documented, i.e. what difference does it make to what's actually encoded? Does it provide benefits that directly encoding the stored value does not have? /cc @Tony_Parker @Philippe_Hausler

itaiferber · January 27, 2023, 1:19am

I deleted a previous post which I believe was a misunderstanding of your question. To be clear, are you asking about the difference between

func encode(to encoder: Encoder) throws {
    try wrappedValue.encode(to: encoder)
}

and

func encode(to encoder: Encoder) throws {
    var container = encoder.singleValueContainer()
    try container.encode(wrappedValue)
}

?

If so, functionally: for most types, there is no fundamental difference between these two; at the encoded representation layer, for an arbitrary type T, there will be no difference between the two forms.

However: there is a difference between the sequence of events that happen for these two forms, and in the second case, the container is what gets to call wrappedValue.encode(to: encoder). This means that theoretically, the container (and the underlying Encoder) has the opportunity to run code before and after the request to encode, affording the Encoder the opportunity to inspect wrappedValue, and make decisions about how to encode it in a way that's most appropriate for the format. This means that for specific types, you may see a difference between the two forms.

(FWIW, this behavior is not specific to singleValueContainer — any encode call through any container gives the Encoder the opportunity to inspect the value being encoded. This is also the difference, for instance, between asking for a keyed container and encoding properties one by one, instead of collection a dictionary of values and calling dict.encode(to: encoder).)

ETA: to hopefully be even clearer, all of this basically boils down to

protocol P {
    func f()
}

struct S {
    static func f(_ p: P) {
        // Maybe some code...
        p.f()
        // Maybe some code...
    }
}

let p: P = ...

p.f()
// vs. 
S.f(p)

Calling p.f() doesn't involve S at all, outside of what the implementation of the function might itself do. Calling s.f(p) defers to S immediately, and it may do whatever it wants (but it generally only just calls p.f() itself).

jflan · January 27, 2023, 2:06am

As an example, Decimal is encoded wildly diffrently by JSONEncoder depending on which approach you use:

and Data won't respect .dataEncodingStrategy:

dabrahams · January 27, 2023, 4:10am

In other words, there is no clear semantic difference between these things, and if singleValueContainer were documented, it could only be at best as pseudocode for this sequence of events you've laid out. There's no way to know if I should do things one way or the other in general; it depends on lots of factors about which I can't reason locally.

This is not an abstraction, and it's no wonder it's not documented, because the effects are very difficult to describe clearly.

I appreciate your effort to explain it here, but to be clear I wasn't asking for help. I was characterizing the problem.

itaiferber · January 27, 2023, 6:09pm

Ah, understood!

Agreed — documenting this requires striking a delicate balance between sufficiently motivating the use case to not use a container, and a lot of the more esoteric details that might follow from it (especially without using an example from a concrete Encoder).

Fortunately, at least, a guideline on this is at least easier to give: in the 99.999% use case, you should be using a container for all of your encoding. If you find yourself writing x.encode(to: encoder) inside of your own encode(to:), it's much more likely than not that you're not going to get the results you want or expect.

There's definitely room to make this clearer. If there were a way to leverage the compiler to help with this too, that would also be interesting.

dabrahams · January 28, 2023, 12:37am

Even just putting that guideline in the documentation for singleValueContainer would be a huge help.