[Pitch] JSONDecoder.nullDecodingStrategy

JSONDecoder currently does not differentiate between a payload containing an explicit null or the field being omitted. I would like to propose an additive change to include a decoding strategy for these situations:

class JSONDecoder {
    ...

    enum NullDecodingStrategy {
	    // Default, today's behavior
	    case implicit

	    // If struct contains "var anInt: Int?", valid JSON payloads must explicitly contain either {'anInt': <integer>} or {'anInt': null}.  Not containing 'anInt' results in a decoding error.
	    // If struct contains "var anInt: Int??", {'anInt': <integer>} decodes as .some(.some(<integer>)), {'anInt': null} decodes as .some(.none), and the field being omitted results decodes as .none.
	    case explicit
    }

    var nullDecodingStrategy: NullDecodingStrategy = . implicit

    ...
}

(DISCLAIMER: Might need some bikeshedding.)

2 Likes

JSONDecoder does actually differentiate between these cases — you can distinguish by calling decode(Int?.self, forKey: ...) (which necessitates that the field be present, but the value may be null) vs. decodeIfPresent(Int.self, forKey: ...) (which allows either the field to be omitted, or the value to be null):

import Foundation

struct Foo : Decodable {
    let a: Int?
    let b: Int?
    
    private enum CodingKeys: String, CodingKey {
        case a, b
    }
    
    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        a = try container.decode(Int?.self, forKey: .a)
        b = try container.decodeIfPresent(Int.self, forKey: .b)
    }
}

let payloads = [
    // Succeeds
    """
    {"a": 5, "b": 10}
    """.data(using: .utf8)!,
    
    // Succeeds
    """
    {"a": null, "b": 10}
    """.data(using: .utf8)!,
    
    // Fails, "a" may not be missing
    """
    {"b": 10}
    """.data(using: .utf8)!,
    
    // Succeeds
    """
    {"a": 5, "b": null}
    """.data(using: .utf8)!,
    
    // Succeeds
    """
    {"a": 5}
    """.data(using: .utf8)!
]

let decoder = JSONDecoder()
for payload in payloads {
    do {
        let foo = try decoder.decode(Foo.self, from: payload)
        print(foo)
    } catch {
        print("Failed to parse '\(String(data: payload, encoding: .utf8)!)': \(error)")
    }
}

One thing to keep in mind when using synthesized conformances is that optional values get decodeIfPresent(...) called for them, rather than decode(...), to be more permissive.


This isn't unique to JSONDecoder, BTW, but a requirement of the KeyedDecodingContainer protocol:

Is there a use case you have in mind which this does not cover?

1 Like

The problem with the Foo example above is that it still obfuscates from any consumer of Foo whether the value is present or nil, which is more aptly represented via nested Optionals (e.g. Optional<Optional>). While it's true the support can be implemented manually with additional legwork, the goal of NullDecodingStrategy is to streamline this + enable synthesized conformances to provide this for free. This is especially useful for data querying languages (such as GraphQL) that make a strong semantic distinction between a field in a returned payload being null and it being omitted altogether.

This would ideally look like:

struct Foo: Codable {
    var a: Int
    var b: Int?
    var c: Int??
}

let payloads = [
    // Succeeds, (a: 5, b: .some(10), c: .some(.some(15)))
    """
    {"a": 5, "b": 10, "c": 15}
    """.data(using: .utf8)!,

    // Succeeds, (a: 5, b: .some(10), c: .some(.none))
    """
    {"a": 5, "b": 10, "c": null}
    """.data(using: .utf8)!,

    // Succeeds, (a: 5, b: .some(10), c: .none)
    """
    {"a": 5, "b": 10}
    """.data(using: .utf8)!,

    // Succeeds, (a: 5, b: .none, c: .none)
    """
    {"a": 5, "b": null}
    """.data(using: .utf8)!,

    // Fails, "a" cannot be null
    """
    {"a": null, "b": 10, "c": 15}
    """.data(using: .utf8)!,

    // Fails, "a" may not be missing
    """
    {"b": 10}
    """.data(using: .utf8)!,

    // Fails, "b" may not be missing
    """
    {"a": 5}
    """.data(using: .utf8)!
]

let decoder = JSONDecoder()
decoder.nullDecodingStrategy = .explicit

for payload in payloads {
    do {
        let foo = try decoder.decode(Foo.self, from: payload)
        print(foo)
    } catch {
        print("Failed to parse '\(String(data: payload, encoding: .utf8)!)': \(error)")
    }
}

Having not worked with GraphQL directly, please enlighten me — in what use cases is it important for a consumer of Foo to know whether one of its fields was missing vs. null?

My concern about distinguishing between Optional and Optional<Optional> is multi-fold:

  1. The current behavior for handling of optionals falls out very nicely from a very simple implementation on Optional that follows from general rules: encoding an Optional<T> encodes the inner T if it is present, regardless of what T is — it could be a type like Int (in so, unwrapping Int? → Int), or it could be an optional itself. Encoding an Int?? encodes an Int?, which encodes an Int; the rule for the Optional itself is very simple. (Decoding is similar here — decoding an Int?? decodes an Int? which decodes an Int, wrapping the values as we unwind.)

    However, when we give preferential treatment to Optional<T> where T == Optional<U>, we can no longer rely on these general rules and care about the specifics of what T is. Besides the runtime hassle of unwrapping Optionals (because we can't yet ask if T is Optional because Optional is generic), we have to start checking this on every decode call, which is less efficient.

  2. Giving preferential treatment to such a type also raises more questions — what does Int??? then imply? Int???? (i.e. what makes Optionalⁿ<T> special for n == 2?)

  3. It's not always valid to read into the type annotation in this way. Consider the following:

    struct Field<T> {
        // We might have a T, who knows.
        let name: String
        let value: T?
    
        init(name: String, value: T? = nil) {
            self.name = name
            self.value = value
        }
    }
    
    struct MyType {
        let foo: Field<Int?>
    }
    

    Through an ostensibly reasonable composition of types, we've ended up with a Field whose value is an Int??. Is it reasonable to read into the meaning of that on its own? Maybe, maybe not, but the decoder certainly can't distinguish between what you meant and what it got

Given this, and the fact that T?? types can be difficult to consume correctly (if you're not careful about how you unwrap, you could end up throwing away information; this is just my personal opinion, though), would it not be nicer to work with a field type other than Optional<Optional<T>> which does actually semantically indicate the difference between the field being missing and the value being null? (You would lose automatic synthesis, yes, but you can add extensions to the container protocols that make it really easy to decode a Field<T> [where Field can be an enum that strongly distinguishes between .none and .missing].)

1 Like
Terms of Service

Privacy Policy

Cookie Policy