[Proposal] introducing NonConformingType strategies with Encoder/Decoder

Current Encoder/Decoder has a problem that most types (except for Date, Data and Float) have no way to modify encoding/decoding strategies.
I want to give them the way.

Proposed solution

for example of Decoder,

struct NonConformingTypeDecodingStrategies {
    subscript<T>(type: T.Type) -> NonConformingTypeDecodingStrategy<T>?
}

public enum NonConformingTypeDecodingStrategy {
    /// Throw upon encountering non-conforming values. This is the default strategy.
    case `throw`

    /// Assume the values as the given value.
    case `default`(Any)

    /// Decode as a custom value by the given closure.
    case custom((_ decoder: Decoder) throws -> Any)
}

open class JSONDecoder {
    ...
    open var nonConformingTypeDecodingStrategies: NonConformingTypeDecodingStrategies
}

Usage

var decoder = JSONDecoder()
decoder.dateDecodingStrategy = .iso8601
decoder.dataDecodingStrategy = .custom(myBase85Decoder)
decoder.nonConformingFloatDecodingStrategy = .convertFromString(positiveInfinity: "INF", negativeInfinity: "-INF", nan: "NaN")
decoder.nonConformingTypeDecodingStrategies[Int.self] = .default(0)
decoder.nonConformingTypeDecodingStrategies[CGPoint.self] = .custom {
    // Default CGPoint is decoded from `[1.0, 1.0]` but I prefer `{x: 1.0, y: 1.0}` instead.
    enum CodingKeys: String, CodingKey {
        case x
        case y
    }
    let container = try decoder.nestedContainer()
    let x = try container.decode(Double.self, forKey: .x)
    let y = try container.decode(Double.self, forKey: .y)
    return CGPoint(x: x, y: y)
}

Detailed design

private enum AnyNonConformingTypeDecodingStrategy {
    /// Throw upon encountering non-conforming values. This is the default strategy.
    case `throw`

    /// Assume the values as the given value.
    case `default`(Any)

    /// Decode as a custom value by the given closure.
    case custom((_ decoder: Decoder) throws -> Any)

    init<T>(original: NonConformingTypeDecodingStrategy<T>) {
        switch original {
        case .throw: self = .throw
        case .default(let t): self = .default(t as Any)
        case .custom(let closure): self = .custom({ return try closure($0) as Any })
        }
    }
}
private extension NonConformingTypeDecodingStrategy {
    static func from<T>(erased: AnyNonConformingTypeDecodingStrategy) -> NonConformingTypeDecodingStrategy<T> {
        switch erased {
        case .throw: return .throw
        case .default(let any): return .default(any as! T)
        case .custom(let closure): return .custom({ return try closure($0) as! T })
        }
    }
}

struct NonConformingTypeDecodingStrategies {
    private var store: [String: AnyNonConformingTypeDecodingStrategy] = [:]

    subscript<T>(type: T.Type) -> NonConformingTypeDecodingStrategy<T>? {
        get {
            return store[String(describing: type)].flatMap(NonConformingTypeDecodingStrategy<T>.from(erased:))
        }
        set {
            store[String(describing: type)] = newValue.flatMap(AnyNonConformingTypeDecodingStrategy.init(original:))
        }
    }
}
2 Likes

Thanks for taking the time to put this together! This is an idea we considered early on in the development of JSONEncoder/JSONDecoder, but ultimately rejected at the time because it felt like it too strongly violated the principles of encapsulation. It's common to include types which are not your own in payloads, and controlling from the top level how their types encode and decode may be a violation of the autonomy of those types. It's possible, too, they they have built-in assumptions about how their types are represented, and when those assumptions are violated, the code no longer behaves as expected.

We did this for Dates and Data specifically because they are

  1. Incredibly common types, which are likely to have the same representation across your entire payload, because
  2. Specific format requirements are likely controlled by whatever software is running on a server, and for those very integral types, the server is likely opinionated/has specific requirements

However, for less common types, the recommended approach — if you have specific format requirements/preferences — is to write an adapter type which does have the format you'd like to use:

struct PointWrapper : Decodable {
    let point: CGPoint
    init(_ point: CGPoint) {
        self.point = point
    }

    private enum CodingKeys : String, CodingKey {
        case x
        case y
    }

    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        let x = try container.decode(CGFloat.self, forKey: .x)
        let y = try container.decode(CGFloat.self, forKey: .y)
        self.init(CGPoint(x: x, y: y))
    }
}

You can then either use that type directly, e.g.:

struct Polygon : Decodable {
    let points: [CGPoint]
}

becomes

struct Polygon : Decodable {
    let points: [PointWrapper]
}

or just use that type as an intermediate type just during encoding and decoding:

struct Polygon : Decodable {
    let points: [CGPoint]

    private enum CodingKeys {
        case points
    }

    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        points = try container.decode([PointWrapper.self], forKey: .points).map { $0.point }       
    }
}

It's a bit more verbose than what you might have in a .custom closure, but it's less ad-hoc and can lead to less surprising behavior because you can control where and how the custom type is used. Is there a use-case that this approach doesn't meet, though?

3 Likes

Thank you for your reply!

they have built-in assumptions about how their types are represented

Yes, they have assumptions about the payload structure. However, sometimes it differs from built-in structure provided by Swift standard library, like the example of CGPoint .

Dates and Data specifically because they are ... likely to have the same representation across your entire payload

I agree. Not only Date and Data , but also all types are likely to have same representation across entire payload. The types indicate their only one representation by conforming to Codable .
The current problem is that, when it differs from assumption, there is no way to adopt "really" assumed format instead of built-in format.
So I realized that introducing strategies is not the best way for my motivation. A better way is to

  • remove built-in Codable conformance from some types ( Data , Date , CGPoint and more).
  • let developers define their own extension Foo: Codable to conform their "real" format.

My conclusion strays from the topic, so I want to create another thread to discuss it. Is it ok?


By the way, back to the topic.
The discussion whether or not to extend strategy might be worthy.

it felt like it too strongly violated the principles of encapsulation

I'm not sure what the principles of encapsulation means.
All types provide some initializers, except for init(from decoder: Decoder) . The .custom strategy just uses one initializer of them. In old days it hadn't, but now SE-0189 ensures it. So the customization doesn't violate encapsulation, isn't it?

Is there a use-case that this approach doesn’t meet, though?

The approach to wrap with adapter type is a good solution, but it can be exaggerated in some case.
For example, the code below fails with error.

struct A: Decodable {
    let url: URL?
}
let data = """
{ "url": "" }
""".data(using: .utf8)!

try JSONDecoder().decode(A.self, from: data)

I want URL? to ignore thrown errors, but I should make adapter type or write own init(from:) to get it. Both ways seem too much for me. (Especially, when A has tons of other properties)