How to encode objects of unknown type?

How can I encode objects without knowing their type at compile time because I only hold a reference to a protocol type?

Here is a simple playground example:

/******************/

import Cocoa

protocol AnyFoo: class, Encodable {}

class Foo: AnyFoo {
var foo = 0

private enum CodingKeys: String, CodingKey {
    case foo
}

func encode(to encoder: Encoder) throws {
    var container = encoder.container(keyedBy: CodingKeys.self)
    
    try container.encode(foo, forKey: .foo)
}

}

let f = Foo()
let a: AnyFoo? = f

let encoder = JSONEncoder()

let data = try! encoder.encode(a)

print(String(data: data, encoding: .utf8)!)

/*********************/

This result in the following error:

generic parameter 'T' could not be inferred
let data = try! encoder.encode(a)

How could I do that?

Thanks

Manfred

You can build a type-erasing wrapper:

struct AnyEncodable: Encodable {

    private let _encode: (Encoder) throws -> Void
    public init<T: Encodable>(_ wrapped: T) {
        _encode = wrapped.encode
    }

    func encode(to encoder: Encoder) throws {
        try _encode(encoder)
    }
}

let sample1 = [1, 2]
let sample2 = ["foo": "bar"]
let wrapped = [AnyEncodable(sample1), AnyEncodable(sample2)]
let data = wrapped.map { try! JSONEncoder().encode($0) }

This hides the concrete type of the wrapped value while keeping a reference to the particular encode method used to encode the value.

6 Likes

Thanks for the helpful reply.

The AnyEncodable struct works if I wrap an object directly into it. If I wrap the protocol (AnyFoo in the playground example), it again doesn't work any more.

Is it wrong to use a protocol for type erasure purposes?

I don’t know if I’m the right person to explain that. Take a look at JSONEncoder.encode:

open func encode<T>(_ value: T) throws -> Data where T : Encodable

This says “give me a concrete value that conforms to Encodable”. Which means you can’t do this:

// error: cannot invoke 'encode' with an argument list of type '(Encodable)'
let e: Encodable = [1, 2]
let data = try! JSONEncoder().encode(e)

Which looks confusing – as far as I understand it, the catch is that protocols don’t conform to themselves, so e: Encodable here doesn’t fit the “concrete value that conforms to Encodable” type constraint.

These limitations are not cast in stone, they might be worked around in future versions of Swift.

1 Like

Indeed, the problem is that protocols don't always conform to themselves – AnyFoo cannot satisfy a generic placeholder T : Encodable as AnyFoo doesn't (currently) conform to Encodable.

One rather sneaky workaround in your case would be to use a protocol extension on Encodable:

extension Encodable {
  fileprivate func encode(to container: inout SingleValueEncodingContainer) throws {
    try container.encode(self)
  }
}

struct AnyEncodable : Encodable {
  var value: Encodable
  init(_ value: Encodable) {
    self.value = value
  }
  func encode(to encoder: Encoder) throws {
    var container = encoder.singleValueContainer()
    try value.encode(to: &container)
  }
}

let a: AnyFoo = Foo()
do {
  let data = try JSONEncoder().encode(AnyEncodable(a))
  print(String(decoding: data, as: UTF8.self))
} catch {
  print(error)
}

// {"foo":0}

This exploits the fact that existentials (protocol-typed values) are "opened" when you call a method on them, which gives the extension implementation access to the underlying concrete type, which can then be used to satisfy the T : Encodable placeholder for the single value container's encode(_:) method.

7 Likes

This is a clever implementation of AnyEncodable. Most implementations write

func encode(to encoder: Encoder) throws {
    try self.value.encode(to: encoder)
}

which is incorrect as it doesn't give the encoder the opportunity to intercept value's type in order to apply things like encoding strategies. This actually ends up giving access to the inner type and preserving it correctly.

I guess this is the first implementation I've seen that I can largely endorse as Doing The Right Thing™. :sweat_smile:

Now, I am curious. Why is encode even a generic method and doesn't simply take an Encodable (protocol) parameter?

I assume the the encoding Container is parameterised, so that the compiler can type check the coding keys at run time (not for some intrinsic reason). But why is the encode method of the Encoder parameterised when all it does call the encode method defined in the protocol?

Thanks a lot, this works now. I have to admit though, that I don't understand why and how.

Is there a place where I can read up on existentials and this topic?

This was a conscious design decision we made to help prevent folks from encoding type-erased values which they would not be able to decode, without at least thinking about it.

From long-standing experience in this area — we wanted to help prevent more cases of "well, my app only ever needs to encode values, so my encoding scheme is fine" from becoming "design requirements have changed and now we need to decode the data, but because we never had the types, we can't decode any of it". Archived data is forever, and unless you're aware of the tradeoffs you might be making by encoding type-erased values, you can write yourself into a hole.

Of course, when you really need to, it's always possible to write an AnyEncodable type as @hamishknight's answer shows (and perhaps we should include one in the stdlib if we can help prevent folks from shooting themselves in the foot with it), but it's meant as a "pit of correctness".

5 Likes

Hello @itaiferber. I marked this post of yours until I could give it a second read, because I have a running SQL decoder/encoder and I wanted it to do The Right Thing.

So you say that this code is correct (although it does not compile):

  func encode(to encoder: Encoder) throws {
    var container = encoder.singleValueContainer()
    try value.encode(to: &container) // isn't it container.encode(value) here, instead?
  }

And that this code is incorrect:

func encode(to encoder: Encoder) throws {
    try self.value.encode(to: encoder)
}

But.... how is anyone, generally, supposed to know which container to create? singleValueContainer()? unkeyedContainer()? container(keyedBy:)? And if the correct choice is container(keyedBy:), what is the CodingKey type?

See, for an example of "incorrect" use: https://github.com/groue/GRDB.swift/blob/1a91020a8406b031141a17883a47293a2ca62528/GRDB/Record/PersistableRecord%2BEncodable.swift#L205-L206 I know that container(keyedBy:) is needed here, but I don't know the correct CodingKey type.

Thoughts?

Sure, happy to elaborate!

Did you include the necessary extension on Encodable?

extension Encodable {
    fileprivate func encode(to container: inout SingleValueEncodingContainer) throws {
        try container.encode(self)
    }
}

If you did and this still isn't compiling for you, can you show your full code? (And what version of Swift are you using?)

To elaborate on this point: when you execute value.encode(to: encoder), you are directly invoking value's encode(to:) method, which directly allows it to request containers and encode in its preferred representation. For the vast majority of types out there, this behaves as expected.

This breaks down, though, for any type T whose preferred representation differs from the Encoder's preferred representation for the type. Let's use URL as an example, whose encode(to:) reads as follows:

public func encode(to encoder: Encoder) throws {
    var container = encoder.container(keyedBy: CodingKeys.self)
    try container.encode(self.relativeString, forKey: .relative)
    if let base = self.baseURL {
        try container.encode(base, forKey: .base)
    }
}

URL itself prefers a keyed container which allows it to encode its base and relative string separately, to preserve them. This representation makes sense for URL itself, but it would be pretty surprising to encode a URL with JSONEncoder and get out a dictionary (e.g. {"relative": "http://swift.org"}) rather than a string representation ("http://swift.org"). JSONEncoder, then, intercepts the URL type when it is encoded to encode the preferred string representation:

fileprivate func box_<T : Encodable>(_ value: T) throws -> NSObject? {
    // ...
    if T.self == URL.self || T.self == NSURL.self {
        // Encode URLs as single strings.
        return self.box((value as! URL).absoluteString)
    }

    // ...
}

This works, of course, only when the encoder gets a chance to see that what's being encoded is a URL.

So the key difference is that myURL.encode(to: encoder) asks URL to encode directly into the encoder by requesting a keyed container, while encoder.singleValueContainer().encode(myURL) gives the encoder a chance (via the single-value container) to see that what's actually being encoded is a URL instead of just seeing its contents.

The code above offers the latter option via try container.encode(self) rather than the simpler self.encode(to: container). You can see the effect of this in the following:

import Foundation

extension Encodable {
    fileprivate func encode(to container: inout SingleValueEncodingContainer) throws {
        try container.encode(self)
    }
}

struct AnyEncodable1 : Encodable {
    var value: Encodable
    init(_ value: Encodable) {
        self.value = value
    }
    func encode(to encoder: Encoder) throws {
        var container = encoder.singleValueContainer()
        try value.encode(to: &container)
    }
}

struct AnyEncodable2 : Encodable {
    var value: Encodable
    init(_ value: Encodable) {
        self.value = value
    }
    func encode(to encoder: Encoder) throws {
        try value.encode(to: encoder)
    }
}

struct MyThing : Encodable {
    let myURL = AnyEncodable1(URL(string: "http://swift.org")!)
}

let url = URL(string: "http://swift.org")
let encoder = JSONEncoder()
var data = try encoder.encode(MyThing())
print(String(data: data, encoding: .utf8)!) // => {"myURL":"http:\/\/swift.org"}

Changing the above to

struct MyThing : Encodable {
    let myURL = AnyEncodable2(URL(string: "http://swift.org")!)
}

produces {"myURL":{"relative":"http:\/\/swift.org"}}.

The container should always be a singleValueContainer(). Encoding a value into a single-value container is equivalent to encoding the value directly into the encoder, with the primary difference being the above: encoding into the encoder writes the contents of a type into the encoder, while encoding to a single-value container gives the encoder a chance to intercept the type as a whole.

(I can elaborate on this point if it would help.)

6 Likes

Thank you very much @itaiferber! That's good food for thought. Please let me digest this information, now ;-)

1 Like