Custom Encoder / Decoder supporting RawRepresentable

I give up figuring this out myself.

A struct that conforms to RawRepresentable and to Codable whose rawValue conforms to Codable should have a default implementation for Codable

So this

struct MyRawType
{
   let rawValue: Int
}

extension MyRawType: RawRepresentable {}
extension MyRawType: Codable {}

should just work with my custom encoder. Shouldn't it?

If I step through I always end up in

mutating func encode<T>(_ value: T, forKey key: Key) throws 
where T: Encodable { ... }

My custom encoder can encode Int and structs like

private struct Animal : Codable, Equatable
{
	let id: ...//Int -> Ok. MyRawType -> not Ok
	let name: String
}

My encoder does work when I encode MyRawType directly as in:

encoder.encode(MyRawType(rawValue: 42))

Where is the magic suppose to happen?

I'm sure all of hard work went into Codable and I am probably missing a lot of nuances/edge cases/..., but jeez, writing custom encoders/decoders is far more difficult than it has any right to. :face_with_head_bandage:

To be clear, if you have an Animal (using id: MyRawType) like

let tiger = Animal(id: MyRawType(rawValue: 42), name: "Tiger")

is the output you're seeing equivalent to

["id": ["rawValue": 42], "name": "Tiger"]

? And when you say

do you mean that you this method is called inside of Animal.encode(to:) to encode id, or inside of MyRawType.encode(to:) itself? (Because you would expect this method called when encoding Animal, for each of its properties.)

If my understanding of the issue is correct, are you able to share some of your Encoder implementation so we can get a minimal reproducing example? In particular, this sounds very suspicious:

If encoder here is your top-level encoder, then it sounds possible that somewhere in your implementation of keyed containers, you may not be dispatching to the right encode(to:) (unintentionally) — i.e., if MyRawType only encodes incorrectly when contained inside of another type, then I'd start looking into how Animal itself is encoded into keyed containers, and how that leads to id encoding.

With a bit more info, we should be able to figure out what's going on here.

(Also, just for confirmation — which version of Swift are you using?)

Swift 5.6, aiming at Swift 6.0. So constantly refactoring and trying things out.

The user defines a schema.
That schema is stored within the database.
The database does it thing like optimise column order or whatever.
The user can provide a dictionary of key-value pairs (or that is the result of a SQL statement):

["id": RowID(rawValue: 42), "flag": true, "name" : "kitty"]

This dictionary can pass through the encoder.
The user can provide a struct like Animal which passes through the encoder.
The encoder returns a dictionary of [String: Element] pairs.
Element is something that can be stored within the DB.
It's encoded once, written many times: the row itself, indices, views, whatever.
These are all containers that together from a table.

Are keys always processed in the same order by Codable? I genuinely don't know.
But doesn't really matter, I reserve the right to store the columns on disk as I see fit.

It has to be a struct because it has to be Sendable.
Nested structs? Make a graph. Maybe at some point this can be relaxed and a nested struct just gets its own container. But that depends on how well I get a handle on encoders.

Point being: I want this encoder to turn a lovely swift struct into [String: Element] which I can process.

So kitty becomes:

["id": Element(.numeric(8), .numeric(42)),  
 "flag": Element(.flag(true), .none),  
 "name", Element(.text(5), .text("kitty"))]`

The database row is flat, so id: ["RawValue" : 42] wouldn't work.

struct RowID : Sendable
{
	let rawValue: Int
}

extension RowID : RawRepresentable
{
}

extension RowID : Codable
{
}
private struct Animal : Codable, Equatable
{
    let id: RowID
    let flag: Bool
    let name: String
}
private let cat = Animal(id: 42, flag: true, name: "kitty")

func testRowID() throws
{
	let value = RowID(rawValue: 42)
	XCTAssertNotNil(try encoder.encode(value))
	XCTAssertEqual(try encoder.encode(value), ValueEncoder.Element(value: 42))
}

func testValue() throws
{
	let encoded = try encoder.encode(cat)

	XCTAssertEqual(encoded["id"], .init(value: cat.id.rawValue))
	XCTAssertEqual(encoded["name"], .init(value: cat.name))
}

func testOptionalValue() throws
{
	let value : Animal? = cat
	let encoded : [String: ValueEncoder.Element] = try encoder.encode(value)

	XCTAssertEqual(encoded["id"], .init(value: cat.id.rawValue))
	XCTAssertEqual(encoded["name"], .init(value: cat.name))
}
final class ValueEncoder
{
	//MARK: definitions
	typealias Element = PackageWriter.Element

	//MARK: properties
	fileprivate var stack: Element? = nil
//	fileprivate var array: [Element]? = nil
	fileprivate var content: [String : Element] = [:]

	/* required */
	let codingPath: [CodingKey] = [] //might be useful for nested structs?
	let userInfo: [CodingUserInfoKey : Any] = [:] //might be useful for encoding date strategies or so?
}

extension ValueEncoder
{
        //a standalone value has no key, so just return the element
        func encode(_ value: (any Encodable)?) throws -> Element
	{
		defer{ stack = nil }
		try value?.encode(to: self)
		guard let element = stack else { return .init() }
		return element
	}

	func encode(_ value: any Encodable) throws -> [String: Element]
	{
		try value.encode(to: self)
		return content
	}

//	func encode<T>(_ value: T) throws -> Element
//	where T: Encodable & RawRepresentable, T.RawValue : Encodable
//	{
//		try encode(value.rawValue)
//	}
}

extension ValueEncoder : Encoder
{
	func container<Key>(keyedBy type: Key.Type) -> KeyedEncodingContainer<Key>
	where Key : CodingKey
	{
		.init(ValueEncoderContainer(encoder: self))
	}

	func unkeyedContainer() -> UnkeyedEncodingContainer { fatalError() } // { UnkeyedValueEncoderContainer(encoder: self) }
	func singleValueContainer() -> SingleValueEncodingContainer { SingleValueEncoderContainer(encoder: self) }
}
private struct ValueEncoderContainer<Key>
where Key: CodingKey
{
	//MARK: properties
	let encoder: ValueEncoder
	var singular: SingleValueEncoderContainer

	/* required */
	let codingPath: [CodingKey] = []

	//MARK: init
	fileprivate init(encoder: ValueEncoder)
	{
		self.encoder = encoder
		self.singular = .init(encoder: encoder)
	}
}

extension ValueEncoderContainer : KeyedEncodingContainerProtocol
{
	mutating func superEncoder() -> Encoder { encoder }
	mutating func superEncoder(forKey key: Key) -> Encoder { encoder }
	mutating func nestedUnkeyedContainer(forKey key: Key) -> UnkeyedEncodingContainer { fatalError() }

	mutating func nestedContainer<NestedKey>(keyedBy keyType: NestedKey.Type, forKey key: Key) -> KeyedEncodingContainer<NestedKey>
	where NestedKey: CodingKey
	{
		encoder.container(keyedBy: keyType)
	}
}

extension ValueEncoderContainer
{
	/* optionals */
	mutating func encodeNil(forKey key: Key) throws
	{
		try self.singular.encodeNil()
		try wrap(key: key)
	}

	/* booleans */
	mutating func encode(_ value: Bool, forKey key: Key) throws
	{
		try self.singular.encode(value)
		try wrap(key: key)
	}

	/* strings */
	mutating func encode(_ value: String, forKey key: Key) throws
	{
		try self.singular.encode(value)
		try wrap(key: key)
	}

	/* integers */
	mutating func encode<T>(_ value: T, forKey key: Key) throws
	where T: Encodable & FixedWidthInteger
	{
		try self.singular.encode(value)
		try wrap(key: key)
	}

	/* floating points */
	mutating func encode(_ value: Double, forKey key: Key) throws
	{
		try encode(value.bitPattern, forKey: key)
	}

//	mutating func encode(_ value: Date, forKey key: Key) throws
//	{
//		try encode(value.timeIntervalSinceReferenceDate, forKey: key)
//	}

       /* generic */
//	mutating func encode<T>(_ value: T, forKey key: Key) throws
//	where T: Encodable & RawRepresentable, T.RawValue: Encodable & FixedWidthInteger
//	{
//		try encode(value.rawValue, forKey: key)
//	}

	mutating func encode<T>(_ value: T, forKey key: Key) throws
	where T: Encodable
	{
		switch value
		{
//			case let date as Date: try encode(date, forKey: key)
//			case let rowID as RowID: try encode(rowID.rawValue, forKey: key)
			default:
				throw Error.unsupported(String(describing: type(of: value)))
		}
	}
}

private extension ValueEncoderContainer
{
	func wrap(key: Key) throws
	{
		guard let element = self.encoder.stack else { throw KeyError.unknown(key) }
		encoder.content[key.stringValue] = element
		encoder.stack = nil
	}
}
private struct SingleValueEncoderContainer
{
	//MARK: properties
	let encoder : ValueEncoder

	/* required */
	let codingPath: [CodingKey] = []

	//MARK: init
	fileprivate init(encoder: ValueEncoder)
	{
		self.encoder = encoder
	}
}

extension SingleValueEncoderContainer : SingleValueEncodingContainer
{
	/* optionals */
	mutating func encodeNil() throws
	{
		encoder.stack = .init()
	}

	/* booleans */
	mutating func encode(_ value: Bool) throws
	{
		encoder.stack = .init(value: value)
	}

	/* strings */
	mutating func encode(_ value: String) throws
	{
		encoder.stack = .init(value: value)
	}

	/* integers */
	mutating func encode<T>(_ value: T) throws
	where T: Encodable & FixedWidthInteger
	{
		encoder.stack = .init(value: value)
	}

	/* floating points */
	mutating func encode(_ value: Float32) throws
	{
		try encode(value.bitPattern)
	}

	mutating func encode(_ value: Float64) throws //repetitive :( 
	{
		try encode(value.bitPattern)
	}

//	mutating func encode(_ value: Date) throws
//	{
//		try encode(value.timeIntervalSinceReferenceDate)
//	}

	/* generic */
	mutating func encode<T>(_ value: T) throws
	where T : Encodable
	{
		try value.encode(to: encoder)
	}
}

I got to this point thanks to NSBlog and JSONEncoder.
Without those examples, it's very unclear that you need encode functions in ValueEncoder to kick of this thing. Naively you start with ValueEncoder conforming to those protocols.

Are the encodeIfPresent functions required, actually?

Yes, I can get it work for RowID specifically but if the user has some custom RawRepresentable value then the schema would just be "custom" and not "custom.rawValue" or whatever. It should be flattened in the case of RawRepresentable.

Thanks for sharing this, though unfortunately I think I'm still not 100% clear on what the specific issue is — so please let me know if my understanding is incorrect.

Taking your (rough) example of

let cat = Animal(id: RowID(rawValue: 42), flag: true, name: "kitty")
let encoder = ValueEncoder()
let encoded = try encoder.encode(cat)

, I see we end up throwing an unsupported error in ValueEncoderContainer.encode<T>(_:forKey:), as if we weren't supposed to end up there. This is where my understanding of your expectations might be breaking down, because I do expect us to end up here: we're encoding Animal, which will call this method for each one of its keys.

In general, I expect the encoding tree to look something like this:

// encoder.encode(cat)
ValueEncoder.encode(_:) -> [String: Element] // top level
└ Animal.encode(to:) // grabs a ValueEncoderContainer to encode properties
  ├ ValueEncoderContainer.encode<T>(_:forKey:) // encode row
  │ └ RowID.encode(to:) // grabs a SingleValueEncoderContainer to encode raw value
  │   └ SingleValueEncoderContainer.encode<T: FixedWidthInteger>(_:)
  ├ ValueEncoderContainer.encode(_: Bool, forKey:) // encode flag
  └ ValueEncoderContainer.encode(_: String, forKey:) // encode name

For each property on Animal, we'll be calling one of the encode(_:forKey:) methods on ValueEncoderContainer, and in the structural case (encode<T>, where T isn't a known Element type), we need to further dispatch into the type's encode(to:) to get it to splat out its contents into the containers.

If instead of throwing inside of ValueEncoderContainer.encode<T>(_:forKey:), I replace the implementation with

mutating func encode<T: Encodable>(_ value: T, forKey key: Key) throws {
	try value.encode(to: self.encoder)
	try wrap(key: key)
}

then encoding happens exactly as I understand your expected results to be, and I get

["flag": Element.boolean(true), "id": Element.numeric(42), "name": Element.text("kitty")]

(Because you don't share the structure of Element I had to make up my own definition here, but this looks approximately like what you show.)

Is this the result you're looking for, or am I missing something?


To answer some of the other questions you bring up:

Keys are handed to the encoder in the order laid out in the code:

  • If encode(to:) is implemented by hand, it'll be in whatever order the developer chose to encode
  • If encode(to:) is synthesized by the compiler, it'll generally be in the order of the keys in the CodingKeys enum (which is in turn generally in the order of the properties on the type) — but this isn't guaranteed

In either case, it's up to the Encoder to decide how it wants to lay out the data for output. So you're well within your right to store things as you wish.

You don't need to provide implementations for the encodeIfPresent methods unless you have a specific reason to — default implementations are provided in terms of the regular encode overloads.

1 Like

Thank you! It works now.

My implementation and/or understanding of

mutating func encode<T: Encodable>(_ value: T, forKey key: Key) throws

was just wrong.

For some reason I thought that once I get in this function I have to deal with whatever value I'm given.
In other words, that the compiler already took care of splatting the type's content.
Calling encode on the value itself seemed like infinite recursion.
That is the only explanation I can offer for my faulty code.

I assume it's the same for decoder.

func decode<T>(_ type: T.Type) throws -> T
where T : Decodable
{
	try T.init(from: self.decoder)
}

I'll probably refactor my decoder but at the moment it works too. :smiley:

If you got any tips on further improvements or criticism of this encoder implementation, please don't hold back. That I'm currently not using codingPath: [CodingKey] makes me wonder which improvements I'm overlooking.

the order of the keys in the CodingKey s enum (which is in turn generally in the order of the properties on the type) — but this isn't guaranteed

I could have gone with using the layout of a struct as the order of columns. Thus bypassing the need for a schema. But that seemed fragile. The user might re-arrange things in their code and I was never sure about the generated order. Good thing I didn't went down this road. :slight_smile:

Excellent, glad I could help a bit!

One major thing to keep in mind with Codable (esp. when you're implementing an Encoder/Decoder) is that there's really no "magic" — even for the default Codable synthesis for types, the compiler doesn't do anything but generate boilerplate that you could have written by hand; so in that sense, there's nothing special about these APIs (which means you'll also need to handle it all yourself).

Yep! decode<T>(_:) largely gives you the opportunity to do something special for known Ts, but otherwise, you just defer to the type to initialize from. This is largely what JSONDecoder does too.

I think there's room for improvement here, and a lot of it will come through completing the implementation and getting a better intuitive understanding of what's going on — so not too much specific to say beyond things like "I found it a bit confusing that you have a storage variable named stack which isn't really a stack, and only holds a single value; and depending on how you encode, you can forget to write it back to storage, causing it to get overwritten".

On a higher level, I recommend trying to read through other Encoder/Decoder implementations out there, as JSONEncoder/JSONDecoder are pretty verbose, and it's easy to get lost in there.

  • The JSONEncoder/JSONDecoder used on Darwin is pretty dense, but the swift-corelibs-foundation library also includes a different implementation for use on non-Darwin platforms (Windows, Linux, etc.) that's broken out into separate files (JSONEncoder, JSONDecoder) and simplified pretty significantly; if you want to stick to an "official" example, I'd consider this version
  • The Yams library YAML Encoder and Decoder are significantly easier to understand in smaller pieces, IMO, and follow the same general layout
  • The GRDB SQLite library has also largely solved the problem you're looking to solve, and also has an Encoder and Decoder which persist data into DB columns

These may be illuminating.

I also wish there were official documentation out there on how to write all of this, but in its absence, there's also the Flight School Guide to Swift Codable, which has an annotated implementation of a MessagePack encoder, which can also be very instructive.

If you run into other issues, though, happy to help further!

1 Like