Serialization in Swift

Is the core team interested in reviewing those sorts of proposals that build on top of existing Codable infrastructure? I've had a proposal PR (with a compiler implementation) up since March 2020 and never heard from the core team about it.

3 Likes

The great thing about Codable for simple types is that no code is required. But the slightest need for something not simple requires a lot of boilerplate code. Being able to accomplish less simple things with no code should be a goal. Here are a few pain points for me:

Dates! The default of Date decoding to seconds isn't sensible for JSON. The lack of support out of the box for fractional seconds has been a problem for me. It is possible to have an internet date parser that accepts all forms of international dates. I use SwiftDate for this but it requires custom code.

Default values. I guess there are property wrappers for this but I haven't investigated them. It should be possible to supply default values directly to the decode() methods. It should also be possible to provide the default values in a more declarative way, so no init method.

Future proofing enums. I commonly have simple enums, either String or Int based, where the server API adds new cases from time to time. Any given version of my app only knows about the current cases. The only way I deal with this is try/catch with decoding in the try block and assigning a default value in the catch block. If I forget to add this then parsing fails if there's a new case.

5 Likes

The syntax that property wrappers provide is really nice. One minor issue is that with property wrappers around the underlying data types, it's difficult to do more abstraction over a model type (or I have to put the property wrapper into common model protocols). Thanks for the direction!

One question about the existing (and the proposal you mentioned) encoding/decoding architecture is that why we need the CodingKeys stuff (put the implementation limitation aside)? I think it's quite natural that when I define a property like let date: Date? inside a model type, I just want to extract the date field from a JSON object; and if I define the property as (ignore how we implement it) @field("outer.inner.date") let date: Date?, I want we go into two nested structure of a JSON object to find the date field.

With all the encoding/decoding logic defined alongside a single property itself, we could have several benefits:

  1. It's easier for other engineers to read/understand the model type definition. Eyes won't need to be targeted at different areas of a source file interchangeably.
  2. It's also easier to add/remove/modify the fields, since we don't need to maintain the extra CodingKeys or NestedKeyDecodingStrategy stuff.
  3. If we want a same property in another model type, we just need to copy the single definition of the property into the new model type. It's that simple!

To some point I agree with @ktoso . Maybe if there is a powerful meta programming interface (like the macro in C or the gyb file in the Swift repo, but I prefer a native macro system in Swift), I can create a JSON encoding/decoding library myself (with the features I mentioned earlier). Because the scenarios and cases may seem different a lot (in this thread some people mentioned the performance issue, while I focused more on logic coherence around single properties), a general macro system may be a good direction.

3 Likes

:+1::+1:

+1 to the idea of a general meta programming system. I would love the idea to walk an AST-like structure and synthesize custom type initializers. Essentially lifting the work the compiler does for Codable into libraries. This combined with a compile time asserts would allow libraries to design high performance serializers + deserializers with compile time error checking.

For example, it could allow swift-argument-parser to move it's runtime checks for detecting duplicate commands into compile time errors.

5 Likes

+1 from me as well assuming there is good tooling support. I have found it invaluable in working with Sourcery to be able to see the code that gets generated. Beyond this, we commit generated code to our repo which has the benefit of providing a free regression test when working on the template. I don’t know exactly how to best provide those benefits with a built-in language feature but it’s something worth considering.

5 Likes

From the perspective of making Codable more flexible, I'd love a way to work with data that contains typed objects. For example, given this:

JSON
[
    {
        "name": "Doug",
        "type": "person",
        "date": 1616323199446
    },
    {
        "url": "https://swift.org/assets/images/swift.svg",
        "type": "resource",
        "date": 1616323199000
    }
]

I'd like to be able to define something like this:

Swift Types
protocol Response {
    var date: Date { get }
}

struct Person: Response, Codable {
    let name: String
    let date: Date
}

struct Resource: Response, Codable {
    let url: URL
    let date: Date
}

And somehow decode it as an array of Response.

5 Likes

As already stated in the thread KeyPaths and Codable, accessing the coding key from a keypath is the biggest need I have currently:)

I'll keep the feedback short, one of our needs is the ability to decode and encode a binary layout from and into custom swift types. It would be a huge benefit if this was possible via Codable or some kind of an extension of that feature as otherwise this requires a lot of potentially error prone boilerplate code.

1 Like

I don’t know enough about codable to comment on that specifically, but I do have a decent amount of experience with systems with high performance requirements for serialization / deserialization (automatic trading systems).

FWIW and just in the off chance that not everyone have seen them and without fully groking the full requirements for this effort, I just wanted to point out https://capnproto.org/ and GitHub - google/flatbuffers: FlatBuffers: Memory Efficient Serialization Library. I realize these are not anywhere near flexible or feature complete enough, but I wanted to point them out as if performance is a real concern aiming for zero serialization/deserialization should at least be considered.

3 Likes

In my experience that is a quite common issue and tooling / support for doing that more easily and efficient would be great.

I just wanted to mention that this discussion is not about specific implementations of serialization formats, but about the general support for serialization in Swift and how we can improve it to make it easier for developers of serialization libraries to develop flexible and efficient implementations and also for users of those libraries to more efficiently use them.

3 Likes

Might as well drop some talks for inspiration.

A mechanism to Scala 3's inline macros / derivation could be a very scalable mechanism, though I'm not sure how well it would work in Swift. There was a fantastic walkthrough through the feature recently here: Generic Derivation is the New Reflection by Alexander Ioffe

Most notably, note that with such mechanism it is possible to achieve 1:1 as efficient code as manually hand-written code, as well as it is much nicer than what we currently do in the compiler with manually weaving ASTs and SIL together, which one cannot print easily and there's no real type system to help getting it right on that level -- just that things fail later on in the pipeline.

It is a bit nasty to write those but not too bad, and definitely most powerful. For simple derivation one might want to have quasiquote mechanisms, perhaps we could have both...?

4 Likes

Ok, understood - assuming this was partly directed to my post as I referenced two specific implementations, I just wanted to clarify that my point was more that a good general support framework for serialisation (and something that allows for efficient implementation of such a serialisation library) should consider that being able to build zero-ser/des implementations would be a good thing and wanted to provide those pointers as some food-for-though input as the original opening question was fairly open "The core team would like to initiate this conversation with the community to gather requirements and discuss future designs and their trade-offs".

But if that isn't interesting or out of scope I'll drop it and apologies for the noise in that case - it is just that in my experience the pendulum of serialisation solutions often seems to swing far into the flexibility and expressiveness spectrum and become unusable for most high performance needs as it is never really seriously considered from the beginning.

5 Likes

The concept of generic derivation is absolutely genius! Based on this walkthrough, it requires a few missing links we don't have in Swift: meta-programming capabilities, compile time evaluated macros, tuple composition/decomposition operators, etc. But the result is as efficient as manually written code, and it's flexible enough to achieve any type of serialization needs.
Furthermore, these concepts can all be reused for many other use-cases where we previously required dedicated compiler features (e.g. automatic protocol conformances). I would love to take some of those responsibilities out from the compiler, for the benefit of both the compiler and language pro-users.

4 Likes

would it be possible to go from

init(from decoder: Decoder)

and

encode(to encoder: Encoder)

to

init<D : Decoder>(from decoder: D)

and

encode<E : Encoder>(to encoder)

?

2 Likes

My two cents:
I use Codable in a json context, and I have always loved that feature of Swift. :slight_smile:

For most of the issues that I am facing, I think there are evolutionary steps to make the current system better.

  1. Enum coding: with SE-295 I will get to reduce a fair bit of boilerplate code. And if any subsequent proposals add more configurability that will allow me to discard even more boilerplate.

  2. My own pet peeve: using non-String keys for Codable Dictionaries. This is fixable as well and I hope that the Core Team has bandwidth to put my proposal https://github.com/apple/swift-evolution/pull/1288 through review in the near future. :slight_smile:

  3. Key conversion between camel case and snake case is currently two conversions which results in hard to explain requirements for the keys. I hope that the general direction of https://github.com/apple/swift/pull/14039 could become a reality some day. It appears that the pr above was first abandoned and later closed due to maintenance, but the issue is still there - and worth solving.

These are the main issues I have, and they all have solutions.
That said, both performance and meta programming features would of course both be awesome to have. :slight_smile:

Specific case studies seem like valuable, if not crucial, input, for deciding directions to take to achieve that goal. How better to "gather requirements and discuss future designs and their trade-offs" (from the opening post) than to look at the things people are actually having trouble doing with the current system?

5 Likes

I’m kinda curious if the core team is thinking more towards something similar to the existing Codable protocols or if they’re looking for a more general code synthesis solution?

While we are studying use cases, I don't think anyone has highlighted GitHub - apple/swift-argument-parser: Straightforward, type-safe argument parsing for Swift. Argument Parser currently utilizes Decodable to initialize a parseable command, but the current state of Codable forces this to be pretty hand-wavey and requires quite a few strategic fatalErrors. I think this sort of use case should sit front-and-center in a debate over the future of Codable (and a static structural reflection approach will address this use case nicely).

3 Likes