Future of Codable and JSON-Coders in Swift 6: Hoping for a rework

Codable is a very useful compiler-provided "macro" which has a massive usage rate in the Swift community. While it makes our lives much easier for en/decoding data in formats such as JSON and XML, it does have a decent amount of problems.

In this post i'm trying to mention a number of the problems, both for the record and also to bring up a discussion about the future of Codable.

I do realize that JSONDecoder/JSONEncoder are not a direct part of the Codable protocol and one could just use their own Decoder/Encoder implementation, but realistically majority of the community still uses those 2 in combination with Codable, so hopefully readers don't mind me mixing the 2.

Those being said, let's get to the points:

  • Performance: This is an aspect where JSONDecoder has had a ton of progress, both in Swift 5.4 in corelibs-foundation, and in Swift 5.9 in swift-foundation. This is very nice to see. I don't recall many news about JSONEncoder performance improvements, so I hope to see some improvements there as well, if possible.

  • JSON-Coder Type-Level Settings: JSON-Coder types accept pretty much all settings on a per-coder basis.
    For example, if you want to use snake_case keys for properties of a type, you need to set the settings of JSONDecoder and JSONEncoder to convert the type's coding keys from/to snake_case. Not only this does not make sense (i'll explain why), it can also cause multiple problems:

    • First problem is that you might have a model that is expected to use snake_case keys, but also have another model that is expected to use its coding-keys as-is. Using the convertTo/FromSnakeCase setting of the Coders does not allow this and both models will either be treated with snake_case setting, or both with the use-default-coding-keys setting.
    • Second problem is that 2 different names can have the same snake_case representation, which can result in JSONDecoder not being able to decode what JSONEncoder has encoded. As an example, 'myVAR' and 'myVar' both turn into 'my_var' after a snake_case conversion. JSONDecoder will only look for 'myVar' when decoding and if the original name was 'myVAR', it'll fail to decode a key that a JSONEncoder has encoded.

    Why Coder-level settings doesn't make any sense?
    To be able to provide such case conversions, we have 2 choices:

    • Coder-level settings (current way)
    • Model-level settings

    Coder-level settings not only comes with the problems mentioned above, but also favors being able to decode the same model with different settings like the case-coding settings mentioned above, while Model-level settings favors being able to use a model with any Coder but still en/decode the values as expected.

    In my personal experience, I've never been in a situation to want to en/decode a model with different settings in different places. For example i've never needed to decode a model assuming snake_case keys, then use the same model to decode another value but this time with normal model-defined keys.

    At the same time it does happen from time to time that I have to decode a single model, with different JSONDecoders. A JSONDecoder is sometimes related to another part of the code which does not require special settings such as case-conversion settings, so if you try to decode a model with it which needs specific JSONDecoder settings such as the case-conversion setting, the decoding process would fail.

    This is to say, Coder-level settings does not come with any practical advantage for users over Model-level settings, and a future Codable version should prefer implementing Model-level settings instead, IMO.
    This can be implemented with macro attributes, like some third-party libraries are already doing.

  • Dictionary Coding: It is a known issue that if you want a Swift Dictionary to be en/decoded as a dictionary in the JSON, instead of as an array, the Swift Dictionary's Key type must be exactly equal to String. This is due to the underlying implementation of the JSON-Coders, and is a source of different issues, confusions, and inconveniences. I expect there to be a mechanism to be able to encode literal JSON dictionaries easier, in the next Codable version. Perhaps with another macro attribute.

  • Default Values: Another pretty common problem when using Codable is that you can't assign default values to properties in case the value doesn't exist in the container unless with a ton of hassle, some restricted hacks such as using property wrappers, or meta-programming. I hope to be able to easily assign default values to fields in a future Codable version.

  • Debugging Experience: Codable comes with a suboptimal debugging experience. A decent amount of Codable errors are impossible to fix without taking a peek at the original JSON and/or digging into the codebase. One big problem is that they don't mention what Swift type is throwing the error and from where, which can be hard to find in big codebases when there are a lot of nested Codable types, some with similar property names. I also wouldn't mind if Codable errors contained the the related part of the original JSON, or the whole of it, even if it costs performance. I don't believe in trying to optimize for performance when throwing errors, because that'll just backfire in form of wasted debugging hours.

  • Sendability: JSON-Coder types cannot conform to Sendable because they are open classes. I personally haven't ever tried to sub-class JSON-Coders, nor have I seen any instances of it. I still give it a chance that the open attribute is of use to some people, in which case i'd propose moving the JSON-Coders to use protocols to achieve the same effect of inheritance for those in need. There can be protocols such as JSONDecoderProtocol which provide the actual implementation of the JSONDecoder: JSONDecoderProtocol type. Then users can conform their own type to JSONDecoderProtocol and tweak the default implementations if needed.

  • Codable Design: While I don't have too much experience in implementing Decoder or Encoders, every knowledgable person I know who has also implemented 1/2/3+ Decoder/Encoders, does mention how hard it is to get it right, and complains about its design. That's as much as i know so i'll leave it at that. I can ask them to mention the issues more specifically, if necessary.

So ... what are your opinions? Which one do you agree with, and which you don't? Why? I'm curious to know the community's opinion, as well as hopefully some comments from the folks in charge of Codable.

31 Likes

I would just like to +1 all of this. Server-side Swift lives and dies by JSON. We feel all the rough edges of serialization working without abstractions like Swift Data, User Defaults, etc. Macros seem like the perfect antidote to striking a balance between usability and performance.

8 Likes

There are two opposite cases I've met in practice:

  1. Several backends which provide the same model but with different property name conventions. Mobile app can use the same model for both of them.
  2. encoding model for different purposes e.g. for backend and for persistent storage. Coder-level settings allow to use different date decoding strategies. Model level settings will cause models needed to be duplicated and then problem with naming appear + binary size increases.
3 Likes

There are several other threads about Codable, like this one: Serialization in Swift
Most of these drawback along with solutions are already discussed, but the final shape is not determined AFAIK.

Not only because it is an open class but also because it is a mutable class.
+1 here, the lack of Sendability is a problem.

+1 here

I would also add the lack of access to raw storage inside coding / decoding functions.

For now we have a situation that current Codable implementation is a good default while making it better require other language improvements / features and research. My own suggestion that most of this can be addressed after Swift 6 release.
So for now we have a reasonable default implementation, but specific needs can be solved:

  • performance / dictionary coding by 3d party libraries
  • Sendability by immutable sendable wrapper
  • Default Values by different tricks
    ...

In my projects I've implemented some error handling solutions that allows to know Coding failure reason, which Type, which property and why, failed object json string, request url, file, line and other information. This info is sent to monitoring system so all unexpected failures are caught and developers even don't need to open IDE to understand mapping failure reason in most cases.
I hope most of this will come out of the box some day.

3 Likes

snake_case: I never use it even when it would (probably) work. I just use CodingKeys and don't have to worry about something not quite working.
debugging: Yes. Sometimes when I get firebase logs with parse errors I can't even tell for sure what JSON was being decoded and in what part of my code when the error occurred.
Date: I haven't had much problems with Date parsing lately but fractional seconds is a killer, as mentioned before Serialization in Swift - #23 by phoneyDev

4 Likes

Good to know there are users who actually take advantage of the Coder-level settings in a way that Model-level settings will have a hard time to provide.

You mention the use-case is to keep conventions, which is good and nice, but is not necessary, specially knowing that these settings only affect the generated JSON/data, not the actual models. The models can keep adhering to the conventions even if this feature doesn't exist.

That's why I'm still inclined to think Model-level settings will be more appropriate considering it solves actual problems/bugs, as opposed to being more of a luxury for the data to look nice.

Of course in a perfect world we could have both Coder-level and Model-level settings but I think that's too much to ask and will complicate things with little gain.

Since Codable is used for user data saved to files, I want to caution proposals to be extremely careful about silent behavior changes. Developer inconvenience is one thing, but users losing data in ways you and they may not notice at first is way worse.

Note that it’s still possible to change defaults by removing them, or by renaming a feature and keeping the old name+behavior around as deprecated.

19 Likes

Is the solution here not to remove Codable from the compiler, make it a macro, and distribute it so people can customise?

2 Likes

Just with my personal "hat" on: It'd definitely be worth a shot and see how far such implementation would get, and if (if any) it'd hit any missing features of macros (or not).

1 Like

Codable synthesis today makes use of both type information and largely-unstructured access to the AST for validation and diagnostics, in ways which would be difficult to replicate with today's macro system. As far as I'm aware, these range from plain annoying/difficult to impossible.

In no particular order, synthesis currently requires being able to:

  • Inspect all properties on a type, and differentiating between stored and computed properties
  • Differentiate between Optional and non-Optional properties (and for Optional properties, access the underlying Wrapped type)
  • Tell whether a given property itself conforms to Encodable/Decodable (and trigger synthesis recursively as needed)
  • Look up nested types inside of a type, regardless of location of definition (to find the enum CodingKeys, if defined)
  • Look up properties on a type based on a CodingKeys key name

To my knowledge, these aspects make synthesis a better fit for the "Semantic Macros" described in A possible vision for macros in Swift Β· GitHub than the syntactic macro system we have today.

While a macro system for Codable doesn't need to look anything like it does right now, I suspect that if we'd want to migrate struct MyType: Codable to @Codable struct MyType without any additional work by default on behalf of adopters, we'd need a lot of the same functionality (esp. for reasonable diagnostics and errors).


This is something that I think would be quite interesting to explore, though, especially as a driving force for the introduction of more semantic features to the macro system. (Though like @jrose notes above, we'd need to be very careful about silent or implicit behavior changes, which are a non-negotiable non-starter.)

14 Likes

This sounds like even more reason to try to pull Codable synthesis out into a macro, because it'll drive development of the macro system.

6 Likes