High performance serialization format seriously lacking

Hi, i've recently been involved in a project requiring a pretty good level of performance regarding serialization (to & from local storage), and found myself stuck between a few walls trying to use swift :

  • use a DB mapping each property of my data to a column (with or without an ORM like core data). This is nice, but becomes very cumbersome very quickly if the number of different properties or the complexity of your models becomes too large.

  • serialize my struct & classe to a "data" blob, and persist it on a file (either in a BLOB DB column, or directly in a file).

The second option lead me to try various serialization formats, from json to binary plist, to flatbuffer, to protobufs, and they seem to fall into two categories:
1- Map pretty conveniently with the Swift types & protocols (codable) and are convenient to use, but are way too slow by at least 2 orders of magnitude (taking seconds instead of tens of milliseconds on other platforms), like json & property lists. OR
2- are fast, but requires a lot of plumbing code to map to the wide range of swift types (eg: enums), with each mapping layer adding the risk of a mistake or a performance penalty. (flatbuffers & protobufs)

Are there any plan to add a high performance serialization tool in the stdlib that would map pretty decently with the rest of the library ?

2 Likes

i hoped there was plan to maybe make the jsonencoder / decoder faster ? I've read about around the net, and it seems like swift's implementation is notoriously slow (it is too slow in my case, but i haven't compared it with other platforms personally)

Oh, my bad. Thanks a lot for the pointers, i completely missed the fact that it was coming from Foundation.

but requires a lot of plumbing code to map to the wide range of swift types (eg: enums), with each mapping layer

Could you expand what required plumbing code you mean? FlatBuffers would give very cheap ser&des and supports swift since a year back or so.

that's the one i'm actually using for now, but only for simple structs. Once you start going into enum with associated values, generic enums, etc, it obviously starts to fall a bit short (and i'm not blaming them).

They're officially designed to be a serialization format, they haven't (yet) provided an object-oriented (or protocol-oriented) API, even if the swift project is doing its best to try and match swift's type system.

Eg : if you design your types in a fsd file first, then use them, it's fine. But trying to map an arbitrarely complex swift structure to a flatbuffer buffer is clearly more involved than just declaring your struct Codable and do let data = FlatBufferEncoder().encode(codable)

1 Like

I see. I usually would view arbitrary object graph serialization != high performance as mentioned above... but then, everyone has different frame of reference on performance matters.