Rearchitecting JSONEncoder to be much faster

Yeah strtod is still much slower (5x+ in many cases IIRC) than it needs to be. Rapidjson's is the best one I've found: http://rapidjson.org/strtod_8h_source.html

RapidJSON is an incredibly good library. Lots to be learned from there. My own implementation started off as a research project of my own, and only after completion did I research RapidJSON and SAJSON for inspiration. I seem to have takent he same path RapidJSON did in my library, which is working really well for me and allows for a very powerful set of optimizations in mutations primarily. Although I didn't think my design through as well as a library as RapidJSON, for anyone trying to pursue the same path I'd totally tell them to look into their direction.

I just want to announce that I have taken some time to implement a JSONEncoder and JSONDecoder in pure Swift. That means neither did I use any third party libraries nor did I use Foundation. Further I didn't use any unsafe syntax at all. The encoding and decoding is about 1.5-2x faster on macOS and 8-10x faster on Linux.

Not using Foundation comes with some drawbacks though. For example: I can't provide custom encoders for Data and Date as the Foundation implementation does provide.

I'm sure there are still some performance bottlenecks in the code (especially encoding I guess) and I'd be grateful to get some feedback.

Of course this implementation doesn't reach the speed of any simd implementation. But SIMD instructions are now part of the Swift STL. So this might be something worth looking into. Sadly I have no idea how to implement the approach presented by Daniel Lemire with the given Swift simd instructions. If you can point me in the right direction, please reach out.

28 Likes

Sounds promising. It might be worth creating a new thread under Related Projects to discuss and get feedback on your package.

@frazer-rbsn If you have any feedback would be so kind to just open an issue at GitHub? I don't know if it is reasonable to split the discussion between GitHub issues and the Swift forums.

Have you considered using canImport or a define for foundation support?

Otherwise, you can create a protocol (PureJSONCodable?) and have a separate foundation-supporting module implement support for this on Date/Data.

Yes! The problem here is that Foundation is always available (probably there are some exceptions, but you get my point) and for this reason will always be imported.

There might be situations though in which one doesn't want to link against Foundation. Ask the SwiftNIO folks.

What do you think about the propertyWrapper based approach further down in the README?

Actually, could you send a summary or reference if you have one? (I didn't see anything immediately fro my search terms).

Previously I thought it was just a poor target to rely on because of inconsistent behavior and missing functionality, not that linking was itself a bad idea.

I built a CBOR package with Codable support (Swift-Cyborg), and now wonder if usage from server apps would be aided by some refactoring of Foundation out.

Just search for "import Foundation" in the SwiftNIO repository. You'll only find it in one module SwiftNIOFoundationCompat and in tests.

https://github.com/apple/swift-nio/search?q=import+foundation&unscoped_q=import+foundation

I don't think most developers do care. I was unhappy with the JSON implementation on Linux and questioned weather it would be possible to build something without Foundation (it is possible, some convenience might be missing). Here we are. Use the tool that gets the job done.

What's the performance on iOS?

I haven't tested this yet...

@michaeleisel I just wanted to let you know that after reading this thread we switched to your Zippy library and get a factor around 5 in decoding. I would love to give a try to any implementation of encoding that could be faster than JSONEncoder and support Codable. :slight_smile:

Typically it should provide a faster of 3+, i.e. it should run in one third the time or less. Have you double-checked that it's being compiled in release mode? If so, feel free to file a Github issue. As for encoding, what's your use case?

1 Like

It wen from ~0.99s (JSONDecoder) to ~0.56s(Zippy) on two purposely slow files, so I think it seams reasonable, especially taking into account the nature of this files. I used the Pod library and took this measure with two debug builds.

The usage is serialization/deserialization of users project files for our app. The biggest chunk of the informations are List of Codable structure which contains mostly floats. Serialization (which happen multiple time thanks to auto-saving :') ) can took up to 1s so it is a bit annoying.

We do not really need JSON and I feel like a binary format would definitively be at least one order of magnitude faster. But the cost of implementing and maintaining a custom serialization specific to our data is too much for now. Especially because we do not have a fixed schema and simply adding one optional to our structures is just an awesome way to add new features fast while reducing the risk of introducing bugs.

So any replacement for JSONEncoder / JSONDecoder that gives better performances is welcome (especially the encoder :D) :)

Edit: I switch to a release build to do some measurement and JSONEncoder do 0.934s on a test file whereas your zippy do a magnificent 0.187s, so I think everything is fine ;)

1 Like

A strategy I see for encoding performance is to only save when the user backgrounds the app - only really critical stuff like the user session token being possibly saved during other app functions. And that stuff (security permitting) could just be in the user defaults.

Thanks for the idea. Most interaction are stateful and change the status of what the user is editing (think of word or google keep for example). If the app crash or the user kill it, I do not want to risk to loose more than the last transformation he did to its project.

I moved the serialization into a background thread so for now it is fine and the user won't notice any latency, but I'll probably get worried again in 6 months when the amount of information increase. :smile:

Speaking of encoding performance: does it matter if one uses a dictionary for a container vs a struct? For example:

let dict1: [String: Any] = [
  "key1": 23,
  "key2": "hello"
  "key3": true
  ]

struct MyData: Encodable {
  var key1: Int,
  var key2: String
  var key3: Bool
}

Are there performance differences when encoding dict1 vs an instance of the MyData struct?

I'm not sure about encoding performance for Apple's JSONEncoder beyond at a high level. It really depends on the internal implementation. You also want to think about bridging cost if you're getting an NSDictionary.

Hey all,

I definitely haven't done any benchmarking or profiling of ZippyJSON, but I wonder if there's some potential for using a combination of structural generic programming (Automatic Requirement Satisfaction in plain Swift) and simdjson / ZippyJSON that might provide great performance with great ease-of-use.

All the best,
-Brennan

That makes sense. What sort of optimizations is the JSON decoder shown in the benchmarks doing to give it its speed boost? Isn't Swift auto-generating reasonable code for Decodable? Or does this new proposal have new capabilities as well?

Terms of Service

Privacy Policy

Cookie Policy