Currently, JSONEncoder can spend a lot of time doing the following things:
- Creating intermediate Swift/ObjC data structures to pass to JSONSerialization
- Creating a new Swift String for each key conversion
- Inside of JSONSerialization, using libc functions to convert numbers to strings
All of these things can be resolved with a different architecture. There may be other good ones, and if so I'd love to hear about them, but here's what I'll pitch:
Whenever an encode method is called, write the JSON out directly, rather than storing anything in dictionaries or delegating any work to JSONSerialization. The one tricky bit is that it may be told to encode something that needs to come later after other, unseen, pieces of data are encoded. For example:
let container1 = encoder.unkeyedContainer() let container2 = encoder.unkeyedContainer() container2.encode(...) container1.encode(...)
If container2's data is written directly to the final
Data that will be output, then we have a problem when container1 tries to write to it. However, this will only happen, correct me if I'm wrong, when the user has implemented their own encoding method and is getting a bit wonky with it. At that point, it could keep track of multiple strings and paste them together. There are different ways to do this, and I can flesh out that part, but first I want to gauge interest in this overall proposal.
Another benefit is that, with Swift handling the whole process and not delegating to JSONSerialization, it will be able to optimize things like the conversion of numbers to strings, which is rarely as fast as possible when using libc functions. It may even be able to write multiple numbers at once in a fast way with vectorized instructions. But this is an optional benefit.
I am confident that these changes would make JSONEncoder at least the 3x the speed it currently is. I have employed some of these same ideas and philosophies to ZippyJSON, which decodes at 4-5x the speed of JSONDecoder.