I am making JSON encoder/decoder library which has useful function what
Foundation one does not have.
It is FineJSON.
To achieve some features, I need to use another JSON parser different from
So I am making original JSON parser.
It is RichJSONParser.
First time, my first implementation is amazing slower than
It was over x100 slower.
I tried to many optimization work while looking Time profiler in Instruments.
I referred implementation of
JSON.parse and lexer of swift compiler.
Finally, my code grew to x40 faster than original.
But it is still x2.5 slower than
This last wall is very high for me.
And I have no idea more to optimize this.
So I post this thread here.
Please tell me some useful information, idea, topics to get more faster.
This is current implementation.
Current score. smaller is faster.
mine, Xcode 10.1 7.193193 mine, Xcode 10.2 beta 8.888982 Foundation, Xcode 10.1 2.880308
I have some concerns now in follows from profiler result.
String initialization cost.
My code build UTF-8 byte stream from JSON string value if it has backslash or multibyte characters. Theoretically, my implementation build valid UTF-8 sequence of course. But
Swift.String also validates it in constructor. Can I cut this cost?
String transcoding cost.
Swift 4.2 keeps string as UTF-16 internally. So UTF-8 stream I built would transcode to UTF-16. It needs some operations.
Swift 5 is slower than Swift 4.2.
I ran benchmark in Swift 5 with Xcode 10.2 beta.
I expected it is faster than Swift 4.2.
Because it keeps string as UTF-8 internally. So transcoding computation can be skipped.
Strangely It is slower.
I am very confusing this result.
Array expanding cost.
From time profiler, array expanding consume certain CPU time. (
It may happens when JSON array elements or object key-value pairs produced over than internal buffer capacity of array.
If I specify capacity (
reserveCapacity) by heuristics prediction about JSON.
But number of elements is very variance from 0 to over 100.
It is hard tradeoff of time and memory.
I think its profit is very small.
Source location tracking cost.
My implementation tracks line number and column number in source of JSON.
It helps to produce human friendly error message when parse error is happened.
This needs more operation for two
Int variable than only track offset.
But I feel this penalty is very small.
If all these costs are cut, My intuitive CPU time can be earned are small for gap between mine and
What is last difference?
I concern about class allocation heap cost of Swift (
I predict that
Foundation.JSONSerialization has some specialized allocator for JSON. I don't have enough knowledge about this. If it is, there is no way to fill this gap. Even if I fight with raw allocation or
Unmanaged<T>, I need to build additional convertion process to oridnally Swift object for library interface.