How to parse concatenated JSON with Foundation.JSONDecoder?

i’m trying to write some benchmarks to compare ss-json with Foundation.JSONDecoder. however, ss-json is designed for network streams that can yield multiple concatenated JSON messages in the same ByteBuffer, and Foundation.JSONDecoder’s default settings only allow parsing a single JSON object.

to get a meaningful sample size, the test data contains many megabytes of JSON captures collected in a text file.

is there a way to get Foundation.JSONDecoder to parse multiple JSON messages?

I don't think there is a way.

Do you have some unique divider between jsons in your text file? Then just search replace that to "," and wrap the whole thing in [] to make it one big json array.

1 Like

well, the idea is to simulate the conditions under which a real application would be receiving JSON, so assuming the JSON text is pre-parsed becomes a bootstrap problem.

the captures come from FTX, which is a good case study of an API that streams a firehose of JSON (> 1MB/sec) and which does not insert delimiters in the text stream. the message is considered complete after a successful parse, which is unambiguous.

JSONSerialization has JSONSerialization.jsonObject(with: InputStream...) but AFAIK it doesn't do what you want, it simply reads the whole stream into data and then effectively calls jsonObject(with: Data). And JSONDecoder doesn't have even that. Neither of these APIs allow to parse data partially, along with letting the caller know how many bytes were consumed, so that the caller can continue with parsing. Neither of these API allow to parse JSON stream... their only job is to parse valid json, and if you think about it this: { }{ } is not a valid json... this is: [{ },{ }]

JSONSerialization / JSONDecoder are not applicable for this task unless you prepare the data manually and either aggregate the whole thing into one big json array, or feed small jsons one by one. I remember I was dealing with that issue with Twitter streaming API:

At least in that case it was easy to prepare the data / find message delimiters.

That would be yet another thing I'd like to see addressed in JSONDecoder/JSONSerializer alternatives.

1 Like