[Proposal] SOAR-0010: Support for JSON Lines, JSON Sequence, and Server-sent Events

Hi all,

The proposal SOAR-0010: Support for JSON Lines, JSON Sequence, and Server-sent Events for Swift OpenAPI Generator is now up and In Review:

The review period will run until January 4th 2024 - please post your feedback either as a reply to this thread, or on the pull request.

Thanks!

cc @Honza_Dvorsky

8 Likes

After something like this is accepted, how fast is adoption expected from the providers? Vapor does not yet support SSE (though it’s pretty much right around the corner) and I suspect other providers will have their own giants to slay.

This proposal contains a full implementation, both for creating and consuming these sequences, and will work with all transports right away. Transport providers don't need to add any support, just like with the multipart feature recently.

You can give the feature a try, it's all implemented on the branches of swift-openapi-runtime (the encoding and decoding async sequences) and swift-openapi-generator (fully working client and server examples), linked from the proposal.

2 Likes

I have to say: I did not expect such a well-rounded event streaming solution in the context of the OpenAPI generator (especially this early in its lifecycle).

Despite my surprise, this looks like a great idea to me ... and it seems to fit in surprisingly well.
Kudos!

The only thing I have a weird feeling about is the fragmentation around "bag of bytes"/"stream of bytes" in the swift ecosystem.

[UInt8], Data, ByteBuffer, ArraySlice<UInt8>, AsyncBufferedByteIterator, AsyncSequence<ArraySlice<UInt8>> ... oh my.

Not saying this is a problem for this proposal, or a problem at all, but this one seems to lock in "stream of ArraySlice" quite firmly.

2 Likes

You're right, the ecosystem is yet to align around a single bag-of-bytes type.

That said, we chose to use an array slice of bytes back when introducing streaming in 0.3.0, and then doubled down when introducing multipart support in 1.0.0-alpha.1. This is a continuation of that trend. However, once a unified byte type comes along, we'd happily add conveniences that make it easy to interop.

Regarding the proposal - thanks for the kind words. Once the lowest level is fully asynchronous and supporting streaming, there are various very cool and powerful use cases, such as event streams, that just naturally compose on top. That's why we carefully designed it in 0.3.0, with all these benefits in mind.

It was the same story for multipart, which also supports full streaming of parts and even individual part contents.

If you have other ideas about where full end-to-end streaming can be beneficial to users, let me know!

2 Likes

Thanks @sliemeobn for your feedback. Agree with @Honza_Dvorsky on both points.

First that, given the right abstractions, these things should compose easily regardless of the transport.

Second, that we are deliberately and openly avoiding getting into the bag of bytes business. We wanted to avoid the dependency on NIO, so didn’t want to use ByteBuffer and wanted to use the most idiomatic thing we have today, which we felt was the async sequence of chunks of bytes, using AsyncSequence<ArraySlice<UInt8>>.

Ideally our runtime library will compliment the ecosystem so, if/when the community converge on a solution, we’ll definitely consider using it.

We had a similar stance to the HTTP currency types, which we were quick to replace with swift-http-types as they became available.

1 Like

This might be a novice question, so I’m happy to stand corrected, but this feels like a feature that would be useful in the standard library/Foundation JSON support. Is there any equivalent there? Might there be one day soon?

I’m currently trying to understand how to do something similar: decompressing large archived JSON data and decoding in memory. Am looking at potential overlap in functionality and wondering where else this might be found in Swift.

Yes potentially in the future, this could be part of another library in the ecosystem. But it didn't make sense to wait for that to provide the functionality, especially because of the various different bag-of-bytes problem discussed above.

Regarding streaming parsing of JSON, that's orthogonal to the problem we solved here.

The proposal provides 3 sequences for encoding/decoding a stream of events, but the individual events are still parsed using Foundation's JSONDecoder. Parsing an individual event still requires buffering the full line (for example, for JSON Lines).

If you wanted to do streaming parsing of the contents of individual events as well, you'd probably need to write a brand new streaming JSON decoder.

1 Like

I think I just didn't explain my use case well: I'm working with large files containing newline delimited JSON, so a giant array of JSON objects that I'd like to parse one line at a time. I read the proposal again, and admit I don't totally understand it, but it sounds really close to what I'd like.

I get that we don't want to wait on Foundation to implement this, but it would be nice to consider how this might be more broadly useful and if it's a substantial use case, if or how it might "graduate" to Foundation at a later date.

1 Like

Ah gotcha - yes, you can use it to load a large JSON Lines file from disk! You'd just first need to write an AsyncSequence implementation that reads a file from disk and gives you the bytes as ArraySlice.

One example of such an implementation is soon landing in SwiftNIO: Add NIOFilesystem by glbrntt · Pull Request #2615 · apple/swift-nio · GitHub

So you'd open a file for reading using the new NIO code, it gives you an async sequence of byte chunks, you map it to convert them to an array slice of bytes, and then you can use the convenience function from my proposal to parse the JSON Lines stream directly.

1 Like

(Stepping in for the review manager for this proposal, who's out at the moment)

The review period for this proposal has now concluded. Thanks to everyone who provided feedback!

There were no objections to this proposal. Most comments were around more ecosystem-wide improvements, e.g. for a centralised bag-of-bytes type. While these comments are welcome and useful datapoints supporting the desire for such wider improvements, they are beyond of scope of this proposal, and of the Swift OpenAPI Generator project in general.

This proposal can now move to "Ready for implementation".

4 Likes