I'm thinking about how implementing a protocol in SwiftNIO would work, where there are a variety of packet types, but one or more may contain a large payload. For myself, this is implementing SFTP; however, HTTP is a good analog as well. An example: a client connects to the server, and sends an HTTP PUT request with a 1 GB body as the file contents.
An HTTP request parser would not want to use a ByteToMessageDecoder that loads the entire body into a ByteBuffer. That would take up a lot of server memory, as well as delay the processing of the HTTP request until after the entire gigabyte has been uploaded. (What if it's unauthorized, and we wasted time before sending back an HTTP 4** or killing the connection before wasting bandwidth and resources?).
I've looked into SwiftNIO's HTTP implementation for insight, but it seems to still be a step below what I'm thinking of. It seems to have a ByteToMessageDecoder object that converts the incoming stream of bytes into a single
.header, zero or more
.body (depending on the type of HTTP request), and a single
.end enumeration case, where the
.body cases are small chunks of data that were read off of the socket. This is as opposed to a single body.
Next, I looked at Vapor's implementation on top of SwiftNIO's NIOHTTP1/2. See:
Vapor does what I'm thinking of, at least partially. It uses NIOHTTP and the header, 0 or more bodies, and end (and keeps track of what it should be waiting for via an internal state on the handler, which makes sense). When it encounters a body chunk, it seems to write it to a stream setup on the header object passed on to be processed. The implementation of the BodyStream, however, stores a contiguous array of BodyStreamResults, which are similar to NIOHTTP's
.body enum cases which hold a chunk of data from the socket connection. Once something attempts to read the body stream, any future incoming data immediately calls the handler instead of being stored in memory.
My initial apprehension of this is that memory may quickly be allocated to store a large amount of incoming data dynamically. If there is some kind of delay in beginning to read that incoming stream, or if writing that data to disk or DB is slower than it is coming in on the connection, wouldn't that take up too much RAM? Could this be a potential vector of Denial of Service?
(To be honest, I've been musing on this for a while now and just now thought that, realistically, the time delay between storing incoming chunks of data and the application code doing something with the body stream is likely short, and memory wouldn't be consumed too badly, at least in an HTTP server scenario. In another scenario, where perhaps it waits on user input before reading the body. Example: Chrome or Safari asking you if you want to allow a website to download a file to your storage. It received the HTTP header, but only starts reading the body stream after user input.)
There are two solutions, I think, to this problem:
- Stream the incoming data into a temp file in storage if the memory consumption begins to get too big
- Only read data off the socket when it's requested
First, for "Stream the incoming data into a temp file in storage if the memory consumption begins to get too big", I think this one can be somewhat straightforward. One could feasibly take something like Vapor's
Request.BodyStream and add extra code to monitor its memory consumption, and if it gets too large, make a new temp file and dump all of the memory-stored data into it. Then, a future read operation on the stream would drain data out of the file (and then directly from the socket after that).
Second, for "Only read data off the socket when it's requested", this is where it gets tricky, I think.
The way I understand it, SwiftNIO will fire on the channel pipeline whenever and as soon as any data is available in the socket's read buffer. Once this is done, the code can read the given
ByteBuffer or it can choose to do nothing with it. But that's the only chance it has to do so. If the code does nothing with the data, then that data is lost and the next channel pipeline invocation will use new data in the
ByteBuffer. This is as opposed to ByteToMessageDecoder/Handler, which has its own ByteBuffer that grows from the incoming data off the socket until enough data meets the Decoder's needs, which then reads the buffer and the buffer can be re-used. I think this is correct.
But because of this, if the channel pipeline + ByteToMessageDecoder produced a header message, and let's say it called some asynchronous code to analyze and process it, that might be done on a Promise/Future, which means a new invocation on the EventLoop/Group on the thread. There's nothing to stop the EventLoop/Group from still invoking the channel pipeline with more data to read and process. I did a lot of research on run loops and event loops and I think this is correct. It'll almost be as if the incoming data is always coming in from the connection, in our example of a large file upload. If the header processor chose to wait 5 seconds before processing everything and the body, then that's 5 seconds worth of data still being read in by SwiftNIO and being buffered somewhere.
My question, then, would be if it's possible to temporarily not wait on the channel's socket file descriptor, which the EventLoop/Group internally uses in its epoll/kqueue system call. In this way, the kernel would fill up the descriptor's/process' data buffer, and TCP buffers would become full, which tells the client to temporarily stop sending new TCP segments, all until the server begins reading the data. This way, no buffer is needed, DoS prevention is a bit better, we don't have to worry about temp files and writing to storage or memory, etc.
The header handler would have to begin reading the body at some point, of course, either because the header was accepted and it wants to stream the data directly to its proper place in storage or in the database (using a much smaller buffer in the stream for this), or because the header was not accepted and the connection can be killed entirely, or maybe it just needs to drain the body into /dev/null (the void) to be able to respond with an error and process the next incoming request.
The closest thing I can find to this is the ChannelHandlerContext's fireChannelInactive to temporarily stop SwiftNIO from pushing data into the application when we want to instead wait, and then fireChannelActive to re-activate it once the stream is being read from. I'm not sure if that's what these functions actually do, though. I would fear they would kill the connection, or we aren't really supposed to use them like this, in normal read operations.
Is this wise to do, is solution #1 more practical, or is Vapor's solution good enough for most scenarios? I'm a bit confused on how to proceed with this, especially after performing research into run loops and how this works with threads and asynchronous operations.