Is there a length restriction on channelRead?

I created a server and a client using SwiftNIO. They communicate via TCP. To ensure a correct communication, I put every message from server to client and vice versa in a xml-string like this:
<packet ...>...</packet>.

As far as I could see, this works perfect and everytime channelRead was called, I had a complete packet. My question is now: Can there be a chance, that one packet will be split up in two calls of channelRead although the other side just sent on message?

By the way, my func looks like this:

func channelRead(context: ChannelHandlerContext, data: NIOAny) {
    var byteBuffer = self.unwrapInboundIn(data)

    if let string = byteBuffer.readString(length: byteBuffer.readableBytes) {
        self.messageHandler?(context.channel, string)
    }
}

Yes. TCP is stream-oriented: it does not expose message boundaries, and treats the data as a stream of bytes. Your code does need to guard against the possibility that messages will be split up across packets.

A good thing to do is to write a ByteToMessageDecoder that can process this framing. As an example (it's not super efficient but it's workable):

struct XMLFraming: ByteToMessageDecoder {
    typealias InboundOut = ByteBuffer

    mutating func decode(context: ChannelHandlerContext, buffer: inout ByteBuffer) throws -> DecodingState {
        var view = buffer.readableBytesView

        while view.count > 0, let index = view.firstIndex(of: UInt8(ascii: "<")) {
            view = view[index...]
            if view.starts(with: "</packet>".utf8) {
                // Got a match.
                view = view.dropFirst("</packet>".utf8.count)
                let packet = buffer.readSlice(length: buffer.readableBytes - view.count)!
                context.fireChannelRead(self.wrapInboundOut(packet))
                return .continue
            }
        }

        return .needMoreData
    }
}

This will chunk up the messages, sending one ByteBuffer that contains the full framing. If you wanted you could also arrange for this object to strip off the leading <packet ...> and trailing </packet>. You can then create a ByteToMessageHandler from this ByteToMessageDecoder, and insert it into the pipeline before your ChannelHandler.

This will guarantee that every channelRead contains one and only one packet that matches your framing.

1 Like

Thank you very much for your answer and even for your example!

I checked my code again, because I was wondering, why it was working all the time and then I was a little bit surprised, because I realized, that I answered this question to myself a year ago, as I wrote that code. Actually I solved this problem in a different way:

let packetBegin = "<packet"
let packetEnd = "</packet>"
var dataRead = ""

func handleMessageString(messageString: String) {
    dataRead = dataRead + messageString

    if let startRange = dataRead.range(of: packetBegin), let endRange = dataRead.range(of: packetEnd) {
        let xmlString = String(dataRead[startRange.lowerBound..<endRange.upperBound])
        dataRead = String(dataRead[endRange.upperBound...])   // leave everything, that is behind </packet>
        ...
    }
}

This function is called every time from channelRead (see the self.messageHandler in my first post).

But I will have a closer look at your solution as well.

You can also use ByteToMessageDecoderVerifier from NIOTestUtils to test your ChannelHandler's ability to deal with incomplete messages. It tests different sizes of input for you to verify your handling of incomplete messages.

For example:

func testPingThenDisconnectDecoding() {
    // Given
    let channel = EmbeddedChannel()
    var disconnectInput = channel.allocator.buffer(capacity: 4)
    disconnectInput.writeBytes([0b11000000, 0b00000000, 0b11100000, 0b00000000])
    let expectedInOuts = [(disconnectInput, [Packet.ping(.init()), Packet.disconnect(.init())])]
    
    // When, Then
    XCTAssertNoThrow(try ByteToMessageDecoderVerifier.verifyDecoder(inputOutputPairs: expectedInOuts,
                                                                    decoderFactory: { PacketDecoder() }))
}
1 Like

That is at the end a very trick thing. I figured out a problem, when using an UTF8-String in this case. If there is a special character that is exactly splitted between two messages, then my solution is not working anymore (because appending those two Strings are different then if I would use the whole data). I am using:

func channelRead(context: ChannelHandlerContext, data: NIOAny) {
	var buffer = self.unwrapInboundIn(data)
	if let string = buffer.readString(length: buffer.readableBytes) {
		// pass the message to the messageHandler
		self.messageHandler(string)   // INFO: this will call my handleMessageString
	}
}

@lukasa: I don't understand your solution. Where can I put this?

In the meantime I implemented another solution using only Data so I will append Data and not Strings.

@Lupurus Do you get any length prefix before the actual string. Or do you get some end signal, that you know the string is now fully loaded? You could use those markers to know when you received the complete string and only then start reading the string.

@fabianfett: Yes of course, I do... as I wrote in my first post:

<packet ...>MYMESSAGE</packet>

But I will have a problem, when this one will be splitted in:

PART 1:
-----
<packet ...>.... some text \u00

AND PART 2:
----------
E4 rest of the message</packet>

This will result in a broken string, so handling it with data is working.

I have to come back to this thread, because I would like to understand some backgrounds, as I now have another questing according to this issue.

How is this working under the hood? I am parsing the data everytime there is happening something on channelRead in the ChannelInboundHandler. What happens, if the network is slow or very fast? According to the (german) Wikipedia, the size of a TCP packet is maximum 1.500 bytes. The english Wiki tells, that the maximum segment size should be set small enough. Is the size variable according to the network speed? Like if I download a file via http.

My intention now is, that the client can tell more abouth the transfer progress. So I would like to add the number of the size for big data packages and the client can then show, how many data he already received and how much progress of the download is made. But what if the size of a TCP package is too big? Then the progress will do nothing for a long time, as maybe almost the whole data will be transfered in one TCP package.

So this runs into a bunch of networking details, and it's going to be tricky to give a detailed answer, but I can give a high-level overview.

The first is the maximum size of a TCP segment. This is called the MSS (Maximum Segment Size). Importantly, the MSS is not hard-coded to 1500 bytes (usually), and in fact is almost never actually equal to 1500 bytes. Instead, the MSS is derived from a different concept, the MTU (Maximum Transmission Unit): more specifically, the path MTU.

The MTU is the limit of the largest single unit that can be transmitted on the network layer. Typically in most networks this ends up being the largest size of payload in an ethernet frame. Most networks limit the payload of an ethernet frame to 1500 bytes. This means the maximum number of bytes in an IPv4 TCP segment is 1500 bytes minus the smallest possible IPv4 header size (20 bytes) minus the smallest possible size of a TCP header (20 bytes), allowing for 1460 to be the MSS. In some cases you can use "Jumbo Frames" to make the ethernet frame payload larger, which allows a larger MSS, and on loopback the MSS can potentially be huge, but in general the MSS is 1460 bytes for IPv4.

Importantly, this is a hard upper limit. TCP attempts to avoid IP fragmentation at all costs, so each TCP segment will never be larger than 1460 bytes. How do we send more data than 1460 bytes then?

Well, the trick is that TCP has what's called a "send window". This is the maximum number of unacknowledged bytes that the receiver will allow to be in flight. This number changes over time based on congestion control feedback. For fast reliable networks, this number increases over time, allowing the sender to send faster and faster. When the sender has sent as many bytes as the window allowed, it has to wait for acknowledgements before it sends more. This allows the receiver to exert backpressure on the sender, slowing it down.

Thus, the MSS is not variable, but the total number of bytes you can have on the wire is.

Importantly, therefore, the TCP packet being "too big" is not going to be a problem. Importantly, TCP packets do not arrive slowly: they arrive in a single transfer unit (ethernet frame). This means they either arrive, or they do not. If they arrive, you can read their bytes: if they don't, you can't.

2 Likes

Thank you very much for that detailled explanation! Maybe one day I have time to read a little bit more about the details, but so far I think I understood: So if I have data with the size of 5 MB for example, the data will be splitted into packages of 1460 bytes. If the client then knows, that the data will be 5 MB, he can calculate on every channelRead, how many data already was received and how much is still left. I'll give that a try.

Actually, thinking about this leads me to another question. You said, that TCP is stream oriented. I googled this and read more about it. As far as I understood, all the data that is going to be sent will be queued in the outgoing stream and will then arrive at the other side. As big data will be splitted as said before, it's good to make delimiters like I did with <packet> and </packet>.

My other question now is: What happens, if two big packages will be sent at the same time? Can it happen, that they will be mixed up? Or will the queue look like this:

| Data A ..... | Data B ..... |

and then be splitted in

| Data A 1/10 | Data A 2/10 | ..... | Data B 1/4 | Data B 2/4 | .... |

In TCP it is not possible to send two packets "at the same time": they will always have an order. The closest you can come is to call write from two threads at the same time (which NIO forbids), but even when doing this your writes will be serialized and written as your example suggests.

Maybe worth noting that packets can arrive in arbitrary order, and need to be brought back in order by the TCP stack on the receiving side. But that's all automatic to the user of a TCP socket (e.g. NIO).

Does that also mean, that sending data at the same time is not possible? So if I would have one big data to sent from the server to the client, every other data will be queued, until this one is finished?

If so, how could I achieve multiple transfers at the same time?

You open multiple TCP connections to the server, which is what you would do with HTTP/1.x (e.g. retrieve an HTML page and then all the images within in parallel).
Or you layer an own protocol on top of TCP to allow multiple concurrent streams, which is what HTTP/2.x does.

Thanks for your answer.

I also had a look on HTTP, but I couldn't find out, that HTTP/2 can use mulitple concurrent streams on one connection. That's interesting, but will that work with NIO? lukasa said, that I cannot use write from multiple threads.

Having multiple connections like HTTP/1 would lead to a complex rewrite of my code, as I have specific user rights and not every client can access every data, so I will need to ensure, that the second connection will be authorized in any way.

Sorry, if the topic is now getting away from the original question.

Yes, this works. You can't use the write system call from multiple threads, but that doesn't prevent NIO offering you that interface. NIO's HTTP/2 implementation will absolutely work.

I am actually still thinking about this. I am not very convinced about using two different ways for the communication between server and client (one with a direct and permanent TCP connection and one or more other ones via HTTP).

But this now doesn't have to do anything with the original question, so thank you a lot for helping me.