Partial input

(Helge Heß) #1

When hooking up SwiftProtobuf with SwiftNIO you run into the issue that a stream of data may either have more or less data than what a Protobuf message carries.

What is the proper way to deal with that? Incomplete data can be detected using BinaryDecodingError.truncated, but what about data that belongs to the next message? You would need something like func decode() -> ( Message, RemainingData )?

(Johannes Weiss) #2

Until you have enough to parse one protocol buffer message, read that off the buffer you got (buffer.readData(length: sizeOfTheProtoMessage)) and return .continue. Once the buffer doesn't have enough to decode the message, just return .needMoreData.

ByteToMessageDecoder will automatically call you in a loop and keep a cumulation buffer of bytes available for you.

Pseudo code for a simple protocol that is [32 bits of unsigned big endian int: length of message][message1][32 bits of unsigned big endian int: length of message][message 2]...:

/* disclaimer: haven't compiled or tested this but should give an idea */
class MyProtoDecoder: ByteToMessageDecoder {
    [...]

    enum State {
        case waitingForLength
        case waitingForMessage(Int)
    }
    private var state = State.waitingForLength

    func decode(ctx: ChannelHandlerContext, buffer: inout ByteBuffer) -> DecodingState {
        switch self.state {
            case .waitingForLength:
                if let len = buffer.readInteger(as: UInt32.self) {
                    if len > 4 * 1024 * 1024 {
                        /* just close channel if over 4 MB, in real-world code you might want to handle differently */
                        ctx.channel.close(promise: nil)
                    }
                    self.state = .waitingForMessage(len)
                    return .continue
                } else {
                    return .needMoreData
                }
          case .waitingForMessage(let length):
               if let data = buffer.readData(length: length) {
                    ctx.fireChannelRead(somehowDecodeAMessageFrom(data))
                    self.state = .waitingForLength
                    return .continue
               } else {
                    return .needMoreData
               }
        }
}

does that make sense?

(edit: added a length check just in case someone wants to use code derived from this in real-world apps)

(Helge Heß) #3

That makes a lot of sense (it is handling the BinaryDecodingError.truncated part) but actually doesn't answer my question ;-) The question is specifically about the case when more data is available (i.e. a second message in the receive buffer). What is the API for this in SwiftProtobuf?

(Tim Kientzle) #4

Protobuf is not self-framing. This means you cannot simply append protobuf messages to a stream and then separate them again.

You'll need to add some information to your data stream to help you identify the start/end of each message. Usually, people do this by inserting an integer length value before each message.

SwiftProtobuf does not have an API to do this for you, so you'll have to implement it yourself. Johannes' code shows one approach.

2 Likes
(Johannes Weiss) #5

If you return .continue it’ll call decode again and form a parsing loop. So that should work just fine or am I missing anything?

(Helge Heß) #6

@johannesweiss Yes, you miss the case in which more data is available

(Helge Heß) #7

@tbkka I don't quite get this. When decoding anything, you consume N bytes of data. So there should be API which gives you the data not consumed. I can't see why I need an explicit frame here.

(Tim Kientzle) #8

I should also clarify: BinaryDecodingError.truncated does not detect all cases where a message might be truncated.

(Helge Heß) #9

Oh :slight_smile:

(Helge Heß) #10

OK, I see. You explicitly add framing. Fair enough.

(Johannes Weiss) #11

@Helge_Hess1 the data not consumed will be handed to you in a subsequent call to decode that happenes automatically when you return .continue. So you really only need to read one message off the buffer and B2MD (ByteToMessageDecoder) will call you repeatedly until you need more data to proceed.

If for whatever reason you want to pre-access data that decode would deliver on the next call, just check the bytes in buffer. After you read off the first message, you can use the usual API and the bytes between readerIndex and writerIndex are all yours. But really just return .continue and B2MD will call you again immediately (even if there are no further bytes received from the network as .continue indicates the buffer contains at least as much and likely more than needed for one message).

If this still doesn’t answer your question, can you provide example code on what you’d like to do?

(Helge Heß) #12

I'm fine with this, but would you be so kind to explain why implicit framing doesn't work (w/ Protobuf)?

(Helge Heß) #13

I'm just looking at https://medium.com/@fattywaffles/protocol-buffers-with-swiftnio-69a2804b5ba9 and how to do this properly. He is writing messages without frames, and I'm not entirely sure why frames are necessary.

(Johannes Weiss) #14

@tbkka is the expert but from what I know about protobufs it’s just not a thing the protocol does. You always need to know the message size before attempting to parse. How you do that is up to you. If your wrapping protocol has explicit framing just use that. If it does not (like TCP) you need to create something yourself, like sending the size before the message like in my example.

(Helge Heß) #15

Yeah, but this makes no sense to me. A parser parses and there can be left overs. Just like HTTP upgrade.

(Johannes Weiss) #16

That will need framing. That’s not a NIO problem it’s just how protobufs work: https://developers.google.com/protocol-buffers/docs/techniques

If you want to write multiple messages to a single file or stream, it is up to you to keep track of where one message ends and the next begins. The Protocol Buffer wire format is not self-delimiting, so protocol buffer parsers cannot determine where a message ends on their own.

(Helge Heß) #17

I never said it is a NIO issue, which is why I posted that here :-) I just would like to understand why/how the format is not self delimiting. No big deal, I'm also happy to just accept the fact and then use Cap'n Proto ;-)

(Johannes Weiss) #18

Or sorry! Thought that’s the NIO forum :see_no_evil:. Well, it’s also not a SwiftProtobufs issue either. You somehow need to provide framing by whatever means and when you have a full frame then hand it to SwiftProtobufs or any other protobufs implementation.

(Tim Kientzle) #19

Details of protobuf encoding are here:
https://developers.google.com/protocol-buffers/docs/encoding

There's also a Google forum for general protobuf questions:
https://groups.google.com/forum/#!forum/protobuf

(Helge Heß) #20

Yes I can use Google. So what is the part which requires explicit framing?