What NIO+HTTP/2 errors can be safely ignored?

taylorswift · November 1, 2023, 3:00am

ever since adopting the new async APIs, i started logging all networking errors; previously i was blissfully unaware of them as they all disappeared through the default implementation errorCaught.

i find that i am now logging a staggering amount of networking errors, sometimes one every ten seconds or so.

in order from most to least common:

uncleanShutdown
NIOAsyncWriterError.alreadyFinished
outputClosed

which of these errors can be safely ignored?

lukasa · November 1, 2023, 9:32am

uncleanShutdown is the easiest: you can ignore and suppress that one. That error is fired by the TLS stack when a connection is shutdown without first receiving a TLS CLOSE_NOTIFY message. In general this represents a possible truncation attack on a connection.

In the case where a protocol has internal truncation detection, however, this message is redundant. HTTP protocols are (generally) one such case, as except for a very frustrating edge case in HTTP/1.1 all HTTP messages are well-framed.

This has led to the unfortunate property that the vast majority of HTTP implementations do not send a CLOSE_NOTIFY message. That's bad practice, but they get away with it.

Because NIO's TLS implementation is not HTTP specific, we do not suppress this information: we can't tell at the usage site whether this is likely to matter for you. You can feel free to ignore and drop it.

alreadyFinished is probably an indication of a weird timing issue in your code: either we or you have already closed the writer, but you've attempted to write something new. This is probably generally safe to treat as a terminal error. If this one is happening a lot I'd spend some time trying to work out what actually happened here, but otherwise I'd treat it as acceptable noise.

For outputClosed, that one is a bit trickier. It is thrown, unfortunately, in two places: one of them is indicative of a severe bug, one is not.

The one that is not indicative of a severe bug is the one triggered by NIOTypedApplicationProtocolNegotiationHandler. This happens if we haven't completed protocol negotiation before the connection terminates. That should be expected to happen at a baseline rate.

The one that is indicative of a bug is the one that happens if you write on a channel that has already had its output closed. That will mean the writer was finished, and then we saw more writes. I doubt you're seeing that one, as it would be a pretty severe bug in NIOAsyncChannel that really ought to have reared its head before now. But do keep an eye on it.

In closing I'll say that the unfortunate reality of operating a service on a public listening IP is that your network error rate will be disappointingly high. Lots of connections fall over or terminate uncleanly for reasons that are very banal: scanners, scrapers, phones going into tunnels. Once you filter out uncleanShutdown, this rate should level out somewhere fairly reasonable.

taylorswift · November 1, 2023, 10:19pm

i’ve begun seeing this locally a lot as well, usually when a client rejects a mock SSL certificate. i don’t suppose there is a way to discriminate between the two flavors of this error?

i believe this is happening because my timeout implementation is not aware that a query is in progress, it only counts incoming request fragments as “activity” since the atomic Int can’t model a full state machine.

perhaps a state machine is unavoidable, but it really feels like a leap backward to have to go back to channel handlers just to enforce timeout.

lukasa · November 6, 2023, 11:30am

Yes, please file a bug against swift-nio to change the error thrown by NIOTypedApplicationProtocolNegotiationHandler.

You can have a state machine managing this without going all the way to a channel handler I think.

taylorswift · November 8, 2023, 4:26am

could this happen naturally due to a concurrent task enforcing channel.close(promise: nil)?

FranzBusch · November 8, 2023, 10:41am

Yes this can be the reason for it. After you call channel.close the channel will tear everything down and remove all pipeline handlers. The NIOAsyncChannel is adding two handlers (one for the inbound, one for the outbound). When the outbound handler in the pipeline gets removed it finishes the NIOAsyncWriter.Sink which will then transition to a terminal state. If you try to write to the writer afterwards you will get a .alreadyFinished error.

taylorswift · November 9, 2023, 8:22pm

is it feasible to prevent this error from happening in the first place?

FranzBusch · November 10, 2023, 9:24am

Not for all cases. If the underlying connection got closed because your remote went away you will still get this error. In those cases, you should just ignore it. The other case is when you finished the writer yourself and then tried to write afterwards. In those cases, you can avoid the error by just making sure your state machine understands when you finished and doesn't try write afterwards.

taylorswift · December 8, 2023, 1:25am

the errors keep piling up! now i am seeing a ton of

StreamClosed(streamID: HTTP2StreamID(1), errorCode: HTTP2ErrorCode<0x8 Cancel>, location: "NIOHTTP2/HTTP2StreamChannel.swift:865")

read(descriptor:pointer:size:): Connection reset by peer (errno: 104)

invalid HTTP version

invalid constant string

how should i be handling these? (also, why don’t they print their types by default?)

FranzBusch · December 8, 2023, 1:14pm

taylorswift:

StreamClosed(streamID: HTTP2StreamID(1), errorCode: HTTP2ErrorCode<0x8 Cancel>, location: "NIOHTTP2/HTTP2StreamChannel.swift:865")
read(descriptor:pointer:size:): Connection reset by peer (errno: 104)
invalid HTTP version
invalid constant string

The first error means that an outbound RST_STREAM frame was generated and potentially some frames have been dropped due to that since they haven't been flushed out. This might be something you want to look into but it could also be a completely normal thing.

The second error means that the remote just closed their connection which can always happen and can safely be ignored.

The third one means that the H1 case got an invalid HTTP version. This is weird because I assume you are TLS with protocol negotiation but this error can be generated by our C HTTP1 parsing library by NIOHTTP1 if the HTTP version is not 1.0 or higher. I assume somebody tries to connect with HTTP/0.9

Also an error thrown by the C HTTP1 parser and indicates the HTTP version is not correct similar to the above I assume somebody tries HTTP/0.9.

Some of those implement CustomStringConvertible that's why they all have differing levels of information in the description.

taylorswift · December 8, 2023, 8:34pm

what are some reasons we could be generating an outbound RST_STREAM frame in the first place?

we are using TLS with protocol negotiation, but i’ve found many clients completely disregard it. for example, we have only ever offered h2 on swiftinit for the past month or so, but Googlebot continues to crawl the site over HTTP/1.1 anyways.

have you thought about standardizing on printing the type name followed by the error message? as it stands,

catch let error { print(error) }

is not as helpful as it could be.

lukasa · December 19, 2023, 2:13pm

Are you checking the negotiation result in the .handshakeCompleted user event? As the server, you can offer what you want, but if the client doesn't care about the ALPN extension then the negotiation will be ignored and you'll get .default.

taylorswift · December 19, 2023, 9:07pm

no, the server still “supports” HTTP/1.1 in the sense that it supports connecting over HTTP/1.1, it will just refuse to process the request at the application level unless the connection originates from a trusted IP range, or there is something accompanying the request (like a session cookie) that indicates it can be trusted.