How to gracefully close an HTTP/2 connection?

taylorswift · February 23, 2024, 12:16am

i have some code that handles an incoming HTTP/2 connection with the new SwiftNIO APIs:

func handle<Authority>(
    connection:NIOAsyncChannel<
        HTTPPart<HTTPRequestHead, ByteBuffer>,
        HTTPPart<HTTPResponseHead, ByteBuffer>>) async 
{

the code spawns a child task that checks every second if the connection is active, and closes it if no ongoing operations are associated with it.

    await withTaskGroup(of: HTTP.ServerMessage?.self)
    {
        (tasks:inout TaskGroup<HTTP.ServerMessage?>) in

        ...

        let cop:TimeCop = .init()

        tasks.addTask
        {
            try? await cop.start(beat: .milliseconds(1000))
            connection.channel.close(promise: nil)
            return nil
        }

the problem is quite frequently, Firefox (and probably other browsers as well) will attempt to reuse HTTP/2 connections, and sometimes this races the timeout enforcement. this happens when Firefox tries to send a request over the old connection before it receives the notification that the server has closed the connection.

so probably, what it needs to do is send some sort of notification to the client that the connection is going away and to retry any subsequent requests over a new connection. how do i accomplish this?

georgebarnett · February 23, 2024, 10:06am

Your server should send a GOAWAY frame to the client when the server closes the connection. The GOAWAY frame indicates to the client that it must not open any new streams on that connection.

The GOAWAY frame also includes a “last stream ID”. This is the ID of the last stream initiated by the peer, i.e. the client. However, this leads us back to your original problem: sending the GOAWAY can race with the client opening new streams.

In order to work around this the server can send a first GOAWAY frame with the last stream ID set to max to notify the client to stop creating new streams. It can then wait at least 1RTT and then send another GOAWAY frame with the actual last stream ID. One technique you can use here is to send a PING straight after the first GOAWAY and then send the second GOAWAY when you receive the PING ACK.

You can receive notifications in the channel pipeline about streams opening and closing by listening for the NIOHTTP2StreamCreatedEvent and StreamClosedEvent user inbound events.

taylorswift · February 23, 2024, 3:58pm

how to do this with NIOAsyncChannel? i expended great effort in October to rewrite the old channel handler implementation using the new NIO async APIs, and i’d rather not have to turn it back into a channel handler pipeline.

georgebarnett · February 23, 2024, 4:27pm

IIRC the async channel doesn’t offer a stream of inbound events which seems like a bit of an omission.

The amount of code required to do the graceful shutdown in a channel handler is relatively small and as a protocol-level concern I’d do it in a channel handler. However, if you want to do it in async space you could write a simple adapter handler which forwards an enum of the events or http2 frames.

FranzBusch · February 23, 2024, 10:31pm

This was intentional since events are a concept of the channel pipeline and bridging them as a separate async sequence introduces reordering problems.

You can already observe opening or closing new streams in async land by iterating the async sequence that the async H2 multiplexer vends.

In general for gracefully shutting down an H2 server you want to close the listening socket, then do the above flow that George pointed out and inform all open streams about the graceful shutdown.

The current re-work of Hummingbird does this already and could be a good inspiration to look at. Hummingbird uses swift-service-lifecycle which makes graceful shutdown a structured concurrency concept similar to task cancellation.

taylorswift · February 24, 2024, 12:54am

i guess what i am confused by is i am not trying to shut down the server, but rather i am trying to enforce a idle timeout asynchronously for a single connection.

my listener loop (per connection) looks a bit like:

for try await stream:NIOAsyncChannel<
    HTTP2Frame.FramePayload,
    HTTP2Frame.FramePayload> in streams.inbound
{
    ...
}

and the timeout enforcer can only interrupt this loop by destroying the entire connection. i’m not sure how to fit the more complex procedure @georgebarnett described into this “structured” AsyncSequence iteration.

taylorswift · February 24, 2024, 11:39pm

so, in case this helps anyone, i came up with this solution that enforces timeouts at all applicable levels of the HTTP/2 connection that i could think of. i have tried to add comments to help others understand:

func handle(
    connection:any Channel,
    streams:NIOHTTP2Handler.AsyncStreamMultiplexer<NIOAsyncChannel<
        HTTP2Frame.FramePayload,
        HTTP2Frame.FramePayload>>) async
{
    await withThrowingTaskGroup(
        of: NIOAsyncChannel<HTTP2Frame.FramePayload, HTTP2Frame.FramePayload>?.self)
    {
        (
            tasks:inout ThrowingTaskGroup<NIOAsyncChannel<
                HTTP2Frame.FramePayload,
                HTTP2Frame.FramePayload>?, any Error>
        ) in

        tasks.addTask
        {
            /// This task awaits the first inbound stream from the peer. To enforce rotation
            /// of peers, each peer is only allowed to open one stream per connection.
            /// Otherwise, an attacker could occupy a connection indefinitely, effectively
            /// starving all other peers if the server is near its connection limit.
            for try await stream:NIOAsyncChannel<
                HTTP2Frame.FramePayload,
                HTTP2Frame.FramePayload> in streams.inbound
            {
                return stream
            }

            return nil
        }
        tasks.addTask
        {
            /// This task emits a timeout event while waiting on the first inbound stream.
            try await Task.sleep(for: .seconds(5))
            return nil
        }

        var events:ThrowingTaskGroup<NIOAsyncChannel<
            HTTP2Frame.FramePayload,
            HTTP2Frame.FramePayload>?, any Error>.Iterator = tasks.makeAsyncIterator()

        do
        {
            guard case let stream?? = try await events.next()
            else
            {
                Log[.error] = """
                (HTTP/2: \(address)) Connection timed out before peer initiated any streams.
                """
                return
            }

            /// Before we even process the first stream, we tell the peer that this is the
            /// only stream we will serve on this connection, and to retry any contemporary
            /// requests on a new connection.
            ///
            /// If we do not send this frame, users (or robots) who are browsing a large
            /// number of pages in a short time will not know to retry their requests, and
            /// they will perceive an unrecoverable server failure. Users who are browsing
            /// the production server at a sane pace will never be “bursty”, since secondary
            /// requests will all go to Amazon Cloudfront.
            ///
            /// We have no idea what the stream identifier of the first stream is, so we
            /// can only guess a value of `1`.
            connection.write(HTTP2Frame.init(streamID: 0, payload: .goAway(
                    lastStreamID: 1,
                    errorCode: .noError,
                    opaqueData: nil)),
                promise: nil)

            try await stream.executeThenClose
            {
                (
                    remote:NIOAsyncChannelInboundStream<HTTP2Frame.FramePayload>,
                    writer:NIOAsyncChannelOutboundWriter<HTTP2Frame.FramePayload>
                )   in

                let message:HTTP.ServerMessage
                do
                {
                    message = .init(
                        response: try await self.respond(to: remote),
                        using: stream.channel.allocator)
                }
                catch let error
                {
                    Log[.error] = "(application: \(address)) \(error)"

                    message = .init(
                        redacting: error,
                        using: stream.channel.allocator)
                }

                try await writer.send(message)
            }
        }
        catch let error
        {
            Log[.error] = "(HTTP/2: \(address)) \(error)"
            return
        }
    }
}

as HTTP/2 is a “two-level” network protocol, it is also necessary to enforce timeouts at the stream level:

private
func respond(to h2:NIOAsyncChannelInboundStream<HTTP2Frame.FramePayload>) async throws -> HTTP.ServerResponse
{
    /// Do the ``AsyncStream`` ritual dance.
    var c:AsyncThrowingStream<HTTP2Frame.FramePayload?, any Error>.Continuation? = nil
    let s:AsyncThrowingStream<HTTP2Frame.FramePayload?, any Error> = .init { c = $0 }
    guard
    let c:AsyncThrowingStream<HTTP2Frame.FramePayload?, any Error>.Continuation
    else
    {
        fatalError("unreachable")
    }
    /// Launch the task that simply forwards the output of the
    /// ``NIOAsyncChannelInboundStream`` to the combined stream. This seems comically
    /// inefficient, but it is needed in order to add timeout events to an HTTP/2 stream.
    async
    let _:Void =
    {
        for try await frame:HTTP2Frame.FramePayload in $1
        {
            $0.yield(frame)
        }
    } (c, h2)
    /// Launch the task that emits a timeout event after 5 seconds. This doesn’t terminate
    /// the stream because we want to be able to ignore the timeout events after the peer
    /// completes authentication, if applicable. This allows us to accept long-running
    /// uploads from trusted peers.
    async
    let _:Void =
    {
        try await Task.sleep(for: .seconds(5))
        $0.yield(nil)

    } (c)

    var inbound:AsyncThrowingStream<HTTP2Frame.FramePayload?, any Error>.AsyncIterator
    var headers:HPACKHeaders? = nil

    /// Wait for the `HEADERS` frame, which initiates the application-level request. This
    /// frame contains cookies, so after we get it, we will know if we can ignore timeouts.
    waiting:
    do
    {
        inbound = s.makeAsyncIterator()

        switch try await inbound.next()
        {
        case .headers(let payload)??:
            headers = payload.headers

        case _??:
            continue waiting

        case nil, nil?:
            Log[.error] = """
            (HTTP/2: \(address)) Stream timed out before peer sent any headers.
            """

            return .resource("Time limit exceeded", status: 408)
        }
    }

    ...
}

as i was writing this, i felt that the SwiftNIO async APIs were not really designed with security in mind. its streams don’t have any concept of a time limit, and i had to wrap them in another layer of suspensions that take time limits into account. and it is far too easy to mindlessly await on some event that an attacker could block indefinitely, and would also displace other activity on the server. in my experience, security is hard to bolt on to something that was not originally designed to be secure, and i hope future versions of SwiftNIO will be more conscious of this issue.

as a side note, there doesn’t seem to be a way to obtain the HTTP2StreamID associated with an incoming stream. this is needed in order to correctly send the GOAWAY frame, and the implementation above is just guessing a value of 1.

update: so definitely don’t paste that code into your server, because it will work with curl but not with Firefox, i’m assuming because curl sends a StreamID of 1 and Firefox does not.

maybe it is just too late at night and i am lacking sleep, but i couldn’t find out how to inspect incoming HTTP2Frames with SwiftNIO, as it only exposes the HTTP2Frame.FramePayload.

update 2: after messing around with it a lot, i found that Firefox for some reason initializes the stream ID to 15 and not 1. moreover, it special-cases Int32.max, so if you just send that and not the real stream ID, it will think the server is broken and not retry any requests. it’s probably a bad idea to assume all browsers are like Firefox and start the count at 15, so i’m not sure how to solve this without some way of obtaining the stream ID. how do you get the stream ID?

lukasa · February 28, 2024, 1:58pm

All HTTP/2 stream channels have a way to get their stream ID using the HTTP2StreamChannelOptions.streamID channel option. This is used in the example server, which doesn't use the new APIs, but the new APIs still make this available so long as you use the underlying Channel that the AsyncChannel holds.

let streamID = try channel.getOption(HTTP2StreamChannelOptions.streamID).get()

Another option is to listen for inbound user events, which the HTTP/2 handler uses to notify the channel about stream state. However, as noted above, this is not made easily accessible in the AsyncChannel spelling.