Option to make Connect Proxy work with HTTP traffic?

Hello,

I am using SwiftNIO with the Connect Proxy based on the official example (swift-nio-examples/connect-proxy at main · apple/swift-nio-examples · GitHub). I am aware that it is built to handle HTTPS traffic and uses the CONNECT method.

However I wanted to ask if it is doable to make some changes which would make it work with HTTP also? Preferrably without upgrading HTTP connection to HTTPS as there still be sites that work just over HTTP.

The HTTP traffic appears to be coming into the ConnectHandler instance into the handleInitialMessage method. Naively I have tried modifying the guard to allow also .GET methods which of course did not work :smiley:

I have tried to look into the source code but I don't have a lot of experience with these frameworks and wasn't exactly sure where to look.

Thanks very much!

2 Likes

While in principle you could modify the CONNECT proxy, you probably want to produce an entirely separate proxy for cleartext HTTP traffic.

When proxying cleartext HTTP, instead of using a special magic request to ask for a tunnelled connection you instead reformat the request to be sent to a proxy. To do this you need to compute the request target from the HTTPRequestHead. This is usually communicated in three parts:

  • The scheme is communicated by the fact that you're sending plaintext data, so it'll be http://
  • The host portion of the URI is communicated in the Host header.
  • The path and query portions are in the uri field of the HTTPRequestHead.

You then replace the uri field of the HTTPRequestHead with the computed request target URL. This tells the proxy where it should be forwarding the request to.

This is basically completely unlike what you do for the CONNECT proxy. You absolutely could extend it, but in principle it may be easier just to construct a different handler.

Thanks! I am happy to hear that it is possible.

I currently have this basic bootstrap setup

bootstrap = ServerBootstrap(group: group)
            .serverChannelOption(ChannelOptions.socket(SOL_SOCKET, SO_REUSEADDR), value: 1)
            .childChannelOption(ChannelOptions.socket(SOL_SOCKET, SO_REUSEADDR), value: 1)
            .childChannelInitializer { channel in
                channel.pipeline.addHandler(ByteToMessageHandler(HTTPRequestDecoder(leftOverBytesStrategy: .forwardBytes))).flatMap {
                    channel.pipeline.addHandler(HTTPResponseEncoder()).flatMap {
                        channel.pipeline.addHandler(ConnectHandler())
                    }
                }
            }

So the next step would be to create sommething like HTTPConnectHandler and add it to this pipeline? I guess I also need some other handler to forward HTTP trafic to this new one and HTTPS to existing ConnectHandler?

Are there parts of the ConnectHandler that I could use as inspiration for the HTTPHandler? Honestly the steps outlined in the previous post seem pretty daunting.

Thanks.

Yes.

You shouldn't need an extra handler. Your new handler should probably be a straightforward transformation. The HTTP/HTTPS choice probably shouldn't be done in a channel handler, as you need to know very early: TLS is connection-level, and you should know ahead of time whether you're using it.

Unlike ConnectHandler, this one should be much simpler. It's ultimately just a transformation on HTTPRequestHead. As a result, a scaffolding might be:

final class HTTPProxyHandler: ChannelOutboundHandler {
    typealias OutboundIn = HTTPClientRequestPart
    typealias OutboundOut = HTTPClientRequestPart

    func write(context: ChannelHandlerContext, data: NIOAny, promise: EventLoopPromise<Void>?) {
        guard case .head(var head) = self.unwrapOutboundIn(data) else {
            context.write(data, promise: promise)
        }

        // Mutate the request head here, following the instructions above.

        context.write(yourNewRequestHead, promise: promise)
    }
}

Thanks. I am not sure if it makes sense what I am trying to achieve. I started building the HTTPProxyHandler and added some logging to the write method but it appears it doesn't receive any connections.

I have modified my bootstrap above and replaced ConnectHandler with the HTTPProxyHandler for testing.

What confuses me a bit is that the ConnectHandler implements the ChannelInboundHandler and HTTPProxyHandler implements ChannelOutboundHandler. Shouldn't I also have inbound part for the HTTP?

The HTTPProxyHandler doesn't really care about what the server sends back: it doesn't need to transform it in any way.

Does the scaffolding your showed me need also other methods implemented? From your hints it looks like this HTTP proxy should be much simpler than the ConnectHandler but I think I am on a wrong track, because I cannot get my handler to get triggered. (I currently just want to log the head I get, before the transformation)

This is my temporary bootstrap:

bootstrap = ServerBootstrap(group: group)
            .serverChannelOption(ChannelOptions.socket(SOL_SOCKET, SO_REUSEADDR), value: 1)
            .childChannelOption(ChannelOptions.socket(SOL_SOCKET, SO_REUSEADDR), value: 1)
            .childChannelInitializer { channel in
                channel.pipeline.addHandler(ByteToMessageHandler(HTTPRequestDecoder(leftOverBytesStrategy: .forwardBytes))).flatMap {
                    channel.pipeline.addHandler(HTTPResponseEncoder()).flatMap {
                        channel.pipeline.addHandler(HTTPConnectHandler())
                    }
                }
            }

Are you adding the new handler at all? It seems that you’re only adding the old connect handler.

Oh, also, this is bootstrapping a server, not a client. Do you want a server or a client?

Apologies! Yes, I want a server. Did not cross my mind to specify this as in my mind it was all server :smiley:

Note that I am using the HTPPConnectHandler name for the HTTPProxyHandler from your scaffolding example.

The HTTPS handler is named ConnectHandler.

So in that case you want to invert the transformations I made above, and also invert the handler:

final class HTTPProxyHandler: ChannelInboundHandler {
    typealias InboundIn = HTTPServerRequestPart
    typealias InboundOut = HTTPServerRequestPart

    func channelRead(context: ChannelHandlerContext, data: NIOAny) {
        guard case .head(var head) = self.unwrapIntboundIn(data) else {
            context.fireChannelRead(data)
            return
        }

        // Mutate the request head here, inverting the instructions above.

        context.fireChannelRead(yourNewRequestHead)
    }
}

Nice! This feels like a step towards the solution as the handler is getting called and I am able to log the uri.

This is for example: http://apple.com. I am sorry to say I am quite clueless by what you mean to "compute target request URL".

From what I can tell the original ConnectHandler from Connect Proxy example parses the host (which would be apple.com and port) and then uses this method:

private func connectTo(host: String, port: Int, context: ChannelHandlerContext) {
        self.logger.info("Connecting to \(host):\(port)")

        let channelFuture = ClientBootstrap(group: context.eventLoop)
            .connect(host: String(host), port: port)

        channelFuture.whenSuccess { channel in
            self.connectSucceeded(channel: channel, context: context)
        }
        channelFuture.whenFailure { error in
            self.connectFailed(error: error, context: context)
        }
    }

Which has some upgrading stuff - this I guess should be needed in plain HTTP.

So the only thing I was able to come up with is to transform the uri to something like apple.com:80 mimicking the format from Connect proxy but this does not work.

So the client will have performed the steps I suggested above:

This means you need to undo those steps. That means:

  1. Take the uri from the request head, in this case http://apple.com.
  2. Parse it as a URL. If this fails, report an error.
  3. Confirm the scheme is http. If it is not, report an error. (Strictly you don't have to do this, but it is the right thing to do).
  4. Replace the uri from above with the path from the URL we parsed above. If the URL has a query string, append that as well, with a ? to separate them.
  5. Confirm that the Host header in the request is set to the host from the URL we parsed above. If it is not, report an error.
  6. Buffer your modified head.
  7. Connect to the host and port specified in the URL we parsed above.
  8. When the connect completes, forward your buffered data on.

Once the connection is up, you still need to rewrite any further request head you see using steps 1 through 5, but instead of buffering it as you do in 6, you can just forward it on.

1 Like

Thanks! Will try this stuff tomorrow and will report back how it went.

Are there articles/docs you would recommend for me to read to better grasp this? I am happy to learn this stuff but I couldn't find anything applicable for my use case.

It doesn't fully address your use case, but as an introduction to SwiftNIO you might want to check out this article, which isn't entirely wrong: A µTutorial on SwiftNIO 2 – Helge Heß – Software engineer.

Can you please explain a bit the "Buffer your modified head" part? And how do I know later that I can just forward the heads with steps 1-5 without the rest?

So far I have this:

func channelRead(context: ChannelHandlerContext, data: NIOAny) {
        guard case .head(var head) = self.unwrapInboundIn(data) else {
            context.fireChannelRead(data)
            return
        }
        
        // Mutate the request head here, inverting the instructions above.
        
        os_log(.default, log: .default, "Connecting to URI: %{public}s", head.uri as NSString)
        
        guard let parsedUrl = URL(string: head.uri) else {
            context.fireErrorCaught(ConnectError.invalidURL)
            return
        }
        
        os_log(.default, log: .default, "Parsed scheme: %{public}s", (parsedUrl.scheme ?? "no scheme") as NSString)
        
        guard parsedUrl.scheme == "http" else {
            context.fireErrorCaught(ConnectError.wrongScheme)
            return
        }
        
        var targetUrl = parsedUrl.path
        
        if let query = parsedUrl.query {
            targetUrl += "?\(query)"
        }
        
        os_log(.default, log: .default, "Connecting to targetUrl: %{public}s", targetUrl as NSString)
        
        os_log(.default, log: .default, "Headers: %{public}s", head.headers.description as NSString)
        
        guard let host = head.headers.first(name: "Host"), host == parsedUrl.host else {
            os_log(.default, log: .default, "Wrong host")
            context.fireErrorCaught(ConnectError.wrongHost)
            return
        }
        
        head.uri = targetUrl
        
        
        context.fireChannelRead(self.wrapInboundOut(.head(head)))
    }

Although the targetUrl seems wrong after few tests as the path appears to be empty. So at the end should be the buffering and the:

private func connectTo(host: String, port: Int, context: ChannelHandlerContext) {
        self.logger.info("Connecting to \(host):\(port)")

        let channelFuture = ClientBootstrap(group: context.eventLoop)
            .connect(host: String(host), port: port)

        channelFuture.whenSuccess { channel in
            self.connectSucceeded(channel: channel, context: context)
        }
        channelFuture.whenFailure { error in
            self.connectFailed(error: error, context: context)
        }
    }

As in the original CONNECT proxy but with modifications?

You cannot forward the modified head on until you've made a new outbound connection. You can see the state machine in the connect handler doing this with the pendingBytes object, though in your case you don't need to store NIOAny and can instead store HTTPServerRequestPart. Concretely, you need to hold onto those until the outbound connection has succeeded, when you can then unwrap them and forward them on.

Because you know you completed the original connection attempt. You need to be keeping track of this state, which the connect proxy does.

If the path is empty replace it with /.

On one hand I feel like I kind of understand this and on the other it is like "I have no idea what I am doing".

Currently I have this:

private func connectTo(host: String, port: Int, context: ChannelHandlerContext) {
        let channelFuture = ClientBootstrap(group: context.eventLoop)
            .connect(host: host, port: port)

        channelFuture.whenSuccess { channel in
            self.connectSucceeded(channel: channel, context: context)
        }
        channelFuture.whenFailure { error in
            self.connectFailed(error: error, context: context)
        }
    }
    
    private func connectSucceeded(channel: Channel, context: ChannelHandlerContext) {
        os_log(.default, log: .default, "Connect succeeded")

        if case let .pendingConnection(head) = state {
            self.state = .connected
            
            context.fireChannelRead(self.wrapInboundOut(.head(head)))
        }
    }

And I am calling connectTo from the channelRead if current state does not have pending connection. This shows "Connect succeeded" in the log, but I cannot open these http pages in Safari.

If I get uri: http://apple.com my target url looks like this: apple.com/. Is this correct?

For a proxy? No. You'd connect to apple.com and the URL would be /.

E.g. incoming proxy request:

GET http://www.apple.com/ HTTP/8.0
Content-Length: 0

Outgoing request the proxy would send to the www.apple.com as http on port 80:

GET / HTTP/8.0
Content-Length: 0

I see. Thanks. :bulb: