Optical
Aberration

Building a HTTP Proxy with Swift

Since writing this post we have released Hummingbird v2 which makes building a proxy considerably more simple. You can find a new version of the sample code here. The swiftonserver.com site has a more up to date article about this.

In this article I am going to cover how to build a HTTP Proxy server using Swift. We will use the Hummingbird HTTP server framework as the basis for the server and use the Swift server AsyncHTTPClient HTTP client to forward on requests to the target service.

What is a proxy server

A proxy server is a service that sits between a client and another service. It forwards messages from the client onto the other service and returns the responses received back to the client. Before forwarding messages it may process these in some manner and similarly it can process responses returned.

Uses

Proxy servers can be used for many purposes: as a gateway between different domains, providing an authentication layer to a service, allowing anonymous access to services, logging API calls, load balancing by passing requests to one of many instances of a service, caching responses, decrypting TLS requests and encrypting responses.

Let’s build one

In this article we are going to build a proxy server that just forwards HTTP packets on to the target service. You can find the sample code for the article here.

Creating the project

We will use the Hummingbird template project as the starting point for our server. We can either clone this repository, or press the use this template button on the GitHub web page for the project and create our own repo. The template project creates a server and starts it, provides command line options and a place to configure our application. More can be found out about it here.

Add AsyncHTTPClient

We are using AsyncHTTPClient in our project so we need to add it as a dependency in Package.swift.

First add it as a Package dependency

dependencies: [
    ...
    .package(url: "https://github.com/swift-server/async-http-client.git", from: "1.6.0"),
],

And then add it as a target dependency

targets: [
    .executableTarget(name: "App",
        dependencies: [
            ...
            .product(name: "AsyncHTTPClient", package: "async-http-client"),
        ],

We are going to store the HTTPClient as an extension to the HBApplication. This allows us to manage the HTTPClient's lifecycle and call syncShutdown on it before it is deleted.

extension HBApplication {
    var httpClient: HTTPClient {
        get { self.extensions.get(\.httpClient) }
        set { self.extensions.set(\.httpClient, value: newValue) { httpClient in
            try httpClient.syncShutdown()
        }}
    }
}

The closure at the end of the set function is called on HBApplication shutdown. This also means we have access to the HTTPClient whenever we have a reference to the HBApplication although we are not going to use this here.

Add middleware

We are going to implement our proxy server as middleware. The middleware will take a request, send it on to the target service and then respond with the response from the target service. Here is our initial version of the middleware. It will require the HTTPClient and the URL of the target service.

struct HBProxyServerMiddleware: HBMiddleware {
    let httpClient: HTTPClient
    let target: String

    func apply(to request: HBRequest, next: HBResponder) -> EventLoopFuture<HBResponse> {
        return httpClient.execute(
            request: request,
            eventLoop: .delegateAndChannel(on: request.eventLoop),
            logger: request.logger
        )
    }
}

Now we have the HTTPClient and the HBProxyServerMiddleware middleware we add them to application in HBApplication.configure. Lets set the target of our proxy to be http://httpbin.org.

func configure(_ args: AppArguments) throws {
    self.httpClient = HTTPClient(eventLoopGroupProvider: .shared(self.eventLoopGroup))
    self.middleware.add(HBProxyServerMiddleware(httpClient: self.httpClient, target: "http://httpbin.org"))
}

Converting types

When we build the above, it fails to compile. This is because we need to convert between Hummingbird and AsyncHTTPClient Request and Response types. Also we need to incorporate the target service URL into the request.

Request conversion

To convert from a Hummingbird HBRequest to a AsyncHTTPClient HTTPClient.Request we first need to collate the HBRequest body which may still be loading. This makes the conversion process asynchronous. So it needs to return an EventLoopFuture which will be fulfilled with the result of the conversion later on. Lets add a conversion function to HBRequest

extension HBRequest {
    func ahcRequest(host: String) -> EventLoopFuture<HTTPClient.Request> {
        // consume request body and then construct AHC Request once we have the
        // result. The URL for the request is the target server plus the URI from
        // the `HBRequest`.
        return self.body.consumeBody(on: self.eventLoop).flatMapThrowing { buffer in
            return try HTTPClient.Request(
                url: host + self.uri.description,
                method: self.method,
                headers: self.headers,
                body: buffer.map { .byteBuffer($0) }
            )
        }
    }
}

Response conversion

The conversion from HTTPClient.Response to HBResponse is considerably simpler.

extension HTTPClient.Response {
    var hbResponse: HBResponse {
        return .init(
            status: self.status,
            headers: self.headers,
            body: self.body.map { HBResponseBody.byteBuffer($0) } ?? .empty
        )
    }
}

We can now add these two conversion steps into the apply function of HBProxyServerMiddleware. While we are at it, lets also add some logging.

func apply(to request: HBRequest, next: HBResponder) -> EventLoopFuture<HBResponse> {
    // log request
    request.logger.info("Forwarding \(request.uri.path)")
    // convert to HTTPClient.Request, execute, convert to HBResponse
    return request.ahcRequest(host: target).flatMap { ahcRequest in
        httpClient.execute(
            request: ahcRequest,
            eventLoop: .delegateAndChannel(on: request.eventLoop),
            logger: request.logger
        )
    }.map { response in
        return response.hbResponse
    }
}

Now everything should compile. The middleware will collate the HBRequest body, convert it to a HTTPClient.Request, send this request to the target service using HTTPClient and then convert the response to a HBResponse which is then returned back to the application.

Run the application, open a web browser and type in localhost:8080. We should see the httpbin.org web site which we set as the proxy target earlier on.

Streaming

The setup above is not very optimal. It waits until a request is fully loaded before forwarding it on to its target service and similarly in the other direction it waits until the response has loaded before sending that back to the client. This slows down the forwarding process and also can use a lot of memory if request or response payloads are large.

We can improve this by streaming request and response payloads. Start sending the request to the target service as soon as we have its head and stream the body parts as they are received. Similarly in the other direction start sending the response back once we have its head. Removing that wait for a full request or response will improve the performance of the proxy server.

We can still have memory issues though if communication between the client and the proxy and communication between the proxy and target service run at different speeds. Data will start to back up if we are receiving it faster than we are processing it. To avoid this happening we need to be able to apply back pressure to stop reading in additional data until we have processed enough of the data that is in memory. With this we can keep the amount of memory used by the proxy to a minimum.

Streaming requests

Streaming the request payload is a fairly easy process. In actual fact it simplifies the construction of the HTTPClient.Request as we don't need to wait for the request to be fully loaded. How we construct the HTTPClient.Request body though will be based on whether the full HBRequest is already in memory. If we return a streaming request, back pressure is automatically applied as the Hummingbird server framework does this for us.

func ahcRequest(host: String, eventLoop: EventLoop) throws -> HTTPClient.Request {
    let body: HTTPClient.Body?

    switch self.body {
    case .byteBuffer(let buffer):
        body = buffer.map { .byteBuffer($0) }
    case .stream(let stream):
        body = .stream { writer in
            // as we consume buffers from `HBRequest` we write them to
            // the `HTTPClient.Request`.
            return stream.consumeAll(on: eventLoop) { byteBuffer in
                writer.write(.byteBuffer(byteBuffer))
            }
        }
    }
    return try HTTPClient.Request(
        url: host + self.uri.description,
        method: self.method,
        headers: self.headers,
        body: body
    )
}

Streaming responses

Streaming responses requires a class conforming to HTTPClientResponseDelegate. This will receive data from the HTTPClient response as soon as it is available. The response body is received as a series of ByteBuffers. We can feed these ByteBuffers to a HBByteBufferStreamer. The HBResponse we return is constructed with this streamer instead of a static ByteBuffer.

If we combine the request streaming with the response streaming code our final apply function should look like this

func apply(to request: HBRequest, next: HBResponder) -> EventLoopFuture<HBResponse> {
    do {
        request.logger.info("Forwarding \(request.uri.path)")
        // create request
        let ahcRequest = try request.ahcRequest(host: target, eventLoop: request.eventLoop)
        // create response body streamer. maxSize is the maximum size of object it can process
        // maxStreamingBufferSize is the maximum size of data the streamer is allowed to have
        // in memory at any one time
        let streamer = HBByteBufferStreamer(eventLoop: request.eventLoop, maxSize: 2048*1024, maxStreamingBufferSize: 128*1024)
        // HTTPClientResponseDelegate for streaming bytebuffers from AsyncHTTPClient
        let delegate = StreamingResponseDelegate(on: request.eventLoop, streamer: streamer)
        // execute request
        _ = httpClient.execute(
            request: ahcRequest,
            delegate: delegate,
            eventLoop: .delegateAndChannel(on: request.eventLoop),
            logger: request.logger
        )
        // when delegate receives head then signal completion
        return delegate.responsePromise.futureResult
    } catch {
        return request.failure(error)
    }
}

You'll notice in the code above we don't wait on the result of httpClient.execute. This is because if we did, the function would wait for the whole response body to be in memory before continuing. We want to be processing the response straight away so instead we add a promise to the delegate that is fulfilled with a HBResponse holding the head details and the streamer as soon as we receive the head. The EventLoopFuture of this promise is what we pass back from the apply function.

I haven't included the code for StreamingResponseDelegate here as it is not small, but you can find it in the full sample code.

Sample code additions

The sample code has a few changes to what is detailed above.

  1. The default bind address port is 8081 instead of 8080. Most Hummingbird examples run on 8080 so to use the proxy alongside those examples it needs to bind to a different port.
  2. I added a location option which allows us to forward only requests from a particular base URL
  3. I added command line options for both target and location so these can be changed without rebuilding the application
  4. I remove the host header or request so it can be filled out with the correct value
  5. When converting a streaming request if the content-length header was provided I pass that along to the HTTPClient streamer to ensure the content-length header is set correctly for the request to the target server.

Alternatives

Instead of using Hummingbird for the Proxy server we could use HummingbirdCore instead. This would provide a bit of extra performance as it would remove extra layers of code but at the expense of flexibility. Adding any extra routing or middleware would require a lot more work. I have example code for a proxy server using only HummingbirdCore here.

Of course the other alternative would be to use Vapor. I would imagine an implementation in Vapor would look very similar to what is described above and shouldn't be too hard. I'll leave that to someone else though.