Building a Backconnect Proxy (2/5)

Proof of Concept

Published on: 2022-02-15

This is the second in a series of 5 posts that will outline how to go from idea to product through the creation of a backconnect/rotating proxy in Golang. Before a product can be made, it will need to be envisioned. At LDG, we specialize in taking these ideas and bringing them to life.

In part 1 of our series on creating a backconnect proxy, we introduced backconnect proxies and problems we encountered with existing solutions. In this post we will work towards answering the key technical questions that were proposed. In short, we will create Proof of Concepts (PoC) of the various key features required to build the backconnect proxy.

Which language would fit the job?

Generally when picking which language to use we consider the following:

  • Capabilities of the language
  • Functional needs of the task
  • Experience and skills of the development team

Certain languages lend themselves to certain problems due to language features and existing libraries. In this regard we wanted to pick a language that would allow for a short development cycle and produce fast code without the need for extensive optimizations.

We decided to develop in Golang due to our experience with the language, and general speed of the language. Given our familiarity with the language we were confident we could develop the system in a reasonable amount of time. Likewise, Golang provide CSP-style concurrency which facilitates development of highly parallel systems. Finally, Golang is a language that lends itself towards development of a backconnect proxy, which is heavily centered around networking and system integration. Naturally, if the language lacked the necessary technical capabilities for our task we would need to reassess this choice..

Can we create a proxy service?

We began investigating the network libraries available in Golang. Starting first with the standard library we searched for the ability to create a proxy service. In Golang the standard library provides most of the tools you would need to create a simple proxy. However, creating a backconnect proxy required more extensibility so we expanded our search.. Eventually we found goproxy, a customizable HTTP proxy library. Below is a simple example from their README.md.


package main

import (
    "github.com/elazarl/goproxy"
    "log"
    "net/http"
)

func main() {
    proxy := goproxy.NewProxyHttpServer()
    proxy.Verbose = true
    log.Fatal(http.ListenAndServe(":8080", proxy))
}

Furthermore, goproxy also facilitates the interception and modification of requests. If one were to add the code below (again from their README.md.) each request received by the proxy server will have the header X-GoProxy: yxorPoG-X injected.


proxy.OnRequest().DoFunc(
    func(r *http.Request,ctx *goproxy.ProxyCtx)(*http.Request,*http.Response) {
        r.Header.Set("X-GoProxy","yxorPoG-X")
        return r,nil
    })

Putting this example together we can build and run the proxy with the following:


go run goproxy-example.go

As part of the proxy on port 8080 we are running a second server on port 9090. This is to simplify the demo process as the headers changes will only be visible to the destination. Therefore, to showcase the change the second server will respond to all requests with a raw dump of the request. We can then test the proxy library with the following curl command:


curl  -x http://localhost:8080 http://localhost:9090

The curl request will be modified to include the new header value interjected into the request.

The library provides a fair number of useful built in features including custom request handling definitions. Unfortunately, this library doesn't support a backconnect proxy directly The library does however provide the necessary functionally to facilitate the creation of a backconnect proxy.

Can we access multiple IP addresses?

The backconnect proxy will need access to a pool of IP addresses in order to function. One could purchase several servers, each with their own IP. A proxy would then facilitate forwarding requests to each node in the network to achieve the desired IP rotation. An alternative would be to bind multiple IP addresses to a single server. The proxy would then rotate between these IP addresses for each different client request.

We chose the latter approach as it was possible (thanks to our hosting provider) and more cost effective. Bulk purchasing of IP addresses is not widely available, though it is often cheaper than purchasing many servers. With access to multiple IP addresses solved, we needed to determine how to use those multiple IP addresses bound to the server.

Can we programmatically change the IP address used for making a HTTP request?

The use of multiple bound IP addresses is fairly rare and was a bit more difficult to find the necessary documentation. For Golang, the networking standard library (net) provides a multi-step process to change a request's IP address. A request in Golang will be made by default via http.Transport. Furthermore, Golang defines an interface, http.RoundTripper, to facilitate sending a request and receiving a response. One could, if they so desired, write a custom implementation of http.RoundTripper. However, for our purposes the default implementation, http.Transport will suffice. http.Transport can be customized with a custom dialer function. This function controls how a connection is established to the destination. The following code outlines the definition of a custom dialer function. The request address (represented by LocalAddr) is modified to a hardcoded IP address of 1.1.1.1.


func CustomDialer(ctx context.Context, network string, addr string) (net.Conn, error) {
    altIP := "1.1.1.1" // Custom IP
    ipAddress := net.ParseIP(altIP)
    d := net.Dialer{
        LocalAddr: &net.TCPAddr{
            IP: ipAddress,
        },
        Timeout:   30 * time.Second,
        KeepAlive: 30 * time.Second,
    }
    return d.Dial(network, addr)
}

The custom dialer is then assigned to http.Transport.DialContext. Below is an example using the custom dialer to make a request with a different IP address.


req, err := http.NewRequest(http.MethodGet, "http://ip-api.com/json", nil)
if err != nil {
    panic(err)
}

transport := &http.Transport{
    Proxy: http.ProxyFromEnvironment,
    DialContext: CustomDialer,
}

resp, err := transport.RoundTrip(req)
if err != nil {
    panic(err)
}

With the ability to programmatically switch between bound IP addresses we can provide the rotating portion of the backconnect proxy.

Can we minimize downtime for the proxy?

Any system will likely require some downtime. However, consistent downtime, as we discussed in part 1, lasting several hours conflicted with our client's functional needs. We could afford downtimes that were sparse and brief, as the collected data integrity would not be greatly impacted. .

Similarly, reducing maintenance depends on the type of maintenance the system will require. In examining the backconnect proxy one source of potential downtime would be the need to update or manage the IP addresses used for rotation. Using a statically defined list of the available IP addresses for rotating would be initially easier. However, if that list changes the system would need to be restarted. Alternatively the list of IP addresses could be dynamically maintained from another source. Ideally the source of IP addresses would be either retrieved directly from the server, or a list maintained by us.

We opted to retrieve the list of available IP addresses from the server using Golang's network interfaces standard library. The following code will retrieve all IP addresses bound to a specific MAC (Media Access Control) address.


targetInterfaceMAC := "xxx"
netInterfaces, err := net.Interfaces()
if err != nil {
    panic(err)
}

var activeIPList []string
for _, netInterface := range netInterfaces {
    if netInterface.HardwareAddr.String() != targetInterfaceMAC {
        continue
    }
    addresses, err := netInterface.Addrs()
    if err != nil {
        continue
    }
    for _, address := range addresses {
        if val, ok := address.(*net.IPNet); ok {
            if val.IP.To4() != nil { // Filters out IPv6 addresses
                activeIPList = append(activeIPList, val.IP.String())
            }
        }
    }
}

Provided just the MAC address we can now retrieve the list of bound IP addresses. This is far more convenient as the MAC address is far less likely to change. IP addresses can be added or removed from the system by binding/unbinding to the server. Once these changes are made on the system the proxy may update itself and those changes will cascade into the proxy's list of available IP addresses. Naturally this update would be performed on some interval which we'll explore in part 4.

Beyond maintenance of available IP addresses, the main reasons for downtime of a backconnect proxy would be the hosting service, code or architecture quality. For the hosting service, finding a stable and reliable hosting service provider is fundamental. The latter two will be our focus in part 3 and part 4.

Can we make this a plug in replacement for existing proxy services?

Yes we can, for the most part. We will aim to use the standard interface for proxies. One foreseeable issue is HTTPS requests (a secure HTTP request), which could potentially cause issues. In certain cases we may need to perform Man In The Middle (MITM) attacks on the client's request to make the necessary modifications prior to transmission. We may be able to avoid many of these issues using an HTTP tunnel. An HTTP tunnel facilitates supporting encrypted HTTPS requests through a HTTP proxy. Therefore allowing us to support both HTTP and HTTPS requests through our HTTP proxy.

First, let’s review how Golang supports proxy integration for normal use cases; one can simply define environment variables to set the proxy. Setting HTTP_PROXY or HTTPS_PROXY (for more information, see https://pkg.go.dev/net/http#ProxyFromEnvironment ) can be used to facilitate proxying requests. The default transport will automatically use a proxy if defined by environment variables. If one wishes to manually define this they can configure http.Client with a custom configured http.Transport. Using http.Transport, one can simply set the Proxy member to http.ProxyFromEnviroment.


// simple_proxy_client.go
package main

import (
	"fmt"
	"net/http"
	"net/http/httputil"
	"time"
)

func main() {
	address := "http://ip-api.com/json"


	client := &http.Client{
		Timeout: 10 * time.Second,
		Transport: &http.Transport{
			Proxy: http.ProxyFromEnvironment,
		},
	}

	req, err := http.NewRequest("GET", address, nil)
	if err != nil {
		panic(err)
	}
	resp, err := client.Do(req)
	if err != nil {
		panic(err)
	}

	defer resp.Body.Close()
	dump, err := httputil.DumpResponse(resp, true)
	if err != nil {
		panic(err)
	}
	fmt.Printf("%q\n", dump)
}

This code can be run with the following (of course a proxy must be running as well):

go build simple_proxy_client.go
HTTP_PROXY=http://example-proxy.com ./simple_proxy_client

Code

Hooray! We were able to answer all the key technical feasibility questions and even build out some working code examples. If you'd like to review the full code examples please see them below:

With our concerns for technical feasibility satisfied, we'll move onto part 3 (coming soon). Where we will be designing and architecting the backconnect proxy system.

Discuss on Hacker News, send us thoughts, or join the discussion below.