Building a Backconnect Proxy (3/5)

Design and Architecture

This is the third in a series of 5 posts that will outline how to go from idea to product through the creation of a backconnect/rotating proxy in Golang. Before a product can be made, it will need to be envisioned. At LDG, we specialize in taking these ideas and bringing them to life.

Our Objective

In the last post we ensured the technical feasibility of creating a backconnect proxy in Golang. In this post we will tackle system design and architecture. First, let us restate our objective: to allow clients to make more requests than permitted by a site's rate limits. This objective can be expanded into a more detailed technical list:

Accept requests from clients
Rotate between a list of IP addresses for each request
Forward requests to a destination
Receive the response from the destination
Return the response from the destination back to the client
Maintain a list of IP addresses that are available for rotating

Given this more detailed list, we can produce a high level architecture diagram outlining the layout of the system.

Architectural Diagram of a proxy rotator.

The system is described by Proxy, which interacts with two external components: Client and Destination. The Proxy will receive requests from a Client. The next IP in the rotation will be used to forward the request to the destination. The Proxy will receive the response from the destination and return that response back to the Client. The Proxy will also concurrently maintain a list of IP addresses for rotation. In this diagram, Client may refer to one or more clients making requests through the Proxy.

Use Cases

With the high level design diagram of the system finished, we now need to dive deeper into the specifics of how the system should behave. One technique for detailing a system's expected behavior is to outline use cases of software. Use case definitions help identify both the common flow and edge cases.

Use Case 1: Proxy a client

Client makes request
Proxy receives request
Proxy picks the next IP (ip_i) to use
next IP is incremented to ip_i+1
IP is used to make the request
The response is returned to the client

Use Case 2: Proxy a client with available IP list wraparound

Client makes request
Proxy receives request
The next IP(ip_i) exceeds the available list of IP addresses
Proxy picks the first IP (ip₀) in the list of available IP addresses
next IP is set to the second IP (ip₁) in the list of available IP addresses
IP is used to proxy through to make the request
The response is returned to the client

Use Case 3: Proxy without any available IP addresses

Client makes request
Proxy receives request
List of available IP addresses is empty
Client receives a response that the request failed due to a configuration error

Use Case 4: Multiple concurrent requests

Client A makes request
Client B makes request
Proxy receives request A
Proxy receives request B
Proxy picks the next IP (ip_i) to use for request A
next IP is incremented to ip_i+1
Proxy picks next IP (ip_i+1) to use for request B
next IP is incremented to ip_i+2
ip_i is used to make request A
ip_i+1 is used to make request B
Response A is returned to client A
Response B is returned to client B

Use Case 5: Updating the list of available IP addresses

On startup, load the list of available IP addresses
Wait a defined interval for the next update
Update the list of available IP addresses
Return to step 2

In order to perform the rotation we will need to keep a list of available IP addresses (ip). Upon each request we need to retrieve the next IP to use. For simple rotation scenarios, the definition of the next IP is simply the IP following the previous IP. Likewise, the initial next IP will be the first IP (ip₀) in the available IP addresses list.


def next_ip(ip : List[IP], i : int) : IP
    if length(ip) = 0
        raise "no ip addresses available"
    else if i >= length(ip)
        i = 0
    end
    cur_ip = ip[i]
    i += 1
    return cur_ip
end

Naturally, we will eventually reach the end of our list of available IP addresses. In this case the next IP will be the first IP (ip₀) in the list of available IP addresses. For example if the list of available IP addresses were as follows: [A B C D]. The sequence of the first 6 IP addresses used would be: A, B, C, D, A, B.

The Proxy will also need to handle multiple requests concurrently; the next IP should always atomically increment. For example, given the same list above, if we were to make 3 requests each request should use a different IP. Therefore a valid assignment of IPs for the 3 requests would be any combination of A, B, C. Below is an example of using a Mutex to ensure the next IP is atomically incremented.


def next_ip(m : Mutex, ip : List[IP], i : int) : IP
  if length(ip) = 0
    raise "no ip addresses available"
  end
  m.lock()
    if i >= length(ip)
    i = 0
  end
  cur_ip = ip[i]
  i += 1
  m.unlock()
  return cur_ip
end

TThe list of available IP addresses will play a critical role in the system. This list will need to be created on-startup and then periodically updated while the Proxy is running. This process of updating should impact the main Proxy system as little as possible. As outlined in the architectural diagram, updating the list of available IP addresses can be run in an entirely separate thread. Of course, this update will likely require thread safe access. Finally, this update will need to occur on some interval. The following code outlines retrieving the next IP in a thread-safe manner. Since the available IP list (ip) may be updated concurrently in update_ip_list, the length check on line 3 will need to be synchronized as well.


def next_ip(m : Mutex, ip : List[IP], i : int) : IP
  m.lock()
  if length(ip) = 0
    m.unlock()
    raise "no ip addresses available"
  else if i >= length(ip)
    i = 0
  end
  cur_ip = ip[i]
  i += 1
  m.unlock()
  return cur_ip
end

def update_ip_list(m : Mutex, ip : List[IP], interval : int, macAddr : string)
  Loop indefinitely
    Wait for interval
    new_ip = update_list(macAddr)
    m.lock()
    ip = new_ip
    m.unlock()
  end
end

For example, if the update were to occur every second and we happened to add 10 new IP addresses. The system could require 10 updates to occur in the course of 10 seconds. Instead, if we update every 20 seconds, even in the worst case scenario only two updates would be required. One must take these considerations into account and reach an update frequency that is suitable.

At this point, we've covered enough of the design and architecture of the backconnect proxy. We covered various edge cases and produced pseudocode of the system's core feature, IP rotation. If we have done our job well, building the system should proceed smoothly. In part 4 (coming soon) we will go from PoCs (Proof of Concepts) and design to a full MVP (Minimum Viable Product).

Discuss on Hacker News, send us thoughts, or join the discussion below.

Part 2: Proof of Concept Part 4: Building an MVP