Building a Backconnect Proxy (3/5)

Design and Architecture

Published on: 2022-02-24

This is the third in a series of 5 posts that will outline how to go from idea to product through the creation of a backconnect/rotating proxy in Golang. Before a product can be made, it will need to be envisioned. At LDG, we specialize in taking these ideas and bringing them to life.

Our Objective

In the last post we ensured the technical feasibility of creating a backconnect proxy in Golang. In this post we will tackle system design and architecture. First, let us restate our objective: to allow clients to make more requests than permitted by a site's rate limits. This objective can be expanded into a more detailed technical list:

  • Accept requests from clients
  • Rotate between a list of IP addresses for each request
  • Forward requests to a destination
  • Receive the response from the destination
  • Return the response from the destination back to the client
  • Maintain a list of IP addresses that are available for rotating

Given this more detailed list, we can produce a high level architecture diagram outlining the layout of the system.

Architectural Diagram of a proxy rotator.

The system is described by Proxy, which interacts with two external components: Client and Destination. The Proxy will receive requests from a Client. The next IP in the rotation will be used to forward the request to the destination. The Proxy will receive the response from the destination and return that response back to the Client. The Proxy will also concurrently maintain a list of IP addresses for rotation. In this diagram, Client may refer to one or more clients making requests through the Proxy.

Use Cases

With the high level design diagram of the system finished, we now need to dive deeper into the specifics of how the system should behave. One technique for detailing a system's expected behavior is to outline use cases of software. Use case definitions help identify both the common flow and edge cases.

Use Case 1: Proxy a client
  1. Client makes request
  2. Proxy receives request
  3. Proxy picks the next IP (ipi) to use
  4. next IP is incremented to ipi+1
  5. IP is used to make the request
  6. The response is returned to the client
Use Case 2: Proxy a client with available IP list wraparound
  1. Client makes request
  2. Proxy receives request
  3. The next IP(ipi) exceeds the available list of IP addresses
  4. Proxy picks the first IP (ip0) in the list of available IP addresses
  5. next IP is set to the second IP (ip1) in the list of available IP addresses
  6. IP is used to proxy through to make the request
  7. The response is returned to the client
Use Case 3: Proxy without any available IP addresses
  1. Client makes request
  2. Proxy receives request
  3. List of available IP addresses is empty
  4. Client receives a response that the request failed due to a configuration error
Use Case 4: Multiple concurrent requests
  1. Client A makes request
  2. Client B makes request
  3. Proxy receives request A
  4. Proxy receives request B
  5. Proxy picks the next IP (ipi) to use for request A
  6. next IP is incremented to ipi+1
  7. Proxy picks next IP (ipi+1) to use for request B
  8. next IP is incremented to ipi+2
  9. ipi is used to make request A
  10. ipi+1 is used to make request B
  11. Response A is returned to client A
  12. Response B is returned to client B
Use Case 5: Updating the list of available IP addresses
  1. On startup, load the list of available IP addresses
  2. Wait a defined interval for the next update
  3. Update the list of available IP addresses
  4. Return to step 2

In order to perform the rotation we will need to keep a list of available IP addresses (ip). Upon each request we need to retrieve the next IP to use. For simple rotation scenarios, the definition of the next IP is simply the IP following the previous IP. Likewise, the initial next IP will be the first IP (ip0) in the available IP addresses list.

def next_ip(ip : List[IP], i : int) : IP
    if length(ip) = 0
        raise "no ip addresses available"
    else if i >= length(ip)
        i = 0
    cur_ip = ip[i]
    i += 1
    return cur_ip

Naturally, we will eventually reach the end of our list of available IP addresses. In this case the next IP will be the first IP (ip0) in the list of available IP addresses. For example if the list of available IP addresses were as follows: [A B C D]. The sequence of the first 6 IP addresses used would be: A, B, C, D, A, B.

The Proxy will also need to handle multiple requests concurrently; the next IP should always atomically increment. For example, given the same list above, if we were to make 3 requests each request should use a different IP. Therefore a valid assignment of IPs for the 3 requests would be any combination of A, B, C. Below is an example of using a Mutex to ensure the next IP is atomically incremented.

def next_ip(m : Mutex, ip : List[IP], i : int) : IP
	if length(ip) = 0
		raise "no ip addresses available"
    if i >= length(ip)
		i = 0
	cur_ip = ip[i]
	i += 1
	return cur_ip

TThe list of available IP addresses will play a critical role in the system. This list will need to be created on-startup and then periodically updated while the Proxy is running. This process of updating should impact the main Proxy system as little as possible. As outlined in the architectural diagram, updating the list of available IP addresses can be run in an entirely separate thread. Of course, this update will likely require thread safe access. Finally, this update will need to occur on some interval. The following code outlines retrieving the next IP in a thread-safe manner. Since the available IP list (ip) may be updated concurrently in update_ip_list, the length check on line 3 will need to be synchronized as well.

def next_ip(m : Mutex, ip : List[IP], i : int) : IP
	if length(ip) = 0
		raise "no ip addresses available"
	else if i >= length(ip)
		i = 0
	cur_ip = ip[i]
	i += 1
	return cur_ip

def update_ip_list(m : Mutex, ip : List[IP], interval : int, macAddr : string)
	Loop indefinitely
		Wait for interval
		new_ip = update_list(macAddr)
		ip = new_ip

For example, if the update were to occur every second and we happened to add 10 new IP addresses. The system could require 10 updates to occur in the course of 10 seconds. Instead, if we update every 20 seconds, even in the worst case scenario only two updates would be required. One must take these considerations into account and reach an update frequency that is suitable.

Diagram of 10 updates vs 2 updates.

At this point, we've covered enough of the design and architecture of the backconnect proxy. We covered various edge cases and produced pseudocode of the system's core feature, IP rotation. If we have done our job well, building the system should proceed smoothly. In part 4 (coming soon) we will go from PoCs (Proof of Concepts) and design to a full MVP (Minimum Viable Product).

Discuss on Hacker News, send us thoughts, or join the discussion below.