Building a Backconnect Proxy (3/5)
Design and Architecture
Published on: 2022-02-24
This is the third in a series of 5 posts that will outline how to go from idea to product through the creation of a backconnect/rotating proxy in Golang. Before a product can be made, it will need to be envisioned. At LDG, we specialize in taking these ideas and bringing them to life.
Our Objective
In the last post we ensured the technical feasibility of creating a backconnect proxy in Golang. In this post we will tackle system design and architecture. First, let us restate our objective: to allow clients to make more requests than permitted by a site's rate limits. This objective can be expanded into a more detailed technical list:
- Accept requests from clients
- Rotate between a list of IP addresses for each request
- Forward requests to a destination
- Receive the response from the destination
- Return the response from the destination back to the client
- Maintain a list of IP addresses that are available for rotating
Given this more detailed list, we can produce a high level architecture diagram outlining the layout of the system.
The system is described by Proxy
, which interacts with two external components:
Client
and Destination
. The Proxy
will receive requests
from
a
Client
. The next IP in the rotation will be used to forward the request to the
destination.
The Proxy
will receive the response from the destination and return that response
back
to
the Client
. The Proxy
will also concurrently maintain a list of IP
addresses for
rotation. In this diagram, Client
may refer to one or more clients making requests
through
the Proxy
.
Use Cases
With the high level design diagram of the system finished, we now need to dive deeper into the specifics of how the system should behave. One technique for detailing a system's expected behavior is to outline use cases of software. Use case definitions help identify both the common flow and edge cases.
Use Case 1: Proxy a client
- Client makes request
- Proxy receives request
- Proxy picks the
next IP
(ipi
) to use next IP
is incremented toipi+1
- IP is used to make the request
- The response is returned to the client
Use Case 2: Proxy a client with available IP list wraparound
- Client makes request
- Proxy receives request
- The
next IP
(ipi
) exceeds the available list of IP addresses - Proxy picks the first IP (
ip0
) in the list of available IP addresses next IP
is set to the second IP (ip1
) in the list of available IP addresses- IP is used to proxy through to make the request
- The response is returned to the client
Use Case 3: Proxy without any available IP addresses
- Client makes request
- Proxy receives request
- List of available IP addresses is empty
- Client receives a response that the request failed due to a configuration error
Use Case 4: Multiple concurrent requests
- Client A makes request
- Client B makes request
- Proxy receives request A
- Proxy receives request B
- Proxy picks the next IP (
ipi
) to use for request A next IP
is incremented toipi+1
- Proxy picks
next IP
(ipi+1
) to use for request B next IP
is incremented toipi+2
ipi
is used to make request Aipi+1
is used to make request B- Response A is returned to client A
- Response B is returned to client B
Use Case 5: Updating the list of available IP addresses
- On startup, load the list of available IP addresses
- Wait a defined interval for the next update
- Update the list of available IP addresses
- Return to step 2
In order to perform the rotation we will need to keep a list of
available IP addresses
(ip
). Upon each request we need to retrieve the next IP
to use. For
simple
rotation scenarios, the definition of the next IP
is simply the IP following the
previous IP. Likewise, the initial next IP
will be the first IP
(ip0
) in the available IP addresses
list.
def next_ip(ip : List[IP], i : int) : IP
if length(ip) = 0
raise "no ip addresses available"
else if i >= length(ip)
i = 0
end
cur_ip = ip[i]
i += 1
return cur_ip
end
Naturally, we will eventually reach the end of our list of available IP addresses
.
In
this case the next IP
will be the first IP (ip0
) in the
list of
available IP addresses
. For example if the list of
available IP addresses
were as follows: [A B C D]
. The sequence of the first 6 IP addresses used would be:
A
, B
, C
, D
, A
, B
.
The Proxy
will also need to handle multiple requests concurrently; the
next IP
should always
atomically
increment. For example, given the same list above, if we were to make 3 requests each
request should use a different IP. Therefore a valid assignment of IPs for the 3 requests would
be
any combination of A, B, C. Below is an example of using a Mutex
to ensure the
next IP
is atomically incremented.
def next_ip(m : Mutex, ip : List[IP], i : int) : IP
if length(ip) = 0
raise "no ip addresses available"
end
m.lock()
if i >= length(ip)
i = 0
end
cur_ip = ip[i]
i += 1
m.unlock()
return cur_ip
end
TThe list of available IP addresses
will play a critical role in the system. This
list
will need to be created on-startup and then periodically updated while the Proxy
is
running. This process of updating should impact the main Proxy
system as little as
possible. As outlined in the architectural diagram, updating the list of
available IP addresses
can be run in an entirely separate thread. Of course, this
update will likely require thread safe access. Finally, this update will need to occur on some
interval. The following code outlines retrieving the next IP
in a thread-safe
manner.
Since the available IP list
(ip
) may be updated concurrently in
update_ip_list
, the length check on line 3 will need to be synchronized as well.
def next_ip(m : Mutex, ip : List[IP], i : int) : IP
m.lock()
if length(ip) = 0
m.unlock()
raise "no ip addresses available"
else if i >= length(ip)
i = 0
end
cur_ip = ip[i]
i += 1
m.unlock()
return cur_ip
end
def update_ip_list(m : Mutex, ip : List[IP], interval : int, macAddr : string)
Loop indefinitely
Wait for interval
new_ip = update_list(macAddr)
m.lock()
ip = new_ip
m.unlock()
end
end
For example, if the update were to occur every second and we happened to add 10 new IP addresses. The system could require 10 updates to occur in the course of 10 seconds. Instead, if we update every 20 seconds, even in the worst case scenario only two updates would be required. One must take these considerations into account and reach an update frequency that is suitable.
At this point, we've covered enough of the design and architecture of the backconnect proxy. We covered various edge cases and produced pseudocode of the system's core feature, IP rotation. If we have done our job well, building the system should proceed smoothly. In part 4 (coming soon) we will go from PoCs (Proof of Concepts) and design to a full MVP (Minimum Viable Product).
Discuss on Hacker News, send us thoughts, or join the discussion below.