☁ CloudFront Origin Router
By Gufeng Shen on · Reading time: 3 minsCloudFront delivers websites to users from the nearest locations. Meanwhile, it forward requests that are not cached or not permitted to be cached to your origin servers. However, it is not flexible enough and neither smart enough since it cannot automatically decide the fastest origin based on the response time from edge nodes to origin servers, cannot differentiate execution timeout and connect timeout, etc.
Lambda × CloudFront
AWS has a service called Lambda which offers a serverless script execution environment. When you bring it to CloudFront, with replication on every edge node, it is called Lambda@Edge. Then you can use a script to amend requests before CloudFront sends it to the original server.
Lambda@Edge only supports1
nodejs8.10
,nodejs10.x
, orpython3.7
even though Go and other languages can be uploaded on the setup page. Besides, Lambda@Edge has only one second of execution time.
To start, given that Lambda@Edge is a form of AWS Lambda, you can use the examples illustrated in Building Lambda Functions with Python. It is worth noting that the parameter event
has a structure defined here: Lambda@Edge Event Structure.
def handler(event, context):
request = event['Records'][0]['cf']['request']
return request
Then define a const array that holds all available origins:
SERVERS = [
{"host": "o1.example.com", "port": 443},
{"host": "o2.example.com", "port": 443},
...
]
You can make a simple HTTP request to determine the fastest origin by using the following code:
# s: Server Object
def evaluate(s):
try:
connection = http.client.HTTPConnection(s['host'], s['port'], timeout=0.8)
connection.request("HEAD", "/")
connection.getresponse()
except:
print("[Down] unreachable host:" + s['host'])
return
print("[OK] reachable host:" + s['host'])
... pushlish result
Better use a multithreading model since there is a time limit:
for server in SERVERS:
threading.Thread(target=evaluate, args=(server,)).start()
You may need a watchdog so as to avoid timeout:
def guard():
time.sleep(0.9)
... publish a default server
threading.Thread(target=guard).start()
After this, you have to implement a listener that only accept the first result from all threads, and finally, redirect the request to the fastest responder in the handler:
request["origin"]["custom"]["domainName"] = choice["host"]
request["origin"]["custom"]["port"] = choice["port"]
You might not have to determine the fastest one each time when CloudFront is about to initiate a connection to an original server, so you have to make sure that you implemented a caching mechanism.
Performances
Once this is deployed you can view logs in the CloudWatch console.
1ms
for common situations
800+ms
to update cached origin
900+ms
when all origins are not reachable(result advocated by watchdog)