As states closed their polls and Trump’s campaign surged, the Canadian immigration website crashed repeatedly with a “500 Internal Server Error.” But, in the age of “the cloud,” how could that be?
It wasn’t all that long ago that a spike in network traffic was the worst nightmare for network engineers. Hours upon hours were spent drawing up traffic projection models. Do we have enough inbound bandwidth? Do we have enough outbound bandwidth? Do we have the hardware at hand to “scale up” if, somehow, our product was featured on Oprah and the floodgates opened upon our front-door? These were serious endeavors.
At startups, engineers would fight tooth-and-nail to ensure they had the resources they needed to be ready for a surge. The technical and financial restraints were real — and we all knew there were limits to how well we could insulate ourselves from the risk.
What happened to Canadian immigration site last night provides a valuable lesson. Immigration spokeswoman Lisa Filipps said the website had become “temporarily inaccessible to users as a result of a significant increase in the volume of traffic.” It’s not enough to have headroom in our new world of DDoS attacks and viral social posts; you need to have scale. Just like in economics, scale is often the only means of protection.
Solutions like Amazon’s AWS make achieving scale exceedingly painless. With tools like Auto Scaling, one can inherit the scale achievable only via massive infrastructure providers (Amazon, Oracle, etc.) in an on-demand consumption-driven model (pay for what you use). And it’s not that difficult either.
“Alarms and policies to determine when the conditions for scaling are met. An alarm is an object that watches over a single metric (for example, the average CPU utilization of the EC2 instances in your Auto Scaling group) over a specified time period. When the value of the metric breaches the threshold that you defined, for the number of time periods that you specified, the alarm performs one or more actions (such as sending messages to Auto Scaling). A policy is a set of instructions that tells Auto Scaling how to respond to alarm messages.”
For any site with the possibility for spikes in traffic, automatic scaling is the smart play. In an election season with as many headlines as the one we’ve just witnessed, we can’t say the immigration website issues were a surprise. But with smart policies in-place, the Canadian immigration site could have automatically scaled up to respond to the surge and then scale back-down once the surge had passed.