Today (8 June 2021), many high profile websites were unavailable for around an hour because they were all relying on a Content Delivery Network (CDN) called Fastly that became unavailable. Some headlines said that this was a cautionary tale about the fragility of the web. But is this true? What exactly happened? And what is a CDN anyway?
What is a Content Delivery Network?
A CDN is primarily a service that someone running a website can use to send you their web pages (and other data) really quickly.
Anyone can host a website. They can do it from a PC in their basement, with the downside being that if there’s a flood or the power goes out, then the website goes offline. Or they can use a few of someone else’s computers in a non-flooding location, with a good power supply, and with specialist experience to run them. Amazon Web Services, Google Cloud and Microsoft Azure are popular hosting services from the tech giants, and there are hundreds of other smaller companies who can do the same.
But no matter where someone hosts a website, they face a challenge if lots of people want to access it quickly. The few computers they have will get swamped trying to serve all the people who want their content. You might be on the other side of the world to the website, where it takes a bit longer for data to reach you, and those small delays can add up to a slow site. It also means the core web server is very open to being targeted by a malicious person who wants to bring it down.
A solution to all these issues is to use a CDN like Fastly.
As an example, when Buzzfeed publishes a listicle of the 100 Best Stand-up Comedians (or whatever) to its web servers, it tells Fastly about these new webpages. Fastly then copies the relevant content to its vast number of computers all around the world that sit on the edge of the internet, much closer to the website’s readers. These locations are known as “Points of Presence” (POPs).
So when you really want to know who the 41st best stand-up is, it’ll come from your nearest POP. And of course it’s important to know quickly because the more content you view, the more ads you might click on, and the more dollars Buzzfeed sees. Those small delays cost businesses money.
Why did Fastly go down?
Well right now we don’t really know. All we have is this tweet from Fastly and their status page about the incident that isn’t very illuminating.
But we can make an informed guess. Citing “a service configuration” hints at human error rather than a cyberattack or anything more serious. And this is entirely understandable. Running a CDN isn’t a simple business. Managing thousands of clients uploading millions of pieces of content to their POPs all around the globe is a huge challenge. The way the network is built and configured is complex. So when Fastly want to make a change to that setup (which they do frequently), there’s always a risk that something will go wrong.
Is it a big deal if Fastly goes down?
Well if you just look at today, then yes! Because Fastly is used by a lot of big companies to get their content to you, there can be a lot of disruption if it’s not working.
But over time, it’s not really that bad. A CDN going down is a relatively rare event. It just seems chaotic because it has a big impact on many companies all at once. If centralised CDNs didn’t exist, the alternative would be for everyone to try to run their own. But that would be expensive and – much like trying to run a website out of your basement – certainly wouldn’t be as reliable as a using a specialist company.
And although a CDN like Fastly is a complex beast, copying things to be closer to the users is actually one of the simplest parts of running a website. There are plenty of other CDNs available. You have the likes of Amazon, Google and Microsoft, but also specialist companies like Cloudflare and Akamai. All these services tend to operate in a similar way.
If Fastly disappeared tomorrow, most websites could switch to another CDN. And if uptime is really important, companies can make sure they have a backup CDN ready to go. For example, we discovered that the UK government website gov.uk uses Fastly, but it also has Amazon and Google in reserve and that could be up and running within an hour.
So is the web vulnerable to one company?
The world wide web and its underlying Internet are technological marvels, but they were not designed in detail like a skyscraper or a car. They have many important basic principles, but have grown much more like a city without much planning. There are far worse horrors buried beneath the streets than a CDN occasionally going down.
A clear single point of failure is that the tech giants now hold entire businesses in their infrastructure. You can imagine a doomsday scenario where all of one provider’s data and websites are deleted, bankrupting thousands of companies. In practice, there are significant commercial, technical and organisational controls to make that very unlikely.
The bigger threats are a little more subtle. For example, in 2016 a single developer managed to break the way thousands of websites are built by deleting a piece of simple code that they were all using.
And because the Internet fundamentally relies on trust to get data between two points, it’s fairly easy (either by accident or design) to hijack the route data takes. Recently there have been high-profile incidents where data destined for somewhere else was redirected to Russia and China.
So yes, in many ways the web and the Internet are fragile. It is possible for an individual, company or nation state to break pieces of them. It’s happened on occasion, and no doubt there will be some more spectacular outages and issues in future.
But again if you step back and look over time, the web has actually been amazingly resilient. It’s a global information and service exchange that has been running in some form for over 30 years and never collapsed.
That said, it is right to ask questions about this amazing technology’s vulnerabilities and who has influence over it. On whatever level each of us can, we should all try to ensure that the choices we make and the services we use move things in the right direction.