Billions of users all over the world experienced a major loss of service from both Facebook and WhatsApp on Monday, 4th October. There were so many searches for information on the outage that the topic was soon trending on the web. National media carried the story on their front pages - but provided little insight to the question of "what's going on here". Here is a plain English explanation:
The connectivity within the internet relies on a resource called the Domain Name System (or DNS for short). Think of it as a massive address book - it connects a name (like www.facebook.com) to a computer (often referred to as "servers") The referenced computer has an address (like a telephone number) and the DNS system provides the link between the website name and the physical server. It's similar to the way you use your phone: you tap a contact and the phone dials a number and the phone network uses the number to connect you with the person you want to speak to. So when you click a Facebook link, the DNS finds the Facebook server, connects you to it and off you go.
This all gets a little more involved on the internet: a company like Facebook has masses of servers and it needs to break them down into manageable groups, so it has what are known as sub-domains that are operated internally - these subdomains help Facebook find the right place for the type of activity you want to perform (for instance, whether you want to post on Facebook, use Messenger or engage in WhatsApp). If the computer managing subdomains that provide the connection between you and the service you are trying to access is "down" then it's like calling a phone that's switched off - there's no-one there to speak to and you see a message like the one at the top of this post.
That's what happened at Facebook - they messed up their own DNS (apparently through through human error, not technical malfunction) and although their computers were not down they were all effectively "ex-directory")
Even an organisation like Facebook - whose business is based around sophisticated technology can get it wrong. But there are specialists out there who are asking some important questions about how Facebook is managing it's DNS and why they did not use several available techniques build in resilience which would have prevented the widespread outage.