10 Largest IT Outages in Historical past: Who Pulled the Plug?

July 1, 2025

23

Trendy enterprise continuity hinges on the reliability of expertise.

When important programs go down, the influence isn’t theoretical; it’s operational, monetary, and reputational. IT outages have price firms upwards of $740 million, with ripple results that stretch far past instant downtime.

When such incidents happen, the response turns into essential, not simply to create trails of authorized proof supporting your case, however to cater to clients, making certain they’re not (or minimally) impacted. Some firms make use of incident response software program to make strategic selections amid chaos.

The software program platform will present assist; your method will outline how successfully you management the damages attributable to any IT outage. To create a sensible method, it’s important to get to the “why” and “how” of IT outages.

Under are a number of examples that illustrate the most important IT outages and their influence intimately. These incidents will aid you establish the widespread loopholes that trigger IT outages, serving to you strategize a extra practical method.

Largest IT outages in historical past at a look

Listed here are the IT outage incidents, causes, and their influence that made it huge previously, inflicting downtime for a number of well-liked web sites:

12 months	Incident	Trigger	Affect
2024	CrowdStrike replace crash	A defective safety software program replace. There was a bug within the kernel driver.	Affected 8.5 million Microsoft Home windows units, or lower than 1% of all Home windows machines.
2022	Southwest Airways meltdown	Outdated crew scheduling software program	59% of Southwest Airways flights obtained cancelled. The corporate paid $600 million in reimbursements and $140 million in fines.
2022	Rogers Canada blackout	Inside routing failure	Greater than 12 million clients misplaced wi-fi and wireline providers.
2021	Fb/Meta outage	Defective community config	Affected 3.5 billion customers of its mixed providers. They skilled service unavailability.
2021	Fastly CDN outage	A buyer config change triggered a software program bug	Impacted 85% of their providers.
2020	Google providers outage	Inside storage quota problem (auth system)	World Gmail, YouTube, Maps, and many others. went offline, and customers weren’t in a position to log in.
2019	Verizon BGP route leak	Routing misconfiguration	15% of web visitors was misrouted at peak.
2017	AWS S3 outage	A typo within the server command	A whole lot of internet sites/apps went down. Near $150 million was misplaced by S&P 500 companies alone.
2016	Dyn Area Title System (DNS) assault	Distributed denial-of-service (DDoS) assault with Mirai botnet	Main web sites like Twitter, Netflix, and CNN had been down throughout the US/EU.
2011	PlayStation Community outage	Exterior hack inflicting a safety breach.	Presumably, about 77 million accounts had been affected.

The largest IT outages in historical past by yr

Under is an summary of various IT outages which have made it into historical past. Let’s get to them with no second’s spend.

2024: CrowdStrike international IT outage

Trigger: A flawed CrowdStrike Falcon Sensor replace brought about Home windows units to crash to a blue display on reboot.
Affect: Round 8.5 million Home windows programs crashed worldwide on July 19, 2024.

CrowdStrike’s routine software program replace with a important logic error triggered a worldwide IT outage in July 2024. It was a misconfigured channel file for Home windows. When it was pushed to buyer units globally, it brought about any Microsoft Home windows working Falcon, CrowStrike’s safety agent, to crash instantly. Customers noticed a blue display upon reboot.

Inside hours, the influence reached 8.5 million Home windows PCs and servers. The impact was extremely unprecedented. All U.S. airways grounded flights as a precaution. All establishments, together with banks, hospitals, and authorities workplaces, skilled the outage.

Though it is tough to place a quantity to the precise financial influence, Nir Perry, the CEO of cyber insurance coverage threat platform Cyberwrite, stated, “the damages may attain tens of billions of {dollars}.”

Curiously, this wasn’t a cyber assault however a lapse in software program high quality management. This exposes dangers related to centralized software program updates and higher patch administration practices.

2022: Southwest Airways meltdown

Trigger: A legacy laptop system failure in Southwest’s crew scheduling software program.
Affect: 59% of Southwest Airways flights obtained cancelled.

Whereas climate situations can understandably trigger flight delays, the mass cancellations by Southwest weren’t primarily weather-related. Different airways going through the identical winter storm managed to renew regular operations comparatively shortly, in contrast to Southwest.

As an instance, Southwest canceled 59% of its flights, in comparison with solely 3% canceled by different main carriers. Southwest itself has admitted that these widespread cancellations and delays since December 24 stem from inner points throughout the airline’s management.

The airways impacted numerous vacationers, inflicting them to overlook household gatherings. Many shoppers skilled frustration as a result of they had been unable to achieve Southwest representatives for help. Within the aftermath, Southwest accelerated plans to improve its expertise.

The corporate was fined a document $140 million (£110 million) by the US Division of Transportation (DOT). As well as, the corporate paid a reimbursement of round $600 million to passengers.

2022: Rogers Canada blackout

Trigger: A upkeep replace deleted a routing filter, inflicting Rogers’ core IP routers to overload and crash.
Impact: Greater than 12 million clients misplaced wi-fi and wireline providers.

On July 8, 2022, Canada’s largest telecom supplier, Rogers Communications, skilled a catastrophic outage that impacted a variety of providers throughout the nation. It started when Rogers was implementing a scheduled replace to improve its core IP community.

The technician made an error and eliminated a important BGP routing filter on the core community distribution routers. This resulted within the full collapse of the Roger community. Because it was a distinguished supplier in Canada, it knocked Canada’s community totally. Even a number of 911 emergency calls failed, elevating considerations about public security.

A later authorities report discovered the corporate lacked correct community redundancy and had tied each wi-fi and broadband providers to the identical core infrastructure, making the failure “excessive”.

2021: Meta/Fb international outage

Trigger: A defective configuration change on Fb’s spine community disconnected Fb’s knowledge facilities from the web.
Impact: Affected 3.5 billion customers of its mixed providers. They skilled service unavailability.

On October 4, 2021, the social media big Fb (now Meta) skilled a historic outage, affecting customers worldwide. Fb’s inner community underwent routine upkeep. An engineer issued a command to replace the community configuration and unintentionally took down all BGP routes to Fb’s DNS servers.

The outage lasted about 5.5 hours earlier than the crew may manually restore the networking tools. Shut to three.5 million Fb, Instagram, or WhatsApp customers had been minimize off from the platforms.

The reason for this incident was the misconfiguration of spine routers and the failure of an auditing software that ought to have caught the error.

2021: Fastly CDN outage

Trigger: Software program bug in Fastly’s CDN code, triggered by a buyer’s legitimate configuration change.
Impact: 85% of their providers returned errors, taking down high-profile web sites like The Guardian, CNN, and a few streaming platforms.

A cloud content material supply community (CDN) outage demonstrated how one glitch may take down a big chunk of the net. Fastly, a prime CDN supplier, skilled a worldwide outage on June 8, 2021. It impacted a number of the high-profile web sites and a few authorities web sites within the UK.

The engineers had been in a position to detect the issue and establish the perpetrator configuration. The incident raised consciousness of the reliance on a number of CDN suppliers and prompted firms to revisit redundancy for his or her net infrastructure.

Listed here are some fast takeaways and learnings from the Fastly CDN outage:

Diversify supply providers. Contemplate two or extra CDNs for optimum supply. It reduces the influence a CDN would expertise when it faces service disruption.
Create a backup plan. Guarantee visibility into indicators of points and know when to activate backup procedures.
Perceive your dependencies. Contemplate the hidden ones and even the oblique dependencies. When you depend on exterior providers for web site or app elements, perceive dependencies like DNS, internet hosting, and many others.

There are a number of different elements to think about when taking steps to handle IT outages. Most significantly, the way you deal with the incident speaks volumes about your dedication to serving your clients.

On the tech aspect, when you’ve got an incident administration software program onboard, it would aid you reply, report, examine digital incidents, and preserve issues in sync when all the pieces emerges into chaos immediately.

2020: Google service outage

Trigger: An authentication system bug was attributable to an inner storage quota problem.
Impact: All Google providers worldwide had been unreachable

On December 14th, 2020, all Google providers went down abruptly. Gmail, YouTube, Docs, Maps, Calendar, and even Nest good residence providers stopped working. The search engine big later confirmed the reason for the outage. It was a problem within the central id administration system.

The inner storage quota was exhausted within the system that handles consumer authentication. Resulting from this, Google’s login and account APIs failed. They turned globally inaccessible for roughly 45 minutes. Throughout this time, solely the search engine remained up because it didn’t require any login.

Though it lasted underneath an hour, the outage’s influence was immense resulting from Google’s ubiquity. It underscored how a hidden single-point failure, on this case, an inner quota configuration, may disrupt the day by day workflow of billions, from companies to varsities and customers.

2019: Verizon BGP Route Leak

Trigger: A misconfigured BGP optimizer at a small web service supplier (ISP), compounded by Verizon’s lack of route filters.
Affect: Web visitors misrouted, inflicting outages and slowdowns for Cloudflare, Amazon, Fb, and others. Cloudflare noticed a 15% drop in international visitors throughout the incident.

On June 24, 2019, a BGP mishap demonstrated how fragile the Web’s routing system might be. It began when a small Pennsylvania ISP, utilizing a BGP optimization software (Noction), leaked hundreds of improper routes to its upstream supplier, Verizon.

Verizon, one of many largest web spine suppliers, propagated these routes globally as an alternative of filtering them out. The consequence was an web visitors jam: massive parts of visitors destined for giant providers had been erroneously routed by DQE/Verizon’s community after which dropped or slowed as a result of these networks couldn’t deal with it. Cloudflare reported a 15% lack of its international visitors at its worst level.

The incident lasted a number of hours on Monday morning earlier than the unhealthy routes had been corrected.

2017: Amazon Internet Providers S3 Outage

Trigger: An AWS engineer mistakenly eliminated too many servers throughout a routine process.
Affect: 4-hour outage of AWS S3 storage in a single area, cascading failures throughout many apps.

On February 28, 2017, Amazon’s extensively used cloud storage service, Easy Storage Service S3 within the N. The Virginia area went down resulting from a easy mistake. An AWS crew member, whereas debugging the billing system, ran a upkeep command with the incorrect parameter, eradicating a far bigger set of servers than supposed. This triggered a cascade: important S3 index and placement subsystems misplaced capability and needed to be restarted, a course of that took hours.

For about 4 hours, S3 was unable to serve requests in that area. In style web sites and apps like Quora, Slack, Medium, Trello, Enterprise Insider, and Docker Hub turned unavailable or severely degraded. Even AWS’s personal standing dashboard failed, since its icons had been saved on S3. The financial influence was substantial.

One evaluation estimated that S&P 500 firms alone misplaced $150 million as a result of incident, not counting the quite a few startups and third-party providers additionally affected.

2016: Dyn DNS Assault

Trigger: DDoS assault by the Mirai IoT botnet.
Affect: Main web sites, together with Twitter, Netflix, and Reddit, went down throughout the US and Europe.

On October 21, 2016, a DDoS assault on DNS supplier Dyn disrupted web entry on a large scale. Dyn’s position was to translate domains to IP addresses for a lot of well-liked websites. Starting that morning, a Mirai botnet, comprising a whole bunch of hundreds of malware-infected IoT units, bombarded Dyn’s DNS servers with pretend lookup requests, overwhelming them.

Dyn estimated that roughly 100,000 malicious endpoints had been concentrating on its infrastructure, with visitors peaking at 1.2 Tbps, roughly twice the dimensions of any earlier DDoS on document on the time. The assault got here in waves and knocked offline main providers like Twitter, Netflix, Reddit, PayPal, CNN, and even The Guardian’s web site, for customers throughout the U.S. and Europe.

2011: PlayStation Community Outage

Trigger: Exterior hack and knowledge breach, forcing Sony to close down PSN servers.
Impact: Greater than 77 million customers had their private knowledge compromised.

In April 2011, Sony’s PlayStation Community (PSN) suffered one of many longest gaming service outages in historical past. Hackers infiltrated PSN between April 17 and 19, stealing account knowledge like usernames, passwords, and probably bank card data from over 77 million customers.

In response, Sony utterly shut down PSN on April 20 to include the breach. The community remained down till Might 14, leaving PlayStation players unable to entry on-line video games.

This incident, basically a large cyberattack, price Sony an estimated $171 million in remediation and safety enhancements.

A fast takeaway, earlier than your service takes away

These information unanimously recommend that cyber assaults aren’t the first reason for nearly all of IT outages, regardless of how typically they’re blamed. You don’t essentially must look out for malicious attackers when you might have people and misconfigured settings (unintentional, in fact) doing their job.

This highlights the significance of addressing inner safety gaps earlier than implementing defenses towards exterior threats. There’s no precedence right here, however you must take each inner and exterior safety equally significantly to maintain your system up.

Study extra about incident response and make safety incidents much less chaotic.

10 Largest IT Outages in Historical past: Who Pulled the Plug?

Largest IT outages in historical past at a look

The largest IT outages in historical past by yr

2024: CrowdStrike international IT outage

2022: Southwest Airways meltdown

2022: Rogers Canada blackout

2021: Meta/Fb international outage

2021: Fastly CDN outage

2020: Google service outage

2019: Verizon BGP Route Leak

2017: Amazon Internet Providers S3 Outage

2016: Dyn DNS Assault

2011: PlayStation Community Outage

A fast takeaway, earlier than your service takes away

Related Articles

THE DEVIL WEARS PRADA’s KYLE SIPRESS Reveals He Had No Enter & No Components On The Band’s New Album

Is Your Gross sales Staff Responsible of AI-Washing? A CRO’s Information to AI Brokers, Assistants, and Precise ROI

Product Administration within the Age of AI with Chris Butler – O’Reilly

LEAVE A REPLY Cancel reply

Latest Articles

THE DEVIL WEARS PRADA’s KYLE SIPRESS Reveals He Had No Enter & No Components On The Band’s New Album

Is Your Gross sales Staff Responsible of AI-Washing? A CRO’s Information to AI Brokers, Assistants, and Precise ROI

Product Administration within the Age of AI with Chris Butler – O’Reilly

How districts can keep away from 4 hidden prices of outdated amenities methods

Brad Pitt Intensifies Demand For Entry To Angelina Jolie’s Personal Messages