CrowdStrike Outage Updates , Incident & Breach Response , Security Operations
Banks and Airlines Disrupted as Mass Outage Hits Windows PCs
CrowdStrike Confirms Faulty Software Update for Falcon Sensor, Is Deploying FixBanks, airlines, major media firms and others are experiencing business disruptions due to a mass, global IT outage tied to Windows PCs.
Security and IT experts report that the outage appears to be due to a faulty Falcon software update released by Austin, Texas-based cybersecurity firm CrowdStrike, which leaves Windows systems displaying the dreaded "blue screen of death."
CrowdStrike said it's aware of the outages. "CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts," a spokesman told Information Security Media Group in a statement. "Mac and Linux hosts are not impacted. This is not a security incident or cyberattack. The issue has been identified, isolated and a fix has been deployed."
Microsoft said the outages appeared to begin around 6 p.m. U.S. Eastern Time on Thursday and that it's taking "mitigation actions," including for Microsoft 365 outages. Numerous other organizations worldwide have also reported disruptions.
CrowdStrike on Friday morning detailed a workaround for the faulty software update while it prepared a more robust fix. The company has advised all customers to stay tuned.
"We refer customers to the support portal for the latest updates and will continue to provide complete and continuous updates on our website," the CrowdStrike spokesman said. "We further recommend organizations ensure they're communicating with CrowdStrike representatives through official channels. Our team is fully mobilized to ensure the security and stability of CrowdStrike customers."
"The team at CrowdStrike takes their responsibility as a global cyber defense company seriously. I know people there working hard to restore services and fix this issue," said Ian Thornton-Trump, CISO of Cyjax. "But it does serve to remind everyone that the digital infrastructure we rely on is fragile. As custodians of IT and cybersecurity defenses we need to have conversations about resiliency. I can tell you: If 'bad vendor update' is not part of an incident response playbook, it should be on Monday."
The resulting disruptions have reportedly led to flight delays or cancellations at multiple airports in the U.K., EU, Malaysia, India and beyond, and Hong Kong International Airport was left "in chaos" as staff reverted to using manual procedures for checking in passengers, reportedSouth China Morning Post.
The U.S. Federal Aviation Administration said it's "closely monitoring a technical issue impacting IT systems at U.S. airlines," in a post to social platform X. "Several airlines have requested FAA assistance with ground stops until the issue is resolved." Delta, United and American Airlines reported being temporarily grounded, and United and American subsequently resumed some flights, albeit with delays. In Australia, airport-goers reported that flight information screens at Sydney Airport went blank, and self-service checkout systems at supermarket chains Woolworths and Coles are displaying error messages. National broadcasters ABC and Network Ten reported outages, as did the country's Bendigo and Adelaide Bank.
In Britain, banks, rail operators and doctor's offices reported disruptions, as did Sky News, which was unable "to broadcast live TV this morning" in Britain or Australia, said David Rhodes, its executive chairman, in a post to X. "We are working hard to restore all services."
CrowdStrike Details Workaround
Not all versions of CrowdStrike Falcon are affected. "It is our understanding that any business running versions 7.15 and 7.16 are affected by the outage, but 7.17 seems to be OK," said Ajay Unni, CEO of Australian cybersecurity service firm StickmanCyber.
Multiple IT administrators report receiving this workaround from CrowdStrike's support team:
- Boot Windows into Safe Mode or the Windows Recovery Environment;
- Navigate to the
C:/Windows/System32/drivers/CrowdStrike
directory; - Locate the file matching
C-00000291*sys
and delete it; - Boot the host normally.
Ireland's National Cyber Security Center said that faulty channel file 291 is the culprit, and that CrowdStrike has replaced it with a nonfaulty version that it's now distributing via automatic update channels.
"We hope that this will mitigate further expansion" of the problem, the NCSC said. "For already crashing systems, some are rebooting to a normal working state and we believe they should pick the new channel file." But other systems "are just loop crashing and might need a manual intervention."
Applying the workaround via manual invention might be fine in theory, but it remains difficult to implement at scale since it can't be automated, said British cybersecurity expert Kevin Beaumont in a post to Mastodon. Also, what effect deleting the file might have - for example, would it compromise the ability of the Falcon sensor to detect or block malicious code? - remains unclear.
"If anybody is wondering the impact of the CrowdStrike thing - it's really bad. Machines don't boot," Beaumont said. "Basically, CrowdStrike will be in very hot water."
"IT security tools are all designed to ensure that companies can continue to operate in the worst-case scenario of a data breach, so to be the root cause of a global IT outage is an unmitigated disaster," StickmanCyber's Unni said.
Incident Response Plans Activated
Multiple IT support teams have reported that they are implementing incident response plans to deal with the outages or at least preparing for serious overtime. Some IT administrators have been taking to message boards seeking advice on workarounds as well as the potential efficacy of any update CrowdStrike might push.
"I have 40% of the Windows Servers and 70% of client computers stuck in boot loop (totaling over 1,000 endpoints), one posted to the r/crowdStrike
subreddit. "I don't think CrowdStrike can fix it, right? Whatever new agent they push out won't be received by those endpoints coz they haven't even finished booting."
"Here in the Philippines, specifically in my employer, it is like Thanos snapped his fingers," another posted. "Half of the entire organization are down due to BSOD loop. Started at 2pm and is still ongoing. What a Friday."
Many report having to immediately update large numbers of PCs, in some cases by in-person teams. "I'm planning a weekend trip to 15 sites with all the IT staff to bring systems up one by one. Hilarious," one Australian IT administrator who oversees about 200 Windows PCs, said in a post to Mastodon, adding that every system uses BitLocker whole-disk encryption plus a local administrator password solution.
"We're 100% bitlockered and LAPS'ed, so I have to wake every machine by hand to delete the file, AFAIK," the admin said. "Happy to accept advice on a better way."
Others reported needing to get IT hands on keyboards to deal with affected systems. "All of our work computers use bitlocker for certain government contract requirements (consulting). So no employees can do the official workaround on their own since they won't have the bitlocker recovery key," another IT administrator posted to the r/sysadmin
subreddit. "So there goes the weekend I guess."
Security experts said the outage should prompt sharp questions about potential single points of failure in Windows environments.
"I'm just going to throw this out there, but maybe - just maybe - a vendor having the ability to change every one of their kernel drivers in the field at the same time without any approval from IT/end users is a model we need to reconsider ... @CrowdStrike," Jake Williams, faculty member at the Institute for Applied Network Security and a former National Security Agency elite hacking team member said in a post to X.
Update July 19, 2024 10:24 UTC: Added CrowdStrike's statement and analysis from Ireland's NCSC.
Update July 19, 2024 11:09 UTC: Added FAA's statement.