CrowdStrike BSOD Outage: What We Know

What we know about the CrowdStrike BSOD outage.

CrowdStrike BSOD Outage: What We Know

We've compiled key resources to provide context and help the community answer common questions about the CrowdStrike BSOD outage:

  • What happened?
  • How could CrowdStrike cause a BSOD?
  • Why does Microsoft allow access to the kernel?
  • What was the impact of the outage?
  • What are the steps for remediation?
  • Phishing URLs
  • Tracking Suspicious Domains

What happened?

On July 19th at 00:09 UTC-4, CrowdStrike pushed a detection update to their Falcon sensors on windows. This content update included a configuration file “Channel File” containing logic for behavioral detection. A problematic configuration file, C-00000291-00000000-00000029.sys, caused the Windows operating system to crash leading to the widely observed blue screen of death (BSOD). Microsoft estimated that 8.5 million devices were down due to the configuration file.

💡
A Microsoft 365 outage occurred around the same time as a result of a configuration change. Early reports conflated the Microsoft outage with outages resulting from CrowdStrike’s faulty update, but the two are unrelated.
Figure 1: An example of a blue screen of death.

CrowdStrike determined that Windows machines that had Falcon sensor versions 7.11 and above that were online between July 19th, 2024 00:09 UTC-4 and July 19th, 2024 01:27 UTC-4 may have received the erroneous configuration file leading to the machine crashing. 

C-00000291-00000000-00000029.sys was the channel file that system crashes. According to CrowdStrike, this particular update contains information used by the sensor to determine how to evaluate named pipe execution on Windows. Named pipes are used for interprocess communication between processes, where processes can share data with each other using a particular pipe. This behavior has been observed in several popular C2 frameworks.

💡
Some users shared C-00000291-00000000-00000029.sys files that consisted of null bytes only. However, according to CrowdStrike’s analysis, “the CrowdStrike Falcon® sensor wrote valid (non-zero) data to the file prior to the machine crash. The zero content observed after a reboot is an artefact of the way in which the Windows operating system manages files on disk to satisfy its security design.”

How could CrowdStrike cause a BSOD?

In order to detect malicious kernel drivers and rootkits in Windows systems, CrowdStrike and other EDR agents must be developed as Early Launch Anti-Malware (ELAM) drivers, which run in kernel mode and before other startup drivers are initialized.

The CrowdStrike agent loaded the faulty configuration file during the Windows bootup process, which resulted in an invalid memory access attempt in kernel mode. Whereas user-mode crashes will have limited impact to the underlying operating system, unhandled errors in kernel mode result in a system failure to prevent possible corruption of data, appearing to users as the BSOD. The subsequent bootloop was caused by the CrowdStrike agent attempting to reload the channel file as the system restarts.

Why does Microsoft allow access to the kernel?

Where Apple has restricted third-party Mac OS software running at the kernel level, Microsoft is required to provide security vendors access to the kernel as a result of an antitrust complaint from the European Commission in 2009. Microsoft introduced Kernel Patch Protection in 2005 to prevent kernel “patching” by third-party security software, but made exceptions for Microsoft security software, resulting in criticism from antivirus vendors.

Figure 2: Microsoft must allow access to APIs used by Microsoft security software.
💡
Vendors that want to have an ELAM driver are required to be part of the Microsoft Virus Initiative (MVI). Vendors need to request access to be part of MVI and must meet the MVI requirements.Once a vendor is part of MVI, the driver must be submitted for verification through the Windows Hardware Quality Lab submission process and must meet all ELAM driver requirements specified by Microsoft.

What was the impact of the outage?

Major organizations around the world within the transportation, financial services and healthcare sectors have reported disruptions starting from July 19th. The widespread impact reflects CrowdStrike popularity as the EDR of choice for many organizations with over half of the Fortune 500 companies using it (Reuters). Upguard has shared a list of organizations that were impacted by the CrowdStrike incident.

Due to the widespread operational and financial impact, US House leaders have called on CrowdStrike CEO George Kurtz to testify before Congress (AP News).

With regards to cyber insurance coverage, Fitch Ratings says it expects insurance companies to avoid major financial impact as a result of the outage. Loretta Worters from the Insurance Information Institute said that cyber insurance does not typically cover downtime due to non-malicious events at service providers (Reuters).

Figure 3: Some of the organizations impacted by the CrowdStrike incident (Upguard).

What are the steps for remediation?

CrowdStrike shared steps for regaining access to impacted devices that have been documented by Qualys. These steps are a mixture of reverting to prior screenshots where available or accessing the machine and removing the problematic file.

💡
Before running any of the commands against multiple devices, we recommend testing against a single machine.

Recovery Steps for Windows Machines

1. Boot Windows into Safe Mode or the Windows Recovery Environment.
2. Navigate to the C:\Windows\System32\drivers\CrowdStrike directory.
3. Locate and delete the file matching “C-00000291*.sys”.
4. Boot the host normally.

Recovery Steps for Machines on AWS

1. Detach the EBS volume from the impacted EC2 instance.
2. Attach the EBS volume to a new EC2 instance.
3. Fix the CrowdStrike driver folder.
4. Detach the EBS volume from the new EC2 instance.
5. Attach the EBS volume back to the impacted EC2 instance.

Recovery Steps for Machines on Azure

1. Log in to the Azure console. 
2. Go to Virtual Machines and select the affected VM. 
3. In the upper left of the console, click “Connect”. 
4. Click “More ways to Connect” and then select “Serial Console”. 
5. Once SAC has loaded, type in ‘cmd’ and press Enter. 
6. Type ‘ch -si 1’ and press the space bar. 
7. Enter Administrator credentials. 
8. Type the following commands: 
9. ‘bcdedit /set {current} safeboot minimal’ 
10. ‘bcdedit /set {current} safeboot network’ 
11. Restart the VM. 
12. To confirm the boot state, run the command: ‘wmic COMPUTERSYSTEM GET BootupState’.

One of the major hurdles impacting recovery is the manual effort required to regain access to affected devices. For devices that are encrypted with Bitlocker, the recovery kit will be needed to recover the device. Microsoft has also released a recovery tool to help users recover impacted devices. 

Phishing URLs

While CrowdStrike worked to identify what caused systems to crash and impacted organizations attempted to continue business operations where possible, others were quick to exploit the outage for their own gain. Researchers identified several domains that had been registered related to the outage (Twitter Post). 

Some sites provided instructions on how to recover a system that matches those provided by CrowdStrike, but also provided contact information to impacted users to reach out. 

Figure 4: Website advertising recovery support creating during the outage (Twitter Post)

Another site also contained donations links through PayPal and bitcoin. 

Figure 5: A website asking for donation through cryptocurrency or PayPal (Twitter Post)
Figure 6: Another website that claims to provide a recovery tool once payment has been made via Bitcoin or Ethereum (Twitter Post)

Tracking Suspicious Domains

Potentially lookalike and suspicious domains related to the CrowdStrike Outage have been identified, aggregated, and added to the Pulsedive platform. This data can be queried in Pulsedive using the Explore query threat="CrowdStrike BSOD Impersonation" and is available for export in multiple formats (CSV, STIX 2.1, JSON).

References