CrowdStrike, the cybersecurity company behind one of, if not the biggest IT outage in history, has explained where it went wrong after pushing out an update to its malware software that nuked a staggering 8.5 million Windows machines.
The update was released last Friday, and approximately 8.5 million Windows machines were thrown into infinite boot loops with blue screens of death, impacting many aspects of everyday society, such as airlines, supermarkets, telecommunications, emergency services, and more. Since then, CrowdStrike has been quiet on how it missed the faults within the driver throughout internal testing, which it has now explained in a new update.
Here's how it works. CrowdStrike's Falcon Sensor, the software that was updated and ultimately led to the global outage, uses what is called "Sensor Content," which is software that defines what Falcon Sensor is capable of. The software is updated with "Rapid Response Content" which is designed to enable the software to detect and collect information on any new threats.
Further into the weeds, we go. Sensor Content relies on what is called "Template Types," or lines of code that include pre-defined fields for threat detection engineers to use with Rapid Response Content. The Rapid Response Content, or the new detection information for the software, is delivered in "Template Instances," which can change the behaviors of the software, such as granting it improved detection, identification, and prevention capabilities.
CrowdStrike announced the "InterProcessCommunication (IPC) Template Type" in February 2024, and on March 5, it passed testing and was released as a Template Instance. Three additional IPC Template Instances were released between April 8 and April 24, and two more on July 19. One of these included the bullet that brought down 8.5 million PCs.
CrowdStrike states the IPC Template Instance that included what it describes as "problematic content data," but was rolled out publicly anyway, wasn't picked up internally due to "a bug in the Content Validator" - CrowdStrike's own internal code-testing software.
Unfortunately, CrowdStrike didn't elaborate on what this bug was within the Content Validator or what the Content Validator even is. We can only assume what its name suggests: that its role is to validate any content that it is introduced to, looking for faults and errors within code.
In this outage, the Content Validator missed the bad data that caused machines to execute code that triggered an "out-of-bound memory read," resulting in critical boot failures and 8.5 million blue screens of death.