The data from Backblaze should not influence a purchasing decision by any consumer, regardless of what type of drive they are purchasing. The innumerable variables, and lack of documentation, ensures the results are unreliable. Even for the winners, the results aren't good; the failure rates are exponentially higher than those observed in the real-world. One should question whether these companies could survive financially with the massive warranty return rates in real-world scenarios.
We covered some of the most obvious holes in the methodology behind the Backblaze comparisons, but there are many more, such as sample size. With varying numbers of drives for each model, it is possible that some bad batches may have made their way into the sample pool, thus further skewing the numbers.
There is no clearer example of this than in their blog post titled "Enterprise Drives: Facts or Fiction?". In this blog post, Backblaze compared 368 enterprise HDDs, presumably purchased as a batch, to 14,719 consumer drives. Along with the fact that a bad batch may skew the numbers, Backblaze admits they subjected the drives to various chassis, temperatures and workloads. This creates data that is essentially worthless for comparative purposes, but when paired with a catchy title, it serves the purpose of attracting attention.
The enthusiast in me loves the Backblaze story. They are determined to deliver great value to their customers, and will go to any length to do so. Reading the blog posts about the extreme measures they took was engrossing, and I'm sure they enjoyed rising to the challenge. Their Storage Pod is a compelling design that has been field-tested extensively, and refined to provide a compelling price point per GB of storage.
It is the release of the data, in handy charts and graphs that encourage misrepresentation, which brings out the data-storage stickler in me. HDD manufacturers spend billions of dollars in R&D, and their labs are designed to characterize and measure the reliability and endurance of their storage solutions.
The Backblaze environment is the exact opposite. I do not believe I could dream up worse conditions to study and compare drive reliability. It's hard to believe they plotted this out and convened a meeting to outline a process to buy the cheapest drives imaginable, from all manner of ridiculous sources, install them into varying (and sometimes flawed) chassis, then stack them up and subject them to entirely different workloads and environmental conditions... all with the purpose of determining drive reliability.
Of course that wasn't the intention, but that is how some will interpret the data. In my opinion, the intoxicating allure of media coverage overwhelmed common sense, and Backblaze released these numbers with a catchy title that would attract attention. The tech media is to blame as well, with many posting the information with little or no research. Unfortunately, the Backblaze blog post will be copy/pasted innumerable times for years to come as an authoritative source of data, when it is the furthest thing from a comprehensive study imaginable.