Recently Intel invited members of the media, including TweakTown, to their Folsom campus for a deep dive on the inner workings of their SSDs. There was a constant running theme of Intel's commitment to reliability during our visit. Intel considers reliability at every step in the design, engineering, and validation process. This concurrent optimization extends to all levels of SSD design and quality control.
The fruit of this outstanding effort is possibly the lowest field failure rate in the industry. We have to throw in a big disclaimer; no other SSD manufacturer releases official numbers so it is impossible to ascertain who the best really is. Official field reliability data is rare due to industry silence on the matter, and this did not start with SSDs. HDD manufacturers have notoriously kept reliability data secret for years, leading many to believe that the real failure numbers are almost nightmarish.
We have relied upon anecdotal failure rate data provided by a European SSD reseller as a guideline for years, and even with the somewhat inaccurate data, Intel has always had the lowest return rate. The problem with relying upon return rate data is that not all returns are actually due to a failure of the storage device. Many cases are simply user error, denoted by the ARR (Annual Return Rate) on the chart, which is higher than the actual failure rate (AFR - Annual Failure Rate).
SSDs enjoy a much lower failure rate at 1% than HDDs, which (unofficially) fall into the 5-6% range. Intel's goal is to fall below .73% for client products and .45% for datacenter products. The graph above reveals that Intel has managed to yield much lower failure rates than their own stringent goals. The AFR is less than .25% over a sample size of over 2 million drives. Reliability is not something Intel just preaches to others, they switched all Intel employees from HDDs to SSDs and yielded a 5X reduction in failure rates.
The failure rate for Intel's professional series of drives also falls well below expectations. Currently Intel is the only SSD manufacturer confident enough to reveal their field reliability data, and we applaud Intel's unprecedented transparency with their failure rate statistics. Do not hold your breath waiting for other manufacturers to come forward with the same data; companies guard field reliability data closely.
If you boil down reliability to its core, it is really all about protecting user data. Intel addresses this with full data path protection through parity, CRC, memory ECC, and LBA tag validation. This ensures that the data read from the drive is the same as the original data written.
Threats to reliably storing data can come from all angles. One notable problem stems from cosmic rays. These rays create soft errors in the controller and DRAM, which can lead to data loss. Luckily, this is not a frequent occurrence, but Intel is dedicated to removing the chance of soft errors on their drives. This is accomplished by bombarding SSDs with the Los Alamos Neutrons Science Centers particle accelerator. Intel tests the drives until they fail, and then use failure analysis to determine methods to avoid soft errors. Using this approach Intel has reduced the chance of a soft error to 1E25, or one silent error per billion drives. This leads to a failure rate due to soft errors to a nearly immeasurable level of .029%.
PLI (Power Loss Imminent) Technology protects user data in the event of host power loss. This system actually detects unexpected power loss prior to the event, and capacitors provide enough power to commit data inflight to the underlying media. One of the most difficult aspects of incorporating sound PLI technology is to test power events during other operations, such as garbage collection or TRIM events.
Intel creates multiple fault scenarios to eliminate all possible data loss during power loss events, and applies XOR to the flushed data for an extra level of protection. Intel conducted over 2 million PLI cycles for the DC S3700 and DC S3500 series. The Intel DC P3700, and its siblings, feature a user-initiated PLI self-test that assures the system is working correctly.
One of the biggest benefits of vertical integration is the close control of all the components used in the product. Intel utilizes their own proprietary ASICs in the datacenter series that go through intense internal validation during development. Intel also utilizes their own NAND from the IMFT partnership with Micron, and like every other component, it goes through a rigorous qualification processes beyond the requirements of JEDEC.
PRICING: You can find products similar to this one for sale below.
United States: Find other tech and computer products like this over at Amazon's website.
United Kingdom: Find other tech and computer products like this over at Amazon UK's website.
Canada: Find other tech and computer products like this over at Amazon Canada's website.
- Page 1 [Introduction]
- Page 2 [The Changing Datacenter - Workload Tuning]
- Page 3 [Reliability Statistics - Data Protection]
- Page 4 [Measuring Reliability]
- Page 5 [Design and Specifications]
- Page 6 [Test System and Methodology]
- Page 7 [Benchmarks - 4k Random Read/Write]
- Page 8 [Benchmarks - 8k Random Read/Write]
- Page 9 [Benchmarks - 128k Sequential Read/Write]
- Page 10 [Database/OLTP and Webserver]
- Page 11 [Email Server and File Server]
- Page 12 [Final Thoughts]