We learned several years ago that the only way to run apples to apples comparisons between SSDs was to use a fix amount of time between each test. SSDs go through changes over time. When a TRIM command is issued the drive is free to run the actual clean up, called garbage collection, when it wants. In the real-world we don't write massive amounts of data to our drives and keep doing it for several hours straight so why test SSDs in this manner? For my testing I leave an exact 4 minute window between one test and the next. If the drive chooses to clean itself then it happens but some drives are more aggressive than others.
We've talked about several Samsung SSDs having high write latency in past reviews. With early firmware, Samsung SSDs would aggressively work to cleanse the NAND of data we told the drive was unneeded. This would trigger a GC event and during our next test the latency would be much higher than most other products we tested the same way.
Many of the early tests we run read and write to the full span of the drive, all of the user available NAND flash. This works out great when trying to bring a drive down to a consumer steady state, performance at levels most of us find our SSD at. The early tests are best case scenario since the flash is fresh but it's not very realistic for telling you what to expect at home. The tests performed later in the review get us closer to the performance you can expect.
A couple of years ago Jon, Paul and myself sat down with Futuremark in California to discuss PCMark 7 and give our ideas on next generation storage testing. Notes were taken and we all walked away with new ideas. Weeks later a change was made to PCMark 7's storage test but the largest impact from that meeting came in Futuremark's new PCMark 8.
A little over a week ago we were given the keys to what we think is the best storage benchmark for consumer SSD products to date. The test isn't perfect yet but it's very close to what we want to show you in our reviews, true consumer SSD performance.
The new tests write a lot of data to the drives and take close to 24 hours to complete. The first test is a performance consistency test and it represents a consumer worst case scenario. The methods used a much more appropriate for determining consumer performance than writing a drive 5 times with 4K random data. Here is a breakdown of the actions performed.
1. Precondition phase
1. Write the drive sequentially through up to the reported capacity with random data, write size of 256*512=131072 bytes.
2. Write it through a second time (to take care of overprovisioning).
2. Degradation phase
1. Run writes of random size between 8*512 and 2048*512 bytes on random offsets for 10 minutes.
2. Run performance test (one pass only). The result is stored in secondary results with name prefix degrade_result_X where X is a counter.
3. Repeat 1 and 2 for 8 times, and on each pass increase the duration of random writes by 5 minutes
3. Steady state phase
1. Run writes of random size between 8*512 and 2048*512 bytes on random offsets for final duration achieved in degradation phase.
2. Run performance test (one pass only). The result is stored in secondary results with name prefix steady_result_X where X is a counter.
3. Repeat 1 and 2 for 5 times.
As you can see the drive is hit very hard but as I stated, this is a worst case scenario.
The next test in the cycle, the recovery phase is what we are more interested in because none of us write to our drive over and over. This is where time comes into play and what I wanted to show you today. The second chart above shows the same drives but after 5 minutes of idle time to recover.
1. Recovery phase
1. Idle for 5 minutes.
2. Run performance test (one pass only). The result is stored in secondary result with name recovery_result_X where X is a counter.
3. Repeat 1 and 2 for 5 times.
As you can see in the two charts, all of the drives increased performance after the 5 minute idle time. If you are measuring SSD performance and not putting a fix time between each test then your results are invalid, at least the way I see it. With most drives performance increases with each pass but we're still not sure how much data we should show in the reviews until we have more data collected.
I want to add a few notes about the results above. At this time all of the results displayed use incompressible data so LSI SandForce (Mushkin Chronos DX / Intel 530) and the new Phison (MyDigitalSSD BP4) are penalized for having advanced technology. At this time we're not sure when we'll migrate the new tests into our SSD reviews. There are still some small issues we are addressing.