Introduction
The demarcation between consumer and enterprise SSDs is becoming a blurry line in some cases. The cloud computing market is growing quickly, and SSDs afford performance advantages in both performance and power consumption metrics. It is predicted that the Internet, and the servers powering it, consume 1.5% of the global power output.
This may not seem like a large number, until we take into consideration that this is roughly the output of 30 nuclear power plants, or 30 billion Watts, and leads to a whopping power bill of $8.5 billion annually. SSDs bring tremendous advancements in the IOPS produced per Watt of energy consumed, and in many cases, this is one of the driving factors in the decision to deploy them into an enterprise environment.
As with any device that delivers a substantial performance advantage over a commodity product, SSDs command a much higher price than their HDD counterparts do. The performance advantages of SSDs, in addition to their enhanced density, lower power requirements, and other factors combine to make them a cheaper long-term solution than HDD storage. While the lowered TCO is a big magnet drawing prospective buyers into considering SSDs, the up-front expenditure is also the largest inhibitor to deploying SSDs into the datacenter.
This pricing double-edged sword leads many to look for value-oriented solutions. This eventually turns the eye to consumer SSDs and their lower price structure. For those unaware of the differences the learning curve can be steep and come at a high cost. Users are typically aware that client (consumer) SSDs feature a lower endurance threshold than enterprise products. When comparing the price and performance of consumer SSDs, many mistakenly think that they can overcome the high price premium of enterprise SSDs simply by replacing cheaper consumer SSD more often (rip-and-replace).
Unfortunately, many users simply look at the specifications of client hardware and immediately think that these radically high specifications will equate to a high level of performance in heavy usage enterprise scenarios, when in fact this could not be further from the truth.
Intoxicating performance specifications for client hardware lead many to make the jump without considering the significant impact that differing hardware specifications can have upon the real performance in an actual deployment. Client SSDs higher performance specifications are not indicative of long-term sustained performance in typical enterprise usage scenarios. We take a deeper look at the different test methodologies on the following page.
There are several key differentiators between client and enterprise SSDs, with performance, consistency, power efficiency, duty cycles, power loss protection, and data protection features chief amongst them. This is a daunting list, and by no means will we cover all of these facets in today's article.
The focus today is relatively simple: performance variability and latency. Deploying an SSD that does not deliver sustainable performance over the long term can end up significantly affecting TCO. Making significant tradeoffs in endurance to realize a cost saving is a risky gamble with data, but doing so at the expense of overall performance removes the initial motivation of utilizing an SSD to address performance challenges.
Client v Enterprise Specifications
The disparity in performance specifications is a result of the test protocols used for consumer and enterprise SSD performance measurements. The root of the differing test methodology lies in the characteristics of the intended environment.
The JEDEC specification for client SSDs calls for only eight hours of active use per day. Even more importantly, client SSDs are tuned for light workloads in 'bursty' environments. The SSD is at idle, or near idle, for the majority of its use in a consumer environment. This allows the SSDs internal functions to keep performance 'fresh' by utilizing a number of background processes. The internal mechanisms of the SSD are designed and optimized for this type of environment, and thus functions accordingly.
Client SSDs utilize extra spare area for internal functions, and are rarely used at full capacity, further boosting their performance. The abundance of free space is complemented by the active use of the TRIM command. The TRIM command allows the SSD to maintain high performance levels.
The relatively stress free environment of a client SSD allows it to enjoy much higher performance during operation, and this is reflected in the manner in which their performance is measured and marketed. Typically client SSD performance is measured in FOB (Fresh out of Box) conditions, with no preconditioning, and with very little (typically 8-10GB) of the available LBA utilized.
JEDEC specifications for enterprise SSDs are much more stringent, calling for 24 hours of active use per day. Utilizing enterprise SSDs at full capacity is commonplace to maximize the ROI of the premium tier of storage. Recording performance metrics after a protracted preconditioning cycle, and with utilization of the entire LBA range, reflects the workload environment. The use of preconditioning and the entire capacity of the SSD alters the result of the testing drastically, leaving us with lower, more realistic performance specifications.
The lack of the TRIM command is also an important consideration in the comparison of client and enterprise hardware. TRIM is not utilized in the majority of enterprise applications and the removal of its use results in markedly lower performance with any SSD.
When users compare the performance of a top-shelf consumer SSD to an enterprise SSD there appears to be similarities, and in some cases, client hardware appears to have even higher performance than the enterprise hardware. In deployment, the performance of the client SSD will drop dramatically. Some consumer SSDs excel at pure read or write workloads, but the introduction of any mixed workloads creates a dramatic drop in performance. In our testing, we will also be able to observe the massive difference in latency distribution.
Adding to the problem, consumer SSDs aren't tested in uniform fashion with industry approved tools. With the help of many major manufacturers, SNIA has set forth separate testing methodologies for both client and enterprise SSDs. While the enterprise portion of the market adheres to these basic tenets of SNIA methodology, the consumer market largely disregards the testing methodology tailored for their use.
This leads to blatantly misleading specifications that are not representative of performance in a typical consumer workload, let alone the harsh realities of enterprise workloads. The fact that the test results are released with full read and write workloads, and no mixed workloads, leads to even more misleading results.
Test System and Methodology
We utilize a new approach to HDD and SSD storage testing for our Enterprise Test Bench, designed specifically to target long-term performance with a high level of granularity.
Many testing methods record peak and average measurements during the test period. These average values give a basic understanding of performance, but fall short in providing the clearest view possible of I/O QoS (Quality of Service).
'Average' results do little to indicate the performance variability experienced during actual deployment. The degree of variability is especially pertinent, as many applications can hang or lag as they wait for I/O requests to complete. This testing methodology illustrates performance variability, and includes average measurements, during the measurement window.
While under load, all storage solutions deliver variable levels of performance. While this fluctuation is normal, the degree of variability is what separates enterprise storage solutions from typical client-side hardware. Providing ongoing measurements from our workloads with one-second reporting intervals illustrates product differentiation in relation to I/O QOS. Scatter charts give readers a basic understanding of I/O latency distribution without directly observing numerous graphs.
Consistent latency is the goal of every storage solution, and measurements such as Maximum Latency only illuminate the single longest I/O received during testing. This can be misleading, as a single 'outlying I/O' can skew the view of an otherwise superb solution. Standard Deviation measurements consider latency distribution, but do not always effectively illustrate I/O distribution with enough granularity to provide a clear picture of system performance. We use histograms to illuminate the latency of every single I/O issued during our test runs.
Our testing regimen follows SNIA principles to ensure consistent, repeatable testing. We attain steady state through a process that brings the device within a performance level that does not range more than 20% during the measurement window. Forcing the device to perform a read-write-modify procedure for new I/O triggers all garbage collection and housekeeping algorithms, highlighting the real performance of the solution.
We measure power consumption during precondition runs. This provides measurements in time-based fashion, with results every second, to illuminate the behavior of power consumption in steady state conditions. Power consumption can cost more over the life of the device than the initial acquisition price of the hardware itself. This significantly affects the TCO of the storage solution. We also present IOPS-to-Watts measurements to highlight the efficiency of the storage solution.
The first page of results will provide the 'key' to understanding and interpreting our new test methodology.
4K Random Read/Write
We precondition the SSDs for 18,000 seconds, or five hours, receiving reports on several parameters of workload performance every second. We then plot this data to illustrate the drives' descent into steady state. This chart consists of 36,000 data points. The dots are IOPS measurements during the preconditioning period. The lines through the data scatter are the average during the test. This type of testing presents standard deviation and maximum/minimum I/O in a visual manner.
We provide histograms for further latency granularity below. This downward slope of performance happens very few times in the lifetime of the device, during the first few hours of use, and we typically only present the precondition results to confirm steady state convergence. In this article, we are combining the preconditioning runs to highlight the performance of consumer products.
The OCZ Vector and the Samsung 840 Pro suffer from significant performance variability. The wide swath of results points to varying performance during the preconditioning period. The SMART Optimus holds steady with very little variability.
Each QD for every parameter tested includes 300 data points (five minutes of one second reports) to illustrate the degree of performance variability. The line for each QD represents the average speed reported during the five-minute interval. 4K random speed measurements are an important metric when comparing drive performance, as the hardest type of file access for any storage solution to master is small-file random. One of the most sought-after performance specifications, 4K random performance is a heavily marketed figure.
The 4K random read performance of the OCZ Vector rises above the Optimus, while the 840 Pro resides at a lower value of 70,000 IOPS. The Vector performs admirably in the pure random read environment. However, during actual deployment into an enterprise environment, a pure read workload is rare.
Garbage collection routines are more pronounced in heavy write workloads. This leads to more variability in performance, and the difference between the Optimus and the consumer SSDs is very clear with this test. The client SSDs suffer from tremendous variability, and average well below a third of the performance of the Optimus. Much like pure read environments, a pure write workload is equally rare.
Our write percentage testing illustrates the varying performance of each solution with mixed workloads. The 100% column to the right is a pure write workload of the 4K file size, and 0% represents a pure 4K read workload.
The massive read performance of the Vector is brought down to earth in this testing, illustrating one of the key differences between client and enterprise SSDs. Enterprise hardware is designed to handle heavy write workloads with ease, and here we can see that the client SSDs suffer a drastic reduction in performance with even the slightest of write workloads.
The Vector averages 97,000 IOPS with a pure read workload, but this drops dramatically to 27,000 IOPS with the introduction of only 10% writes into the workload. The 840 Pro suffers a comparable loss, from 70,000 IOPS to 22,000. The Optimus also loses speed as we mix in more writes into the workload, but we can see that the reduction in performance is not near the scale suffered by the client SSDs.
Observing the distribution of latency during the test period also brings into focus one of the overlooked aspects of performance. The latency of the requests is extremely important, and one of the key advantages of using an SSD over an HDD.
The massive distribution of latency from the client SSDs is a tangible problem that will have a direct impact upon application performance. Both the Vector and the 840 Pro suffer a significant number of outlying I/O's that take longer than 200-400ms to complete. Our charts only go to 200-400ms, but the Vector had several I/O's land as high as 600-800ms, and the 840 Pro has operations in the 800-1000ms range.
We record the power consumption measurements during our precondition run. We calculate the stated average results during the last five minutes of the test, after the device has settled into steady state.
To the untrained eye, the power requirement of the Optimus is drastically higher than that of the client SSDs. While the Optimus does require more power for operation, it accomplishes much more work with the same amount of voltage, as covered in the graph below.
IOPS to Watts measurements are generated from data recorded during our precondition run, and the stated average is from the last five minutes of the test.
The IOPS to Watts measurements gauge the efficiency of the SSD during the workload, and calculates the amount of work accomplished per Watt. As noted above, the Optimus does require significantly more power to operate, but the amount of IOPS provided per Watt reveals that the Optimus is actually more efficient than the Vector and the 840 Pro.
8K Random Read/Write
The Optimus provides much more performance in a tighter range than the Vector and 840 Pro.
8K random read and write speed is a metric that is not tested for consumer use, but for enterprise environments this is an important aspect of performance. With several different workloads relying heavily upon 8K performance, we include this as a standard with each evaluation. Many of our Server Emulations below will also test 8K performance with various mixed read/write workloads.
The Vector truly shines with pure random read workloads, again topping the chart. The 840 Pro performs admirably as well, managing to best the Optimus in this test.
The average 8K random write speed flips the tables on the client SSDs, highlighting an inherent strength of the Optimus.
The relevant results of this battery of tests again fall to our write percentage testing. We can observe the massive drop in performance of the client hardware with the addition of even the slightest of write workloads.
The Vector manages to provide 60% of its I/O during the test at 4ms, much better than the competing SSDs. The numerous results that fall much higher, up to the limit of our recording at 200ms, conspire to remove the latency advantage. The Optimus provides steady performance with the highest results landing at 10ms.
Both the Vector and the 840 Pro had large percentages (10.7% and 8.2%, respectively) fall higher than our chart illustrates, at 200-400ms. More concerning is the smattering of I/O's that fell into the 800-1000ms range for both SSDs.
The Optimus again requires the most power for operation, and the Vector touts an impressively low power consumption.
The Optimus again proves to be more efficient, with an average of 2,800 IOPS per Watt. The Vector comes in second with 1,300 IOPS per Watt. The Samsung 840 Pro comes in a distant third place.
128K Sequential Read/Write
The Samsung 840 Pro excels at sequential write workloads, and here manages to beat the Optimus and the Vector. This seems to be a clear win for the Samsung 840 Pro, but once we look at histogram reports below the picture changes.
The 128K sequential speeds reflect the maximum sequential throughput of the SSD using a realistic file size actually encountered in an enterprise scenario.
The Samsung 840 edges the Optimus by a slight margin during our sequential read testing.
The Samsung 840 Pro rides the top of the charts in the sequential write testing exhibiting one of the great strengths of the 840 Pro. Though the 840 Pro has impressive 100% write performance, below we can observe the difference with mixed workloads.
The performance of the consumer SSDs drops quickly with the introduction of a mixed workload, though the Vector manages to best the Optimus in the 50-90% range.
The 840 Pro nudges out the Optimus in sequential write performance, with 100% of I/O falling into the 60-80ms range. The Optimus delivers 95% of I/O in the same range, but 4.85% falls into the 80-100ms range, and a small amount falls in at 100-200ms. The Vector has operations as high as 600-800ms.
The Vector and 840 Pro manage to best the Optimus in performance per Watt in the 100% sequential write testing.
Database/OLTP and Webserver
Database/OLTP
This test emulates Database and On-Line Transaction Processing (OLTP) workloads. OLTP is in essence the processing of transactions such as credit cards and high frequency trading in the financial sector. Enterprise SSDs are uniquely well suited for the financial sector with their low latency and high random workload performance. Databases are the bread and butter of many enterprise deployments. These are demanding 8K random workloads with a 66% read and 33% write distribution that can bring even the highest performing solutions down to earth.
The Optimus leverages its exceptional random write performance to provide a large lead in this test.
The Optimus again gives refined performance in the latency department, with the highest results falling into the 20-40ms range. Client SSDs, in contrast, have many operations with such high latency that they are literally off our chart. The 840 Pro has numerous operations fall above 1000ms, and the Vector tops out with a smattering of I/O in the 600-800ms range.
[img]27 [/img]The SMART Optimus manages to pull off a much higher IOPS to Watt rating of 5,171 IOPS, compared to the client competition that average 2,600 IOPS.
Webserver
All three SSDs perform very closely during preconditioning.
The Webserver profile is a read-only test with a wide range of file sizes. Web servers are responsible for generating content for users to view over the internet, much like the very page you are reading. The speed of the underlying storage system has a massive impact on the speed and responsiveness of the server hosting the websites, and thus the end user experience.
The Samsung 840 Pro has the highest average speed, followed closely by the Vector and SMART Optimus.
In this test, what we see is what we get. There is a nice distribution of latency performance with no outlying I/Os off our chart.
The Optimus loses the IOPS per Watt competition for the first time in this test. The Samsung 840 Pro pulls of a much-needed win, and the Vector follows closely behind.
Fileserver and Emailserver
Fileserver
The File Server profile represents typical file server workloads. This profile tests a wide variety of different file sizes simultaneously, with an 80% read and 20% write distribution.
All three SSDs perform very closely in this testing.
The SSDs all perform closely in this test, with no latency results out of the range of our chart, and the Samsung 840 Pro enjoying a slight lead.
The 840 Pro and the Vector take a large lead in this test.
Emailserver
The Emailserver profile is a very demanding 8K test with a 50% read and 50% write distribution. This application is indicative of the performance of the solution in heavy write workloads.
The heavy write workload provides the Optimus an advantage in this test.
The return of a write workload brings the latency penalties suffered by client SSDs back into the limelight. We can see the large distribution in the upper regions, but we also have a wide range of latency results off our chart. The 840 Pro has a number of results above 1000ms, and the Vector tops out at 800-1000ms. The Optimus tops out in the 20-40ms range.
Final Thoughts
Recent articles from several large websites have highlighted the performance of consumer SSDs in relation to enterprise SSDs. Many times these articles have declared that client SSDs can equal, or beat, enterprise SSDs. The fact that a bit of overprovisioning can produce higher IOPS or bandwidth has nothing to do with the true measurements of performance consistency, and with comprehensive study these theories are debunked when measuring the latency of all I/O's issued during the testing period. The focus needs to be placed back on latency.
We are pleased to see that many of our competitors have embraced our scatter testing, this provides us a measure of validation from our peers; but it is distressing to see these same techniques used as a means of disseminating incorrect analysis of the results. Some caveats of performance aren't always revealed even when using sophisticated scatter testing.
A key take-away from our testing revolves around the ambitious specifications touted by consumer SSDs. Client SSDs promise massive IOPS in pure read workloads, but once we begin to mix in even the lightest of write workloads, performance plummets. There simply are very few pure read workloads in the vast majority of enterprise deployments, and client SSDs do not handle mixed workloads as well as SSDs designed for enterprise environments.
Enterprise SSDs really come into their own in pure write situations due to a thoroughbred design optimized for heavy usage, but there are also very few pure write workloads. The real performance lies in the middle ground, and in this region enterprise-class hardware is unassailable.
Latency is the primary driving force behind SSD usage and delivers radical performance gains over typical HDDs. When placed in enterprise environments, our testing revealed client SSDs faltered tremendously in latency QoS.
To put this in perspective, a single 15,000 RPM MK01GRRB/R Toshiba HDD had 11,748 I/O's fall into the 600-800ms latency range during the exact same 8K random write test utilized in this product evaluation. In comparison, the 840 Pro had 3,328 I/O's in the 800-1000ms range, and the Vector had 2,777 I/O's fall into the same 800-1000ms range, higher than the 15K HDD.
These outlying I/O's can be crippling to application performance in actual deployment. Even though the percentage of operations for the client SSDs are very small, the fact that we are experiencing worse latency with some operations in comparison to an enterprise HDD is telling. The client SSDs will be faster overall due to the increased number of overall fast operations, but there are situations where an application waiting on data will wait longer than they would from an HDD. This really drives home the performance gap with client SSDs. The Optimus had zero operations above 40-60ms in the same test. This is the most powerful evidence of the difference between enterprise and consumer SSDs.
Utilizing extra overprovisioning with client SSDs is a common approach to boost steady state performance, but even with hefty overprovisioning, there are still a number of operations that fall into an unacceptably high latency range. Look at TweakTown soon for an upcoming article on the effects of overprovisioning.
Overprovisioning a client SSD also dilutes the dollar per GB advantage. Factoring in the higher price per GB after overprovisioning brings the number of other features provided by enterprise SSDs into focus. Once the price per GB begins to come close to enterprise SSDs, features such as power loss protection and enhanced data protection quickly tip the scales in favor of the enterprise SSD.
Many users forgo the enhanced data protection of enterprise SSDs in favor of utilizing consumer SSDs in RAID arrays, defraying the heightened potential for data loss and aggregating performance of multiple SSDs. Unfortunately, the speed of the entire array is constrained to the speed of the slowest I/O. In RAID arrays numerous client SSDs can equate to simply terrible performance in relation to enterprise SSDs. This is due to the effects of a continuous steam of errant I/Os from multiple drives and the array is constrained to the speed of the slowest operation.
SSDs bring tremendous advancements in the IOPS produced per Watt of energy consumed. Enterprise SSDs typically enjoy large advantages in the IOPS per Watt category, and provide enhanced power efficiency.
There are a number of applications where client SSDs make sense, but they are usually in workstation and consumer applications. The comparison drives tested are the fastest client SSDs on the market and in a consumer environment are hard to beat. It really isn't fair to put them in the ring against a heavyweight like the SMART Optimus, but they were selected for their class-leading status. We required the best SSDs that the client-side had to offer, and the OCZ Vector and the Samsung 840 Pro fit the bill.
In certain situations, we received results worse than a 15K HDD. There simply is a tremendous gulf between the performance of these two solutions, especially when we go beyond the typical 'speeds and feeds' and drill down into the real performance measurements that have a dramatic impact on application performance and TCO.