Note: |
The CacheCade feature is available from first half of calendar year 2011. |
Note: |
In order to use CacheCade for the virtual disk, the Write and Read policy of the HDD based virtual disk must be set to Write Back or Force Write Back and read policy must be set to Read Ahead or Adaptive Read Ahead. |
Related articles and whitepapers:
Measuring Performance
Users may not understand the best methods to test SSD and CacheCade™ devices so that they can observe the benefits of solid-state storage. This article attempts to provide guidance on the optimal performance specifications which can be applied generically to most of the performance testing tools.
The use of performance testing tools to attain optimal performance is, of course, dependant on the level of understanding of the user as to how the device under test is supposed to operate.
Block-size: SSD and CacheCade devices behave optimally when used with small block sizes rather than large-block. When IO is being read or written, the process of selecting the active cell is electronic and is not dependent on a physical head movement as with mechanical disks. This means that the solid-state devices can respond very quickly to small-block random IO and may achieve greater than 10,000 IOPS where a mechanical disk would struggle to attain greater than 200 IOPS.
Queue-depth: SSD’s have a deep queue-depth, with most capable of 64 outstanding IO’s, significantly more than that of a standard SAS disk, typically at 16 outstanding IO’s. This deep queue-depth allows much more flexibility to the disk as it lessens the disk’s dependency on the controller to provide IO’s in a timely manner. The controller can maintain the queue when it can, leaving the disk to work through it without having to wait on the controller.
As the technology changes and SSD’s perform more tasks in parallel, the disk queue-depth is likely to deepen again. The performance testing tool needs to be used to probe for the most effective queue-depth, so increasing this queue-depth from time to time may result in better figures with differing devices.
Cache-bound: It’s important that the performance tool is not cache-bound, that being that all of the IO gets serviced by the controller cache. This occurs when the test-file size is incorrectly specified and is able to fit completely into the controller cache. When this occurs, the IO’s never reach the disks and the performance returned for IO is usually limited by the speed of the PCI bus, therefore false performance figures of more than 3GB/sec can be observed. Always overwhelm the cache by selecting a test-file size of greater than that of the controller cache.
CacheCade
CacheCade must be benchmarked differently to standard SSD drives as this technology is only used to cache read requests, not write requests. A challenge is therefore created when a user wishes to benchmark a CacheCade solution as the standard methodology of just reading or writing blocks will not provide the expected results unless the cache is prepared.
To further describe this characteristic of CacheCade, consider a situation where mechanical disks are only read-cached and you wish to run IOMeter to validate that CacheCade is capable of providing the performance expected of it. IOMeter will first create a test file from which it will carry out it’s IO operations, this file is written to the target storage, therefore the file is not cached by CacheCade. IOMeter will then start to carry out it’s IO operations on the file, but as we already understand it’s not currently in the cache, so the initial IO operations will be carried out on the mechanical disks. This initial cache-miss (where the data requested is not available in the cache) negatively affects the first part of the performance analysis, so steps need to be carried out to eliminate this performance hit from the statistics. CacheCade also implements caching on data hot-spots only, meaning that data needs to be frequently accessed before it becomes cached; we also need to overcome this effect to measure the performance at a practical level.
To achieve our expectations we need to ensure that the test file is accessed enough to cause it to be cached. To do this, leave IOMeter running a read test for an extended period of time. Bear in mind that the size of the test file and the speed of the IO operations in MD/sec will determine how long it takes for the file to become cached. The file needs to be read MULTIPLE times before it becomes cached, so you could aim to read the file an equivalent of 5 times by dividing the size of the file by the speed in MB/sec * 5.
For example, a test file of 4GB, being read at 40MB/sec = 100 seconds * 5 = 500 seconds.
For this example, you would need to leave a READ test running for a minimum of 8.5 minutes for the equivalent of 5 read operations to be carried out over the whole file. This time is called the ‘warm-up time’ for the cache.
After completing more than 8.5 minutes of warm-up, terminate the performance test. This will leave IOMeter’s test target file still cached as there will not be any process to flush the data from CacheCade as the file is retained after the application is closed. Then restart the same performance application and select the same target drives. When IOMeter now starts to read from the file, the data will already be in cache (a cache-hit) and the performance should resemble that of CacheCade in an optimised state.
Key points:
When running other performance measurement tools there are some configuration recommendations which should be followed.
For SSD and CacheCade: