MSST 2014 Speaker
Ethan Miller, University of California, Santa Cruz
Inside the Pure Storage Flash Array: Building a High Performance, Data Reducing Storage System from Commodity SSDs
The storage industry is currently in the midst of a flash revolution. Today's smartphones, cameras, and many laptops all use flash storage, but the $30 billion a year enterprise storage market is still dominated by spinning disk. Flash has large advantages in speed and power consumption, but its disadvantages (price, limited overwrites, large erase block size) have prevented it from being a drop-in replacement for disk in a storage array. This talk will describe the techniques that we've developed at Pure Storage to overcome these obstacles in creating a high-performance flashstorage array using commodity SSDs.
We will first explain what an enterprise storage array is and how it's used. We then describe the design of the PureFlashArray, an enterprise storage array built from the ground up from relatively inexpensive consumer flash storage. The array and its software, Purity, leverage the advantages of flash while minimizing the downsides. Purity performs all writes to flash in multiples of the erase block size, and keeps data in a key-value store that persists approximate answers to further reduce writes at the cost of extra (cheap) reads. Our key-value store, which includes a key range invalidation table, provides other advantages, such as the ability to take nearly instantaneous, zero-overhead snapshots and the ability to bound the size of our metadata structures despite using monotonically-increasing unique identifiers for many purposes. Purity also reduces the amount of user data stored on flash through a range of techniques, including compression, deduplication, and thin provisioning. The system relies upon RAID both for reliability and for performance consistency: by avoiding reads to devices that are being written, we ensure more efficient writes and eliminate long-latency reads. The net result is a flash array that delivers sustained read-write performance of over 400,000 4KB I/O requests per second while maintaining uniform sub-millisecond latency and providing an average data reduction rate in excess of 6x, averaged across installed systems.