The hows and whys of SSDs
Harnessing flash cells for mass storage
NAND is unique from memory technologies like NOR and DRAM because it is not byte-addressable, or incapable of reading and writing one byte at a time. Because the complexity of wiring for byte-addressable memory scales with capacity, NAND’s destiny as mass storage meant a different approach. To overcome addressing restrictions, NAND cells are grouped by the hundreds into pages, and the pages into blocks. Each page shares a common set of word and bit lines and are organized into four common configurations:
- 32 pages of 512 bytes for a 16 KiB block
- 64 pages of 2,048 bytes for a 128 KiB block
- 64 pages of 4,096 bytes for a 256 KiB block
- 128 pages of 4,096 bytes for a 512 KiB block
Even though data can be read and written by the page, up to the entire block can be read and written in a single pass. This means that a 4 KiB file could consume the entire block which may be up to 128 times the size of the file. Such wasted space, called slack space, will go unused until replaced by data that uses the space more efficiently.
Writing data
While solid state disks are quite adept at locating blocks for I/O, NAND cells can’t directly write to those located block. Instead, data is first written to an erase block and then merged with the existing contents of the drive to complete the write sequence. This merger process is rated with a write coefficient that compares the amount of data managed in DRAM, the ATA bus, and on the drive’s buffers against the size of the actual write. Many of today’s hard drives have a write coefficient of about 20:1, meaning 1GiB of written data forced the computer to manage 20GiB before the write could ever happen.
The erase blocks set aside for write merging average 1MiB in size, meaning that a file must fit into or be divisible by 1MiB increments to achieve optimal write performance. The difference in write sizes can have staggering implications: transferring 32MiB of data in 1MiB chunks can exceed 80MiBps, or three times the performance of that same 32MiB written in 4KiB chunks. As the average write is less than 50KiB, many users are often underwhelmed by disappointing I/O.
Researchers hope to refine NAND so small block performance eclipses that of mechanical drives, but the reality of NAND is that the tremendous jumps in speed will continue to grace the big block writes.
Reading data
Today’s modern solid state disks have similar initial read and write speeds, but read throughput can be over 25 percent faster with a proper block size. More impressively, the maximum write speeds are still climbing, having recently approached the 200MiBps barrier in just nine short months.
NAND’s position as mass storage meant that it abandons some of the features that make NOR faster in reads. The biggest loss is the cut of eXecute in Place (XIP) technology which allows memory to be executed directly in flash space. Instead, NAND must copy requested data to system RAM before it can be run.
It is plain that moving data to manipulate it is an inefficient mechanism, but the technique is not as devastating to read performance as it is to write. This is because reads rely on the tremendous speed of DRAM to achieve its scores. Though DDR2-SDRAM latency is a shade slower than NAND at 60ns, DRAM posts throughput surpassing 1100MiBps as today’s flash disks approach the 200MiBps barrier.
While this seems fast, sustained read performance ails because every block of data must be read in its entirety, and in sequence. In just the same way that humans can find a page in a book much faster than they can read the page, solid state disks suffer too.
Random read performance is significantly better. Capable of accessing over a thousand files per second using a 0.1ns seek time, burst throughput is at 250MiBps and climbing. This means that tasks like opening programs, opening small files and even on-demand file loads inside a game can be quite quick.
NAND’s true enemies
While it is easy to believe that design trade-offs have stunted NAND, we have only begun to scratch the surface of its potential. Flash memory’s biggest opponent is the present hard drive ecosystem caught entirely unprepared for a technology of flash’s nature. Largely unchanged since the early nineties, the advent of Serial ATA and ever-increasing magnetic capacities have done little to alter the way in which we talk to hard drives.
Clustering
Today’s approach to mass storage assumes a mechanical drive which, without aid, is a single mass of bits that go undivided unlike NAND blocks. Operating systems rely on an abstraction layer known as a file system to logically divide this contiguous swath of data into smaller manageable pieces known as clusters. Operating systems are so reliant on file systems that stored data is simply unreadable without one.
While there are many prevalent file systems, the 1996 introduction of FAT32 with Windows 95 OSR2 was a great success that practically institutionalized the 4KiB cluster size. Such a size was chosen to alleviate the tremendous slack space suffered by FAT16’s 32KiB cluster size in an age of tiny files. This 4KiB cluster size has been carried forward into NTFS, the file system of choice for Windows 2000, XP and Vista.
We now live in a world where most would not bat a lash at a file size that may not have fit on a hard drive from the earliest years of FAT32’s life. Though it’s true that such large files would fit neatly into NAND’s 4KiB pages when clustered, this belies the larger point that flash-based devices can manipulate 1MiB of data as quickly as mechanical hard drives do 4KiB.
The solution to the problem is to increase the cluster size, for which there are several advantages:
- Reduced file system complexity; less clusters means less to organize.
- Increased read and write speed as cluster size approaches parity with block size.
- Decreased slack space if the system is primarily composed of large files.
Yet increased cluster size is not a magic bullet for solid state disks, as most people have a mix of information. Games often contain a myriad of small files and operating systems are the sum of small files almost as a rule; yet movies, music, archives and MMOs are perfect candidates for enlarged cluster sizes. More frustrating than the anchor of small clusters is the complicated process to get larger clusters under modern Windows operating systems. Such a feat requires premeditated use of programs like Acronis Disk Director which can increase cluster sizes prior to the installation of Windows. It is also possible to resize existing clusters, but such a procedure is accomplished with a frighteningly varied degree of success.
Hard Drive Controllers
Today’s drive controllers, like cluster sizes, were built for the relatively simple mechanical drive. They assume that the operating system continues to manage disk I/O and that data operations can be performed directly within the disk space. This approach ignores that flash drives do considerable self-management and are forced to make monumental exchanges of data due to the write coefficient.
Various approaches have managed to improve the bleak outlook on solid state drive control. An Intel technology known as write amplification has reduced the coefficient to just 1.1 times the size of the intended write. This approach alleviates a burden on the SATA bus, DRAM subsystem, and on the drive’s own techniques for placing clusters into storage.
Operating Systems
But hardware controllers are only half of the equation. Windows, Linux and other operating systems are ultimately responsible for how the data gets to the controller for management, and most are not yet optimized for flash storage. Microsoft Windows is especially ill-equipped to communicate intelligently with today’s flash drives, much less their successors. Given that the primary test platform for flash disk review has been Windows, one wonders how much the early reputation flash had for poor performance can actually be attributed to the drives.
Not only is Windows guilty of being a poor traffic controller, Windows-based systems are particularly fond of heavy disk access. Fixated with indexing, swapping, buffering, caching and background optimizing, Windows is analogous to torture for today’s flash-based devices. This brand of drive interaction is another clear indicator that today’s drive ecosystem has been built around the radically dissimilar mechanical drive.
Ready to 










