Here Is the Physics Behind the Magic 80% Rule for Media Storage and File Systems
You Shall Never Exceed 80% Usage. Ever. But Why? We’ve all heard it — the advice of both storage vendors and solutions providers to never use more than 80% of your storage. Otherwise … Otherwise what? What terrible disaster can possibly happen if this rule is ignored? Only a few obedient, safety-oriented users actually follow the instructions. The majority feels bold enough to bravely disregard the warnings. In their minds, they just want to use the entire capacity of the storage space they’ve purchased. 500 TB should be 500 TB and not 400 TB, they figure. Why lose 100TB of precious space for unknown reasons? I feel the urge to write this article because I’ve seen many users, full of optimism, headed straight into disaster. Stunned to see that so many — even admins! — don’t seem to care, I want to spread the word: media storage is not an empty barrel that can be filled until it spills. You will run into a whole lot of problems trying to fill that barrel, long before it runs over. So Why Is There an Invisible Limit Assigned? Mostly because the drive performance guaranteed by the vendor will drop dramatically after the magic limit has been exceeded. Unless you are lucky enough to be using only SSD/NVMe-based storage (in which case you may want to stop reading, as we are going to talk about boring spinning disks), both physical limitations and fragmentation play a big role in disk performance. Some Physics First Sector Sizes As you probably know, there are different sector sizes on hard drives. The old standard 512-byte drive that lived in a 3.5-inch chassis served us well for many years. When the demand for more storage space rose, the manufacturers were challenged to find more efficient ways to store data. One of their ideas was to increase the number of sectors in the same physical space on the platter, logically reducing the size of each sector. The significantly narrower sectors allowed more data to be stored, but the downside was that disk error correction didn’t work as efficiently anymore. With those tiny magnetic particles being squeezed so closely together and the disks spinning faster and faster, more read and write errors occurred over time. Vendors agreed on one solution in 2009-2010. The plan was to move forward with the Advanced Format, which increased the sector size to 4 KB, which is eight times larger. It made more effective utilization of bytes/sectors and ECC (see picture) while keeping the same physical dimension of the hard disk. Comparing 512 byte and 4 KB sector drive shows also an increase in throughput per rotation, as the 4 KB block is more efficient. Controllers and operating systems that have to emulate the 512 byte sector are called 512e. This usually has no disadvantage in read performance, while there is a small penalty for writes. As it’s not a plain write, the complete 4 KB sector containing the 512 bytes that are to be overwritten has to be read so that the requested 512 byte block can be modified and the new 4KB sector written to the platter. This could be more accurately described as a read-modify-write procedure, which causes an additional rotation or two. Speed The second critical factor is the velocity or rotations per minute (RPMs) the disk can provide. Slower drives providing 5400 RPMs can usually be found in laptops or as external drives. The standard 7200 RPM drives are considered consumer or nearline drives. In the enterprise arena, you will find drives with 10K and 15K RPM. Those drives will certainly give you a higher performance per drive, and they are built to run 24/7 for the next few years. Read/Write A new disk will always write data to the outer cylinders first and then work its way into the inner tracks. Logically, you can store more sectors on the outer portion of a disk. So it makes sense to perform a speed test in that outer area, where more cylinders can be accessed in one rotation. A basic performance table comparing all those drives could look like this: SIZERPMREAD 2TB540080MB/s 2TB7200100MB/s 2TB10K115MB/s 2TB15K130MB/s These numbers are fictitious, but you get the idea. The Price of Unrelenting Physics For data security, many set-ups lose roughly 30% space to the RAID levels right off the bat. On top of that, the manufacturers make the impudent demand to leave another 20% of the storage untouched and unused. So we basically pay double the price for what we can actually use. That’s doesn’t seem fair, right? The inevitable reality, though, is that you start fighting unyielding physics when you reach a certain area on your hard disks. As illustrated before, the outer tracks of a disk have naturally more sectors compared to the inner tracks. Logically, there is more space for data available to be written/read per rotation. That means that the fictitious performance table above doesn’t the deliver the same performance once more than 80-ish % of the disk has been utilized. After that threshold has been breached, a more realistic table will probably look like this: SIZERPMREAD 2TB540040MB/s 2TB720055MB/s 2TB10K80MB/s 2TB15K85MB/s Of course, the actual numbers depend heavily on drive size, model and vendor. Comparing two 2 TB SATA 3G drives, one HGST (HUS724040AL) and the other one is a WD (WD2003FYYS) disk: BEGINNING OF DISK50% OFFSET FROM BEGINNING90% OFFSET FROM BEGINNING HGST135.08 MB/sec114.99 MB/sec74.02 MB/sec WD101.62 MB/sec78.70 MB/sec49.02 MB/sec What’s with the 80% Boundary? The manufacturers (mainly in the M&E vertical) advising you not to use more than 80% of the disk space want to make sure that you get the performance they have promised. They know what will happen when your utilization crosses this threshold — performance will decrease dramatically and will miss the promised specs. That’s it. This standard advice is not exactly reprehensible. It is the natural result of our expectation of affordable storage solutions. Plus, let’s be honest for a second: People get used to high speeds in system performance very quickly. That means that after working with a really fast system and enjoying high speed for just a few weeks, they will expect the same performance forever, no matter what. The Cheat to Keep High Performance With so many users ignoring advice to leave 20% of their storage untouched, some manufacturers overcome the issue by tricking you — by pretending to provide you with 100% usage of the storage/file system while in reality only giving you access to the first 80% of a disk. For example, you may purchase 100 TB and actually get 120 TB delivered, but since you only see 100 TB usable space, you won’t face any slowdowns after you’ve reached 80% file system usage. Feeling cheated? Let it sink in for a second, and compare it with SSDs. It is quite common that you buy a 500 GB SSD that, in reality, has 530 GB or even more under the hood. Yes, the reason for this is to map bad and burnt-out cells but, as a result, the lifecycle of your SSD increases. So, at the end of the day, this “trick” actually solves your problem, doesn’t it? You just don’t see the slow portion of the disks — that portion that you weren’t supposed to use anyway, right? So be grateful for this little trick, as it saves your butt in terms of productivity and system performance. $ vs. TB Many users look at their storage in a “$ per TB” fashion and yes, you will always only get what you paid for. I’m not implying that cheaper hard drives are not worth the money, but let’s face it: with more cost-efficient disks, certain limitations will be hit faster. However, even if you choose enterprise-grade drives with 10K or 15K RPMs, you will face the same issues in the long run. The only difference will be how deep you had to dig into your pockets, because you still want to obey the rule to leave 20% of the disk untouched to keep the performance up. Bottom Line No matter who you ask, the 80% rule applies to most of us using spinning disks. Tricks as in creating huge hidden files on a fresh file system only save you until the space is urgently needed and you have to release the blocked space. It doesn’t matter if you have a SAN or a scale-out NAS, the impact is the same and your whopping 8 GB performance clusters potentially go down to a few hundred megabytes, if even that. And please don’t forget to add the fragmentation level of a generic file system to the mix — as you know, that steals additional performance. Even if you are obedient and follow the rules, you can’t avoid and bytes ending up in the inner area of the disk. If you are interested in reading more about the impact of fragmentation, please read my earlier article on that subject.