Five 9s

March 6, 2011 - How Much is That Down Time?

The term "five nines" is used in information technology to refer to reliability or up time. If a system delivers five 9s, it means that the system will be running 99.999% of the time, thus five nines in the percentage.

Five 9s is actually pretty good. Over the course of a year, it means that a system will be down a total of 315 seconds or about 5 minutes. That's less than 30 seconds per month. Five 9's is a de facto industry standard for reliability, considered the ultimate. Everything breaks eventually, after all.

Recently I saw an advertisement for hosting that guaranteed three 9s (99.9%) up time. I was struck by that claim because if we are just talking casually among ourselves and someone says "it will happen 99% of the time," the person usually means that the opposite case is almost impossible. 99.9% is better, right? Alas, in the systems world 99% is not very good and 99.9% is not good enough.

Here's quick table that might shed some light.

 Time Down Per Year
UpDownSecondsMinutesHours
 99.999%  .001% 315 5 .09
 99.99%  .01% 3,154 53 .88
99.9%  .1% 31,536 526 8.8
99%  1% 315,360 5,260 88

That makes two 9s (99%) look very bad at 88 hours down time per year. Each of the rows in the table differs by one order of magnitude, which means that each extra "9" lowers the down time by a factor of 10.

So is 99.9% good? At nearly 9 hours down per year, probably not.

If you're wondering how an organization can afford to be down for even 5 minutes per year, you're talking about a different measure called availability. 100% availability is the result of redundancy and is common in big systems like Amazon or Google. It is rare in shared hosting, where a site is typically running on a single, non-redundant server. In such cases, reliability is the same as availability - you're only as good as the box you're running on.

 

Tags: Availability, Hosting, Reliability, Systems