How does redundancy improve reliability? (in context of CSE / Computer Arch. )


Could you give me a proper idea as to why this logic works and is used extensively? Any articles (small ones) would also be appreciated.

In: 1

Let’s say you have a system that will fail 50% of the time you use it. You want more reliability.

Having 2 of the systems, and using the 2nd if the first one fails, changes things. While each system still fails 50% of the time, the odds of BOTH systems failing is 25%. Add a 3rd and it’s 12.5%. Each redundant system you add makes it less and less likely that EVERY system will fail, which means even if you have to try 10 times you’ll get your answer.

Now, usually it’s a better idea to fix what causes the failure. But for some things, we know it’s not possible to have a 0% failure rate. So if we don’t like the failure rate and have no easy way to prevent failure, having a redundant system gives us more chances to roll the dice.

If you have a system with a single component, then the system has the same reliability as that component. If the component fails, the system fails as well.

If you create a system with multiple redundant components, then the failure of a single component no longer means the entire system fails. As long as there is a single component that didn’t fail, the system remains functional.

As no component is 100% reliable, using redundant components improves the overal reliability.

An example could be providing electricity to you:

If there is a single cable providing electricity from the powerplant to you, you will lose power if that cable is broken due to roadworks.
If there are 2 redundant cables providing electricity to you. A break in one cable would still mean your are getting electricity from the other cable that didn’t break.
(*Redundancy* in cabling means in this case a cable that takes a different route.)