RAID Storage and how it works

607 views

I keep hearing from people in tech communities that keeping things in RAID is somehow better than just slapping it on a hard drive backup, but… why? What even is RAID storage and why is it better?

I spent last night looking up RAID servers on Amazon to get me an answer but I left with more questions: Do all the hard drives in RAID need to be the same size? My entire family switched their laptops due to the work from home orders, so I have a bunch of unused SSDs and HDDs lying around the house, can I just throw those in a RAID server and have it work?

Can I install an operating system on a RAID server? Would it run faster/slower? If I have a MacBook and can install Windows on my RAID server, would it work a bit like bootcamp? What happens when I’m not connected to it?

I HAVE SO MANY QUESTIONS!

EDIT: I think I get it now: It’s a system that’s automatically backing itself up to… itself essentially. You do this by creating redundancies within the RAID system (multiple SSDs). If that’s not it, please tell me.

So final question: can I use a RAID server NOT in RAID? As in can I put multiple hard drives in a server, then plug the server in to my Mac and have it appear as several individual drives? Almost like a HDD mounting system on steroids.

In: Technology

6 Answers

Anonymous 0 Comments

Best I can tell for a simple answer, a RAID is having multiple drives doing the job of a single drive

1tb of portable storage holds roughly 1tb of files and if it fails then you lose that data

4tb of RAID storage might still only hold 1tb of files and would appear as a 1tb drive when you plugged it in, but it holds bits of each of that data on all of them. Should one of the drives fail, you just remove it, replace it with a new drive, and the remaining 3 drives would immediately copy to it the data you lost. Similarly anything you save to the drive is stored on all of them.

The advantage of having them connected this way, seems to just be ease of use. It’s more expensive, but you are less likely to have to think about backing up. The general rule is that you still should though – have the machine creating the data, a backup of that data stored locally, and a further backup stored in a different physical location for files that matter

Anonymous 0 Comments

RAID stands for “Redundant Array of Independent Drives” (or sometimes “Inexpensive” instead).

The fact is parts in computers do fail. Hard drives so much more so because they’re often spinning and subject to incredibly tight tolerances so mechanical failure is absolutely a thing. RAID is designed to address this issue by using many disks to cover for each other at the cost that you lose storage capacity. If you need a system that can survive a disk failure RAID is essential. The alternative is to reinstall all the software and restore from backups which can take hours or days during which the computer is effectively dead. With RAID it can soldier on in spite of a dead hard drive.

Various schemes exist, with the most common being:

* RAID-1 – two or more hard drives are kept exactly identical

* RAID-5 – made of many hard drives, one hard drive’s worth of capacity in a group is lost, but any one hard drive can die and the system keeps going. Uses math to calculate missing data from remaining disks when one dies.

* RAID-6 – similar to RAID-5, but you can tolerate two dead disks at the cost of losing two disks’ worth of capacity. The math is a lot more complicated though.

Generally all disks need to be the same size, or else each disk will be treated as only being as small as the smallest disk in the set. It’s necessary because of how the schemes work.

Amazon’s “RAID servers” are just servers that have additional hardware to do the work of RAID. Otherwise it looks like an normal computer. You can buy such add-on cards but they’re usually expensive, a few hundred dollars to start with and easily $1,000 for the good ones. The RAID hardware provides the illusion to the computer that there is only 1 hard drive, but in fact there are several and all the RAID protection work is done by the hardware. Software versions also exist but must be set up by the user during installation of the operating system and may have other limitations.

Whether RAID makes a system run faster or slower depends on the workload. Reading data is usually faster because RAID controllers now have flexibility in selecting which hard drive to read data from, or data is distributed across multiple disks allowing them to work together to run faster. However writing may be slower because multiple disks need to participate and there is math to be done. If a hard drive has died then RAID-5 and 6 get slower because the lost data must be recalculated and all the other disks need to participate in the recovery of the lost data.

Anonymous 0 Comments

> I keep hearing from people in tech communities that keeping things in RAID is somehow better than just slapping it on a hard drive backup, but… why? What even is RAID storage and why is it better?
>

RAID is a way of effectively stapling together hard drives to create one logical drive. In most configurations it allows for some amount of redundancy.

> I spent last night looking up RAID servers on Amazon to get me an answer but I left with more questions: Do all the hard drives in RAID need to be the same size? My entire family switched their laptops due to the work from home orders, so I have a bunch of unused SSDs and HDDs lying around the house, can I just throw those in a RAID server and have it work?
>

With most RAID technologies, either all the drives need to be identical, or all drives are only effectively as large as the smallest drive in the array. With some technologies you can use differing sized drives, but that can be tricky.

> Can I install an operating system on a RAID server? Would it run faster/slower? If I have a MacBook and can install Windows on my RAID server, would it work a bit like bootcamp? What happens when I’m not connected to it?
>

For a dedicated RAID server you will install a specific kind of OS that is used for managing storage arrays. The two that come to mind are FreeNAS and Unraid.

Anonymous 0 Comments

For most purposes, a RAID array is functionally identical to a single drive. The advantage RAID has over a backup is that it *is* it’s own backup. This leaves you with less overhead earmarked for redundancy, and therefore more storage space. All drives need to have the same capacity, although it may be possible to use differing drive sizes and just use portions of each equivalent to the smallest drive.

Edit” Spelling, Clarifiction

Anonymous 0 Comments

>EDIT: I think I get it now: It’s a system that’s automatically backing itself up to… itself essentially. You do this by creating redundancies within the RAID system (multiple SSDs). If that’s not it, please tell me.

RAID is not a backup. RAID is a form of data redundancy. While some people use a server that has a RAID disk array to backup data from their computers (such that the data now exists on their computer and on the RAID server in their basement or the back office) the RAID array by itself is not a backup.

What it is, and this is key, is a *redundant* system. If one disk dies, you don’t lose the data on it because that disk can be “rebuilt’ using the data spread across the other disks. Different forms of RAID support higher number of disk failures (at a higher “cost” of disk space being dedicated to the redundancy part).

>So final question: can I use a RAID server NOT in RAID? As in can I put multiple hard drives in a server, then plug the server in to my Mac and have it appear as several individual drives? Almost like a HDD mounting system on steroids.

Yes, you can absolutely skip using a server/computer capable of RAID configuration and simply use it in “JBOD” (Jumble of Disks) format where you’ve just got a bunch of disks in a computer that are network accessible.

That said, you need to start with what you want accomplish and work backwards.

If you don’t know what the goal in setting up this hardware is, then you’re going to end up wasting time and money.

What do you want out of this project? Do you want a home server that has lots of storage to backup files from multiple computers (and possibly serve up media like family photos and your movie and TV show collection)? Do you want a computer that has an incredibly high read/write speeds for some specific application you require like database management or rendering?

Configuring a device for RAID is a specific thing for specific tasks. If you have no idea what those tasks are and have no need for them, then there’s no real reason to bother with it.

For example if your only goal is to have a backup of important files and those files aren’t changing frequently… then just buying a $200 high capacity external hard drive and copying files over to it once a week will meet you needs without spending a ton of money on a multi-disk home server.

Anonymous 0 Comments

RAID is just a way to use several HDDs together.

It has nothing to do with backup.

RAID is not a form of backup! (This is a vitally important concept to grasp and has ended the careers and business of a number of people who had not fully internalized it. RAID by itself is not a backup solution!)

What RAID is, is a way to use two or more physical drives together in a way so they appear as a single drive to the computer and provide some level or redundancy and increased performances (in some versions at least)

There are several schemes of RAID which usually are labeled with numbers RAID 0 , RAID 1, RAID 5, RAID 6, RAID 10 etc.

The simplest version is RAID 1.

RAID 1 is just two drives together which both have exactly the same contents.

Every time you write something to a RAID 1 drive you write it to both drives at the same time. You can read from either drive. If you set things up right you can actually read both drives at the same time (different info from each) doubling your read speed compared to normal drives.

If either drive fails the dat will still be there on the other drive. This is what we call redundancy (the R in RAID)

The next most common scheme is RAID 5. it uses three or more drives. It distributes the data across the drives in such a way that even if one disk fails you still have all your data.

You can have half a dozen disk together in a RAID 5 array and they will appear on the computer as a single drive. The capacity of the drive will be that of all the drives together minus one.

RAID 6 is the same but instead of having just one disk or redundancy you have two, meaning that you also loose the capacity of two disk from your total. It is worth it in some occasions.

RAID 0 is not really a RAID at all. because it doesn’t provide any redundancy. It just puts a number of different disks together and presents them to the computer as a single one with the capacity of all the drives combines.

If any single drive fails in a RAID 0 all your data is lost.

It may be useful if you don’t care about preserving your data and just need some place to temporarily store huge amounts of data cheaply and with high read and write speeds.

RAID version with double digits combine the above concepts. RAID 10 is RAID 1 + RAID 0 for example.

The important hing is that all of the above is just describing how disk drives are connected to a computer. Nothing else.

You can use you standard windows desktop PC put two or more drives into it and configure them so the computer treats the RAID array as a single drive for example.

There are ways to do that in the operating system itself or by using dedicated hardware. The hardware part means that it won’t take resources away from the OS to decide where to put which data.

You can build or buy a computer which does little else but run an array of disks in a RAID and presents them to the outside world as a fileshare or similar. This is called a NAS. The NAS doesn’t need to use RAID technology but could always present each individual disk to the outside as a different network share. This is not a good idea.

There are other devices that don’t show their array to the outside via network technology but use the same sort of storage technology that computers use to connect to drives internally. These type of SAN solutions can be quite expensive and are usually only used in professional environments.

In recent years RAID has lost some of its importance. They were originally invented because they allowed you to use s a number of small cheap drives as a single big one that was fast and unlikely to fail completely and destroy all your data.

Over the last few years SSDs have become reasonably cheap and quite big by themselves. Operating system like Windows have built in ways to access and divide up drives that are more flexible than traditional RAID and in enterprise environments new technologies have found hold.

But to repeat myself don’t think of RAID as a from of backup!

You can put your backups on a RAID. That is a good idea. However if you are running your computer on a RAID directly you may have redundancy in case a disk breaks, but that is it. If you accidentally delete something or catch a crypto virus it will destroy all redundant copies of your data across the RAID.