So i was enjoying some down time for myself the other night taking a nice warm bath and letting my mind wander when i suddenly recalled a time when i worked at a research station and some idiot managed to somehow delete over 3000 excel spreadsheets worth of recently collected data. I was charged with recovering the data and scanning through everything to make sure it was ok and nothing deleted…must have spent nearly 2 weeks scanning through endless pages…and it just barely dawned on me to wonder…exactly…how the hell do data recovery tools collect “lost data”???
I get like a general idea of like how as long as like that “save location” isnt written over with new data, then technically that data is still…there???? I…thats as much as i understand.
Thanks much appreciated!
And for those wondering, it wasnt me, it was my first week on the job as the only SRA for that station and the person charged with training me for the day…i literally watched him highlight all the data, right click, and click delete on the data and then ask “where’d it all go?!?”
In: Technology
A file is 0s and 1s written on your drive. Your drive has a number of times it can write over the same part of it’s storage. So instead of actually overwriting everything deleted with all 0s or all 1s, it just says there is nothing there and that the space is available to be used again.
As long as those same places aren’t used to store another file, the information is still there and can be recovered.
Here’s my ELI5:
Think of your diskspace like a neighbourhood. Buildings like houses, apartments, stores, schools are all different types of files.
To find the building you are looking for, you simply search for the “door number” assigned to that building, that is the address and files have “addresses”, just like buildings.
Now truly erasing a file is a lot of work, computers don’t do that, what they do is “de-address/de-register” the file. Let’s stay your local McDonald’s goes out of business and the address was 1234 Hexadecimal Street. If you lookup what’s at this address, you will be told there’s nothing there, property/space is up for sale for the next business to move in.
However, a clever person is able to guess that there used to be a McDonald’s there just by looking at the old signage, building shape, drive-thru area, overabundance of red and yellow paint,etc.
A deleted file is like that out-of-business joint: as long as no one has demolished the building… it’s still there, and clever softwares recognize “deleted” files by the patterns they leave at those addresses.
For a file to be completely gone, one needs to demolish the pattern left by the file which is like demolishing that golden arch food joint and building a veterinary clinic instead: new building, new signage, no traces of the old building and yet, it’s address will be 1234 Hexadecimal Street.
so say you have a hard drive that can store 10 things in it, for simplicity. The addresses are arranged thusly – note that while I have combined things quite a bit, each individual file has its own address – so like the windows OS address in a real computer would actually be like a few hundred things, not just one thing
address 1: windows OS
address 2: word document “RoughDraft”
address 3: word document “stuff”
address 4: empty
address 5: league of legends game data
address 6: empty
address 7: pictures
address 8: empty
address 9: zoom
address 10: empty
Now if you read the hard drive it has all that data in that order: something like [addresses][win32][RoughDraft][stuff][garbage][league.dll][garbage][pngdata][garbage][zoom.dll][garbage]
If i then delete league of legends, it will remove it from the address list, but it doesn’t actually change what’s ON the hard drive, so you get an address list that says this…
address 4: empty
address 5: empty
address 6: empty
but then if you go read the actual data on the hard drive for those addresses, you get this: [garbage][league.dll][garbage] – so you can still see all the files there, because the bits that encode the league of legends data never actually got flipped.
~~~~~~~~
If you actually want to delete something, then you need to flip the bits on the hard drive – the best way to do this is to put new data in the address where the old data used to be – so if you fill up address 5 with copies of the song despacito, you will no longer be able to see the league of legends data, because it has been overwritten by despacito.
Let’s imagine a city planning department.
They give permission to build on a lot. This is like writing a file.
After someone builds a building on the lot, if someone new asks to tear down the building and build something new. They’ll grant peission to do that. This is what deleting a file does, it’s permission to use the lot for some new purpose. However the structure hasn’t actually been torn down, it’s just available to do so.
Only when a new file is actually assigned that space is the old building removed and a new one built.
Restoring a file is recreating the planning department paper that says there is a building here and here are the details of how to use it.
Your storage medium (drive) has parts called sectors. These act as little boxes. The sectors are a particular size based on how the drive is formatted (think of this like how you write your papers, with font and such). That’s going to be between 512bytes and 4kb. Bigger drives tend to have larger boxes (sectors).
Files have a particular size. If they take up exactly the same size as a sector, no space other than a single sector is used. If it’s smaller than a sector, exactly 1 sector will be used anyway. There are ways to get data to share sectors, which is what compression does.
Every sector is marked in a list telling the machine where the files are, physically. Every part is listed. when your system wants a file, it consults that list. It goes out and gets all of the parts. On SSDs this is pretty fast, but a hard drive will have to physically move to get them. This is slow. Defragging makes this faster by putting the parts together. They’ll be physically next to each other. SSDs do not need to be defragged.
When you delete a file the parts aren’t erased. That would take FOREVER. Instead, the parts on the file list are marked as “deleted”. The actual data is still there. They’re less accessible, but physically remain. When any new files or other data is written, they can be written in this full, but accessible area. It’ll overwrite the data there.
If not all of the data is overwritten, it’s still recoverable. Not all of it, mind, but enough that it might not matter. The file names are often lost, though. Those aren’t stored in the data, but the file list. Since that’s deleted, you’ll need to name the files found.
In the event the drive itself fails, pretty much all of the data is there, you just can’t get to it. It depends on how the drive fails. A crash is usually catastrophic. That’s when the read-head physically touches the disc. It destroys the disc surface, rendering most, if not all, of the data physically gone. It’s pretty much the worst-case scenario.
An SSD can also fail catastrophically in a few ways, but mostly it just can’t access the data. The chips that actually store the data may be saveable. This doesn’t always work – some come encrypted.
Some hard drives lose their controller. This is usually recoverable, but not easily. The controller is the hardware on the drive that tells the system and the drive how to talk. Just swapping this out for a working one can fix it…but not always.
Basically, data isn’t usually gone when you delete it. There are programs that will go through your drive and just write 0’s on the deleted portions. If you do enough passes this will generally render it effectively deleted. Or you can just encrypt your drive and require a password to unlock it. This comes at a performance cost, but if you’re concerned about security, it works really well. It’s nearly impossible to break it. I mean that. It would take about a million years of brute force hacking to get through. Faster computers reduce this time to a few hundred thousand years.
Guessing the password usually takes less time. Merely decades.
Generally when you delete a file on a computer, it’s not actually removed right away.
What ‘deleting’ a file does is it tells the computer ‘okay you can write something new over this data.’
Until something actually gets written over that data, it’s still there. It’s just not readily visible anymore.
Latest Answers