So i was enjoying some down time for myself the other night taking a nice warm bath and letting my mind wander when i suddenly recalled a time when i worked at a research station and some idiot managed to somehow delete over 3000 excel spreadsheets worth of recently collected data. I was charged with recovering the data and scanning through everything to make sure it was ok and nothing deleted…must have spent nearly 2 weeks scanning through endless pages…and it just barely dawned on me to wonder…exactly…how the hell do data recovery tools collect “lost data”???
I get like a general idea of like how as long as like that “save location” isnt written over with new data, then technically that data is still…there???? I…thats as much as i understand.
Thanks much appreciated!
And for those wondering, it wasnt me, it was my first week on the job as the only SRA for that station and the person charged with training me for the day…i literally watched him highlight all the data, right click, and click delete on the data and then ask “where’d it all go?!?”
In: Technology
Imagine that a computer’s hard drive (HDD/SSD) is a warehouse. To find things in the warehouse, you have and inventory list of numbers which point you to look for an item in the warehouse (For example, Aisle 5, Shelf 12, Bay 6).
When you delete a file, it is like removing the record from your warehouse’s inventory list. This indicates that the space in your warehouse can be used for other inventory even though you haven’t removed the old “junk” inventory yet. The old junk inventory remains until the space is needed (overwriting in a computer) or removed as part of a clean up operation (aka, emptying your recycle bin).
Data recovery tools are like sending someone to take inventory at your warehouse- they can find and re-record the items (files) that were deleted but haven’t been actually removed yet.
Imagine that a computer’s hard drive (HDD/SSD) is a warehouse. To find things in the warehouse, you have an inventory list of numbers that point you to look for an item in the warehouse (For example, Aisle 5, Shelf 12, Bay 6).
yet. The old junk inventory remains until the space is needed (overwriting in a computer) or removed as part of a clean-up operation (aka, emptying your recycle bin).
Most answers seem to forget that even writing over a file it might still be recoverable (on a hdd at least). I’ll try to ELI5 this one:
Your hdd is like a stack of sheets of papers, and you write on them using pencils. And given you are writing and reading all the time, in the interest of time, when something has to be deleted, first you just remove that sheet from the stack to a “to be recycled” stack.
When you run out of paper (or earlier, if you fancy it) you may get a sheet back from the recycle stack. Before you write, you use your eraser.
Now, at first you can probably still read (some of) the previous text on the sheet. But after recycling it a few times, it will be unreadable.
A few tools (called a shredder) will rewrite and erase over the sheet multiple times until the text you had there becomes unreadable.
Ah, formatting usually don’t destroy data as well.
(to add to this metaphor a bit: pages are numbered, and you try to write sequentially, but then you delete something from the middle of the stack. When you need more space, you reuse that, but a sheet isn’t enough for what you are writing, so you make a note on the bottom, “next page: 10”, so when reading you move back and forth because the text is *fragmented* across many pages. De fragging is about reorganizing the sheets so that you don’t need to jump around so much.)
An harddrive is like a paper notes. You won’t cut a hole into it to erase content, because if so, you will also need to tape paper back.
So physically, the paper will always exists.
Then as for it contents, it is up to you to organize it and interprete it as you want.
You may see emptyness everywhere, but for a computer, those emptyness are still something – like (TLDR) they may see the space characters used in between word on the full page instead of emptyness. You don’t create data or delete data [physically] on an harddrive. If you prefer, see it like electrical switches. They may not be wired, but they will always exists.
That paper will be read by you to someone else, so only you need to be able to read it, but not the other guy.
Let create our own file system to manage our data on that paper note: A valid block (and existing block) of data on your page always start with a capital letter and end with a dot. If you see a dot followed by a non capital letter then it is free space until the next capital letter. So, basically how they teach you how to write a sentence.
So you start writing sentences on your page, 5, then 10 sentences. Then you figure out you don’t need the 3rd sentence anymore. You could erase that sentence and move everything after to clear the emptyness so it looks great.
That will take sometime. And, for a computer, reading data is hidden from the user. So why bottering to look nice when you just want to be fast so the user like you?
You remember our rules from above? They have been made to be fast.
Instead, you will just lower the first capital letter of the sentence you want to “erase”.
So later one, when you will read that paper, you will read the first sentence. After the dot you will start reading aloud only on the next capital letter. The next capital letter is on the 3rd sentence.
So for the user, the 2nd sentence doesn’t exists. Yet, it is still on your page.
If you need to write something, you will look for free spaces by looking for those “invalid” block of data – those invalid sentences.
If you want to recover a sentence, you will recreate, yourself, how our storage system work, but will basically swap out the rule of what is “sentences to read” with “what sentences to ignore”.
Consider an address book that you have for all your friends. It has the address details of all your friends meaning you know how to find them.
Now imagine you tear a page and throw it in the trash, or maybe you loose your whole address book. That doesn’t necessarily mean your friends all cease to exist. They are still there, you just don’t know how to find them. This is how an operating system deletes files, by tearing their address information from the address book. In this case your friends are files on the disk.
Now imagine you really need to talk to A friend called John, but you accidentally tore and trashed John’s info from your address book a while back. You can still somehow contact John by contacting mutual friends and worst case look through each house in the city where John lives. This is how data recovery tools find deleted files, by clever mathematics or worst case brute force through the entire data in the disk.
ELI5 answer.
Computers are like libraries. Files are like books. A library has a table that lists all the books that it has within it and where to find it. If we remove an entry from that table, no one can find it unless one were to go through and check every single book in the library and say “hey this book is here but it’s not in the table”.
The act of going through every single book in the library is basically what recovery software does.
Latest Answers