Everything that’s ever been on the internet is still there, just hiding.
Thus spoke a wiser person than me. Well, as a non-CIA spy with god-level access to ALL of the internet, I want to find really old images and mp3s posted in the late 90s/early 2000s to sites that no longer exist. The Wayback Machine https://web.archive.org has allowed me to see tantalizing snapshots of old web pages, but where is all that old content now? Is there a better way to search for old images and audio? obv.
In: 0
The internet is a collection of computers with data stored on hard drives somewhere. The computer may be mounted in a rack next to others in data center, but it functions similarly to a desktop machine with an attached enclosure of disk drives.
Not all old files still exist. The user accounts holding them might have been deleted, the owners may have moved on to other projects or the server might have suffered catastrophic data loss.
Data has to be intentionally backed up by sites like the Internet Archive. Software distributions may get mirrored by download sites. You might get more content available form Archive.org if you select another date. A major hub of personal websites called Yahoo Geocities got mirrored over at Oocities.org
It’s not correct that everything that’s ever been on the internet is still there. Although it is generally true that most things of note on the internet do get archived one way or the other – either intentionally by services like the Wayback Machine, or unintentionally by people downloading and then re-uploading that same content to a different place – at the end of the day everything that’s on the internet does need to exist as a file on a computer somewhere, and if those files are deleted, then it’s gone. What you’re seeing on the Wayback machine is hosted by the wayback machine and that is probably the best way to find that content. Moreover, there’s probably a lot of internet content saved on computers that were once connected to the internet, but now aren’t, meaning that those files are effectively “gone” as far as an internet user is concerned – though they still do exist somewhere they’re “hiding” in that they would be impossible to access unless you foudn the physical computer they’re saved on
Your problem is thinking of “the internet” as one homogenous thing, rather than what it really is: a decentralised collection of independently-operated computers and networks that happen to communicate with each other using established standards and protocols.
Any individual piece of data that is or was available on the internet is ultimately physically located on some storage media owned by some person or group. That may be the person who created it, it may be someone else who explicitly downloaded it, it may be some group who incidentally has a copy of it like Google or the Internet Archive.
But there’s no guarantee that anyone still has a copy (files get deleted, computers die), and if they do, that’s it’s still available online (people die, computers get disconnected, companies go bankrupt), accessible by everyone (and not locked behind someone’s private Dropbox, or on a site not indexed by search engines, aka the deep web) and that it can be easily found (obviously there is no search engine to find content not indexed by any search engine).
When people talk about anything you upload to the internet being there “forever”, what they mean is that once it’s there you can no longer control who has access to it or can make copies of it. It doesn’t mean that literally every piece of content that was ever uploaded is guaranteed to stay there forever, you need to take active measures to ensure that if that’s your goal.
Latest Answers