Previously the way that Google indexed the web was by downloading a copy of *all the websites* (well, all public web pages) onto their own storage, and then building the search index from there. The copy has been also made available as “cached pages” so you can see what the result is even if the original website is down or overloaded.
(“The search index” is the data structure that lets their servers look up a search term and get a list of web pages where that term can be found. It’s kind of like the index to a book, where you can look up a topic and get the page numbers that refer to that topic.)
If the indexing pipeline no longer needs to keep a copy of the whole public web in persistent storage (i.e. because they’re building the index on-the-fly) then they no longer need to have those cached pages. Also, websites don’t go down as often as they did 20 years ago, so the cached pages feature is not as important to users as it once was.
Latest Answers