By constantly downloading random information from the internet, wouldn’t you be exposing yourself to tons of malicious content? Aren’t there pages that can run malware without you even clicking on anything?
A better example than search engines might be something like “the wayback machine”, a site that actually saves the pages, and not just links.
In: 12
ELI5: You can pretty easily tell if something is a book right? So you are looking for something to read. Pick it up. Is it a book? No. Toss it. Yes? Read it.
Search engines do the same with everything they process. Malware can’t be embedded in a webpage, its a seperate executable downloaded by the page. So anytime the crawler reads something “is it a webpage?” No, toss it. Yes, process it, then find everything it links to, repeat.
Latest Answers