Data scraping is basically using a piece of software to read the contents of a website and store it into a database. A scraper could load the Reddit front page, then go into each post on the front page, read all the comments, and store the contents of those contents, including the userid of the person that wrote it, into a database.
There isn’t anything inherently “bad” about data scraping as a technique. It’s at the root of how pretty much all Internet search engines operate, for example. Where the concern comes in is the potential privacy implications. For example, someone could analyze writing styles and match up pseudonyms on Reddit with real people and use that to deanonymize them and build up in depth profiles of people they can sell to scammers, marketers, etc.
Data scraping is basically using a piece of software to read the contents of a website and store it into a database. A scraper could load the Reddit front page, then go into each post on the front page, read all the comments, and store the contents of those contents, including the userid of the person that wrote it, into a database.
There isn’t anything inherently “bad” about data scraping as a technique. It’s at the root of how pretty much all Internet search engines operate, for example. Where the concern comes in is the potential privacy implications. For example, someone could analyze writing styles and match up pseudonyms on Reddit with real people and use that to deanonymize them and build up in depth profiles of people they can sell to scammers, marketers, etc.
Latest Answers