[ELI5] What is data scraping and why is it bad?

298 views

[ELI5] What is data scraping and why is it bad?

In: 16

6 Answers

Anonymous 0 Comments

Data scraping is basically using a piece of software to read the contents of a website and store it into a database. A scraper could load the Reddit front page, then go into each post on the front page, read all the comments, and store the contents of those contents, including the userid of the person that wrote it, into a database.

There isn’t anything inherently “bad” about data scraping as a technique. It’s at the root of how pretty much all Internet search engines operate, for example. Where the concern comes in is the potential privacy implications. For example, someone could analyze writing styles and match up pseudonyms on Reddit with real people and use that to deanonymize them and build up in depth profiles of people they can sell to scammers, marketers, etc.

You are viewing 1 out of 6 answers, click here to view all answers.
0 views

[ELI5] What is data scraping and why is it bad?

In: 16

6 Answers

Anonymous 0 Comments

Data scraping is basically using a piece of software to read the contents of a website and store it into a database. A scraper could load the Reddit front page, then go into each post on the front page, read all the comments, and store the contents of those contents, including the userid of the person that wrote it, into a database.

There isn’t anything inherently “bad” about data scraping as a technique. It’s at the root of how pretty much all Internet search engines operate, for example. Where the concern comes in is the potential privacy implications. For example, someone could analyze writing styles and match up pseudonyms on Reddit with real people and use that to deanonymize them and build up in depth profiles of people they can sell to scammers, marketers, etc.

You are viewing 1 out of 6 answers, click here to view all answers.