what big data is and how it is useful



I’m reading a lot on this topic and I cannot understand it

In: Technology

Big data = harvesting all sorts of data on a large scale, such as spending habbits, geographical locations, psychology, sleeping habbits. Literally anything you can think of. Why is this useful? Well companies can come along and purchase this data collection, reinterpret this data and get useful information out of it. The idea is pretty terrifying, since it’s not just being used to push adverts on us, it’s been used for more nefarious things.

Big data is basically what it says: loads and loads and loads of data.

In very simple terms, it is the process of collecting, storing, and analysing vast vast amounts of data for a particular purpose. For example, if you had a way of tracking every single fish in the sea, you would be able to learn huge amounts about sea currents, migratory patterns, fishing impact, etc.

It has become a buzzword recently because advances in storage space means we can store far more data than before, advances in computer science mean we can design much better algorithms for analysing data, advances in computer hardware mean we can have much more powerful computers to run those advanced algorithms, and advances in other technology mean it is much easier to collect data.

It has also gained a lot of focus in the public eye, as it is often linked with the collection of people’s personal data by big tech companies and/or governments. For example, Facebook collects huge huge amounts of data on the behaviours on individuals online, collected from all sorts of sources such as Facebook itself, your browsing habits, your shopping history, etc etc. It uses this data to learn all sorts about you (and others), and can e.g. sell your profile to advertising companies or others. Facebook and others can also look at the data in bigger chunks, to learn all sorts of insights about how larger populations behave and act.

You will also find it linked up with other buzzwords such as “AI” or “machine learning”. This is because those technologies (or linked technologies that are mis-labelled as them) often underpin the newest and most advanced algorithms used to analysing data.

One of the best examples I saw was presented on a PBS Frontline documentary on artificial intelligence. In China, there is a micro-lending platform (phone app) through which people can request small loans for whatever purpose. The applicant puts in just a few pieces of information and before they even finish, the app has determined whether they should get the loan. Because of the amount of data that it researches (over 1,000 data points), it is more reliable than an old-fashioned loan officer, who might look at 10 data points (characteristics) of the applicant.

By looking at data on a massive scale, you can see more patterns, which helps you predict outcomes better. In the case of the Chinese loan app, there is a strong inverse correlation between the person’s ability to pay back their loan and the battery percentage remaining on their phone at the time they applied. In other words, if you’re phone is almost out of battery, you are more likely to default on your loan and likely to be a riskier person in general. Without big data analytics, that tidbit might never have been discovered. So big data can be useful in making sure people do not default on their loans, which is costly for everyone involved.