What is Big Data and Data Analytics?



Explain to me like I’m 5

In: Technology

Big data analytics examines large amounts of data to uncover hidden patterns, correlations and other insights. With today’s technology, it’s possible to analyze your data and get answers from it almost immediately.

Big Data is having a large volume of data (like say the data of all Facebook users) that covers multiple aspects/categories (such as browsing habits, likes/dislikes, ads watched, ads skipped, location data, work history, time the app/site is used, post history, etc.) A good video about it.
Data analytics is looking for patterns in this data and then doing something with the patterns. For example if I notice that the majority of dog owners ignore Purina ads, but love/watch Chewy ads, I might ask Chewy to pay a higher fee to advertise on my site. I might also notice that the majority of people who have cat pictures on their social media also have a lot of listens to Mr. Brightside on Spotify so I buy ad space on Spotify’s Mr. Brightside playlists for my cat toy website. A set of videos about data analysis.

So Data is simply knowing quantifiable stuff about an observed event.

Simple data might be “how tall something is” or “at what time did someone go to sleep”.

Big Data is when we start looking at how all the little data start interacting with each other, let’s think specifically in terms of internet usage, for example.

So we start not only collecting information about what websites a user is visiting, but more detailed scenarios like “70% of the time after visiting amazon.com users between the ages of 18 and 34 are performing google searches for the same thing they just shopped for, and clicking on the first 3 search results”.

That’s not only capturing the small data of “user went to amazon” “user is 27 yrs old” “user searched amazon for X” “user visited Google” “user searched google for X”… etc., but also capturing the whole story that those individual data points tell us, and analyzing it over a large sample set and time span.

That sort of Big Data, and the subsequent analysis of said data “Data Analytics” allows marketing companies to determine how best to target advertisements to get the biggest result.

If I know the scenario above, I might want to embed my ads for products on amazon within sites with high Search Engine Optimization, knowing that they’re going to appear in those top 3 results, and therefore get more coverage for the same money spent on the ads. Subsequently, businesses will want their own sites to appear within those top 3 results, and work to improve their own SEO.

This is just one example of how Big Data works (it’s not just confined to internet usage; pharmaceutical data, medical data, sociological impact data, these are all fields that can benefit from this sort of large scale correlation analysis), but I think it helps illustrate the difference between simply gathering specific data, and the analysis of what’s been coined as “Big Data”.

Big Data is a buzzword for massive repositories of information that need to be searched through quickly, particularly because new data is being added quickly. Things like stock trades, Amazon purchases, YouTube views. Data Analytics, really just refers to analyzing data, but when it’s combined with Big Data, you can start to see some patterns emerge, as even 1 in a million events happen a lot at ten million transactions per second.

The combination allows companies to look at pools of data in very broad ways, and can recognize correlations that a human might not even think to look for. Say, for instance, people who buy baby formula also are more likely to buy camping gear 2 years later. So a retailer like REI might find it useful to get the marketing list for new births, and start sending catalogs/coupons about 21 months after birth.

Netflix has used it to guide decisions on what shows or movies to create, looking at combinations of actors, directors and subject material.

Big data is like regular data just a lot more of it.

So much more that you can’t store it all on one computer. Or you could but it would need to be huge and they don’t really make computers that big. But if you instead take 50 or 100 computers and network them together you can get the same amount of storage as you would in the super big computer.

So now you have access to new data sets that didn’t exist before because there simply wasn’t the space to store all that data in one place. Or they did exist but it took a lot of work to actually trawl through it because it was all so scattered.

So that’s where analytics comes in. Analytics applies to data big and small but the bigger the sample size usually means a more accurate analysis or prediction.

Analytics is just the processes and tools used to answer (or start to ask) a question based on what the data actually says rather than what you think it might be.