Big Data. How often do you see articles talking about Big Data? What is so special about it, what can you do with it, and why even care about Big Data? What's the big deal?
Well, it is a big deal. The amount of data generated each year by individuals and corporations continues to grow. According to International Data Corporation, digital universe data has grown from a ridiculously large number of 130 exabytes in 2005 to an insanely large number of 1227 exabytes in 2010, and will grow to an astonishingly larger number of 7910 exabytes by 2015. For any of you who don't know the digital system of units, 1 exabyte is equal to 1 billion gigabytes. So that's a lot of data...and that's a 60x growth rate in just 10 years!
So what are we supposed to do with all this information? Is it even possible to make use of it all?
Enter Data Science: the combination of traditional fields of study like math, statistics, visualization, computer science and pattern recognition. Data scientists interpret subsets of growing data within the digital universe and develop tools for the rest of us to easily be able to make sense of overwhelming data.
So if we have all these exabytes of data and we have data scientists, will the scientists make the Big Data make sense? Unfortunately the answer is "it depends." The key to data of any size is figuring out what it means and what you might want to do with it. There is no one right way to look at data and there is no one set of conclusions that can be drawn from it. In order to use the data available, we need to step out of the big data discussion and figure out what we are trying to solve.
For example, there is an intersection near my house with gas stations on three corners. When I need to get gas, I have to decide which gas station to use. Do I use the one with the best price, the one that is closest to me, the one that requires the fewest left turns or the one that has the best frozen drinks? The amount of information available is small, but if I don't take the time to figure out what I am solving for, I won't be able to use the data available to make my decision.
Big Data in the Casino Industry
In comparison, a casino operator likely has a significant amount of data representing every game played on the casino floor. As the task of making sense of this big data becomes overwhelming, the casino owner has to ask some tough questions:
- Are there enough popular games on the floor?
- Are there games that are not profitable?
- What are the demographics of the most profitable customers?
Once strategic goals are figured out, the overall data set can be cut down to just the information that will help answer these questions. We have taken Big Data and made it much smaller. Like we saw with the gas station example, smaller data won't solve the overall problem, but it does make the challenge less daunting.
Oh and remember the Data Scientists? Their job is to look for data patterns and design mathematical models that will help solve your questions. Some of this work can be done with tools like Excel, but the really complex modeling requires math that most people have either forgotten or never learned. (Check back for more posts that will dive into the far-out math and science behind data).
So back to my original question: What's the big deal about data and what can you do with it? In my opinion, data is a big deal if you know what you want from it and have the right resources to help you get it. Data should not be the primary focus. Instead, focus on what you are trying to solve and let the data and scientists guide you from there.