I often get asked what industry trends excite me the most. Without doubt, it has to be the surge in amazing technologies that democratize solutions to complex data problems. Many of these are open source, or available on cloud platforms at the click of a button. This was unthinkable as recently as five years ago. I think it’s fantastic that all industries can now profit from the game changing work done by heavyweights like Google, Facebook and Amazon.
So where’s the problem?
It’s in the paradox of choice.
Common wisdom dictates that consumers always want more – more smartphone models, more styles of shoes, more book titles. But this is not always the case. Having many complex choices induces analysis paralysis. Especially so when you’re outside our realm of expertise, and are uncertain about what criteria to consider. It’s much more tempting to take the path of least resistance – that is to continue bandaiding the legacy solution. Uncertainty makes you kick the can down the road.
Industry trends indicate that things will only get more complex. For any data science use case, there has been an order of magnitude increase in potential solutions to consider. Which means you have this nagging anxiety that you’re not doing all you can to stay ahead of the game. Do you need to be looking at TensorFlow right now? How about Apache Beam? Druid or Redshift? The choices are mind boggling.
What do you do? You need ways of filtering out what’s not relevant. Let’s look at a mental framework I’ve found useful for doing just that.
Gartner Analytics Maturity Model
Gartner postulated that any data forward organization goes through four stages of analytical maturity – descriptive, diagnostic, predictive and prescriptive. These stages remind me of Maslow’s hierarchy of needs. Just as Maslow maps out how individuals can attain their highest purpose in life, the Gartner model shows us a path by which data finds its highest value.
So, what should you be expecting at each stage?
You’re able to generate basic reports and dashboards with KPI metrics and trends. Automated systems collect, clean, store and catalog data from diverse sources. You constantly monitor data quality. Robust failure recovery infrastructure minimizes the need for manual intervention. Big Data technologies like Hadoop make it very cost effective to store large volumes of data forever, but they are just one piece of the puzzle. You also need reliable data workflow engines like Airflow, Pinball or Luigi to cover all of your bases.
You can triage data issues – find the proverbial needle in the haystack – quickly and efficiently. Why did the ARPU (average revenue per user) on your mobile game go down this week? Was it because an Android system update increased the number of app crashes? Was it limited to a specific smartphone model? Or was there a problem with a data feed from one of your ad networks? Tools like Splunk, Presto and Druid can help you with such investigations.
Now that you have dependable data, you can build predictive models. Models that make your data product more useful. You’re thinking beyond obvious applications and making subtle improvements, like how Yelp automatically categorizes uploaded photos as “food”, “drink”, “inside” or “outside”. Your models can also help streamline operations – for example, when enterprise sales teams use propensity models to prioritize prospects based on their predicted likelihood to buy. Platforms like H2O, Dato, TensorFlow or Aerosolve are great for this stage of data maturity.
Your data is truly valuable when it enables high impact decisions. But sometimes there are too many variables and scenarios to consider manually. How do you best take advantage of all available opportunities? How do you mitigate risk? These are some of the questions you want to answer confidently. Prescriptive analytics allows you to simulate different scenarios, evaluate predicted outcomes and recommend an optimal strategy. Revenue management systems like those used for airline pricing or in the hospitality industry are classic examples.
What stage are you on?
As we’ve seen earlier, a staggering number of solutions are available to help you move up the analytics value chain. But organizations often get the timing wrong – which is why only 27% of C-level executives think that their companies are getting any value out of their data. Don’t invest in analytics that your organization isn’t ready for yet. You might be spending a lot of resources on predictive technologies, but if your data collection is not dependable, you’re wasting your time. The predictions won’t work, and nobody will ever want to use them.
What a lost opportunity.
So, take a step back, and honestly evaluate what stage you’re on. After all, each stage lays the foundation for the next. You want strong foundations so you don’t end up with a house of cards. Prioritize resources towards technologies that are relevant at your stage of data maturity. Take the time to build confidence in them every step of the way. A credibility focused approach will make it much easier to execute on your vision of building a data-first culture.