When Big Data Means Bad Analytics

When analytics delivers disappointing results, it is often because there is not enough analytic expertise, and/or lack of understanding of a business objectives for using Big Data in the first place. To avoid failure, insist on high standards.

By TJ Horan, FICO. http://www.kdnuggets.com/

Now that data scientist has been named one the sexiest job of the 21stcentury, a lot more people are claiming that they “do analytics,” and a lot more companies are claiming the same as well. The rise of Big Data means there’s more demand than ever for these types of businesses.

Unfortunately, few of these “analytics arriviste” companies have the experience necessary to follow best practices in terms of analytic processes.  And more to the point, many of these companies haven’t done the important-but-unglamorous work of really understanding the data that drives both the decisions and the building of analytic models.

When analytics delivers disappointing results, it is often because of one or both of these reasons:  1) there is not enough, or the right, analytic expertise; and 2) there is a lack of understanding of a business’s objectives for utilizing Big Data in the first place. That’s why this may be a year of bad analytics.

Exploring huge amounts of data with advanced analytic tools can be fun, but it can also be a huge waste of time and resources if the results do not translate into something that solves real-world business problems.  And while today’s analytic tools are becoming more robust, that doesn’t mean there is less of a need for human expertise.  Analytic expertise informed by deep domain knowledge is essential for building effective predictive and decisioning models.

Today’s shortage of analytic talent puts more pressure on organizations to ensure they engage with well-trained data scientists, either their own in-house experts or vendors with whom they choose to work.  A McKinsey study predicts that data science jobs in the US will exceed 490,000 by 2018, but there will be fewer than 200,000 data scientists.  The fact that there is a talent shortage only makes the issue all the more urgent to address.

But don’t be discouraged by those numbers; there is some good news.  First, businesses are getting increasingly smarter about what analytics can do, how it works, and how to tell the good from the bad.  Secondly, higher educational institutions are creating programs to better train tomorrow’s data scientists.  For example, San Diego State University MBA students participate in a semester-long program that enables them to bridge the gap between the theory and practice of analytics by using current and real-word decision management and modeling tools. Students experience working with real-world data in which values are sometimes unreliable or missing and they sometimes have to make decisions with incomplete information—just as today’s businesses do.

Working with universities to increase enrollment and improve the data science curriculum, and expanding internships and work practicums where students get a real-world working knowledge of the analytics they are trying to master, will certainly help improve the situation in the long-term.

If you plan to engage with a vendor for a Big Data project, make sure they really understand your business and the data that drives both the decisions and the building of the analytic models. With the growth of both open source and commercial data science tools, newer “data scientists” often use these tools without a true understanding of how they work, what the parameters are, and the impact on your business decisions.

When looking to recruit a data scientist, ensure you have a highly selective hiring practice and look for talent with knowledge in statistics, mathematics, or programming.  And remember that it is improbable you will find one person with all the skills, knowledge and background needed.  So instead, consider having your data analysts work on collaborative teams, and create team-based agile analytic model development projects.

When done properly, Big Data can lead to real business results and benefits.  Bad analytics will get exposed and rejected by the market — but that will take time.  We as analytics leaders need to help the market in this difficult transition.

Bio: TJ Horan oversees fraud solutions for analytic software company FICO. He blogs on the FICO Blog and is on Twitter as @FraudBird.