In my conversations with companies looking to hire data scientists, I hear a recurring theme – “It’s so hard to find the right combination of industry background and technical chops.” Or. “We found the perfect fit. But there was no way we could match what Google / Facebook was offering, so we lost out. How do we build an effective team when the competition is so white hot?”
The problem with their approach is that they were trying to find data science unicorns. What is a data science unicorn? In my last post, I offered a useful way to think about a data scientist’s skillset. If you look closely, you’ll see that those skills don’t always overlap. For example, a software engineer won’t automatically have the statistical background to train, test and validate predictive models. And a statistician might not think through data latency issues that render his model useless. Can you find someone who is an expert in all aspects of data science? Sure, but it’s very hard. Like finding a unicorn.
Data science thought leader Monica Rogati has brought up this issue many times, and I agree with her. If you’re waiting for unicorn data scientists, you will likely be waiting a long time. There are very few of them, and they are in extremely high demand. Which means that you will have to shell out top dollar to get them on board. Also, chances are that they are working on their own ideas, and are not interested in being an employee anywhere.
Well, that’s not encouraging. How else can we approach this problem?
You Want Domain Expertise
A lot of data science discussions center around the latest tools and technologies. What gets downplayed is the role of industry background. Many industries are so complex that it takes years of active experience to gain an intuition for where the main business gaps and challenges lie. Without that intuition, you’d be shooting in the dark. You don’t want to build a solution that’s amazing on the technical front, but doesn’t address any real industry problems.
In specialized industries like banking and healthcare, it’s all the more imperative to be on top of latest developments – not just to identify customer painpoints, but also to not run afoul of industry regulations. For example: Does your predictive analyics engine expose sensitive patient data and violate HIPAA? Is your recommendation system PCI compliant? Without the right industry background, you wouldn’t know that you need to ask these questions. Ignorance – wilful or otherwise – is not an excuse, as Zenefits and Theranos are finding out.
Why Don’t You Train Your Current Team?
People on your current team already have the deep industry knowledge. So why don’t you train them to move up the data science value chain? There are a lot of great data science programs that could be part of their continuing education. Yes, they could do these on their own time, but here’s why I think it’s important to train on the job. The data. Many data science courses use examples with fabricated data, or data that is not applicable to your industry. You want to have your team learn using real data from your business, and they might not be able to do that unless you officially sanction their training programs.
Proactively identify who has the aptitude and the drive to step up to the role. You already do the same for management positions, don’t you? Your vote of confidence will reward you with highly committed and motivated employees. Which is all the more important in high turnover tech hubs like Silicon Valley.
What kind of training should you provide? To answer that, you first need to identify where your business currently lies on the Analytics Continuum. My former colleague Rob Koste has written some excellent thoughts on the different stages of data maturity at companies, and how to navigate from one stage to the next. Once you’ve identified where you stand and what skill gaps keep you from the next level, you’ll have a much better idea of who to train on your team and how to train them.
But I also realize that everyone on your team might be stretched really thin, and you just don’t have the spare headcount to train into a challenging new role. You have to hire new talent. We’ve already seen how difficult that is in the current environment. In my next post, I will present an alternative approach to recruit for an effective data science team.