Data science is a hot field (duh). Popularized by Billy Beane of Moneyball and Nate Silver of the FiveThirtyEight blog, it’s proponents say data science will save us from the massive quantities of data that are generated by companies, government and even ourselves. An influential (and widely cited) study by McKinsey indicates that the U.S. alone will face a shortage of 140,000 to 160,000 people with analytical expertise. Many universities are working to create new programs to train this new batch of data scientists. (A great resource is this list of schools with Masters in Data Science programs: http://www.mastersindatascience.org/schools/23-great-schools-with-masters-programs-in-data-science/) Some of the most popular MOOCs (Massive Open Online Courses) are about Data Science and Machine Learning, including ones by Udacity and Coursera.
But who are these new Data Scientists? A recent story from CNBC titled, “Why your kids will want to be data scientists” focuses on pay and job prospects; nothing on passion, scientific curiosity, or value to the world. This prompted Steven Mills to tweet, “If salary is driving the next generation of data scientists I worry we will lose the passion.” The key notion is the (implied) relationship between the passion the individual has for a field and the quality of their work. It seems intuitive that new entrants in the field that are chasing job prospects and high-salaries will not have the passion for the field.
So why should we care? As Data Science managers we want to hire the best talent for our teams. But it’s difficult to distinguish between the passionate ones and the ones just in it for the money. So what do we do? Some have proposed that we look at individual’s activities beyond formal education; participation in Data Science contests like Kaggle, TopCoder, or InnoCentive, or volunteer organizations like Data for Good, DataKind, or Code for America. Joining a Data Science oriented MOOC is also mentioned as a way to measure an individual’s passion for the field. Completion of a relevant, and well-regarded, MOOC is a good signal, but completion rates average less than 10%. But for individuals that don’t have formal Data Science training; people looking for a career change, or holding degrees without significant computer programming and/or statistics requirements MOOCs can be a valuable, and many times inexpensive, alternative.
In my experience, none of these are fool-proof signals for finding passionate Data Scientists. Many of the members of my team came from backgrounds other than Data Science, Statistics or Computer Science. Also, many team members didn’t participate in a Data Science contest or take a MOOC.
At this point, I’m left with more questions than answers.
How do you find passionate Data Science practitioners?
Michael Cavaretta is a Data Scientist and Manager at Ford Motor Company in Dearborn, Michigan He is a leader for the Predictive Analytics group in Research and Advanced Engineering.
Read more blog posts by Michael
Follow Michael on Twitter