Managing the Changing IT Landscape: Real-Time Predictive Analytics
There’s been huge interest in the press in the role of data science as an engine for predicting the March Madness* victors. A CIO* article reports on a Logistic Regression/Markov Chain (LRMC) model developed by four professors from Georgia Tech and Columbia that predicts Louisville as the 2014 champ, despite its #4 seed at the start of the tournament. After a tough test, Louisville is still in the game, but there are many high-ranking teams that have gone down in Cinderella upsets. In fact, there were 13 upsets in the 48 games that were played by the end of this past weekend.
Personally, I’m excited because my alma mater, the U Conn Huskies (a #7 seed), beat Villanova (a #2 seed) — though I will admit that rooting for your alma mater is not a good way to predict the outcome of the tournament.
Predicting the outcome of the “Big Dance”
I was curious about how these upsets affected the standings for the predictive models developed for the Intel-sponsored March Machine Learning Mania competition. For a little background on the contest, I recommend Robert Roble’s (@THESportsTechie) Intel iQ blog titled “Secret to Predicting March Madness Winners Hidden in Data.”
This contest is an excellent way for data scientists to hone their skills and get experience. That’s just what’s happening. Armed with historical data from the last five tournaments, teams developed their predictive models in a kind of practice round. The second phase of the March Machine Learning Mania kicked off with this year’s tournament. Predictions were set before tip-off as teams forecast not only the 2014 NCAA* champion, but all the games in between.
The leaderboards have been quite dynamic this past week, because the data from the first 48 games is now in the books. Why so much movement in the rankings? Unlike your home bracket, success is not because the numbers of teams that you picked are continuing to play for the championship. The leading Kaggle* competition teams are now succeeding because of the way all the teams won. The predictive models that most closely predicted those 13 upsets as well as the 35 victories are now the models that are best positioned to predict the next 15 games.
These are the leaderboards of the top 10 rankings from before tip-off March 18 (first round), March 21 (second round), and then again March 25 (after the "sweet 16" were decided).
Predictive analytics for business
The lesson for business is that an effective analytics model must continuously improve predictions by using new information to improve insights. Here’s why.
Predictive analytics extracts information from historical data sets to predict future outcomes and trends. This approach used by itself is flawed, since the world is constantly changing. One of the Holy Grails of big data analytics is the ability to practice predictive analytics and draw new insights with real-time data streams, new data sets, and competitive information that others may not yet be able to access. With machine learning and other analytic techniques, the incoming data can be used to continuously refine and improve the original analytics model. Business users and automated systems can have immediate access to up-to-date operational data, driving efficiency and competitive advantage.
Predictions become progressively more accurate, leading to increasingly useful insights and smarter decisions. In applications where businesses can benefit from closely monitoring and predicting trends or making rapid decisions (retail, marketing, manufacturing, and energy, for example), real-time predictive analytics delivers significant value. One of the challenges with real-time analytics has been building cost-effective tools to analyze new sets of data more quickly. Achieving real-time analytics often means deploying the right IT solutions that can reduce big data processing time dramatically.
The March Machine Learning Mania competition has given us a number of interesting avenues to explore related to predictive analytics. If you’re interested in learning the basics, read the short Predictive Analytics 101.
Is your organization using or thinking about predictive analytics applications? I’m interested in your comments.