Big Data’s Biggest Challenge: Getting Started

Amazon recently rocked the business and consumer world with their announcement that they would be exploring “a method and system for anticipatory package shipping.” This new data-driven plan recognizes shipping time as a barrier for online purchases and seeks to minimize the time between order and delivery with predictive logistics.

As someone who spent half a decade working in supply chain, this intersection of Big Data and logistics is fascinating to me. The solutions to many logistical conundrums are not visible to the naked eye, but are instead buried within data, and are unlocked only with the right algorithm and a clear understanding of what you want to achieve.

For many businesses, getting started in advanced analytics can feel overwhelming – many have compared the challenge of Big Data solutions to being tasked with “boiling the ocean.” So where should your organization start?

I recently came across an Intel IT case study that captures an entire project from beginning to end, as Intel’s sales organization partnered with IT using advanced analytics to solve a supply chain issue.

To properly set up the analytics challenge, Intel IT used the following 3-step approach:

1. Understand the
 Business Domain

Before Intel could start mining data to fuel a predictive analytics engine to advance sales and reduce costs, they needed to better understand the relationship between Intel, manufacturers, distributors, resellers and end-users.

As they began to understand the business domain, they realized a need to find hidden patterns and deciding factors in data that were not available in the sales database. The solution also had to be automated to help the sales organization most efficiently use the program database each quarter.

2. Devise a Solution

To build a solution, a three-phase approach was necessary:

  1.     Determine what data sets could be used for mining.
  2.     Build the predictive analytics engine that modeled and learned from the mined data.
  3.     Optimize the engine to include feedback.

3. Identify Available Data Sets

Once Intel IT understood the business problem they needed to solve, they started looking for data sources to mine. As they determined the data source, they populated the enterprise data warehouse (EDW) with demographic, training, and sales data, which enabled them to create customer profiles. Once profiles were outlined, information was added that would help the sales organization determine each customer’s revenue potential.

Once the business challenge, domain, solution, and data sets had been properly defined, how did Intel IT develop their predictive analytics engine? Discover all the details here.

Finishing Strong

While it can feel overwhelming to start Big Data projects, the Intel sales and IT organizations are pleased with the ultimate results. An estimated $3M incremental revenue has already been realized in the Asia Pacific region, while the initiative is expected to deliver up to $20M when deployed globally.

Up next is utilizing open source software that includes the Intel Distribution for Apache Hadoop software to mine unstructured data to gain additional insights and predictive capability.

Learn more about Intel Distribution, the open source software that includes Apache Hadoop and other software components along with enhancements and fixes for Intel.

In the comments section, or on Twitter, tell me: what is holding you back from getting started on a Big Data project?


Chris Peters is a business strategist with more than 21 years of experience ranging from Information Technology, manufacturing, supply chain, nuclear power and consumer products.

Find him on LinkedIn.

Follow him on Twitter (@Chris_P_Intel)

Check out his previous posts and discussions