Deep Learning and Artificial Intelligence Help World Bank Team Create Image-Recognition Models for Crowdsourced Photos

AI has made big strides in the past several years. E-commerce sites recommend future purchases based on past purchases, Alexa* and other digital assistants respond to our inquiries, and social media platforms help us organize and tag our photos for easier search. More and more organizations are bringing the power of AI to bear on their processes, initiatives, and operations.

The International Comparison Program (ICP) team did exactly that. The ICP team in the World Bank Development Data Group utilized Intel’s BigDL framework (a distributed deep-learning library for Apache Spark*) and an AWS Databricks* platform running on Intel® Xeon® Processors to help classify more than 1 million crowdsourced photos before sharing the dataset with the public.

The ICP Pilot Study

The photos were collected as part of the pilot data collection study the ICP commissioned from December 2015 to August 2016. For the project, paid contributors used smartphones to gather photos and price-related data for a variety of household goods and services (in 162 categories from food to footwear) in 15 countries: Argentina, Bangladesh, Brazil, Cambodia, Colombia, Ghana, Indonesia, Kenya, Malawi, Nigeria, Peru, Philippines, South Africa, Venezuela, and Vietnam.

To efficiently compare all those photos within and across countries, the ICP team turned to AI and deep-learning models that could help review, search, and sort the images into 162 categories.

In short, they needed to automate the process of confirming that the crowdsourced photos matched the goods and services for which the observations were submitted—and to remove personally identifiable information (PII) from the photos along the way.

What Was the Point?

Why go to all this trouble? Here’s bit of context. The ICP has been around for 50 years and this particular global data initiative was led by the World Bank under the auspices of the United Nations Statistical Commission. The ICP ”measures world economies,” providing the kind of data that enables its parent organization, the World Bank Group, to pursue its larger mission: To reduce poverty, increase shared prosperity, and promote sustainable development by partnering with governments and the private sector around the world.

With an ultimate mission like that, the quality, integrity, and confidentiality of the data matters. The innovative crowdsourcing approach to collect the data, coupled with the intensive use of AI on the Cloud, helps the World Bank team reduce the labor-intensive tasks of manually reviewing, searching, and sorting the images. The completed dataset is then made public and is used to train various deep learning models.

The Two-Phased Process

In order to arrive at a useable dataset from photos that varied in quality and were recorded in different languages with a mix of typed and handwritten text, the team focused on cleaning the images and understanding their reliability. This was achieved by classifying the images using models that run on Intel’s BigDL framework. Each image was identified as tagged correctly, tagged incorrectly, or invalid.

Next, the photos that were identified as tagged correctly were used to train a model that identified the types of goods or services presented in the photo.

Phase 1:

  • Define image quality and eliminate poor quality images.
  • Classify images to validate existing labels.

Phase 2:

  • Identify images with text in the existing dataset; circle text.
  • Recognize the words in that text.
  • Determine whether the text contains PII.
  • Blur areas with PII text.

The Solution Architecture

The team built a solution architecture using AWS Cloud running Intel® Xeon® processors, Databricks Spark and the Intel BigDL deep-learning framework. With BigDL, users can write their deep-learning applications as standard Spark programs, which can directly run on top of existing Spark or Hadoop clusters.

This unified platform enables customers to eliminate many unnecessary dataset transfers between separate systems, eliminate separate hardware clusters (for example, CPU and GPU clusters), and move towards a CPU cluster, reducing system complexity and end-to-end latency.

Here’s the solution architecture for the ICP pilot study:

Model Development and Results

The World Bank team used Inception* v1 model for transfer learning and fine tuning on a partial dataset. The team loaded a pretrained Caffe* Inception v1 model to BigDL and added a fully connected layer with the customized SoftMax* classifier.

By using a dataset with pretrained weights, the team reduced training time and improved model accuracy as compared to training an Inception model from scratch. The team was also able to effectively scale the model on multinode clusters in AWS Databricks.

For Phase 1, the team first ran a test on a partial dataset (1,927 images, nine categories) to compare training from scratch vs. transfer learning vs. fine tuning:

Since fine tuning with Inception v1 showed the best results, it was used to further complete model training on a whole dataset (994,325 images, 69 categories). This model training was performed using a multinode cluster on the AWS R4.8xlarge instance with Intel® Xeon® processors with 20 nodes and brought the following results:

Nodes Cores Batch Size Epochs Training Time (sec) Throughput (images/sec) Accuracy (%)
20 30 1200 12 61125 170 81.7

Then, the team conducted scalability tests on that partial dataset with Inception v1 running on eight nodes vs. 16 nodes. The test showed almost linear scaling with BigDL with throughput increasing from 56.7 images/sec to 99.6 images/sec. Leveraging a native Spark DL library/framework like BigDL allowed the model to take full advantage of efficient distributed training.

Nodes Batch Size Epochs Throughput Training Time
8 256 20 56.7 745.6
16 256 20 99.6 424.7

As the result of partial dataset training and its scalability tests, the World Bank team was able to create an application that automates the validation process to confirm that photos gathered through the crowdsourced data collection pilot matched the goods for which observations were submitted. That helped to get a clean, validated image dataset and understand the reliability of this collection.

You can try image classification using BigDL with this World Bank code at https://github.com/intel-analytics/WorldBankPoC.

This is just one of many examples of Intel’s BigDL platform enabling the application of AI and deep learning to solve real-world challenges. Try applying BigDL to your business and data challenges. Join in and contribute to the project: https://github.com/intel-analytics/BigDL

Thanks to Yulia Tell (Intel), Jennie Wang (Intel), and Maurice Nsabimana (World Bank) for providing relevant data points and information for this blog.

Published on Categories Cloud ComputingTags , , , ,
Jeff Wittich

About Jeff Wittich

Jeff is Intel's Director of Cloud Service Provider (CSP) Business Strategy and Product Enabling team in the Data Center Group and is responsible for setting global strategic initiatives in order to accelerate cloud growth and deliver innovative platforms to CSPs. Over the last 14 years, Jeff has held a wide range of roles at Intel across engineering, management, and leadership, including product development for 5 generations of Intel® Xeon® processors. He holds a Bachelor of Science degree in Electrical Engineering from the University of Notre Dame and a Master in Science degree in Electrical and Computer Engineering from the University of California, Santa Barbara. This unique background has given him a wealth of knowledge which he now leverages to drive Intel's platform and business strategy for the fast growing CSP market.