Co-authored by Andres Rodriguez, PhD., Deep Learning Solutions Architect at Intel
Deep learning is rapidly advancing artificial intelligence (AI) and revolutionizing various market segments. But what does a company need to succeed in deep learning? Here are 10 strategies to win in this space. These strategies have been successfully applied across various market segments such as energy, health, agriculture, finance, and transportation, from large organizations to small startups.
1. Start with an existing model
There is a fear that to use deep learning, companies need to gather large datasets of labeled data. However, the field of deep learning is unusually open and there are many publicly available trained models. Trained models exist for a variety of tasks such as video activity recognition, image localization, speech recognition, natural language understanding, and more. Companies can get started using an existing trained model.
2. Fine-tune training with your data
The performance of a model typically improves with more training data. Companies can use a pre-trained model that was trained with a very large dataset, and then further train, i.e., fine-tune, the model with a dataset that is more representative to the company’s workloads. The more similar the company’s dataset is to the dataset used for the pre-trained model then the less data is needed and the less training time is required to fine-tune the model. On the other hand, the more different the datasets are the more data is needed and the more training time is required to fine-tune the model. If the datasets are too different, then a pre-trained model may not help, and training a network from scratch may be required.
For example, a company can use a model trained on the 1000-class ImageNet dataset with 1.28M images and fine-tune for a specific task. If the task is a simple dog vs cat classifier, then the model can be trained in a few minutes with only thousands of images. If the task is nuclei classification for biomedical applications, then the model would require hours to train and tens of thousands of images.
3. Use Professional Services
Trained deep learning scientists are often necessary to properly train or fine-tune a deep network. However, talent is tight. The top talent in deep learning is getting absorbed by the big companies in this space. So, what chance does it leave other companies to make a dent in deep learning? Fortunately, companies can leverage the expertise of deep learning teams using professional services from companies and teams from Intel. For more information you can contact Fiaz Mohamed at Intel.
4. Train your staff
In parallel to using professional services, companies can train their workforce and can acquire talent from adjacent fields such as computational neuroscience, machine learning, and applied mathematics. There are online courses from Coursera and Udacity and data science competitions hosted by Kaggle to get your employees started. You can start with foundational courses at the Intel® Nervana™ AI Academy.
The Intel Nervana AI solution team is working with Prof. Kurt Keutzer from UC Berkeley to design a deep learning online course that focuses on deep learning usages across various markets. This course will be available in Q2 of 2017. Our goal is to democratize deep learning and train industries using examples that can be encountered in the real world.
5. Evaluate technology options
There are a number of choices including hardware, software libraries, and deep learning frameworks. Important questions to ask include:
- Can you continue to use your existing infrastructure?
- Do you prefer to use open-source software libraries?
- Do you prefer to use a deep learning frameworks that are based programming languages like Python?
- Can you use a cloud service provider?
There are different options available, depending on your answers.
6. If your organization requires it, have an on-premise plan
Many organizations require an on-premise appliance, even with advancements in securing data in the public cloud. This requires either buying an entire appliance with the hardware and software ready for deep learning workloads or building one themselves. The Intel Nervana AI platform offers a full-stack approach to AI solutions in the cloud or on-premise.
7. Use a scalable cloud service
Once you have trained a model on a small scale and are happy with the results, you will want to train it on a much larger dataset to get to a production-level model. Overall production time can be significantly reduced if the data science team can scale from the initial ‘researchy’ explorations to large-scale training with minimal incremental effort. The Intel Nervana Cloud is a hosted service that you can use today to build AI solutions.
8. Embrace ludicrous speed
Deep Learning workloads can take a long time to run – days or even months in some cases. Embracing the fastest technology can give you a big competitive advantage.
Perhaps speed is what the Intel Nervana AI solution is best known for, i.e., achieving high speeds on hardware including our upcoming Intel Nervana technology (code-named, “Lake Crest”). Lake Crest is a processor specifically designed for deep learning workloads, and Intel will test first the silicon in the first half of 2017 and will make it available to key customers later in the year. In addition, Intel announced a new product (code-named “Knights Crest”) on the roadmap that tightly integrates best-in-class Intel® Xeon® processors with the Intel Nervana technology.
9. Plan a Data Strategy
Even with the fastest deep learning processor, data needs to be continuously fed to operate at full utilization. The data loading pipeline needs to be multithreaded in a way such that the upstream processes like data transfer from object store to local disk, local disk to host memory, on the fly augmentations, host memory to device memory, and finally device memory to device registers are all happening in parallel with minimal latency.
In addition, it is important to load, decode, and transform different file formats depending on the problem at hand.
The aeon module (and Intel Nervana Data Service for Intel Nervana Cloud) from Intel has been designed to provide a smooth user experience for these features.
10. Have a deployment strategy
The path to deploying a trained model for use in a cloud app or embedded device needs to be seamless. There are power, memory, accuracy, and connectivity tradeoffs to consider in deciding the solution. For example, an embedded system with smaller memory capacity may require a deep learning scientist to reduce the precision of the weights in the model and/or add sparsity in the model. If incorrectly done it can negatively affect the performance. Rather, a trained model requires further fine-tuning with the additional deployment constraints added to retain high accuracy. Intel Nervana solution provides a seamless path from training to deployment for edge platforms.
Intel Nervana solution helps companies with any and all of these strategies. We offer a full solution stack. We have a team of deep learning scientists that can help you build a solution for your particular workload. We have neon and Intel Nervana Graph – a fully featured software FW which is interoperable with many of the FWs. We offer a cloud service for training and deploying deep learning models. We have an upcoming processor (Lake Crest) and appliance.
The Intel Nervana platform has been applied to real-world problems such as detecting tumors in healthcare, counting plants in agricultural robotics, finding oil rich regions in seismic data, building better speech interfaces in cars, building a time-series search engine for finance, and engineering better organisms through amino acid sequence analysis.
Our mission is no less than to transform all industries with the power of AI and deep learning to augment the world with safer, healthier, and more enriching experiences.
About the authors:
Andres Rodriguez, Senior Technical Lead at Intel, where he designs deep learning solutions for Intel’s customers and provides technical leadership across Intel for deep learning products. He has 13 years of experience working in Artificial Intelligence. Andres received his PhD from Carnegie Mellon University for his research in machine learning. Prior to joining Intel, he was a Principal Scientist for deep learning with the Air Force Research Laboratory and Adjunct Professor at Wright State University.
Arjun Bansal, Senior Director of Deep Learning Algorithms at Intel, leads the teams working on neon, Intel Nervana Graph, framework optimizations for Intel Architecture, deep learning customer engagements, and helping define features for Intel Nervana products such as Lake Crest and Intel Nervana Cloud. Previously he was a co-founder of Nervana Systems, a provider of a high-performance, scalable clou d platform for deep learning, leveraging distributed processor architectures. He received his PhD from Brown University, and a BS in Computer Science from Caltech.