As Bioscience Heats Up, We Add Fuel to the Fire

A revolution is underway in the life sciences. It now takes only a few hours and about a $1,000 to sequence a whole human genome1—and $100 genome sequencing is coming soon.2 Meanwhile, technologies such as Cryo-Electron Microscopy (Cryo-EM) and Molecular Dynamics (MD) are shining new light into the fundamental processes of life. Scientists are using these tools to illuminate biological pathways at the molecular level and design drugs that precisely target specific disease mechanisms.

These innovations have all been enabled, in part, by advances in computing performance. However, computing has ironically become a primary bottleneck for many researchers today. The large amounts of data generated by today’s high-volume instruments and complex applications can take many hours, and in some cases far longer, to process.

Faster Software

Intel is working with leaders in the field to unleash a new wave of discovery by overcoming these challenges. It’s a two-part effort. The first part is to modernize some of the most important and computationally-intensive scientific algorithms. Code optimization alone will often provide an order-of-magnitude or greater performance gain. For example:

  • Professor Knut Reinert at the Free University of Berlin is collaborating with Intel to accelerate genome analysis. His optimized algorithms can help speed performance by as much as 900X.3 The algorithms are available in the open source SeqAn* library, and can be used to quickly build complex, high-performing genomic pipelines.
  • Professor Simon Warfield of Harvard Medical School used Intel® software tools to reduce image processing times for his Diffusion Compartment Imaging (DCI) technique by up to 161X4, through optimizations that improve single-node performance and enable multi-node scaling. DCI can now furnish high-resolution images of neural pathways and other soft brain tissues quickly enough to support potentially life-saving clinical diagnoses for brain injuries and disorders. The optimized algorithms are available in the Insight Toolkit* (ITK) library, which is widely used in applications that process medical imaging data.
  • Professor Erik Lindahl of Stockholm University and Professor Sjors Scheres of the UK MRC Laboratory of Molecular Biology have worked with Intel to reduce Cryo-EM image reconstruction times from as many as 47 hours to as few as 5.6 hours through redesigned algorithms and optimized code4. Cryo-EM allows scientists to explore cellular processes at near-atomic scale without the daunting complexities of x-ray crystallography. By reducing and simplifying computing requirements, the optimized code will help speed the adoption and use of this powerful imaging technique.

Faster, Simpler Hardware

These and other modernized codes save time, potentially lots of it. But, optimizing code isn’t enough on its own: our second project is to optimize our hardware. To do so, Intel developed Intel® Scalable System Framework (Intel® SSF). By combining high performance compute, memory, storage, fabric, and software components into a balanced system framework, Intel SSF brings new simplicity to high-performance computing (HPC), so researchers can focus on biological science rather than computer science.

Intel SSF is designed to provide superior performance across the full range of bioscience and HPC workloads, including molecular dynamics, genomics, molecular imaging, machine learning and AI, data visualization, and more. It is also designed to provide greater density and power efficiency and to scale over multiple generations, which is a must for a field that is seeing such daunting levels of growth in data volumes and algorithmic complexity.

Intel SSF scales from small workgroup clusters to supercomputers and can be extended and tuned over time by adding additional compute nodes based on current and future Intel processors and accelerators. This flexibility and scalability facilitate faster and more accurate science. It also opens the door to new experimental possibilities, so scientists and clinicians can imagine and explore what is possible, instead of framing their research in terms of constraints.

You can learn more about Intel SSF.

Intel Life Sciences at SC17 and Intel® HPC Developer Conference

Would you like to learn more about Intel SSF and how it can help scientists and clinicians unleash the future? Come visit us at the Intel HPC Developer conference or at SC17 this month in Denver. Please find below a list of some of the activities about which we’re most excited. If you’re not able to make it to Denver, follow along on Twitter with @IntelHPC and #IntelHPC.

 

Event Date/Time Topic Speakers
Saturday 11/11
Intel HPC Developers Conference, Sheraton Denver Downtown 1:30-2:00 Accelerating Cryo-EM Reconstruction (with RELION) Erik Lindahl, Stockholm University
Charles Congdon, Intel
Intel HPC Developers Conference, Sheraton Denver Downtown 3:15-3:45 Accelerated Characterization of Neural Circuits of the Brain Simon Warfield, Harvard Medical School
Tuesday 11/14
Intel Nerve Center, Booth 1301, Denver Convention Center 12:00-1:00 Accelerated Characterization of Neural Circuits of the Brain Simon Warfield, Harvard Medical School

Kristina Kermanshahche, Intel

Intel Nerve Center, Booth 1301, Denver Convention Center 1:00-2:00 Accelerating Cryo-EM Reconstruction (with RELION) Erik Lindahl, Stockholm University
Kristina Kermanshahche, Intel
Wednesday 11/15
Intel Nerve Center, Booth 1301, Denver Convention Center 2:00-3:00 Maximizing Unlabeled Data for Drug Discovery with Multilayer CNNs and Autoencoders Kyle Ambert, Intel
Kushal Datta, Intel
Kristina Kermanshahche, Intel
Thursday 11/16
Intel Nerve Center, Booth 1301, Denver Convention Center 1:00-2:00 Generic Parallelization of SeqAn's Alignment Module Knut Reiner, Free University of Berlin
Rene Rahn, Free University of Berlin
Kristina Kermanshahche, Intel

 


Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance/datacenter.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein.

Normalized performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actual benchmark result for the baseline platform into each of the specific benchmark results of each of the other platforms, and assigning them a relative performance number that correlates with the performance improvements reported.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at www.intel.com.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

Copyright © 2017 Intel Corporation. All rights reserved. Intel, the Intel logo, and Xeon are trademarks of Intel Corporation and its subsidiaries in the U.S. and other countries.

*Other names may be trademarks of their respective owners.

 

[1] Source: National Institute of Health (NIH) National Human Genome Research Institute, The Cost of Sequencing a Human Genome, last updated July 6, 2016. https://www.genome.gov/27565109/the-cost-of-sequencing-a-human-genome/

[2] Source: Illumina introduces the NovaSeq Series—a New Architecture Designed to Usher in the $100 genome, Illumina press release, January 9, 2017. https://www.illumina.com/company/news-center/press-releases/press-release-details.html?newsid=2236383

[3] Baseline: 2-socket Intel® Xeon® processor E5-2650 v3 (2.3 GHz, 10 cores), Intel® Turbo Boost Technology off, Intel® Hyper-Threading Technology on, BIOS 2.0.1(04/11/2016), 12x16GB DDR4 RDIMM (2133Mhz), OS: Debian 3.16.43-2+deb8u3. System Under Test: Intel® Xeon Phi™ processor 7250 (1.4 GHz, 68 cores), Intel® Turbo Boost Technology 2.0 on, Instruction Set: Intel® AVX-512, OS provisioning Quadrant + Cache mode, BIOS 5.12(04/14/2017), 6x16 GB DDR4 (Hynix HMA42GR7AFR4N-UH (PC4-2400)) and 2x8 GB MCDRAM (7200 Mhz), OS: Linux nid00054 3.12.60-52.49.1_2.2-cray_ari_c.

[4] Baseline: Unoptimized DCI image processing workload running on Intel® Xeon® E5-2697 v2 @ 2.70 GHz (24 cores). The unoptimized software could only utilize one core of the processor. Systems Under Test: Optimized DCI image processing workload running on Intel® Xeon® E5-2697 v2 @ 2.70 GHz (24 cores) and Intel® Xeon Phi™ 7210 @ 1.30 GHz (72 cores). The optimized software was able to utilize all available cores on each processor.

[5] Tests conducted by Intel as of June 2017. Workload: Plasmodium Ribosome (EMD_2660). Baseline Configuration: 2-socket server with 2 X Intel® Xeon® processor E5-2697 V4 (18 cores, 2.3 GHz), 128 GB memory (8 x 16 GB @ 2400 MT/s DDR4 RDIMM) 1 x 1 TB SATA, Red Hat Enterprise Linux* 7.2, RELION v2.0.3 double precision. Test Configuration: 2-socket server with 2 X Intel® Xeon® Gold 6148 processor (20 cores, 2.4 GHz), 192 GB memory (12 x 16 GB @ 2666 MT/s DDR4 RDIMM) 1 x 800 GB Intel® SSD SC2BA80, Red Hat Enterprise Linux* 7.2, RELION 2.0.3 mixed precision.