HPC-Accelerated Genomic Insights with BIGstack

Advancements in the field of genomics are revolutionizing our understanding of human biology, rapidly accelerating the discovery and treatment of genetic diseases, and dramatically improving human health. Genomics, the study of genetic material, is allowing medical professionals to enhance clinical care based on their patients’ specific genetic makeups. Thanks to ongoing IT innovations, genomic research is moving the industry toward a promising future of personalized healthcare and precision medicine.

To facilitate this transition, Intel and the Broad Institute of MIT and Harvard announced the Center for Genomic Data Engineering. The Center is collaborating with Hewlett Packard Enterprise (HPE) to ramp up genomic analytics workflows, at an unprecedented scale, and with easier deployment. Sharing a common vision to uncover better methods of acquiring, processing, storing, and analyzing genomic data, these organizations offer a series of powerful and flexible data center solutions incorporating best-of-breed hardware and optimized software that will enhance researchers’ ability to analyze genomic datasets collected from diverse sources.

The Genome Analysis Toolkit (GATK) from the Broad Institute is the industry’s foremost software tool in the field of genomic analytics. The fourth version of this software package (designated GATK4) is available under an open source software license to help researchers eliminate infrastructure-related complications and derive insights from large genomic datasets in an easier, faster, and more efficient way.

Intel and the Broad Institute have developed a breakthrough reference architecture called the Broad-Intel Genomics Stack (BIGstack), which delivers 5X improvement to the Broad’s genomics analytics pipeline. Through this partnership, the BIGstack will drive speed, scale, ease of deployment, and global alignment in the rapidly growing genomics community.

As an original equipment manufacturer (OEM) partner, HPE is excited to bring to market solutions based on this breakthrough architecture. HPE has developed a purpose-built, unified compute and storage solution that is optimally designed for next-generation sequencing (NGS) workflows, delivering the throughput and efficiency needed to keep pace with the growing demand for genome analysis. Utilizing high-performance computing (HPC) solutions such as HPE Apollo systems and HPE ProLiant servers, researchers can leverage the speed and performance necessary to power advanced data analysis techniques—such as artificial intelligence (AI) and deep learning capabilities.

Enabled by Intel® architecture, users can harness breakthrough levels of performance to eliminate bottlenecks, support data-heavy workloads, and scale seamlessly to accommodate tomorrow’s challenges. These technologies enable researchers to rapidly train deep neural networks on robust, scalable infrastructure to accelerate AI and genomic insights. Additionally, the Intel® Scalable System Framework is a holistic solution that optimizes compute- and data-intensive processes as well as deep learning techniques to reduce time to insight.

The field of genomics is experiencing a major paradigm shift, and developments over the next decade will undoubtedly accelerate the use of technologies like HPC and AI across the industry. As leaders in their respective fields, Intel and the Broad Institute are working together to develop solutions that will help researchers overcome the challenges of diverse genomic datasets and speed time to results. HPE is proud to contribute to this milestone in advancing genomics research.

The HPE and Intel HPC Alliance are driving major improvements to a variety of enterprise operations—and genomic analytics is just the beginning. To learn more about how these cutting-edge capabilities are revolutionizing the life sciences industry, I invite you to follow me on Twitter at @pango. You can also check out @HPE_HPC and @IntelHPC for up-to-the-minute news and updates on the latest innovations in HPC, AI/deep learning, supercomputing, and more.