Intel Donates HPC Infrastructure to Pan-Cancer Analysis of Whole Genomes Project

We’re experiencing ever-increasing volumes of data within health and life sciences. If we were to sequence just once the ~14M new cancer patients (T/N) worldwide[1], it would require more than 5.6 Exabytes (and the reality is we need to be able to sequence them multiple times during the course of treatment using a variety of omics and analytics approaches). The technical challenges of big data are many, from how do we manage and store such large volumes of data to being able to analyse hugely complex datasets. However, we must meet these challenges head-on as the rewards are very real.

I’m pleased to tell you about a significant project that Intel is supporting to help overcome these types of challenges which will assist in the drive to comprehensively analyse cancer genomes. Our HPC solutions are already facilitating organisations around the world to deliver better healthcare and individuals to overcome diseases such as cancer. And our relationship with the Pan-Cancer Analysis of Whole Genomes (PCAWG) project is helping scientists to access and share analysis of more than 2,600 whole human genomes (5200 matched Tumor/Normal pairs).

Scientific discovery can no longer operate in isolation – there is an imperative to collaborate internationally working across petabytes of data and statistically significant patient cohorts. The PCAWG project is turning to the cloud to enhance access for all which will bring significant advances in healthcare through collaborative research.

By working directly with industry experts to accelerate cancer research and treatment, Intel is at the forefront of the emerging field of precision medicine. Advanced biomarkers, predictive analytics and patient stratification, therapeutic treatments tailored to an individual’s molecular profile, these hallmarks of precision medicine are undergoing rapid translation from research into clinical practice. Intel HPC Big Data/Analytics technologies support high-throughput genomics research while delivering low-latency clinical results. Clinicians together with patients formulate individualized treatment plans, informed with the latest scientific understanding.

For example, Intel HPC technology will accelerate the work of bioinformaticists and biologists at the German Cancer Research Centre (DKFZ) and the European Molecular Biology Laboratory (EMBL), allowing these organisations to share complex datasets more efficiently. Intel, Fujitsu, and SAP are helping to build the infrastructure and provide expertise to turn this complex challenge into reality.

The PCAWG project is in its second phase which began with the uploading of genomic data to seven academic computer centres, creating what is in essence a super-cloud of genomic information. Currently, this ‘academic community cloud’ is analysing data to identify genetic variants, including cancer-specific mutations. And I’m really excited to see where the next phase takes us as our technology will help over 700 ICGC scientists worldwide to remotely access this huge dataset, performing secondary analysis to gain insight into their own specific cancer research projects.

This is truly ground-breaking work made possible by a combination of great scientists utilising the latest high-performance big data technologies to deliver life-changing work. At Intel it gives us great satisfaction to know that we are playing a part in furthering knowledge in both the wider genomics field, but also specifically in better understanding cancer which will lead to more effective treatments for everyone.