Bio-IT World: Scalable Whole Genome Analysis

by Charlotte Rasmussen

As discussed in an earlier blog post, QIAGEN has been working together with Intel to bring infrastructure together with genome analysis tools to enable massively scalable whole genome analysis at lower cost. Now, there’s a new white paper detailing the reference architecture and other technical information for our joint solution.

Designed to help NGS scientists keep their sequencing pipelines running smoothly even at capacity — all while saving money and producing better results — our solution provides whole genome analysis for as little as $22 per genome. It meets the computational and analysis demands of Illumina’s HiSeq X Ten, but Intel’s 32-node offering can save researchers up to $1.3 million in total ownership costs compared to the 85-node cluster recommended by the vendor for a BWA+GATK variant calling pipeline.

int_brand_879_LabDocTblt_5600_cmyk._lowresjpg.jpg

Here’s a quick look at what makes our solution different:

  • Built-in analysis tools: The system uses Biomedical Genomics Server solution.
  • Scalability: Designed to scale on-demand for computing, networking, and storage, the cluster allows labs to manage capacity easily and cost-effectively.
  • Proven accuracy: While efficiency and cost-effectiveness is an important factor for NGS data analysis, accuracy in both variant calling and interpretation for the solution is proven to be among the best.
  • User friendly: The solution masks the complexity of cluster computing with the easy-to-use Biomedical Genomics Workbench.
  • Fast connection to data: We used a high-speed interconnect system based on Intel True Scale Fabric to link the compute nodes and centralized storage, providing up to 40 Gbps of bandwidth per port.
  • Parallel storage: The solution incorporates Intel Enterprise Edition for Lustre, the world’s leading parallel storage system, to keep all the nodes, cores, and threads operating at high efficiency.

For more details, check out the full white paper.

Our tests showed that the 32-node system could process and analyze 48 genomes in 24 hours, on average — enough capacity to handle all the data produced by a HiSeq X Ten. We also tested the system with exome data and successfully analyzed approximately 1,440 human exomes every 24 hours.

Together with Intel we’ll be presenting this joint solution at the upcoming Bio-IT World conference in a presentation addressing the growing demand for population-scale genomics.

Bio-IT World 2016

April 5-7 we’ll be in Boston at the Annual Bio-IT World Conference & Expo and together with Intel, we’ll be presenting this joint solution in a presentation addressing the growing demand for population-scale genomics. We’ll demonstrate how our companies have partnered to design a reference architecture to address these challenges in a cost-effective manner.

You can visit us in booth #229 and you’re of course very welcome to join our presentation on Wednesday, April 6:

Title: A Reference Architecture for High-Volume Whole Genome Data Analysis

Date and time: April 6 at 3:30 PM,

Location: Vendor Theater

Speakers: Mikael Flensborg, Director of High Volume Sequencing Solutions, QIAGEN and Michael McManus, Senior Health & Life Sciences Solution Architect, Intel

If you’d like to learn more but are not able to attend the conference, please feel free to email us.

We're looking forward to seeing you in Boston!

More information about Bio-IT World

Charlotte Rasmussen is a scientific correspondent at QIAGEN, where she summarizes and communicates scientific information and customer stories.