Today, I had the pleasure of participating in a Fireside chat with Martin Giles from the MIT Tech Review as part of the Newsweek Structure Conference focused on cloud computing and infrastructure. It was an opportunity for me to share how the data center is evolving and transforming.
By 2025, between 70-80%1 of data center systems will be deployed in hyperscale data centers. Major new application areas such as artificial intelligence, autonomous driving, medical imaging, advanced genomics, network function virtualization, and fintech are advancing quickly and radically changing the way we generate, gather, process, and interpret data.
This massive wave of data growth requires new thinking on data center architecture and technology. First, the building block of the data center needs to change – from discrete systems to rack scale architecture. Second, the complexity of solutions requires higher degrees of optimization and integration.
As an example, R&D for autonomous vehicles today is a massive data management challenge. Some R&D efforts are spending upwards of $1M in data center infrastructure for every autonomous car on the road – clearly not scalable. Intel is developing a data center platform based on a rack-scale design, which includes accelerators for AI, next-generation memory, high-speed interconnect, and intelligent tiered storage.
The role of the data center for autonomous vehicles will evolve from R&D to a gateway to delivering new data services like HD maps, smart city services, traffic management, contextual services, and connected insurance, all of which we will help bring to market in the near future. Visit the Intel Newsroom to read more about Intel's autonomous driving efforts.
As another example of solution complexity, health care creates a tremendous amount of data. Worldwide healthcare data is projected to reach over two Zetabytes2 by 2020 with up to 40 Exabytes3 for genomics. Through partnership with The Broad Institute, we developed the Broad Intel Genomics reference architecture (a.k.a. “BIGstack”) which integrates GATK4 -- the world’s most popular genome analysis software from Broad -- with optimized hardware and purpose-built software including a job scheduler, a workflow execution engine, a genomics kernel library, a Genomics database, and other purpose-built components.
A few days ago, at Supercomputing 2017, we announced that this platform now runs optimally on our latest Xeon® Scalable processors with Intel® Arria® FPGAs, resulting in a 3.34x speed up in whole genome analysis and a 2.2x daily throughput increase4,5. This combination of hardware and software brings a performance optimized, end-to-end solution to researchers, clinicians, and Healthcare IT departments. Complex genomics analysis is reduced from weeks to hours.
It is exciting to see the transformation that analytics is having on the evolution of the data center. It’s even more exciting when the role of the data center is harnessed for good leading to the betterment of the entire industry, community, and environment.
5 Assuming3-year life cycle of the appliance, $522K initial cost for 24 node configuration, 5 whole genomes per day per node analyzed and 70% utilization of the system.