Intel IT Celebrates a Decade of High-Performance Computing Success

Blog-BytheNumbers.png

Microprocessor design requires huge amounts of computing capacity. Therefore, it probably comes as no surprise that more than 120,000 servers in use by Intel are dedicated to silicon design. Each new generation of process technology—such as the transition from 65nm to 45nm in the past or 14nm and 10nm process technology-based design work now—brings a substantial increase in design complexity, requiring a major increase in design compute performance. Though increased performance is needed across the entire design process, the requirement is particularly acute at the highly compute-intensive tapeout stage. For those not steeped in silicon manufacturing lingo, “tapeout” is a process where Intel® chip design meets manufacturing; it’s the last major step in the chain of processes leading to manufacturing the masks used to make microprocessors.

Intel IT Meets the Business Need

A decade ago Intel IT recognized the need for high-performance computing (HPC) to support tape-out, and in response formed the HPC Center of Excellence (see our recent white paper, “Hyperscale High-Performance Computing For Silicon Design,” for details about this strategic program and its results). Since then, we’ve created five generations of HPC, delivering business value through a phenomenal ability to scale – during the ensuing ten years, the HPC team has successfully met a 9,100% increase in compute demand while achieving a 6,400% increase in reliability, even as the chip size continued to shrink and designs became exponentially more complex.

HPC by Design

When we designed the first generation of HPC at Intel, we focused on key challenge areas such as compute, network, storage, job submission scale and automation, stability, and cost. We synched each generation of HPC with process technology generations, but did not overhaul every aspect of the HPC environment with every generation. We only made changes when they made sense and would result in significant benefit. Using this “right-sized” approach to all aspects of technology, solutions, and processes associated with HPC enables us to gain substantial performance and throughput scale without “breaking the bank.”

Blog-TapeoutChallenges.png

Technology Is Not Enough

HPC would not be possible without technology – servers, networks, storage, and so on. But it is equally true that even high-performance technology is inanimate without people. Intel IT considers investing in our people – their skills and knowledge – to be critical in our ability to transform Intel’s silicon design environment. We’ve shown that HPC doesn’t have to be cost-prohibitive, but it does require a team that is focused and committed to delivering significant business value.

Conclusion

As Intel’s Chief Technology Officer and senior principal engineer, I hope that other IT professionals will take note of Intel IT’s celebration of ten years of highly successful HPC – of meeting the business need with unprecedented scale and throughput while creating a sustained and growing technical leadership. The white paper mentioned earlier provides more detail about each HPC generation at Intel, and I’d be interested in hearing other HPC stories from various industries – what has worked well? What have you learned in your own HPC journey? Leave a comment below, and join the conversation!

Published on Categories Other (hidden)Tags , , ,
Shesha Krishnapura

About Shesha Krishnapura

Shesha Krishnapura is an Intel Fellow and chief technology officer in the Information Technology organization at Intel Corporation. He is responsible for advancing Intel data centers for energy and rack space efficiency, high-performance computing (HPC) for electronic design automation (EDA), and optimized platforms for enterprise computing. He is also responsible for fostering unified technical governance across IT, leading consolidated IT strategic research and pathfinding efforts, and advancing the talent pool within the IT technical community to help shape the future of Intel. Shesha has led the introduction and optimization of Intel® architecture compute platforms in the EDA industry since 2001. He and his team have delivered five generations of HPC clusters and four supercomputers for Intel silicon design and device physics computation. Earlier in his Intel career, as director of software in the Intel Communications Group, he delivered the driver and protocol software stack for Intel’s Ethernet switch products. As an engineering manager in the Intel® Itanium® processor validation group, he led the development of commercial validation content that produced standardized workload and e-commerce scenarios for successful product launches. He joined Intel in 1991 and spent the early years of his Intel career with the Design Technology group. A three-time recipient of the Intel Achievement Award, Shesha was appointed an Intel Fellow in 2016. His external honors include an InformationWeek Elite 100 award, an InfoWorld Green 15 award and recognition by the U.S. Department of Energy for industry leadership in energy efficiency. He has been granted several patents and has published more than 75 technical articles. Shesha holds a bachelor’s degree in electronics and communications engineering from University Visvesvaraya College of Engineering in Bangalore, India, and a master’s degree in computer science from Oregon State University. He is the founding chair of the EDA computing board of advisers that influences computer platform standards among EDA application vendors. He has also represented Intel as a voting member of the Open Compute Project incubation committee since its inception.