Microprocessor design requires huge amounts of computing capacity. Therefore, it probably comes as no surprise that more than 120,000 servers in use by Intel are dedicated to silicon design. Each new generation of process technology—such as the transition from 65nm to 45nm in the past or 14nm and 10nm process technology-based design work now—brings a substantial increase in design complexity, requiring a major increase in design compute performance. Though increased performance is needed across the entire design process, the requirement is particularly acute at the highly compute-intensive tape‑out stage. For those not steeped in silicon manufacturing lingo, “tape‑out” is a process where Intel® chip design meets manufacturing; it’s the last major step in the chain of processes leading to manufacturing the masks used to make microprocessors.
Intel IT Meets the Business Need
A decade ago Intel IT recognized the need for high-performance computing (HPC) to support tape-out, and in response formed the HPC Center of Excellence (see our recent white paper, “Hyperscale High-Performance Computing For Silicon Design,” for details about this strategic program and its results). Since then, we’ve created five generations of HPC, delivering business value through a phenomenal ability to scale – during the ensuing ten years, the HPC team has successfully met a 9,100% increase in compute demand while achieving a 6,400% increase in reliability, even as the chip size continued to shrink and designs became exponentially more complex.
HPC by Design
When we designed the first generation of HPC at Intel, we focused on key challenge areas such as compute, network, storage, job submission scale and automation, stability, and cost. We synched each generation of HPC with process technology generations, but did not overhaul every aspect of the HPC environment with every generation. We only made changes when they made sense and would result in significant benefit. Using this “right-sized” approach to all aspects of technology, solutions, and processes associated with HPC enables us to gain substantial performance and throughput scale without “breaking the bank.”
Technology Is Not Enough
HPC would not be possible without technology – servers, networks, storage, and so on. But it is equally true that even high-performance technology is inanimate without people. Intel IT considers investing in our people – their skills and knowledge – to be critical in our ability to transform Intel’s silicon design environment. We’ve shown that HPC doesn’t have to be cost-prohibitive, but it does require a team that is focused and committed to delivering significant business value.
Conclusion
As Intel’s Chief Technology Officer and senior principal engineer, I hope that other IT professionals will take note of Intel IT’s celebration of ten years of highly successful HPC – of meeting the business need with unprecedented scale and throughput while creating a sustained and growing technical leadership. The white paper mentioned earlier provides more detail about each HPC generation at Intel, and I’d be interested in hearing other HPC stories from various industries – what has worked well? What have you learned in your own HPC journey? Leave a comment below, and join the conversation!