On Parallel Paths: HPC and Many-Core Architectures

 

Over the past decade or so, a quiet but dramatic change has come to the world of computing. From desktop systems to supercomputers, the era of the system based on single-core processors has given way to systems based on multi-core and many-core processors.

Ten years ago Intel launched its first dual-core CPU for data center applications. Today a common two-socket server has as many as 44 processor cores, and we even have amazingly parallel and energy-efficient many-core processors that are available to scientists and researchers, engineers, and developers to help them gain insights faster than ever before.

A single many core Intel® Xeon Phi™ 7250 processor can compute more than 6 trillion floating point operations per second while consuming only the power equal to a couple incandescent lightbulbs. We’re talking about a change that is nothing less than revolutionary. This is technology that will allow us to build computers that can eventually solve problems 500 to 1,000 times faster than today’s most powerful computers.

A blad module with the Xeon Phi processor prominately shown

A Transformation in Parallel-Processing Architecture

So, you might ask, does this mean that all of my workloads will accelerate immediately, now that we have amazingly parallel processors? Maybe, but only if the application was designed to take advantage of the features in a modern processor. And since many applications have a heritage that is 10 years or older—before even the dual-core era—developers used different techniques ideal for the platform of that day.

In 2014, the US Department of Energy Next Generation of High Performance Computing Taskforce concluded that these large-scale systems would require developers to step back and consider how their algorithms and applications could be designed to extract the full benefit of modern parallel-processing architecture. The taskforce noted that “it is safe to assume that entirely new and innovative algorithms will need to be invented to make cost-effective use of the more powerful system.”*

At Intel, this need for code modernization is top of mind. We are working actively with the developer community and the ecosystem to help people optimize their applications and, where necessary, take advantage of Intel multi-core and many-core processing architectures that will power the future of computing.

HPC Developers Gather to Share Knowledge and Learn New Techniques

The Intel focus on software will be front and center this week at the Intel HPC Developer Conference in Salt Lake City. This conference offers many opportunities to explore industry-specific approaches and techniques for tackling real-life challenges in parallel programming, HPC, and artificial intelligence.

One of these challenges is the need to master modern parallel programming techniques that can help developers take full advantage of today’s modern infrastructure, including effective use of shared memory and SIMD (single instruction, multiple data) programming in the latest Intel processors. The goal is to write efficient applications that take full advantage of the features of today’s and tomorrow’s many-core processors.

No one company—not even Intel—can solve this problem by itself. But Intel can play a major role. In the last two and a half years, we have collaborated with application developers and communities; trained more than 10,000 programmers; and invested in developing case studies, online tutorials, and books with public domain source examples, along with cutting-edge software compilers, analysis tools, and performance libraries.

One of these tools is the Intel® Math Kernel Library (Intel® MKL), which provides a standard library for common computational functions like linear algebra or fast fourier transforms. I recently got a close-up look at the benefits of Intel MKL and other Intel tools at the Parallel Application Challenge during HPC China, where more than 2,500 students and teachers competed to demonstrate proficiency in parallel programming.  The winning participants used Intel MKL to increase the performance of a parallel application by 60x—a truly astounding accomplishment. The team’s professor, who also had a team win second place, mentioned that a secret of his program’s success was integrating the tutorials and materials Intel made available online into his program.

Another challenge for developers revolves around the use of HPC techniques to enable applications that incorporate technologies for artificial intelligence, such as machine learning techniques. The same techniques that speed traditional and AI workloads are fundamentally parallel, and many of today’s most popular deep learning techniques feature dense linear algebra at scale. In fact, the winning application from the HPC contest in China was a deep learning application.

The exciting future of HPC combines traditional HPC algorithms with machine learning techniques and replaces human intervention to speed time to results. We see a symbiotic relationship of simulation and machine learning developing for the next wave of large-scale scientific and business applications.

While there are many other challenges we could talk about, in the interest of brevity I will highlight just one more: the commonly faced issue of high productivity languages. While they offer some great advantages to the developer and are the backbone of how students are trained to program today, abstracted languages like Java and Python could be death sentences for the performance of parallel applications. But today, new parallel runtime processes are available that allow HPC developers to use these newer languages with the goal of delivering performance as good as compiled languages like C, C++, and Fortran.

Looking ahead, optimized productivity languages can play a key role in expanding the use of data science, machine learning and AI, visualization, and high-performance computing on a scalable hardware infrastructure. Intel’s vision of this infrastructure is embodied in the Intel® Scalable System Framework.

Join Us at the HPC Developer Conference

If you’re in Salt Lake City this week, you can get a closer look at the tools and technologies for the many-core era at the Intel HPC Developer Conference. The conference will address all of these themes—parallel programming, artificial intelligence, productivity languages, visualization, and systems design (including Intel® Scalable System Framework), including hands-on labs for developers.

The conference and its technical presentations and hands-on labs are completely free and open to all interested parties. Better still, the conference is essentially a convocation of researchers, developers, and system experts working in the HPC ecosystem, so it offers great opportunities to meet with your peers, share insights, and pick up some tips and tricks for parallel programming.

 

 

* Secretary of Energy Advisory Board, Report of the Task Formce on Next Generation High Performance Computing, August 18, 2014 page 16. http://energy.gov/sites/prod/files/2014/10/f18/SEAB%20HPC%20Task%20Force%20Final%20Report.pdf

 

 

 

 

Published on Categories Artificial Intelligence, High Performance Computing, Machine LearningTags , ,
Joe Curley

About Joe Curley

Joseph (Joe) Curley serves Intel Corporation as Senior Director, Code Modernization Organization, Enterprise, and Government Group within the Data Center Group (DCG). His primary responsibilities include supporting global ecosystem partners to develop their own powerful and energy-efficient HPC computing solutions utilizing Intel hardware and software products. Mr. Curley joined Intel Corporation in 2007, and has served in multiple other planning and business leadership roles. Prior to joining Intel, Joe worked at Dell, Inc. leading the global workstation product line, consumer and small business desktops, and a series of engineering roles. He began his career at computer graphics pioneer Tseng Labs.