While the cloud has been viable for some time for a number of HPC workloads and use cases, it hasn’t been a practical option for most users. However, that is rapidly changing.
Cloud Service Providers and their partners are now offering HPC as a Service (HPCaaS) via fully-orchestrated services that provision a familiar, compatible, and fully elastic HPC cluster in the cloud. Amazon’s ParallelCluster eases cluster creation and management, while AWS Batch is a fully-managed and MPI-capable HPC service. Microsoft’s CycleCloud provides automated configuration, along with additional services to link on-premise systems to the Azure cloud. In addition to a base HPCaaS platform offering, Rescale offers popular HPC applications on demand. These are just a few examples; several other cloud providers have launched HPCaaS offerings, including familiar names like Oracle and IBM.
We see a number of top reasons people are looking to the cloud for HPC. First, existing HPC users have become constrained by the fixed capacity of on-premise systems. The cloud allows these users to augment those valuable systems in new and novel ways. Second, people who have not traditionally had access HPC now have a natural “on ramp” via cloud, with pay-as-you-go services that address the barrier of up-front capital investments. Finally, HPC users are finding cloud provides a practical way to access the latest technologies as needed, without committing to long-term ownership.
Unlocking Productivity and Innovation
On-premise HPC systems have long been an essential resource for commercial, academic, and government institutions. They are also a great value, delivering an estimated $463 revenue return on each dollar invested.1 However valuable, they have one major drawback: a given HPC system’s capacity is fixed, while an organization’s demand is highly variable and generally exceeds system capacity. Hyperion found that average demand exceeds capacity by 24% or more in 45% of surveyed HPC centers.2 With such a compelling ROI, unmet demand represents a significant opportunity cost, through reduced productivity, less optimized product designs, and delayed discoveries or time to market. Existing HPC users are beginning to use the cloud to supplement work and scale on demand to unlock their productivity. By accessing HPC in the Cloud, organizations can offer their users faster turnaround versus waiting in a queue, unlocking user productivity. Further, the results achieved through cloud can measurably demonstrate the value of increased access and inform future HPC capacity planning
The Cloud’s ability to offer vast resources on demand is also enabling innovative new approaches to HPC. A recent example involving on Amazon Web Services, Western Digital dramatically accelerated the exploration of a complex design space, accelerating product design decisions and time to market. For global organizations, HPC in the Cloud eases collaboration for teams working on shared data sets and projects. It also provides ways to develop and test new methods and ideas without impacting production usage.
Expanding HPC Access
Existing HPC users aren’t the only ones benefiting from cloud. Entirely new business that depend on HPC can quickly launch without having to design their own data center and hire a full-time administrator, helping them get to market faster. Boom Supersonic is one such example. The company is working to bring back commercial supersonic travel following the demise of the Concorde and requires HPC simulation and modeling to design its future 55-passenger airplane.
Finally, there is a large population of technical workstation users who are becoming constrained by locally-available compute power. Increasingly detailed designs and product requirements are driving substantial increases in simulation time. Many organizations using Microsoft Windows-based workstations lack the capability, desire, or capital required to install a typical Linux HPC cluster. Solutions like Altair PBS Works provide seamless access from workstation to cloud, improving workstation user productivity by shortening simulation turnaround time.
The cloud often offers HPC users the earliest access to new technologies. For example, Google Cloud Platform was first to offer 2nd Generation Intel® Xeon Scalable processors and Intel® Optane™ DC persistent memory. Amazon Web Services offers HPC-oriented C5 and C5n instances based on Intel Xeon Scalable Processors and recently introduced Elastic Fabric Adapter for improved MPI communications.. Microsoft Azure recently launched HC instances targeted for HPC and connected with InfiniBand, also for improved MPI performance. The cloud provides a great option for accessing technologies not available on premise.
If you’ll be attending ISC 2019 in Frankfurt, I encourage you to attend some of our offered tutorials from the leading cloud service providers to see how their HPC services might benefit your organization. I also invite you to stop by the Intel booth Monday and Tuesday to hear panel discussions I will be hosting with major cloud providers and Intel partners who are helping organizations tap into the benefits of HPC in the Cloud. All our talks, panels, and tutorials can be found in this mobile agenda.
1 Hyperion Research 2018 HPC ROI Research Update: Economic Models for Financial ROI And Innovation From HPC Investments. https://www.hpcuserforum.com/ROI/downloads/HyperionResearchPowerPoint.zip
2 Hyperion Research 2018 HPC Multi-Client Study: The Use of Public/External Clouds for HPC Workloads, Trends, and Drivers