Getting Started with Database Workloads on Azure

More and more companies are moving to the cloud, a trend that has accelerated significantly during the pandemic. According to Gartner, 85% of all companies will have a “cloud-first principle” by 2025.1 This means most companies are or will be, facing the prospect of moving workloads to the cloud, which can be a daunting undertaking. Many of these workloads involve transactional and analytical database applications. If your company is transitioning these applications to the cloud, you’re likely wondering which cloud provider to choose, which virtual machine (VM) type and size will best fit your needs, and what to keep in mind as you consider your options. I’d like to address these questions and offer some guidance on running database workloads on Microsoft Azure VMs with Intel processors.

What the cloud can do for you

Hosting applications in the cloud offers many benefits. With a myriad of options available at your fingertips, you can avoid the hassle and expense of procuring onsite solutions. You can test a variety of configurations, improve QA on products that customers will run in various environments, and tier your workloads according to actual performance needs rather than the highest common denominator. Cloud providers are continually investing in new technologies, so you can take advantage of them easily. For example, Azure has already announced VMs backed by new Intel 3rd Gen Scalable processors.2 You can even access technology such as Intel® Optane™ Persistent Memory in some VMs, which offers unique advantages to some databases such as the Hybrid Buffer Pool feature with Microsoft SQL Server on Linux.3,4

Additionally, you can avoid spending money on resources you aren’t using and add resources at any time as needed with the ability to easily scale your workload up and down in response to demand, provided by Azure and other cloud providers.

Finally, when you use Azure VMs, Microsoft maintains the hardware and software for you and mitigates any hardware vulnerability issues that arise. This not only saves you time and money, but also ensures that the latest security measures are always in place.

Selecting your provider and VM

To help companies decide which cloud service provider is right for them, Intel has tested the performance of real-world workloads across several of the most popular providers. In this blog, I focus on our experience testing database workloads on Azure VMs. Among the many good reasons to choose Azure is the Microsoft connection: if your company uses SQL Server databases and Windows Server, you can save money with Azure by using your existing licenses.5

In our testing on Azure, one of the ways we achieved strong database performance was by sizing the VMs correctly. I’ll walk you through the process we used so you can optimize your performance, too. Azure offers several different VM series, but for this discussion, I’ll focus on two: the memory-optimized E-series and the general-purpose D-series. Both series are available in several sizes, determined by the number of vCPUs on each VM. However, memory capacity could be a more important factor as you determine which would be the best fit for a given database: as the amount of data you can fit in memory increases, database performance improves. In our testing, we sized the VM so that the database fit entirely in memory. With memory capacities up to 504 GB on the E_v4 series, the memory-optimized series is the first place to look for large databases. Because every database is different, though, we tested both the general-purpose and memory-optimized series to help customers understand the potential performance of both.

Within each series, Azure offers VM versions with the latest hardware and other features, as well as older versions with older hardware. The latest versions deliver the best performance and, in many cases, the best value. Currently, the D_v4 and E_v4 are the most recent versions available in the series we’re looking at, though a new v5 offering featuring the latest Intel 3rd Gen Scalable processors will be available soon.

The performance gains you can enjoy with the newest versions can be considerable. In Intel testing with transactional databases, the 2nd Gen Scalable processor-backed D16d_v4 VM achieved 1.53 times as many MySQL new orders per minute (NOPM) as an older D16d_v3 VM. For analytical database performance, the E16s_v4 VM completed data warehouse queries up to 1.54 times as fast as the same workload on an E16s_v3 VM.6

The VM size you choose will depend almost entirely on the size and performance needs of your application. Compute is an important aspect of performance, so you’ll need enough vCPUs to handle the workload. With offerings up to 64 vCPUs on the D-series and 80 vCPUs on the E-series, Azure should have a size that suits your needs. To show the range of performance you can expect on various sizes, we tested multiple VM sizes from each series in our analytical database studies.7

Adding block storage to your database VMs

Now, let’s look at storage. If your database fits in RAM, storage is a less critical, but still important, part of the cloud equation. It can also be one of the more challenging aspects of cloud offerings due to the many options and performance caps in play. In this post, I’ll focus on Azure offerings for managed disk—or block storage—because this is the type of storage that database workloads need.

There are several considerations when choosing Azure-managed disks for your database workload. To help you understand them, let’s use the hypothetical scenario of a 1TB transactional database that I want to host on an Azure E64ds_v4 VM. The database must fit on the disks, and because I won’t be able to cache the entire database into memory, storage performance will be critical. Regardless of the number of disks and their rated performance, the VM supports a maximum of 80,000 “uncached” IOPS and 1,500 uncached MBps.8 Azure does let me use a disk with higher limits for caching, but most disks I have assigned to a VM are limited to uncached limits. Azure allows me to attach a maximum of 32 managed disks to a single VM.

Now that I know the VM’s limits, I can look at the managed disks Azure offers. There are four tiers based on overall performance: Standard HDD, Standard SSD, Premium SSD, and Ultra disk. For my transactional database, I’m interested in the Premium SSD and Ultra disk options, which the Azure site states are appropriate for “production and performance-sensitive workloads” and “IO-intensive workloads” respectively.9 The maximum disk size is 32,767 GiB for Premium, and 65,536 GiB for Ultra, which means a single disk from either tier could hold my 1TB database. However, the highest-performing disk in the Premium SSD tier, the P80 model, supports a maximum of 20,000 IOPS and 900 MB/s. To fully utilize the 80,000-IOPS limit of my VM would require at least four Premium P80 disks. According to the Azure calculator, one of these is $3,604 per month, so four would cost me $14,416 per month.10 Note that each P80 disk holds 32,767 GiB, so I’d be wasting a lot of capacity if only my 1TB database sat on these four disks. For this database, a better alternative would be a single 2TB Ultra Disk set to 80,000 IOPS and 1,500 MB/s, which would cost only $4,741 per month.

Many factors will influence the type and number of disks you decide to use: your performance and SLA requirements for the workload, snapshot usage, budget, and more. However, this quick exercise shows some of the considerations that will help you determine the best disk for a workload. Check both the VM you’ve chosen and the disk tier options to determine the combination that best fits your budget and performance needs.

Experimenting to determine which VMs are right for you

As I mentioned earlier, one great advantage of the cloud is being able to add resources to your workloads quickly and easily. This is especially beneficial for testing patches and upgrades and other development tasks. You can also experiment with new VM versions released with the latest generation of hardware to see how they can boost your database workload performance and can quickly adjust the resources for a given database.

Azure, like most cloud service providers, offers various features, tools, and ways to script your environment to help you maintain image conformity across VMs for scaling, development, and other needs. In our testing here at Intel, we utilized Azure disk snapshots and image creation. This gave us a single image—containing the software versions, tuning, configurations, and other details—that we could apply to every VM size we tested to ensure that our results came from identical software environments regardless of the underlying VM hardware. This allowed us to ensure that performance benefits were not related to unintentional versioning or other software-level differences. Below, we provide a high-level overview of our approach to help guide you through a similar process.

First, create a new VM for your installation with the OS of your choice and minimal resources. You’re not using this VM for testing, so it can be small and cheap. Install your database software, applying best practices. For example, for a Microsoft SQL Server environment, we would lock pages in memory and set the MAXDOP at which we want to run our tests.

Once your initial image is ready, create a snapshot of the OS disk of the baseline VM you just configured. You can then create a new Azure Image Gallery or use an existing one and create a new image using your OS disk snapshot. The Azure GUI will walk you through the necessary steps. Finally, use the image you created and stored in the Image Gallery to create new VMs of any series, version, and size with the confidence that the exact image will be identical each time.

In conclusion

I hope I’ve provided some helpful background that you can draw on as you shift your database workloads to the cloud. Remember that whether your applications run in the cloud or on-premises, performance depends on the underlying hardware. Choosing smaller or older VMs can seem to be an economical choice, but our tests show VMs backed by newer hardware such as Intel 2nd Gen Scalable processors can consistently deliver more performance for your dollar. (For example, in one test of data warehouse performance on Microsoft SQL Server databases, Dds_v4 VMs finished the query streams up to 1.49 times as fast as Ds_v3 VMs while only costing 1.17 times more.11 )

Invest in Azure and Intel for your database workloads and save time, money, and upkeep effort while maintaining performance and SLA requirements.

[1] https://www.businessinsider.com/cloud-technology-trend-software-enterprise-2021-2
[2] https://azure.microsoft.com/en-us/blog/upgrade-your-infrastructure-with-the-latest-dv5ev5-azure-vms-in-preview/
[3] https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/hybrid-buffer-pool?view=sql-server-ver15
[4] https://www.intel.com/content/www/us/en/now/microsoft-azure-optane-innovation-editorial.html
[5] https://azure.microsoft.com/en-us/overview/azure-vs-aws/
[6] https://www.principledtechnologies.com/Intel/Xeon-Platinum-8272CL-Microsoft-Azure-SQL-Server-0920-v2.pdf
[7] https://www.principledtechnologies.com/Intel/Cascade-Lake-SQL-Server-Azure-1220.pdf
[8] https://docs.microsoft.com/en-us/azure/virtual-machines/edv4-edsv4-series
[9] https://docs.microsoft.com/en-us/azure/virtual-machines/disks-types
[10] https://azure.microsoft.com/en-us/pricing/calculator/?service=storage
[11] https://www.principledtechnologies.com/Intel/Cascade-Lake-SQL-Server-Azure-1220.pdf

Notices & Disclaimers
Performance varies by use, configuration and other factors. Learn more at https://intel.com/benchmarks.
Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
All product plans and roadmaps are subject to change without notice.
Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.