Getting Started with Databases on Google Cloud

As cloud computing becomes more popular and cloud offerings continue to expand, moving your workloads to the cloud can seem like a daunting task. If your company is in this position, I’d like to offer some help by walking you through a few considerations, with a focus on database workloads. In this blog, I’ll talk about some of Google Cloud’s particular strengths, VM sizing, and disk configurations to consider for your database workloads, and some pointers for using Google Cloud. I hope that this introduction to databases on Google Cloud will help ease your path to the cloud.

What Google Cloud can do for you

When you’re deciding which cloud service provider (CSP) to choose, several factors come into play: geographic location, available services, cost, and performance. The factors you prioritize will determine which CSP is the best fit for your workloads. While it’s relatively easy to figure out what each CSP offers in some of these areas, it’s harder to pin down offerings in other areas. I’d like to summarize some of the particular strengths of Google Cloud and highlight some of the things we at Intel have learned. I’d also like to fill you in on the database performance tests we’ve run to help you see how your database workload could fit in Google Cloud.

Google Cloud offers some distinct strengths that may appeal to your needs. For example, the Google private fiber-optic network lets them offer high-speed connections between their data centers.1 Google also has a strong connection with Kubernetes, letting them offer a robust container presence in the cloud.2 Additionally, Google Cloud is especially friendly to cloud-native application development.3 And, finally, Google claims to already be carbon neutral, with a goal of being carbon-free by 2030, so if supporting an environmentally conscious CSP is important to you, Google Cloud is a good choice.4

To help customers learn about performance, Google Cloud offers several products that benchmark and compare cloud performance.5 Here at Intel, we conducted several tests across transactional and analytical databases to highlight performance gains customers could realize by choosing newer offerings with 2nd Gen Xeon Scalable processors over older offerings. Read on to see what we learned in our testing and which VM instances may be best for your workloads.

How to choose a VM instance

Every database is unique in terms of size, performance requirements, and more, so I can’t tell you which Google Cloud offerings will fit your needs exactly. However, keeping general database best practices in mind, I can discuss the options and walk you through some of the decisions you’ll need to make.

First, you must decide which VM instance to use for your database. Google Cloud offers four main categories of predefined VM instances—General purpose, Memory-Optimized, Compute-Optimized, and Accelerator-Optimized—each with one to four VM instance series, and a custom machine type option, which allows you to fully customize your VM instance to fit your needs, with up to 96 vCPUs and up to 8 GB of RAM per vCPU.6 The Google Cloud Bare Metal Solution offers options for Google Cloud VMware Engine, Oracle Bare Metal, and Anthos Bare Metal. Google Cloud also offers the Google Cloud VMWare Engine (GCVE) with 72 vCPUs and ample memory/SSD to enable lower TCO per VM.

For a database workload, you’ll have to balance compute, memory, and storage performance to achieve the best performance while avoiding a bottleneck. If you can fit your database into memory, as we did in many of our tests, you might be able to rely less on storage performance, but you’ll need to make sure your processor can handle the workload. In our tests on both high-memory M-series and general-purpose N-series VM instances, we saw better database performance with the VM instances backed by 2nd Gen Intel Xeon Scalable processors than with those backed by older CPUs.

7
While Google Cloud does offer their M-series VM instances as Memory-Optimized options, they come in only very large sizes. For example, the smallest M2 VM instance has 208 vCPUs and 5,888 GB of memory. For smaller workloads, Google Cloud also offers high-memory versions of their general-purpose N2 series. Here at Intel, we tested a MySQL transactional database workload on these high-memory VMs to show that choosing newer CPU generations is a great way to improve performance for database workloads that consume a lot of memory. In one test, our N2 High-memory VM instance backed by 2nd Gen Xeon Scalable processors supported 1.21x the customer transactions of the same workload on an N1 High-memory VM instance.8

Unless you have a very large database, the General Purpose N2 VM instances will probably meet your needs, whether you choose the standard VMs or the high-memory versions. Google even offers a high-CPU option with as many as 80 vCPUs. So, I’d suggest checking out the N2 options as a starting point when looking for your VM instance.9

Database workload block storage considerations

Block storage performance is especially important for database workloads and a common bottleneck culprit. Figuring out which cloud disk type and size will meet the performance requirements for your database workload can be difficult because limits and caps are not always obvious.

With Google Cloud, you have to check the disk limits of your VM instance and the limits of the disk you’re attaching to it. One way that Google Cloud differs from some other popular CSPs is that it ties disk performance to the number of vCPUs on your VM instance. If you know you will have high disk performance needs, make sure you keep that in mind when choosing your instance. While the disk performance may throttle compute somewhat, better disk performance typically requires strong compute. This means that you will likely be able to get the storage performance you need without having to oversize your vCPUs. Note that Google Cloud will throttle bursty I/O to help spread the load over time, so make sure your performance limits are sufficient to prevent excessive throttling.10

For most database workloads, SSD persistent disks (pd-ssd) are probably the best place to start looking. You can choose between Zonal disks, which have higher performance limits, and Regional disks, which offer automatic replication across two zones. With Zonal disks, you can get up to 100,000 IOPS and 1,200 MB/s. Google Cloud persistent disk performance is tied to capacity, however, so be sure to size your disk large enough to meet performance needs. For example, a 64GB pd-ssd has an IOPS limit of 1,920, while a 500 GB volume offers 15,000 IOPS.

For those who need more performance, Google offers Extreme persistent disks, as well as some limited local SSD and NVMe SSD options. These options could give you up to 2.4 million read IOPS, though it’s unlikely your database workload will need that level of performance.11

Tips and tricks with Google Cloud

Every CSP has its quirks that users learn about with experience. I’d like to share a few such things we learned about Google Cloud in the hopes of making your life easier.

First, note that while the Google Cloud site states which generation(s) of CPU are available on each VM type, the VM instance reveals only the CPU speed. Check the site’s listing for your VM type to make sure you’re getting the CPU generation you need.

Second, Google Cloud’s VM instances define and set metadata values by default, but you can change them to suit your needs. You can set the metadata on a VM-by-VM basis, or you can configure project-wide metadata that propagates to any VM instances within your project. Some of the things you can set via metadata include setting permissions for access, assigning SSH keys, the hostname, network interfaces, and much more. One trick we learned relating to this metadata was using the SSH value to maintain consistent public SSH keys on our Linux VM instances. Before we assigned static keys to the metadata value to include them in the authorized_keys file, the SSH keys changed or disappeared upon a VM instance reload.

Once you have created and configured your VM instance, you can use Google Cloud’s persistent disk snapshots to create a consistent image of your database workload. You can use the snapshot to spin up your workload on any VM instance size or type while keeping the installation settings and configurations identical. Once you’ve set up your initial workload VM instance, create a snapshot of the boot disk. You can then go to the Images page, and create a new image using the snapshot you created as the Source. Finally, go to the Disks page and create a new disk, choosing the Image you created in the previous step as the Source image. Now you have a custom boot disk that you can choose when creating a new VM instance saving you from having to reinstall and reconfigure your environment.

Conclusion

In your journey to the cloud, you face many decisions. I’ve explained some of the unique selling points of Google Cloud, given you some strategies for selecting the appropriate VM instance and storage for your database requirements, and offered some tips that might help you navigate the world of Google Cloud more easily. I hope you will find this helpful as you shift your database workloads to the cloud.

[1] https://cloud.google.com/blog/products/networking/google-cloud-networking-in-depth-cloud-cdn
[2] https://cloud.netapp.com/blog/azure-vs-google-cloud-how-they-compare
[3] https://cloud.google.com/solutions/cloud-native-app-development
[4] https://cloud.google.com/sustainability
[5] https://cloud.google.com/free/docs/aws-azure-gcp-service-comparison
[6] https://cloud.google.com/custom-machine-types
[7] https://www.intel.com/content/www/us/en/partner/workload/google/cloud-n2-instances-analyze-data-faster-benchmark.html
[8] https://www.intel.com/content/www/us/en/partner/workload/google/postgresql-dbs-perform-google-cloud-benchmark.html
[9] https://cloud.google.com/compute/docs/machine-types
[10] https://cloud.google.com/compute/docs/disks/performance#machine-type-disk-limits
[11] https://cloud.google.com/compute/docs/disks/performance#type_comparison

Notices & Disclaimers
Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
All product plans and roadmaps are subject to change without notice.
Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.