Database virtualization – latency in the cloud

Nowadays, cloud computing is the hottest topic in IT. There is a massive amount of discussion about how to build it, consume it, and how cloud computing will change our lives. The potential is very real. However, there are distinct challenges and physical limitations to deal with in order to transform it into a mature, widely adopted technology.

Economies of Scale are the base of the cloud computing business model. How can one use computational capability in an efficient way to dramatically reduce cost? Accommodation is only one aspect in a multi-tenancy environment with peakes and vales of different workloads in a multi-purpose platform. Kill latencies and bottlenecks are probably the hardest parts of cloud infrastructure. A well-balanced environment not only pays back on customer experience, but also increases profitability to better utilize available computational resources.

A big concern in a highly transactional environment is the database. The IOPS requirements are not only on throughput but also latency. Even in a non-virtualized environment, there are tough decisions to be made. Single storage without high availability or sync replication between two storages in the same site of geographic dispersed? Use Fiber Channel or 10GbE to connect to storage? Page size, link aggregation, etc. (including virtualization in this equation) make these decisions even harder.

In fact, a virtual environment cannot provide the same level of performance as bare metal. But how much is the impact for a highly OLTP system that requires high IOPS?

In several tests conducted with multiple VMMs available in the market, some are penalized by virtual CPU count, whereas others deal with virtual disk technology. Yet, all share the same constraint: high waits on read latches and I/O throughput to redo.  The following tables present these differences, comparing the virtualized and native databases in the same hardware and configuration:


In this case, the average wait for log file sync jumped from 1ms in native platform to 6ms with virtualization!

Making transactions longer means that you may experience more locks rates in the database, and also that your application server and web server will hold threads for longer periods. In this scenario for example, if you use default values for most of the application servers and web servers available in the market, then you will probably post to your user the “Server is too busy” message (aka. error 500)


For Instance, your web server will experience lower CPU consumption. Default configuration in most web servers allocates 25 threads per processor (you can change this value). When a user requests a transaction that requires database access, the time spent on round trip until database means that the user’s work thread is in a wait state, and it is unable to take another user’s request until it concludes the actual transaction. At the end, with a large amount of users and lower CPU consumption, the user will experience the bad service denied experience.

Usually, for highly OLTP environment, database virtualization is not a good idea from a computational resource utilization standpoint. However, if you decide to use it for any reason, spend some time tuning the application end-to-end in order to optimize these “latencies”.

Best Regards!