Performance optimization of VMware VMs

Performance optimization under VMware is a complex subject. I’ve been involved in several instances where a VM is not performing as expected, and the following is what I have learnt.

Sources

Architecting Microsoft SQL Server on VMware vSphere
https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/solutions/sql-server-on-vmware-best-practices-guide.pdf

VMware Hybrid Cloud Best Practices Guide for Oracle Workloads
https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/solutions/vmware-oracle-databases-on-vmware-best-practices-guide.pdf

Oracle Monster Virtual Machine Performance (VMware 6.5)
https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/performance/vsphere65-oracle-monster-vm-perf.pdf

CPUs and Hyperthreading

In some cases, you may be running a very powerful VM with a large amount of CPUs. In the majority of these instances, it would not be recommended to allocate more vCPUs to the VM than the number of physical cores in the host ESX server.
For example, the host may have 2x 10-core CPUs. In total, that would be 20x physical CPU cores and 40x logical (with Hyperthreading).
In this case, the suggested maximum limit of vCPUs for a big VM would be 20x.

The reason is that the Hypervisor is aware of Hyperthreading on the CPUs, and it keeps the VM’s CPUs on different physical cores when possible. If you were to then increase the amount of vCPUs on the VM to 24x, some of these CPUs must now share the same physical core through Hyperthreading. The Hyperthreading functionality does not pass through to the guest OS, so it does not know which of the vCPUs share the same physical core and this can result in increased contention.

NUMA

Be aware of NUMA, and when a big VM could be crossing NUMA nodes. This is well documented in the best practice guides referenced above, but the general rule is that when a VM is crossing NUMA nodes, the ‘CPU Hot add’ function is disabled in the VM’s settings. This enables the vNUMA architecture to be exposed to the guest OS.
In this case, match the virtual hardware with the physical host – if the number of CPUs on the guest is split across two physical sockets on the host, give it 2x sockets in the VM hardware.
This requires a complete shut down of the VM if Hot add is currently enabled.

SCSI Controllers

For VMs with lots of disks, it can help to add multiple SCSI controllers, and split the disks between them. Each controller gets its own I/O queue in the VM, so by working out which are the busiest disks and splitting them across different controllers, latency is reduced at the VM level.

Datastores

If the VM has a lot of busy disks on the same datastore, this can also increase the latency that is experienced. Splitting these out to separate datastores can reduce the latency at the HBA.

In regards to the points above for SCSI Controllers and datastores, I can give an example. A VM we were troubleshooting was configured as follows:
Approximately 24 hard disks. The C: drive was on SCSI Controller 0, and the remaining disks split between two other Controllers.
All disks were on two datastores.
The VM was regularly showing signs of storage latency.

After investigations had ruled out the storage itself, we reconfigured the VM as follows:
Added a fourth SCSI controller.
Moved the low activity disks to SCSI Controller 0.
Balanced the higher activity disks between the remaining three SCSI controllers (five or six disks per controller).
Created eight smaller datastores, and split the VM’s disks amongst these.

This solved the latency issues experienced.

Path Selection Policy

By default, when a host is built and a shared datastore is created, the Path Selection Policy is normally set to Most Recently Used. This restricts the data path to just one being active at any time. Performance improvements can be seen by switching the Path Selection Policy to Round Robin, and then by reducing the IOPS limit to 1.
In Round Robin, the default is that a new path is selected after 1000 IOPS; the change mentioned reduces this so that this is done after every I/O.
It is recommended this is done with input from your storage vendor to confirm they support this.

Changing the above settings for multiple datastores can be done through SSH, and does not impact any running VMs:

In the Storage Devices view on a host, confirm if your datastore device IDs use naa addresses, or whether they start with something else.
SSH to the host.
Execute the following commands (alter “naa” in the two commands below to whatever is appropriate for your datastores):

for i in `esxcfg-scsidevs -c |awk ‘{print $1}’ | grep naa`; do esxcli storage nmp device set –device $i –psp VMW_PSP_RR;done
for i in `esxcfg-scsidevs -c |awk ‘{print $1}’ | grep naa`; do esxcli storage nmp psp roundrobin deviceconfig set –type=iops –iops=1 –device=$i; done

Remember to do this on all connected hosts.