Performance Monitoring on VMWare ESX

I met with a Systems Engineer from VMWare this afternoon. Some of my students are working on a performance study of VMWare and so I took the opportunity to pick his brain on how to get performance data from the server. There are two levels that you need to gather data: the virtual machine and the host machine. Here's what I found out:

perfmon gives good data for everything but the CPU on the virtual machines. Because the host machine is running ESX (a modified Linux kernel) you can't directly run perfmon. For the host machine itself, there are several options:

  • VM VirtualCenter gives usage data, but the default polling interval is five minutes. This isn't fast enough. The polling interval can be reduced, although I still have questions whether or not we can create
  • esxtop is a special version of top that can run on the host machine.
  • vmkusage is an HTTP accessible program that gives host and virtual machine usage data.

Another question I've had is about resource constraints. We bought boxes that were maxed out in CPUs and memory. We're concerned that we'll run into network bottlenecks. I've known that we can buy a quad NIC and assign ports, but I didn't know that ESX will gang the quad NICs together and let do resource allocation to the virtual machines.

We also talked about using VMWare in a disaster recovery situation. Because the virtual machines look like files, they can be backed-up. Then you can recover back-ups daily to an off-site VMWare host and in the event of a disaster, be ready for a warm-start on the backed-up servers. You're a day behind, but could be rolling in a matter of minutes.