I have been working on a large End User Computing programme for a while, and not found the time to blog, so now it is time to catch up with a few snippets.
This one is about Virtual Desktop Infrastructure (VDI) and the BIOS settings of the physical servers. Here's the summary: VDI depends on high performance hosts, but by default hosts are typically configured for a balance of performance and energy efficiency. Check your BIOS. It may not be what you think.
I first came across this a while ago when working on a new VDI deployment of Windows 7 on VMware View, running on Dell blade servers to an EqualLogic SAN. We noticed that the desktops and applications were not launching as quickly as expected, even with low loads on the servers, networks and storage. We did a lot of work to analyse what was happening. It's not easy with non-persistent VDI, because you don't know what machine the user will log on to. The end result was a surprising one.
The problem statement was: "Opening Outlook and Word seems to be sluggish, even though the host resources are not being fully used. Performance is not slow. It is just not as good as we were expecting".
My company, Airdesk, is usually called in after the IT team have been unable to resolve the problem for a while. If the problem were obvious it would have been solved already. This means that we were looking for a more obscure cause. For example, in this case, it was not a simple case of CPU, memory or disk resources, because these are easily monitored in the vSphere console. So already we knew that we were looking for something more hidden. Here's a good article on Troubleshooting ESX/ESXi virtual machine performance issues. Let's assume the IT team has done all that and still not found the problem.
My approach to troubleshooting is hypothesis-based. We identify all the symptoms. We identify the things that could cause those symptoms. We devise tests to rule them out. It's not as easy as that, because you can't always restructure the production environment for testing. You need tools to tell you what is going on.
In this case the tools we used were:
- vSphere console to monitor the hosts and the virtual machines from outside
- Performance Monitor to monitor processor and disk activity from inside the virtual machine
- Process Monitor (Sysinternals) to see what was actually happening during the launch
- WinSAT to provide a consistent benchmark of performance inside the virtual machine
- Exmon for monitoring the Outlook-Exchange communication
The tools told us that the application launch was CPU-bound, but there was no significant CPU load on the hosts. CPU Ready time (the measure of delay in scheduling CPU resources on a host) was normal. We could see spikes of disk latency, but these did not explain the time delay in opening applications.
Our conclusion was that the virtual machines were not getting access to the CPU that the vSphere console said was available to them. What could cause that? Something perhaps that throttled the performance of the CPU? Intel SpeedStep maybe? The vSphere console showed that it was configured for High Performance. But we decided to check the BIOS on the hosts and, sure enough, they were configured for Active Power Controller (hardware-based power management for energy efficiency).
We changed the BIOS settings, and the result was immediate. Performance of the virtual desktop was electric. Click on an application and, bang, it was open. We saved potentially £ tens of thousands by finding the cause and not throwing resources at it.
You have two types of setting in Dell BIOS:
- Processor settings, which can be optimized for different workloads
- Power Management settings, which give a choice between efficiency and power.
In our case we wanted to configure the processors for a general-purpose workload but we also wanted to provide immediate access to full resources, without stepping the processor down to save power based on average utilisation. So the Maximum Performance power setting was the one we needed. You could also set the BIOS Power Management to be OS-Controlled, and allow the vSphere setting to take effect. The point of this post is that the vSphere setting said the hosts were in High Performance mode, while the troubleshooting showed they were not.
That was a little while ago. I was reminded of it recently while designing the infrastructure for a global VDI environment based on XenServer (the hypervisor) and XenDesktop (the VDI) running on HP blade servers to a 3PAR SAN. In the Low Level Design we said "Max performance MUST be set in the host server BIOS".
Sure enough, in the HP BIOS the default power setting is for Balanced Power and Performance, and this needs to be changed. In a XenServer VDI environment it needs to be set to maximum performance. See this technote from Citrix on How to Configure a XenServer Host's BIOS Power Regulator for Maximum Performance.
If you are not managing the BIOS power management settings on your virtualisation hosts, you are not getting the results you should.