In a recent article, I discussed using GPUs in virtual desktops and how to monitor their usage using the ControlUp Real-Time Console.
With the recent outbreak of Coronavirus, more and more power users are working remotely and are using virtual desktop technology (VDI) to complete their day-to-day tasks. These power users need to use multiple virtual desktops or virtual desktops with GPUs. Virtual desktops, especially those that use scarce resources, such as graphic cards, need to be used as judicially as possible.
With this in mind, I’d like to show you some examples of monitoring users that are running multiple virtual desktops, those that use virtual desktops with GPUs and, more importantly, how to pinpoint issues users may experience, so you can troubleshoot them efficiently and effectively.
Using one one of the more powerful thin clients on the market—a Wyse 5070—I will demonstrate what the GPU and thin client resource usage is when using multiple virtual desktops and then with a GPU-based virtual desktop.
The Wyse 5070 is a highly configurable VDI client platform that can be configured as a basic VDI client or to support the most demanding of workloads. I paired it with an Intel Pentium Silver J5005 CPU, 8 GB DDR4 RAM, a 64GB SSD drive Intel UHD Graphics 605, and (most importantly) an AMD Radeon 9173 PCIe graphics card. Windows 10 Enterprise 2016 LTSB 64-bit was pre-installed on the device, which was fortuitous, as it allowed me to monitor the thin client’s resource usage using ControlUp. The 5070 can have up to 6 monitors attached to it.
I verified the configuration of the device by running Speccy, a free utility that shows you information about hardware and software on your computer.
The two most common methods that power users use for VDI clients with multiple monitors are as a device connected to a single virtual desktop with a GPU, or as a device connected to display many different virtual desktops.
I set the 5070 to display several virtual desktops. I set up three instant-clone virtual desktops, a manual virtual desktop, and a fifth desktop—a physical PC delivered through VMware Horizon.
I connected the device to five different monitors: three 4K monitors and two 1280 x 1020 monitors. I plugged the three 4K monitors into the GPU card and the two 1280 x 1080 into the main device.
Initially, I used a single virtual desktop and displayed it on one of the 4K monitors. Using ControlUp, I was able to monitor not only the virtual desktop, but also the Wyse 5070. To stress the virtual desktop, I played a 4K video on it. It showed CPU usage of 92% and 5.27Mbp of network traffic, while the Wyse 5070 had CPU usage of 11% and 5.24Mbp of network traffic. This indicated that any performance-related issue would come from the virtual desktop, not the 5070.
I used the ControlUp Console to monitor the process on the virtual desktop and the Wyse 5070. I saw that the VLC application (which we used to play the video) used 66% of the CPU, while VMBlast used only 22% of the CPU; the process on the 5070 that was using the most CPU resources was services.exe.
Using VMware Horizon, I powered on the other two 4K monitors, which were attached to the Wyse 5070 through mini DisplayPorts on the extended chassis. I saw that all of the virtual desktops were using nearly all their CPU’s capacity, while the Wyse 5070 was using only 13%. If this were a real-life work situation, end users would have complained about videos not playing smoothly and applications not being as responsive as usual. I was able to ascertain that this was not the fault of the 5070 itself, but rather due to the CPU of the virtual desktops being so heavily used (as indicated by the processor queue length).
To test the 5070 with five monitors, I displayed yet another virtual desktop to one of the 1280 x 1080 monitors and the physical machine (NUC10i7FNH) with the Horizon agent on the other monitor. I streamed a video on the virtual desktop and PCMark 10 on the physical machine to generate a load. The Wyse system showed CPU usage of 27% even though it was processing 22.68 Mbps of data; the three virtual machines were heavily stressed, but the physical system was not. Due to a technical problem on our side, we were not able to display the fifth virtual desktop in the screen capture below.
To test the difference in protocols, I connected two virtual desktops to different 4K monitors, using two different protocols: Blast and PCoIP. Each of the videos looked exactly the same to me, but the desktop that used PCoIP consumed far more bandwidth. It needs to be noted, however, that higher bandwidth consumption by PCoIP is not an issue when the network is not under contention, as both Blast and PCoIP have mechanisms in place to lower bandwidth consumption when they detect network contention. ControlUp makes it easy to monitor resource usage for different protocols or when applying tuning tweaks.
The next series of tests involved using a single GPU-enabled virtual desktop to drive four different monitors.
I set up a GPU virtual desktop that was hosted on a Dell PowerEdge R740xd with an Intel Xeon Gold 6248 CPU with 2.50GHz, 256GB of DDR4 2667MHz RAM, and a NVIDIA Turing T4 GPU installed. The server had VMware ESXi 6.7.0 installed and was being managed by a 6.7 vCenter server. For more information on this setup, see my first article on GPUs with ControlUp.
I didn’t want the virtual desktop to be a limiting factor, so I created a Windows 10 VM with 16GB of RAM, 4 vCPU, and a 200GB disk that was hosted on the local SSD datastore and associated the GPU with it.
For multi-monitor usage, I connected the Wyse 5070 to a Dell 43 Ultra HD 4K multi-client monitor (P4317Q). This monitor can simultaneously display content from up to four different inputs in FHD (1920×1080), or from a single input at a resolution of 4K (3840×2160). The monitor has two HDMI/MHL inputs, a Mini DisplayPort input, a full-size DisplayPort input, a VGA input, and a pair of 8-watt speakers.
I attached the Dell P4317Q monitor to the main HDMI port and used the Horizon client to attach to the virtual desktop with the GPU and configure the monitor to act as a single 4K monitor. I played a 1080p video and worked with LibreOffice documents.
During my testing, I didn’t notice any degradation of performance. The Wyse 5070 was processing 8.26 Mbps of data from the virtual desktop, but the CPU was only at 9% and the queue length was less than 1%. The virtual desktop showed CPU usage of 8% and GPU usage of 19%.
Next, I configured the Dell P4317Q monitor’s picture-by-picture (PBP) feature to display two FHD (1920×1080) displays and set up another video connection between the monitor and the 5070; I displayed the GPU virtual desktop to both. I played a 1080p video to stress the Wyse 5070 and worked with LibreOffice documents. I didn’t notice any degradation of performance on either of the monitors.
The ControlUp Console showed that even though the Wyse 5070 was driving two monitors and processing ~10 Mbps of data, there was only 19% GPU usage on the virtual desktop CPU usage of 9%, and queue length less than 1%.
I used the P4317Q PBP feature to display four FHD (1920×1080) virtual monitors and connected it to the device to the expansion card. I then used the GPU-enabled desktop with all four of the virtual monitors; I played a 1080p video on all the monitors and worked with LibreOffice documents.
The response was still good on the virtual desktops. The GPU used by the virtual desktop showed 9% CPU usage and 25% GPU usage. Even though the Wyse 5070 was driving four monitors and processing 12.31 Mbps of data, the CPU was at 9%, the GPU was at 16%, and the queue length was less than 1%.
Next, I wanted to see what the resource usage was using PCoIP with a GPU-enabled virtual desktop. I reconfigured the P4317Q monitors to be a single 4K monitor and again ran my streaming video and office application test on it, using PCoIP rather than Blast. The Wyse 5070 was able to handle the load, but while the GPU usage on the virtual desktop resembled what we saw when using Blast, the bandwidth consumption was 34 Mbps with PColP—much higher than with Blast. Further, we saw that the CPU usage on the 5070 went up to 90% when using PCoIP, as opposed to 14% when using Blast. Once again, it is important to note the fact that both Blast and PCoIP have mechanisms in place to lower bandwidth consumption when they detect network contention, so the PCoIP used more bandwidth is not an issue when the network is not under contention. Again, the purpose of this test was to show that both PCoIP and Blast can be used with a GPU-enabled virtual desktop on this system.
The purpose of this article was to illustrate how to monitor the VDI environment of power users that need to use more than one monitor or GPU-enabled virtual desktops using ControlUp. I showed you examples of monitoring everything from a VDI client with a single 4K monitor attached to a GPU virtual desktop, to a VDI client having five monitors attached to it with multiple virtual desktops. I showed you how you could look at the process running on both the VDI client, as well as the virtual desktop. I even showed you how you could compare the resource utilization of different remote display protocols.
With more and more power users working from home, we need to have VDI clients that can drive multiple monitors and virtual desktops that use GPUs. But we also need to ensure that they are using these resources as efficiently as possible. If They have issues, we need to pinpoint them as quickly as possible . ControlUp makes it possible to do just that, ensuring business continuity and keeping end-users productive.