I’m happy to announce today that you can now deploy vGPU Mixed Size Virtual GPU types with VMware vSphere 8U3, also known as “Heterogeneous Time-Slice Sizes” or “Heterogeneous vGPU types”.
VMware vSphere 8U3 was released yesterday (June 26th, 2024), and brought with it numerous new features and functionality. However, mixed vGPU types deserves it’s own blog post as it’s a major game-changer for those who use NVIDIA vGPU for AI and VDI workloads, including Omnissa Horizon.
NVIDIA vGPU (Virtual GPU) Types
When deploying NVIDIA vGPU, you configure Virtual GPU types that provide Workstation RTX (vWS Q-Series), Virtual PC (vPC B-Series), or Virtual Apps (vApps A-Series) class capabilities to virtual machines.
On top of the classifications above, you also needed to configure the Framebuffer memory size (or VRAM/Video RAM) allotted to the vGPU.
Historically, when you powered the first VM, the physical GPU that provides vGPU, would then only be able to serve that Virtual GPU type (class and Framebuffer size) to other VMs, locking all the VMs on running on that GPU to same vGPU type. If you had multiple GPUs in a server, you could run different vGPU types on the different physical GPU, however each GPU would be locked to the vGPU type of the first VM started with it.
NVIDIA Mixed Size Virtual GPU Type functionality
Earlier this year, NVIDIA provided the ability to deploy heterogeneous mixed vGPU types through the vGPU drivers, first starting with the ability to run different classifications (you could mix vWS and vPC), and the later adding support for mixed-size frame buffers (example, mixing a 4Q and 8Q profile on the same GPU).
While the NVIDIA vGPU solution supported this, VMware vSphere did not immediately add support so it couldn’t take advantage of this until the new release of VMware vSphere 8U3, VMware vCenter 8U3, and VMware ESXi 8U3.
To configure different classifications (vWS mixed with vPC), it requires no configuration other than using a host-driver and guest-driver that support it, however to use different sized framebuffers, it needs to be enabled on the host.
To Enable vGPU Mixed Size Virtual GPU types:
- Log on to VMware vCenter
- Confirm all vGPU enabled Virtual Machines are powered off
- Select the host in your inventory
- Select the “Configure” tab on the selected host
- Navigate to “Graphics” under “Hardware”
- Select the GPU from the list, click “Edit”, and change the “vGPU Mode” to “Mixed Size”
Once you configure this, you can now deploy mixed-size vGPU profiles.
When you SSH in to your host, you can query to confirm it’s configured:
[root@ESXi-HOST:~] nvidia-smi -q
vGPU Device Capability
Fractional Multi-vGPU : Supported
Heterogeneous Time-Slice Profiles : Supported
Heterogeneous Time-Slice Sizes : Supported
vGPU Heterogeneous Mode : Enabled
It’s supported, and enabled!
Additional Notes
Please note the following:
- When restarting your hosts, resetting the GPU, and/or restarting the vGPU Manager daemon, the ESXi host will change back to it’s default “Same Size” mode. You will need to manually change it back to “Mixed Mode”.
- When enabling mixed-size vGPU types, the number of some types of vGPU profiles may be reduced vs running the GPU in equal-size mode (to allow other profile types). Please see the additional links for information on Mixed-Size vGPU types inside the “Virtual GPU Types for Supported GPUs” link.
- Only “Best Effort and “Equal Share” schedulers are supported with mixed mode vGPU. Fixed Share scheduling is not supported.
Greetings. Nice to see a familiar face. Anyway, do you know if vMotion now supports non-full GPU modes? We had started having mixed sizes in a cluster but gave up since DRS didn’t support moving them around automatically.
Hi David,
As of vSphere 8U3, it is now vGPU aware. I’d recommend checking in to the documentation, but I think you’ll be pretty happy. Please note that as of today (July 13th, 2024), Horizon and vGPU do not support vSphere 8U3, but I’m expecting that’ll happen soon.
Cheers,
Stephen
Regarding your note:
“When restarting your hosts, resetting the GPU, and/or restarting the vGPU Manager daemon, the ESXi host will change back to it’s default “Same Size” mode. You will need to manually change it back to “Mixed Mode”.”
This only seems to be partly true. Although they are reset to “Same Size” after a host reboot, either vCenter og the vgpu manager sets it back to “Mixed Mode” again a few seconds after reconnecting with vCenter.
I’ve tried lot’s of combinations of host reboot/shut down, and it ends up being set correctly to “Mixed Mode” again every time. Which is great or else this whole awesome feature would be useless for our servers.
Tested on ESXi 8.0 U3b, vCenter 8.0.3.00200, & NVIDIA 17.3 manager/driver
Thanks for the write up!
We just upgraded to 8u3 and missed this part. I will say it looks like you have to power down the VMs on the host to change this setting.
Espen, thanks for that info as well. That would be bad if it switched back after every reboot.
Have you had any issues vmotioning a vGPU VM in mixed mode? Since moving to mixed mode I’m unable to vmotion a machine from one host to another. I get the compatibility warning “insufficient resources. One or more devices (pciPassthru0) required by VM are not available on host .”
Hi Dave, I haven’t had any issues, however I only have a few deployments where I have this enabled. Do you have it enabled on the source/destination host? Also, when using mixed modes, depending on the framebuffer size/profiles you’re using, there me be unusable space. Have you checked against the documentation, that with the profiles you’re using, you can consume the remaining space?
Hello, i’ve just upgraded to 8.0.3, but when i want to set the vGPU mode to to mixed size, seems to apply the change, but nothing happens. the value stays like it was before. Do you know the command to set the parameter through CLI?
Hi Juan,
Did you set it before powering up any VMs? All VMs have to be powered off before you enable the setting.
Also, are you running a version of vGPU that supports mixed sized profiles? Did you update your driver to the version required for vSphere 8?
Cheers,
Stephen
Hi Stephen,
When I try to change the value with the VMs running, I get an error. I’ve also tried changing the value with any VM registered to the host, with the same result.
On the driver side, I’ve installed the latest driver that Nvidia has released for vSphere 8 at this time (550.127.06-1OEM.800.1.0.20613240 released on 10/22/2024).
As I found out, I think the problem is with my GPUs. I have a Tesla T4 based on the Volta architecture. If I’m not mistaken, you can only use this new feature with Ampere or higher GPUs that have MIG enabled.
As a curious detail, I can run Q and C profiles on the same GPU if I don’t mix the VRAM size. If I’m not mistaken, this is because this GPU has 2 processors that can handle the 2 profiles separately.
Hi Juan,
That’s correct, since the T4 uses the Turing architecture, it does not support mixed size vGPU profiles.
Hi Stephen,
We’re testing that new option since we have ot mix different vGPU frame buffer. It seems like even though the Shared Passthrough GPU assignment policy is set tp Spead VMs across GPUs (best performance) is set, I’m seeing the GPÛ consolidation behavior instead. I’ve confirmed VM placement is not being spread across vGPU anymore by running nvidia-smi. Did you see similar behavior in your setup?
Hello,
I personally haven’t seen this behavior, I could see this happening on the first couple boots of VMs, however do you see this as more and more VMs continue to power on?
Cheers,
Stephen