vMotion for vGPUs

The introduction of vMotion for vGPUs was one of the exciting vSphere features announced at VMworld US this year. The announcements included a new vSphere edition (vSphere Platinum) and version (vSphere 6.7 Update 1). Security is the key feature of vSphere Platinum. It provides security at the hypervisor level with encryption in flight and at rest, TPM 2.0 (including virtual TPM 2.0). Access security has been improved, and now the security of the VMs themselves is protected with AppDefense. You can read more about the new vSphere Platinum features in this blog post.

vMotion for vGPUs was one of the most exciting new features in vSphere 6.7 Update 1 announcement (read them all here). You probably know that GPUs (graphical processing units) can be used to make the virtual desktop experience more seamless for end users. But you may not realize that GPUs can also be used for HPC and Deep Learning compute clusters. GPUs are excellent for these workloads because they are optimized for many parallel tasks – a perfect match for High Performance Computing workloads.

Virtualizing GPUs – architectural considerations

GPUs can be brought into a vSphere environment by two methods, direct pass-thru and vGPU. Direct pass-thru uses VMware Direct Path I/O, and GPUs are presented to a VM as a PCI device. The guest OS bypasses the virtualization layer to access the GPU, so it’s simple to do but you can’t share a GPU with other VMs or perform more complex operations such as vMotion.

GPU Direct Path Architecture

vGPU is a virtual NVIDIA GPU. vGPUs are a type of mediated pass-thru co-engineered by VMware and NVIDIA. NVIDIA’s virtualization software gets installed with ESXi, and a driver is required for each guest. There are 3 editions of vGPUs, each is enabled by applying a license:

Quadro vDWS (Quadro Virtual Data Center Workstation). Don’t let the name fool you – this is not only for use with workstations. You must use this edition if you want to virtualize a powerful GPUs like the Tesla v100 to drive compute for Deep Learning.
Grid vPC: this is for VDI, such as what’s delivered with Horizon (and that relies on vSphere).
GRID vAPPS: for VDI, specifically apps delivered by Horizon (relies on vSphere)

Once this is configured, VMs are able to share access to a GPU, and use that processing power to support DL and AI architectures.

vGPU support has been available in vSphere for some time to support VDI use cases, and support for using vGPUs for compute for ML/DL/AI workloads has been evolving over the last few versions. When vSphere 6.7 U1 is released, you’ll be able to vMotion vGPUs. There are some pre-requisites: you’ll need to have the same GRID profile on both servers, and the GPUs will need be the same on both servers.

The benefits of using vMotion for vGPUs support VDI and HPC compute scenarios: you can provision a partial or an entire GPU to a VM, and provide all of the benefits the ability to vMotion a virtual machine brings.

vGPU for Compute

In addition to enabling a seamless VDI experience, vGPU can be used to power the compute for virtual HPC and virtual Deep learning architectures. With the advancements in ESXi, this can be done with virtually no impact to performance.

Here’s an example of what the virtual architecture could look like. Check out this performance paper for more details.

Learn more

Using vSphere to provide compute for HPC, Machine Learning, Deep Learning, and Big Data Analytics environments is an exciting new technical space that will empower all sorts of organizations to harness the power of their data. Learn more about this new space by checking out the reference architecture guide for High Performance Computing, following the vSphere Machine Learning and AI blog as well as the HPC section of the vSphere Apps blog.