GPU-accelerated VMs on Proxmox, XCP-ng? Here's what you need to know

Hands on Broadcom's acquisition of VMware has sent many scrambling for alternatives.

Two of the biggest beneficiaries of Broadcom's price hikes, at least on the free and open source side of things, have been the Proxmox VE and XCP-ng hypervisors.

At the same time, interest in enterprise AI has taken off in earnest. With so many making the switch to these FOSS-friendly virtualization platforms, we figured at least some of you might be interested in passing a GPU or two through to your VMs to experiment with local AI workloads.

In this tutorial, we'll be looking at what it takes to pass a GPU through to VMs running on either platform, and go over some of the more common pitfalls you may run into.

Enabling PCIe passthrough on XCP-ng

To kick things off, we'll start with XCP-ng - a descendant of the Citrix Xen Server project - as it's the easier of the two hypervisors to pass PCIe devices through, at least in this vulture's experience.

By default, graphics cards get assigned to Dom0 (the management VM) and are used for display output. However, with a couple of quick config changes, we can tell Dom0 to ignore the card so that we can use the hardware for acceleration in another VM - you may want to set up a display via another GPU, via the CPU, or the motherboard's integrated graphics.

Before you get started, make sure that an IOMMU is enabled in BIOS. Short for I/O memory management unit, sometimes called Intel VT-d or AMD IOV, this is used by the hypervisor to strictly control which hardware resources each guest VM can directly access, ultimately allowing a given virtual machine to communicate directly with the GPU.

On server and workstation hardware, an IOMMU is usually enabled by default. But if you're using consumer hardware or running into issues, you may want to check your BIOS to ensure it's turned on.

Next connect to your XCP-ng host via KVM or SSH, as shown above, and drop to a command shell. From here we'll use lspci to locate our GPU:

If VGA isn't working, try one of the following instead:

You should be presented something like this:

Next, note down the ID assigned to the GPU's graphics compute and audio outputs. In this case it's 03:00.0. We'll use this to tell XCP-ng to hide it from Dom0 on subsequent boots. As you can see in the command below we've plugged in our GPU's ID after the 0000: to hide that specific device from the management VM:

With that out of the way we just need to reboot the machine and our GPU will be ready to be passed through to our VM.

Passing a GPU to a VM in XCP-ng

With Dom0 no longer in control of the GPU, you can move on to attaching it to another VM. Begin by spinning up a new VM in Xen Orchestra as you normally would. For this tutorial we'll be using the latest release of Ubuntu Server 24.04.

Once your OS is installed in the new virtual machine, shutdown the VM, and head over to the VM's "Advanced" tab in the Orchestra web interface, scroll down to GPUs, and click the + button to select it, as pictured above. It will appear as passthrough once added.

With that out of the way, you can go ahead and start up your VM. To test whether we passed through our GPU successfully, we can run lspci this time from inside the Linux guest VM.

If your GPU appears in the list, you're ready to install your drivers. Depending on your OS and hardware, this may require downloading driver packages from the manufacturer's website. If you happen to be running a Ubuntu 24.04 VM with an Nvidia card, you can simply run:

And if you want the CUDA toolkit, you'd also run:

If you're running a different distro or operating system, you will want to check out the GPU vendor's website for drivers and instructions.

Now that you've got an accelerated VM up and running, we recommend checking out some of our hands-on guides linked at the bottom of this story.

If things haven't gone smoothly, check out XCP-ng's documentation on device passthrough here.

Enabling PCIe passthrough on Proxmox VE

Enabling PCIe passthrough on Proxmox VE is a little more involved.

Like with XCP-ng, this means we need to tell Proxmox not to initialize the graphics card we'd like to pass through to our VM. Unfortunately, it's a bit of an all-or-nothing situation with Proxmox, as the way we do this is by blacklisting the driver module for our specific brand of GPU.

To get started, install your GPU card in your server and boot into the Proxmox management console. But, before we go any further, make sure that Proxmox sees our GPU. For this we'll be using the lspci utility to list our installed peripherals.

From the Proxmox management console, select your node from the sidebar, open the shell, as pictured above, and then type in:

If nothing comes up, try one of the following:

You should see a print out similar to this one showing your graphics card:

Now that we've established that Proxmox can actually see the card, we can move on enabling the IOMMU and blacklisting the drivers. We'll be demonstrating this on an AMD system with a Nvidia GPU, but we'll also share steps for AMD cards, too.

Enabling IOMMU

Before we can pass through our PCIe device, we need to enable the IOMMU both in the BIOS and in the Proxmox bootloader. As we mentioned in the earlier section on XCP-ng, IOMMU is the mechanism which the hypervisor uses to make the GPU available to the VM guests running on the system. In our experience the IOMMU should be enabled by default on most server and workstation equipment, but is usually disabled on consumer boards.

Once you've got an IOMMU activated in BIOS, we need to tell Proxmox to use it. From the Proxmox management console, open the shell.

The next bit depends on how you configured your boot disk. Usually Proxmox will default to Grub for single disk installations and Systemd-boot for installations on mirrored disks.

For Grub:

Start by opening your Grub config file. Feel free to use your preferred editor, if nano isn't your thing.

Then modify the GRUB_CMDLINE_LINUX_DEFAULT= line to read for Intel CPUs:

Meanwhile, for those with AMD CPUs, the line should look like this:

The Proxmox team also recommends adding iommu=pt to boost performance on hardware that supports it, however it's not strictly required.

Save and exit, then apply it by updating the bootloader:

For Systemd-boot:

The process looks a little different for Systemd-boot but is pretty much the same idea. Rather than the Grub config, we'll be editing the kernel cmdline file.

Then add intel_iommu=on if you've got an intel CPU or amd_iommu=on if you've got an AMD one to the first line. You can also add iommu=pt if your hardware supports it. If you've got an Intel-based system it the file should look something like this:

Save and exit, then apply the change by executing:

Adding the VFIO kernel modules

Next we need to enable a few VFIO modules by opening the module config file using your editor of choice:

And paste in the following:

If you're trying this on an earlier version of Proxmox - older than version 8.0 - you'll also need to add:

Once you've updated the modules, force an update and then reboot your system:

After you reboot your system you can check that they've loaded successfully by running

If everything worked properly, you should see the three VFIO modules we enabled earlier appear on screen. Check out our troubleshooting section if you run into any problems.

Blacklist the graphics drivers

Now that we've got an nIOMMU successfully configured we need to tell Proxmox that we don't want it to load our GPU drivers.

This can be done by creating a blacklist config file under /etc/modprobe.d/

For Nvidia GPUs:

For AMD GPUs:

Finally refresh the kernel modules and reboot by running:

Passing through your graphics card to a VM

Okay, we've officially arrived at the fun part. At this point, we should be able to add our GPU to our VM and everything should just work. If it doesn't, head down to our troubleshooting section for a few tweaks to try as well as resources to check out.

To do that, start by creating a new VM. The process should be fairly straightforward but there are a couple of changes we need to make under the System and CPU sections of the Proxmox web-based VM creation wizard, as pictured below.

Under the System section:

Next, under the CPU section, shown above, ensure that Type is set to Host to avoid compatibility issues that can crop up with some GPU drivers and runtimes.

Once your VM has been created, go ahead and start it and install your operating system of choice. In this case, we're using Ubuntu 24.04 Server edition.

After you have your OS installed, shutdown the VM and head over to its Hardware config page and click Add and select PCI Device, as shown above.

Next select Raw Device and select the GPU you'd like to pass through to the VM from the drop down. Then, tick the Advanced checkbox and ensure that both ROM-Bar and PCI-Express are checked. If the latter is grayed out, you probably didn't set the machine type to q35 in the earlier step. Optionally, you can repeat these steps to pass through the GPU's audio device, if it has one.

With our PCIe device added, we can go ahead and start our VM up and use lspci to make sure the GPU has been passed through successfully:

Again, if nothing comes up, try one of the following:

From here, you can install your GPU drivers as you normally would on a bare metal system.

About that secure boot error when installing Nvidia drivers

During installation, you may run into an error, shown below, because UEFI Secure Boot was enabled for OVMF VMs in Proxmox.

You can either reboot and disable Secure Boot in the VM BIOS (press the Esc key during the initial boot splash) or you can set a password and enroll a signing key in your EFI firmware.

Troubleshooting Proxmox GPU passthrough

If for some reason you're still having trouble passing your GPU through to a VM, you may need to make a few additional tweaks.

Enabling unsafe interrupts

If for some reason you're having trouble getting the VFIO modules set up, you may have to enable unsafe interrupts by creating a new config file under /etc/modprobe.d/. According to the Proxmox docs, this can lead to instability, so we only recommend applying this if you run into trouble:

Configuring VFIO passthrough for troublesome cards

If you're still having trouble, you may need to more explicitly tell Proxmox to let the VM take control over the GPU.

Start by grabbing the vendor and device IDs for your GPU. They'll look a bit like this: 10de:26b1 and 10de:22ba. Note if you're using a server card, you may only have one set of IDs. To identify the IDs for our GPU we can use lspci:

Again, if VGA doesn't work try 3D, Nvidia, or AMD.

You should get something like this back:

Now, we can add those IDs - in this case, 10de:26b1 and 10de:22ba - to a new kernel module config file using echo. Remember to change the ID numbers to your card's actual IDs.

We can now refresh our kernel modules and reboot:

To check that the vfio-pci driver has been loaded we can execute the following and scroll up until you see your card.

If everything worked correctly, you should see something like this (note that vfio-pci is listed as the kernel driver) and you can head back up the previous section to configure your VM:

Putting your GPU accelerated VM to work

Now that you've got a GPU accelerated VM how about checking out one of our other AI themed tutorials for ways you can put it to work...

We're already hard at work on more AI and large language model-related coverage, so be sure to sound off in the comments with any ideas or questions you might have. ®

Editor's Note: Nvidia provided The Register with an RTX A6000 Ada Generation graphics card to support this story and others like it. Nvidia had no input into the content of this article.

Search
About Us
Website HardCracked provides softwares, patches, cracks and keygens. If you have software or keygens to share, feel free to submit it to us here. Also you may contact us if you have software that needs to be removed from our website. Thanks for use our service!
IT News
Jul 13
Game dev accuses Intel of selling 'defective' Raptor Lake CPUs

High-end processor instability headaches, failures pushed one studio to switch to AMD

Jul 12
White House urged to double check Microsoft isn't funneling AI to China via G42 deal

Windows maker insisted everything will be locked down and secure - which given its reputation, uh-oh!

Jul 12
PowerToys bring fun tweaks to Windows 10 and 11

Friday FOSS Fest Mac migrants (if any exist) will find Powertoys Run strangely familiar

Jul 12
New Outlook set for GA despite missing some key features

Classic Outlook for Windows shuffles a little closer to the end of the road

Jul 12
Google can totally explain why Chromium browsers quietly tell only its websites about your CPU, GPU usage

OK, now tell us why this isn't an EU DMA violation - asking for a friend in Brussels

Jul 12
SAP's bid to woo open source community meets muted response

German software giant says open source is a 'catalyst for innovation' but is unlikely to release proprietary code

Jul 12
Stop installing that software - you may have just died

On Call They're called role-playing games for a reason ...