Single Root IO Virtualization


It’s a technology in enterprise grade hardware that allow multiple virtual machines to use the same peripheral, like a graphics card. But how does it work exactly? How do you get multiple independent VMs to use the same component (I’m assume each VM can fully utilize it if the other ones aren’t doing anything) without clashing with each other?

In: Technology

It’s a feature that the device itself needs to have. They’re designed to support this feature. For Graphics cards you’ll only see this on Quadros, the cards that cost a few thousand dollars. GeForce cards can’t do it.

Normally you have to pass a PCI device, like a GeForce graphics card, to the VM in its entirety or forfeit some hardware. Most graphics cards with HDMI appear as 2 devices in your device listings – one for the graphics part and one as a sort of “sound card” which outputs HDMI audio. It’s not possible to pass these two devices to different VMs because it’s not possible to isolate the two devices suitably, such as when the GPU tries to access the CPU’s memory the CPU will map it to the VM memory space first. The two devices, being on the same card, can’t provide that separation and the guarantee they won’t talk to each other. But really that’s the graphics card’s fault.

(By “forfeit some of the hardware”, I mean you do have the option to NOT pass the audio device to the VM, but then it’s unavailable for anybody. Not even the host itself is allowed to use it, and the device must still be assigned to the VM pass-through driver to make sure that’s the case.)

However, SR-IOV devices can. They’re designed to be. They create fake/virtual PCI-E devices designed to be passed to VMs, like the graphics card example with the GPU and the audio device except it’s more like 5 different GPUs, and they behave themselves. In fact the main device itself has a management interface to create/delete the virtual devices, assign resources and limits, etc. Each one acts like a separate device and respects the separation between the others.

I’ve done this with network cards supporting SR-IOV. You can give VMs access to a virtual NIC, and on the host you can specify speed limits on the virtual port, set the MAC address and block data that tries to use a different address. It’s kinda nice for the VM because it means the host can’t view your data since it goes right through the network card. Network cards like this usually support huge numbers of send/recv (TX/RX) queues when run alone, but as you make SR-IOV devices for virtual machines the number comes down because they’re being assigned to virtual NICs. You can have 128 queues for the host only, or 63 virtual NICs (plus the real physical device) with 2 queues each. Each VM can have 100% of the network capacity if it’s available and no limits are set, but their 2 queue limit is not something you can change if you created the full set of virtual devices. In the case of networking that can be a problem due to CPU usage.

The main point is that each device passed to a virtual machine must be isolated from other devices. It is possible for PCI devices to direct communicate with each other without the CPU being involved, so for virtual machines having direct access to hardware that must be blocked. All the chips working as PCI communication hubs must support this isolation as well as the CPU itself (many do now). If you have one that does not, then everything south of that chip must be passed to the same VM or else be forfeited as described previously. So SR-IOV would require the device in question to not be behind such a chip.