An In-depth Look at SR-IOV NIC Passthrough

SR-IOV or “Single Root I/O Virtualization” is a very interesting feature that can provide virtual machines shared access to physical network cards installed in the hypervisor. This may sound a lot like what a virtual NIC and a vSwitch does, but the feature works very similarly to PCI passthrough, granting a VM direct access to the NIC hardware. In order to understand SR-IOV, it helps to understand how PCI passthrough works. Here is a quote from a post I did a few years ago:

“PCI Passthrough – or VMDirectPath I/O as VMware calls it – is not at all a new feature. It was originally introduced back in vSphere 4.0 after Intel and AMD introduced the necessary IOMMU processor extensions to make this possible. For passthrough to work, you’ll need an Intel processor supporting VT-d or an AMD processor supporting AMD-Vi as well as a motherboard that can support this feature.

In a nutshell, PCI passthrough allows you to give a virtual machine direct access to a PCI device on the host. And when I say direct, I mean direct – the guest OS communicates with the PCI device via IOMMU and the hypervisor completely ignores the card.”

SR-IOV takes PCI passthrough to the next level. Rather than granting exclusive use of the device to a single virtual machine, the device is shared or ‘partitioned’. It can be shared between multiple virtual machines, or even shared between virtual machines and the hypervisor itself. For example, a single 10Gbps NIC could be ‘passed through’ to a couple of virtual machines for direct access, and at the same time it could be attached to a vSwitch being used by other VMs with virtual NICs and vmkernel ports too. Think shared PCI passthrough.

Continue reading “An In-depth Look at SR-IOV NIC Passthrough”

Updating NIC Drivers with VMware Update Manager

Using VUM and DRS to make quick work of driver updates in larger environments.

In my last video, I showed you how to update ESXi NIC drivers from the command line. This method is great for one-off updates, or for small environments, but it really isn’t scalable. Thankfully, VMware Update Manager can make quick work out of driver updates. By taking advantage of fully-automated DRS, VUM can make the entire process seamless and orchestrate everything from host evacuation, driver installation and even the host reboots.

In today’s video, I walk you through how to upload a custom patch into VUM and create a baseline that can be used to update a driver.

Remember, some server vendors require specific or minimum firmware levels to go along with their drivers. The firmware version listed in the compatibility guide is only the version used to test/qualify the driver. It’s not necessarily the best or only choice. VMware always recommends reaching out to your hardware vendor for the final word on driver/firmware interoperability.

I hope you found this video helpful. For more instructional videos, please head over to my YouTube channel. Please feel free to leave any comments below, or on YouTube.

Updating NIC Drivers in ESXi from the CLI

A video walk-through on updating your NIC drivers from the command line for maximum control.

There are a number of reasons you may want to update your NIC drivers and firmware. Maybe it’s just a best practice recommendation from the vendor, or perhaps you’ve run into a bug or performance problem that warrants this. Whatever the reason, keeping your NIC drivers up to date is always a good idea.

There are several ways to go about updating your drivers, but the tried and tested ‘esxcli’ method works well for small environments. It’s also a good choice to ensure you have maximum control over the process. The below video will walk you through the update process:

Remember that finding the correct NIC on the VMware Compatibility Guide is one of the most important steps in the driver update process. For help on narrowing down your exact NIC make/model based on PCI identifiers, be sure to check out this video.

Another important point to remember is that some server vendors require specific or minimum firmware levels to go along with their drivers. The firmware version listed in the compatibility guide is only the version used to test/qualify the driver. It’s not necessarily the best or only choice. VMware always recommends reaching out to your hardware vendor for the final word on driver/firmware interoperability.

Stay tuned for another video on using VMware Update Manager to create a baseline for automating the driver update process!

I hope you found this video helpful. For more instructional videos, please head over to my YouTube channel. Please feel free to leave any comments below, or on YouTube.

Identifying NICs based on PCI VID and DID

A better way to find your exact NIC model on the VMware Compatibility Guide.

If you’ve ever tried to search for a NIC in the VMware Compatibility Guide, you may have come up a much longer list of results than you expected. Many cards with similar names have subtle differences. Some have multiple hardware revisions, varying numbers or types of ports and may also be released by different OEMs. In some situations, the name of the card in the vSphere UI may not match what it truly is, adding to the confusion.

Thankfully, there is a much better way to identify your card. You can use the PCI VID, DID, SVID and SSID identifiers. The below video will walk through how to find these identifiers, as well as how to use them to find your specific card on the HCG.

Please feel free to leave any comments or questions below or on YouTube.

VMware Tools 10.3.2 Now Available

New bundled VMXNET3 driver corrects PSOD crash issue.

As mentioned in a recent post, a problem in the tools 10.3.0 bundled VMXNET3 driver could cause host PSODs and connectivity issues. As of September 12th, VMware Tools 10.3.2 is now available, which corrects this issue.

The problematic driver was version 1.8.3.0 in tools 10.3.0. According to the release notes, it has been replaced with version 1.8.3.1. In addition to this fix, there are four resolved issues listed as well.

VMware mentions the following in the 10.3.2 release notes:

Note: VMware Tools 10.3.0 is deprecated due to a VMXNET3 driver related issue. For more information, see KB 57796. Install VMware Tools 10.3.2, or VMware Tools 10.2.5 or an earlier version of VMware Tools.”

Kudos to the VMware engineering teams for getting 10.3.2 released so quickly after the discovery of the problem!

Relevant links:

 

Configuring a Proxy in Photon OS

I’ve been playing around recently with VMware’s new Photon OS platform. Thanks to it’s incredibly small footprint and virtualization-specific tuning, it looks like an excellent building block for a custom appliance I’m hoping to build. To keep the appliance as small as possible, I used the minimal deployment and then planned to install packages as required.

After deploying the appliance, I hit a roadblock as the package management tool called tdnf couldn’t reach any of the repositories. This was expected as my home lab is isolated and I have to go through a squid proxy server to get to the outside world.

root@photon-machine [ ~ ]# tdnf repolist
curl#7: Couldn't connect to server
Error: Failed to synchronize cache for repo 'VMware Photon Linux 2.0(x86_64) Updates' from 'https://dl.bintray.com/vmware/photon_updates_2.0_x86_64'
Disabling Repo: 'VMware Photon Linux 2.0(x86_64) Updates'
curl#7: Couldn't connect to server
Error: Failed to synchronize cache for repo 'VMware Photon Linux 2.0(x86_64)' from 'https://dl.bintray.com/vmware/photon_release_2.0_x86_64'
Disabling Repo: 'VMware Photon Linux 2.0(x86_64)'
curl#7: Couldn't connect to server
Error: Failed to synchronize cache for repo 'VMware Photon Extras 2.0(x86_64)' from 'https://dl.bintray.com/vmware/photon_extras_2.0_x86_64'
Disabling Repo: 'VMware Photon Extras 2.0(x86_64)'

When trying to build the package cache, you can see that the the synchronization fails to specific HTTPS locations over port 443.

After having a quick look through the Photon administration guide, I was surprised to see that there wasn’t anything regarding proxy configuration listed – at least not at the time of writing. Doing some digging online turned up several possibilities. There seems to be numerous places in which a proxy can be defined – including in the kubernetes configuration, or specifically for the tdnf package manager.

The simplest way to get your proxy configured for tdnf, as well as other tools like WGET and Curl is to define a system-wide proxy. You’ll find the relevant configuration in the /etc/sysconfig/proxy file:

Continue reading “Configuring a Proxy in Photon OS”

Debunking the VM Link Speed Myth!

10Gbps from a 10Mbps NIC? Why not? Debunking the VM link speed myth once and for all!

** Edit on 11/6/2017: I hadn’t noticed before I wrote this post, but Raphael Schitz (@hypervisor_fr) beat me to the debunking! Please check out his great post on the subject as well here. **

I have been working with vSphere and VI for a long time now, and have spent the last six and a half years at VMware in the support organization. As you can imagine, I’ve encountered a great number of misconceptions from our customers but one that continually comes up is around VM virtual NIC link speed.

Every so often, I’ll hear statements like “I need 10Gbps networking from this VM, so I have no choice but to use the VMXNET3 adapter”, “I reduced the NIC link speed to throttle network traffic” and even “No wonder my VM is acting up, it’s got a 10Mbps vNIC!”

I think that VMware did a pretty good job documenting the role varying vNIC types and link speed had back in the VI 3.x and vSphere 4.0 era – back when virtualization was still a new concept to many. Today, I don’t think it’s discussed very much. People generally use the VMXNET3 adapter, see that it connects at 10Gbps and never look back. Not that the simplicity is a bad thing, but I think it’s valuable to understand how virtual networking functions in the background.

Today, I hope to debunk the VM link speed myth once and for all. Not with quoted statements from documentation, but through actual performance testing.

Continue reading “Debunking the VM Link Speed Myth!”

VM Network Performance and CPU Scheduling

Over the years, I’ve been on quite a few network performance cases and have seen many reasons for performance trouble. One that is often overlooked is the impact of CPU contention and a VM’s inability to schedule CPU time effectively.

Today, I’ll be taking a quick look at the actual impact CPU scheduling can have on network throughput.

Testing Setup

To demonstrate, I’ll be using my dual-socket management host. As I did in my recent VMXNET3 ring buffer exhaustion post, I’ll be testing with VMs on the same host and port group to eliminate bottlenecks created by physical networking components. The VMs should be able to communicate as quickly as their compute resources will allow them.

Physical Host:

  • 2x Intel Xeon E5 2670 Processors (16 cores at 2.6GHz, 3.3GHz Turbo)
  • 96GB PC3-12800R Memory
  • ESXi 6.0 U3 Build 5224934

VM Configuration:

  • 1x vCPU
  • 1024MB RAM
  • VMXNET3 Adapter (1.1.29 driver with default ring sizes)
  • Debian Linux 7.4 x86 PAE
  • iperf 2.0.5

The VMs I used for this test are quite small with only a single vCPU and 1GB of RAM. This was done intentionally so that CPU contention could be more easily simulated. Much higher throughput would be possible with multiple vCPUs and additional RX queues.

The CPUs in my physical host are Xeon E5 2670 processors clocked at 2.6GHz per core. Because this processor supports Intel Turbo Boost, the maximum frequency of each core will vary depending on several factors and can be as high as 3.3GHz at times. To take this into consideration, I will test with a CPU limit of 2600MHz, as well as with no limit at all to show the benefit this provides.

To measure throughput, I’ll be using a pair of Debian Linux VMs running iperf 2.0.5. One will be the sending side and the other the receiving side. I’ll be running four simultaneous threads to maximize throughput and load.

I should note that my testing is far from precise and is not being done with the usual controls and safeguards to ensure accurate results. This said, my aim isn’t to be accurate, but rather to illustrate some higher-level patterns and trends.

Continue reading “VM Network Performance and CPU Scheduling”

VMXNET3 RX Ring Buffer Exhaustion and Packet Loss

ESXi is generally very efficient when it comes to basic network I/O processing. Guests are able to make good use of the physical networking resources of the hypervisor and it isn’t unreasonable to expect close to 10Gbps of throughput from a VM on modern hardware. Dealing with very network heavy guests, however, does sometimes require some tweaking.

I’ll quite often get questions from customers who observe TCP re-transmissions and other signs of packet loss when doing VM packet captures. The loss may not be significant enough to cause a real application problem, but may have some performance impact during peak times and during heavy load.

After doing some searching online, customers will quite often land on VMware KB 2039495 and KB 1010071 but there isn’t a lot of context and background to go with these directions. Today I hope to take an in-depth look at VMXNET3 RX buffer exhaustion and not only show how to increase buffers, but to also to determine if it’s even necessary.

Rx Buffering

Not unlike physical network cards and switches, virtual NICs must have buffers to temporarily store incoming network frames for processing. During periods of very heavy load, the guest may not have the cycles to handle all the incoming frames and the buffer is used to temporarily queue up these frames. If that buffer fills more quickly than it is emptied, the vNIC driver has no choice but to drop additional incoming frames. This is what is known as buffer or ring exhaustion.

Continue reading “VMXNET3 RX Ring Buffer Exhaustion and Packet Loss”