VXLAN overlay technology is part of what makes software defined networking possible. By encapsulating full frames into UDP datagrams, L2 networks can be stretched across all manners of routed topologies. This breaks down the barriers of physical networking and builds the foundation for the software defined datacenter.
VXLAN or Virtual Extensible LAN is an IETF standard documented in RFC 7348. L2 over routed topologies is made possible by encapsulating entire L2 frames into UDP datagrams. About 50 bytes of outer header data is added to every L2 frame because of this, meaning that for every frame sent on a VXLAN network, both an encapsulation and de-encapsulation task must be performed. This is usually performed by ESXi hosts in software but can sometimes be offloaded to physical network adapters as well.
In a perfect world, this would be done without any performance impact whatsoever. The reality, however, is that software defined wizardry often does have a small performance penalty associated with it. This is unavoidable, but that doesn’t mean there isn’t anything that can be done to help to minimize this cost.
If you’ve been doing some performance testing, you’ve probably noticed that VMware doesn’t post statements like “You can expect X number of Gbps on a VXLAN network”. This is because there are simply too many variables to consider. Everything from NIC type, switches, drivers, firmware, offloading features, CPU count and frequency can play a role here. All these factors must be considered. From my personal experience, I can say that there is a range – albeit a somewhat wide one – of what I’d consider normal. On a modern 10Gbps system, you can generally expect more than 4Gbps but less than 7Gbps with a 1500 MTU. If your NIC supports VXLAN offloading, this can sometimes be higher than 8Gbps. I don’t think I’ve ever seen a system achieve line-rate throughput on a VXLAN backed network with a 1500 MTU regardless of the offloading features employed.
What if we can reduce the amount of encapsulation and de-encapsulation that is so taxing on our hypervisors? Today I’m going to take an in-depth look at just this – using an 8900 MTU to reduce packet rates and increase throughput. The results may surprise you!