NSX Troubleshooting Scenario 4

Time for another NSX troubleshooting scenario! Welcome to the fourth installment of my new NSX troubleshooting series. What I hope to do in these posts is share some of the common issues I run across from day to day. Each scenario will be a two-part post. The first will be an outline of the symptoms and problem statement along with bits of information from the environment. The second will be the solution, including the troubleshooting and investigation I did to get there.

NSX Troubleshooting Scenario 4

As always, we’ll start with a customer problem statement:

“We recently deployed Cross-vCenter NSX for a remote datacenter location. When we try to add VMs to the universal logical switches, there are no VM vNICs in the list to add. This works fine at the primary datacenter.”

This customer’s environment will be the same as what we outlined in scenario 3. Keep in mind that this should be treated separately. Forget everything from the previous scenario.

tshoot3a-9

The main location is depicted on the left. A three host cluster called compute-a exists there. All of the VLAN backed networks route through a router called vyos. The Universal Control Cluster exists at this location, as does the primary NSX manager.

The ‘remote datacenter’ is to the right of the dashed line. The single ‘compute-r’ cluster there is associated with the secondary NSX manager at that location. According to the customer, this was only recently added.

The Problem

The customer has two virtual machines, linux-r1 and linux-r2 that are experiencing this issue.

tshoot3a-3

From the NSX logical switches view, we can see several universal logical switches that are visible to both the primary and secondary NSX managers. The one he’s trying to add the VMs to is ‘Universal Web’:

tshoot4a-3

The fact that we can see these logical switches in the ‘Secondary’ NSX manager view seems to support that they are indeed synchronized universal objects that can be used by either environment. In the ‘Add Virtual Machines’ dialog, we see both VMs available for addition:

tshoot4a-4

But on the next step where the VM vNics need to be selected, nothing is displayed at all. We get no error messages, just a blank page and we’re forced to hit ‘Cancel’.

tshoot4a-5

If we try the more traditional way to change the VM’s vNIC backing using the ‘Edit Settings’ dialog, we don’t see the universal logical switch – or any universal logical switch – available for selection:

tshoot4a-6

Interestingly, we can see the non-universal logical switches associated with the secondary NSX manager, but none of the universal switches.

tshoot4a-7

From the networking view, I can’t see any of the dvPortgroups associated with the universal logical switches. They only seem to exist on the other distributed switch that’s used by the compute-a cluster at the primary datacenter location.

What’s Next?

If you are interested, have a look through the information provided above and let me know what you would check or what you think the problem may be! I want to hear your suggestions!

What other information would you need to see? What tests would you run? What do you know is NOT the problem based on the information and observations here?

I will update this post with a link to the solution as soon as it’s completed. Please feel free to leave a comment below or via Twitter (@vswitchzero).

2 thoughts on “NSX Troubleshooting Scenario 4”

  1. Cluster in Remote Datacenter has not been added to the Universal transport zone, that’s why new Logical switchs are not pushed to ESXis host. i hope i am wrong, because i like your resolution method 😀

Leave a comment