NSX-T Troubleshooting Scenario 2 – Solution

Welcome to the second installment of a new series of NSX-T troubleshooting scenarios. Thanks to everyone who took the time to comment on the first half of the scenario. Today I’ll be performing some troubleshooting and will show how I came to the solution.

Please see the first half for more detail on the problem symptoms and some scoping.

Getting Started

As we saw in the first half, our fictional customer was having northbound communication problems because the physical core router was not getting any of the NSX advertised routes:

vyos@router-core:~$ sh ip route
Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
B - BGP, > - selected route, * - FIB route

S>* [1/0] via, eth0.1
C>* is directly connected, eth0.2005
C>* is directly connected, lo
C>* is directly connected, eth0.1
C>* is directly connected, eth0.11
C>* is directly connected, eth0.76
C>* is directly connected, eth0.98

Based on what we observed in the first half, we can make a few assertions:

  1. The T1 routers are advertising their routes just fine to the T0 (a total of 8 routes).
  2. The T0 router is peering with the core router successfully because we received BGP routes from the core router.
  3. The T0 router is configured for route redistribution of NSX connected and Static routes.

Let’s just run through a couple of quick tests to confirm point one above and make sure that the T0 can communicate with the core router. From VRF 2 (the T0 SR), we’ll check the interface IP first:

Continue reading “NSX-T Troubleshooting Scenario 2 – Solution”

NSX-T Troubleshooting Scenario 2

Welcome to the second NSX-T troubleshooting scenario! What I hope to do in these posts is share some of the common issues I run across from day to day. Each scenario will be a two-part post. The first will be an outline of the symptoms and problem statement along with bits of information from the environment. The second will be the solution, including the troubleshooting and investigation I did to get there.

The Scenario

As always, we’ll start with a fictional customer problem statement:

“I’ve just deployed a new NSX-T 2.3.1 environment with two tenants. The T1 routers (one per tenant) appear to be working fine. I have VM to VM connectivity on logical switches, but I can’t get to any northbound networks. The non-NSX core router isn’t getting any of the NSX routes!”

Taking a quick look at the environment, we can see that each tenant T1 router has several logical switches attached. Each is advertising four subnets as can be seen below:


You can also see that the ‘Advertise All NSX Connected Routes’ option is enabled, which should cause these routes to be advertised to the T0.


On the T0, we can  see that there are ‘Linked Ports’ to both T1 routers, as well as a VLAN-backed logical switch for northbound communication via edge-e1. Let’s start by ensuring that these routes are actually making it to the T0 SR.

From the edge CLI, I start by listing all logical router instances to determine the VRF for the T0 SR:

Continue reading “NSX-T Troubleshooting Scenario 2”

NSX-T Troubleshooting Scenario 1 – Solution

Welcome to the first installment of a new series of NSX-T troubleshooting scenarios. Thanks to everyone who took the time to comment on the first half of the scenario. Today I’ll be performing some troubleshooting and will show how I came to the solution.

Please see the first half for more detail on the problem symptoms and some scoping.

Getting Started

As we saw in the first half, the installation of the NSX-T VIBs were failing with the following error:


At first glance, it looked as if the NSX-T VIBs, or an older version of them were already installed. Taking a closer look at the actual VIB names, however, was very telling. The ‘esx-nsxv’ in the name denotes that these belong to NSX for vSphere.

Logging in to host esx-a3 via SSH and checking for installed VIBs with ‘nsx’ in the name came back with the following:

[root@esx-a3:~] esxcli software vib list |grep nsx
esx-nsxv                       6.5.0-0.0.8590012                     VMware      VMwareCertified   2018-08-31

Indeed, the NSX-V VIBs are still installed. Having a look at the environment, we saw that all other traces of NSX-V were gone – the manager, controllers, vmkernel ports, portgroups and Web Client plugin were missing. Only these lingering VIBs were not removed from these three hosts for some reason. It’s important to properly remove NSX to prevent issues like this from occurring.

Removing the NSX-V VIBs

The first order of business was to put the host in maintenance mode. I didn’t have any running VMs created yet, so I just went ahead and put all three in maintenance mode:


Once that was done, I could remove the VIBs using the following esxcli software vib command:

Continue reading “NSX-T Troubleshooting Scenario 1 – Solution”

Manual Installation of NSX-T Kernel Modules in ESXi

Last week, I discussed the manual deployment of NSX-T controller nodes. Today, I’ll take a look at adding standalone ESXi hosts.

Although people usually associate manual deployment with KVM hypervisors, there is no reason you can’t do the same with ESXi hosts. Obviously, automating this process with vCenter Server as a compute manager has its advantages, but one of the empowering features of NSX-T is that is has no dependency on vCenter Server whatsoever.

Obtaining the ESXi VIBs

First, we’ll need to download the ESXi host VIBs. In my case, the hosts are running ESXi 6.5 U2, so I downloaded the correct 6.5 VIBs from the NSX-T download site.


Once I had obtained the ZIP file, I used WinSCP to copy it to the /tmp location on my ESXi host. The file is only a few megabytes in size so it can go just about anywhere. If you’ve got a lot of hosts to do, putting it in a shared datastore makes sense.

Installing the ESXi VIBs

Because the NSX-T kernel module is comprised of a number of VIBs, we need to install it as an ‘offline depot’ as opposed to individual VIB files. That said, there is no need to extract the ZIP file. To install it, I used the esxcli software vib install command as shown below:

[root@esx-a3:/tmp] esxcli software vib install --depot=/tmp/nsx-lcp-
Installation Result
   Message: Operation finished successfully.
   Reboot Required: false
   VIBs Installed: VMware_bootbank_epsec-mux_6.5.0esx65-9272189, VMware_bootbank_nsx-aggservice_2., VMware_bootbank_nsx-cli-libs_2., VMware_bootbank_nsx-common-libs_2., VMware_bootbank_nsx-da_2., VMware_bootbank_nsx-esx-datapath_2., VMware_bootbank_nsx-exporter_2., VMware_bootbank_nsx-host_2., VMware_bootbank_nsx-metrics-libs_2., VMware_bootbank_nsx-mpa_2., VMware_bootbank_nsx-nestdb-libs_2., VMware_bootbank_nsx-nestdb_2., VMware_bootbank_nsx-netcpa_2., VMware_bootbank_nsx-opsagent_2., VMware_bootbank_nsx-platform-client_2., VMware_bootbank_nsx-profiling-libs_2., VMware_bootbank_nsx-proxy_2., VMware_bootbank_nsx-python-gevent_1.1.0-9273114, VMware_bootbank_nsx-python-greenlet_0.4.9-9272996, VMware_bootbank_nsx-python-logging_2., VMware_bootbank_nsx-python-protobuf_2.6.1-9273048, VMware_bootbank_nsx-rpc-libs_2., VMware_bootbank_nsx-sfhc_2., VMware_bootbank_nsx-shared-libs_2., VMware_bootbank_nsxcli_2.
   VIBs Removed:
   VIBs Skipped:

Remember, your host will need to be in maintenance mode for the installation to succeed. Once finished, a total of 24 new VIBs were installed as shown:

[root@esx-a3:/tmp] esxcli software vib list |grep -i nsx
nsx-aggservice                       VMware      VMwareCertified   2019-02-15
nsx-cli-libs                         VMware      VMwareCertified   2019-02-15
nsx-common-libs                      VMware      VMwareCertified   2019-02-15
nsx-da                               VMware      VMwareCertified   2019-02-15
nsx-esx-datapath                     VMware      VMwareCertified   2019-02-15
nsx-exporter                         VMware      VMwareCertified   2019-02-15
nsx-host                             VMware      VMwareCertified   2019-02-15
nsx-metrics-libs                     VMware      VMwareCertified   2019-02-15
nsx-mpa                              VMware      VMwareCertified   2019-02-15
nsx-nestdb-libs                      VMware      VMwareCertified   2019-02-15
nsx-nestdb                           VMware      VMwareCertified   2019-02-15
nsx-netcpa                           VMware      VMwareCertified   2019-02-15
nsx-opsagent                         VMware      VMwareCertified   2019-02-15
nsx-platform-client                  VMware      VMwareCertified   2019-02-15
nsx-profiling-libs                   VMware      VMwareCertified   2019-02-15
nsx-proxy                            VMware      VMwareCertified   2019-02-15
nsx-python-gevent              1.1.0-9273114                         VMware      VMwareCertified   2019-02-15
nsx-python-greenlet            0.4.9-9272996                         VMware      VMwareCertified   2019-02-15
nsx-python-logging                   VMware      VMwareCertified   2019-02-15
nsx-python-protobuf            2.6.1-9273048                         VMware      VMwareCertified   2019-02-15
nsx-rpc-libs                         VMware      VMwareCertified   2019-02-15
nsx-sfhc                             VMware      VMwareCertified   2019-02-15
nsx-shared-libs                      VMware      VMwareCertified   2019-02-15
nsxcli                               VMware      VMwareCertified   2019-02-15

You can find information on the purpose of some of these VIBs in the NSX-T documentation.

Connecting the ESXi Host to the Management Plane

Now that we have the required software installed, we need to connect the ESXi host to NSX Manager. To begin, we’ll need to get the certificate thumbprint from the NSX Manager:

nsxmanager> get certificate api thumbprint

Next, we need to drop into the nsxcli shell from the ESXi CLI prompt, and then run the join management-plane command as shown below:

[root@esx-a3] # nsxcli
esx-a3> join management-plane username admin thumbprint ccdbda93573cd1dbec386b620db52d5275c4a76a5120087a174d00d4508c1493
Password for API user: ********
Node successfully registered as Fabric Node: 0b08c694-3155-11e9-8a6c-0f1235732823

If all went well, we should now see our NSX Manager listed as connected:

esx-a3> get managers
-      Connected

From the root prompt of the ESXi host, we can see that there are now established TCP connections to the NSX Manager appliance on the RabbitMQ port 5671.

[root@esx-a3:/tmp] esxcli network ip connection list |grep 5671
tcp         0       0    ESTABLISHED     84232  newreno  mpa
tcp         0       0    ESTABLISHED     84232  newreno  mpa

From the NSX UI, we can now see the host appear as connected under ‘Standalone Hosts’:


As a next step, you’ll want to add this new host as a transport node and you should be good to go.

It’s great to have the flexibility to do this completely without the assistance of vCenter Server. Anyone who has had to deal with the quirks of VC integration and ESX Agent Manager (EAM) in NSX-V will certainly appreciate this.