NSX Troubleshooting Scenario 14 – Solution

Welcome to the fourteenth installment of a new series of NSX troubleshooting scenarios. Thanks to everyone who took the time to comment on the first half of the scenario. Today I’ll be performing some troubleshooting and will show how I came to the solution.

Please see the first half for more detail on the problem symptoms and some scoping.

Getting Started

In the first half, our fictional customer was trying to prevent a specific summary route from being advertised to a DLR appliance using a BGP filter. Every time they added the filter, all connectivity to VMs downstream from that DLR was lost.

tshoot14a-4

The filter appears correct. The summary route is a /21 network that comprises all eight /24s that were assigned to logical switches. You can also see that GE and LE (greater than/less than) values were not specified, so the specific summary route should be matched exactly.

tshoot14a-5

After publishing the changes, we saw that all BGP routes were removed from the DLR. It’s almost as if the filter stopped ALL route prefixes from making it to the DLR rather than just the one specified. Wait, did it?

Let’s refer to the NSX documentation on BGP filters. Under the Configure BGP section, the relevant steps are the following:

Continue reading “NSX Troubleshooting Scenario 14 – Solution”

NSX Troubleshooting Scenario 14

Welcome to the fourteenth installment of my NSX troubleshooting series. What I hope to do in these posts is share some of the common issues I run across from day to day. Each scenario will be a two-part post. The first will be an outline of the symptoms and problem statement along with bits of information from the environment. The second will be the solution, including the troubleshooting and investigation I did to get there.

The Scenario

As always, we’ll start with a brief problem statement:

“I’m trying to prevent some specific BGP routes from being advertised to my DLR, but the route filters aren’t working properly. Every time I try to do this, I get an outage to everything behind the DLR!”

Let’s have a quick look at what this fictional customer is trying to do with BGP.

tshoot14a-0

The design is simple – a single ESG peered with a single DLR appliance. The /21 address space assigned to this environment has been split out into eight /24 networks.

tshoot14a-3

The mercury-esg1 appliance has two neighbors configured – the physical router (172.18.0.1) and the southbound DLR protocol address (172.18.8.4). Both the ESG and DLR are in the same AS (iBGP).

tshoot14a-1

As you can see, on mercury-esg1 a summary static route has been created with the DLR forwarding address as the next hop. This /21 summarizes all eight /24 subnets that will be assigned to the logical switches in this environment. Because the customer wants more specific /24 BGP routes to be advertised by the DLR, this is what is referred to as a floating static route. Because it’s less specific, it’ll only take effect as a backup should BGP peering go down. This is a common design consideration and provides a bit of extra insurance should the DLR appliance go down unexpectedly.

Continue reading “NSX Troubleshooting Scenario 14”