Category Archives: NSX

Using the Upgrade Coordinator in NSX 6.4

If you’ve ever gone through an NSX upgrade, you know how many components there are to upgrade. You’ve got your NSX manager appliances, control cluster, ESXi host VIBs, edges, DLR and even guest introspection appliances. In the past, every one of these needed to be upgraded independently and in the correct order.

VMware hopes to make this process a lot more straight forward with the release of the new ‘Upgrade Coordinator’ feature. This is now included as of 6.4.0 in the HTML5 client.
The aim of the upgrade coordinator is to create an upgrade plan or checklist and then to execute this in the correct order. There are many aspects of the upgrade plan than can be customized but for those looking for maximum automation – a single click upgrade option exists as well.

It is important to note that although the upgrade coordinator helps to take some of the guess work out of upgrading, there are still tasks and planning you’ll want to do ahead of time. If you haven’t already, please read my Ten Tips for a Successful NSX Upgrade post.

Today I’ll be using the upgrade coordinator to go from 6.3.3 to 6.4.0 and walk you through the process.

Upgrading NSX Manager

Although the upgrade coordinator plan covers numerous NSX components, NSX manager is not one of them. You’ll still need to use the good old manager UI upgrade process as described on page 36 of the NSX 6.4 upgrade guide. Thankfully, this is the easiest part of the upgrade.

You’ll also notice that I can use the upgrade coordinator for my lab upgrade even though I’m at a 6.3.x release currently. This is because the NSX manager is upgraded first, adding this management plane functionality to be used for the rest of the upgrade.

Note: If you are using a Cross-vCenter deployment of NSX, be sure to upgrade your primary, followed by all secondary managers before proceeding with the rest of the upgrade.

upgco-1

Upgrading NSX Manager to 6.4.x should look very familiar as the process really hasn’t changed. Be sure to heed the warning banner about taking a backup before proceeding. For more info on this, please see my Ten Tips for a Successful NSX Upgrade post.

Continue reading

Missing Labels in the HTML5 Plugin with NSX 6.4.

If you recently upgraded to NSX 6.4, you are probably anxious to check out the new HTML5 plugin. VMware added some limited functionality in HTML5, including the new dashboard, upgrade coordinator as well as packet capture and support bundle collection tools. After upgrading NSX manager, you may notice that the plugin does not look the way it should. Many labels are missing. Rather than seeing tab titles like ‘Overview’ and ‘System Scale’ you see ‘dashboard.button.label.overview’ and ‘dashboard.button.label.systemScale’:

html5labels-1

Obviously, things aren’t displaying as they should be, and some views – like the upgrade coordinator – are practically unusable:

html5labels-2

Continue reading

NSX Troubleshooting Scenario 10 – Solution

Welcome to the tenth installment of a new series of NSX troubleshooting scenarios. Thanks to everyone who took the time to comment on the first half of the scenario. Today I’ll be performing some troubleshooting and will show how I came to the solution.

Please see the first half for more detail on the problem symptoms and some scoping.

Getting Started

As we saw in the first half, our fictional administrator was attempting to configure an ESG load balancer for both TCP and UDP port 514 traffic. Below is the high-level topology:

tshoot10a-1

One of the first things to keep in mind when troubleshooting the NSX load balancer is the mode in which it’s operating. In this case, we know the customer is using a one-armed load balancer. The tell-tale sign is that the ESG sits in the same VLAN as the pool members with a single interface. Also, the pool members do not have the ESG configured as their default gateway.

We also know based on the screenshots in the first half that the load balancer is not operating in ‘Transparent’ mode – so traffic to the pool members should appear as though it’s coming from the load balancer virtual IP, not from the actual syslog clients. The packet capture the customer did proves that this is actually not the case.

That said, how exactly does an NSX one-armed load balancer work?

As traffic comes in on one of the interfaces and ports configured as a ‘virtual server’, the load balancer will simply forward the traffic to one of the pool members based on the load balancing algorithm configured. In our case, it’s a simple ‘round robin’ rotation of the pool members per session/socket. But forwarding would imply that the syslog servers would see traffic coming from the originating source IP of the syslog client. This would cause a fundamental problem with asymmetry when the pool member needs to reply. When it does, the traffic would bypass the ESG and be sent directly back to the client. This would be fine with UDP, which is connection-less, but what about TCP?

Continue reading

NSX Troubleshooting Scenario 10

Welcome to the tenth installment of my NSX troubleshooting series – a milestone number for the one-year anniversary of vswitchzero.com. I wasn’t sure how many of these I’d write, but I’ve gotten lots of positive feedback so if I can keep thinking of scenarios, I’ll keep going!

What I hope to do in these posts is share some of the common issues I run across from day to day. Each scenario will be a two-part post. The first will be an outline of the symptoms and problem statement along with bits of information from the environment. The second will be the solution, including the troubleshooting and investigation I did to get there.

I’ll try to include some questions as well for educational purposes in each post.

The Scenario

As always, we’ll start with a brief problem statement:

“I’m using an ESG load balancer to send syslog traffic to a pool of two Linux servers. I can only seem to get UDP syslog traffic to arrive at the pool members. TCP based syslog traffic doesn’t work. I’m using a one-armed load balancer. If I do a packet capture, all I see is the UDP traffic but it’s not coming from the load balancer”

Using the NSX load balancer services for syslog purposes is not at all uncommon. We see this frequently with products like Splunk as well as others. Since syslog traffic can be very heavy, this is a good use case.

When it comes to troubleshooting NSX load balancer issues, triple checking the configuration is key. In speaking with the customer, this is his desired outcome:

  • One-armed load balancer in VLAN 15.
  • No routing done by the edge. Default gateway configuration only and a single interface for simplicity.
  • Transparency is not required – the source IP can be the load balancer as the required source information is in the syslog data transmitted.
  • A mix of both TCP and UDP port 514 traffic is to be load balanced.

Here is a basic, high-level topology provided by the customer:

tshoot10a-1

The one armed load balancer called esg-lb1 is sitting in VLAN 15. It’s default gateway is the SVI interface of the physical switch (172.16.15.1). There is only one hop between the ESXi hosts – the syslog clients – and the ESG in VLAN 15. Because this is a one-armed topology, the syslog-a1 and syslog-a2 servers are using the same switch SVI as their default gateway.

Continue reading

Blank Error While Adding NSX DLR or ESG Interfaces

I recently deployed NSX 6.3.2 in my home lab to do some testing. After deploying a DLR, I went back in to add some additional interfaces and was greeted by a ‘blank’ or null error message. Having run into this problem before, I thought it may be a good idea to give some additional context to VMware KB 2151309.

dlrblankerror-1

As you can see above, there is no text associated with the error. There are no problems with the IP or mask I used, and it doesn’t seem clear why this would be failing.

You would expect to find more detail in the NSX Manager vsm.log file, but interestingly there is nothing there at all for this exception. That’s because this isn’t an NSX fault, but rather something in the vSphere Web Client.

Continue reading

NSX 6.4.1 Now Available!

On May 24th, VMware released NSX 6.4.1 – the first version of NSX to support vSphere 6.7. This is undoubtedly exciting news for those who have been waiting to upgrade their vSphere deployment. Although 6.4.1 sounds like a minor release, there are a slew of UI and usability enhancements as well as context-aware firewall improvements. There has also been some additional functionality introduced into the HTML5 client, which is very welcome news.

You’ll also notice in 6.4.1 that the service composer canvas view has been removed. This was a bit of an iconic overview page for service composer in the UI, but was not terribly useful and didn’t scale at all to large deployments with many security policies. I honestly don’t think anyone will be missing it.

On top of these enhancements, VMware engineering has been busy with bug fixes. NSX 6.4.1 includes 23 documented fixes across all areas of the product. A couple of notable ones include:

  • Fixed Issue 2035026: Network outage of around 40-50 seconds seen on Edge Upgrade
  • Fixed Issue 1971683: NSX Manager logs false duplicate IP message
  • Fixed Issue 2092730: NSX Edge stops responding with /var/log partition at 100% disk usage
  • Fixed Issue 1809387: Support for Weak Secure transport protocol – TLS v1.0 removed

You can find the complete list in the resolved issues section of the NSX 6.4.1 release notes.

Planning to upgrade? Remember to check the NSX upgrade matrix. Those running 6.2.0, 6.2.1 or 6.2.2 will need to refer to KB 51624 before upgrading. Have a look at my ‘Ten Tips for a Successful NSX Upgrade‘ post for ways to ensure your upgrade is successful.

Relevant Links for NSX 6.4.1 (Build 8599035):

NSX Troubleshooting Scenario 9 – Solution

Welcome to the ninth installment of a new series of NSX troubleshooting scenarios. Thanks to everyone who took the time to comment on the first half of scenario nine. Today I’ll be performing some troubleshooting and will show how I came to the solution.

Please see the first half for more detail on the problem symptoms and some scoping.

Getting Started

As we saw in the first half, our fictional administrator was unable to install the NSX VIBs on the cluster called compute-a:

tshoot9a-1

We also saw that there were two different NSX licences added to vCenter. One called ‘Endpoint’ and the other ‘Enterprise’.

tshoot9b-1

You can see that the ‘Usage’ for both licenses is currently “0 CPUs”, but that’s because it hasn’t been installed on any ESXi hosts yet to consume any. What’s most telling, however, is the small little grey exclamation mark on the license icon. If I hover over this, I get a message stating:

“The license is not assigned. To comply with the EULA, assign the license to at least one asset.”

Continue reading