Category Archives: News

NSX 6.3.6 Now Available!

As of March 29th, the long anticipated NSX 6.3.6 release is now available to download from VMware. NSX 6.3.6 with build number 8085122 is a maintenance release and includes a total of 20 documented bug fixes. You can find details on these in the Resolved Issues section of the NSX 6.3.6 release notes.

Aside from bug fixes, there are a couple of interesting changes to note. From the release notes:

“If you have more than one vSphere Distributed Switch, and if VXLAN is configured on one of them, you must connect any Distributed Logical Router interfaces to port groups on that vSphere Distributed Switch. Starting in NSX 6.3.6, this configuration is enforced in the UI and API. In earlier releases, you were not prevented from creating an invalid configuration.”

Since confusion with multiple DVS switches is something I’ve run into with customers in the past, I’m happy to see that this is now being enforced.

Another great addition is an automatic backup function included in 6.3.6. From the public documentation:

“When you upgrade NSX Manager to NSX 6.3.6, a backup is taken and saved locally as part of the upgrade process. You must contact VMware customer support to restore this backup. This automatic backup is intended as a failsafe in case the regular backup fails.”

As part of the upgrade process, a backup file is saved to the local filesystem of the NSX Manager as an extra bit of insurance. It’s important to note, however, that this does not remove the need to backup prior to upgrading. Consider this the backup of last resort in case something goes horribly wrong.

Another point to note is that NSX 6.3.6 continues to be incompatible with upgrades from 6.2.2, 6.2.1 or 6.2.0. You can see VMware KB 51624 for more information, but don’t try it – it won’t work and you’ll be forced to restore from backup. Upgrading to 6.2.9 before going to 6.3.6 is the correct workaround. I covered more about this issue here in a recent post.

There are a number of great bug fixes included in 6.3.6 – far too many for me to cover here, but a couple that I’m really happy to see include:

“Fixed Issue 2035026: Network outage of ~40-50 seconds seen on Edge Upgrade. During Edge upgrade, there is an outage of approximately 40-50 seconds. Fixed in 6.3.6

This one is self-explanatory – not the expected amount of downtime to experience during an edge upgrade, so glad to see it’s been resolved.

“Fixed Issue 2058636: After upgrading to 6.3.5, the routing loop between DLR and ESG’s causes connectivity issues in certain BGP configurations. A routing loop is causing a connectivity issue. Fixed in 6.3.6”

I hope to write a separate post on this one, but in short, some loop prevention code was removed in 6.3.5, and because the AS PATH is stripped with private BGP autonomous systems, this can lead to loops. If you are running iBGP between your DLR and ESGs, this isn’t a problem, but if your AS numbers differ between DLR and ESG, you could run into this. In 6.4.0 a toggle switch was included to avoid stripping the AS PATH, so this is more of an issue in 6.3.5.

As always, if you are planning to upgrade, be sure to thoroughly go through the release notes. I’d also recommend taking a look through my recent post ‘Ten Tips for a Successful NSX Upgrade’.

Links and Downloads:

NSX 6.4.0 Upgrade Compatibility

Thinking about upgrading to NSX 6.4.0? As I discussed in my recent Ten Tips for a Successful NSX Upgrade post, it’s always a good idea to do your research before upgrading. Along with reading the release notes, checking the VMware compatibility Matrix is essential.

VMware just updated some of the compatibility matrices to include information about 6.4.0. Here are the relevant Links:

From an NSX upgrade path perspective, you’ll be happy to learn that any current build of NSX 6.2.x or 6.3.x should be fine. At the time of writing, this would be 6.2.9 and earlier as well as 6.3.5 and earlier.

640upg-0

NSX upgrade compatibility – screenshot from 1/17/2018.

On a positive note, VMware required a workaround to be done for some older 6.2.x builds to go to 6.3.5, but this is no longer required for 6.4.0. The underling issue that required this has been resolved.

From a vCenter and ESXi 6.0 and 6.5 perspective, the requirements for NSX 6.4.0 remain largely unchanged from late 6.3.x releases. What you’ll immediately notice is that NSX 6.4.0 is not supported with vSphere 5.5. If you are running vSphere 5.5, you’ll need to get to at least 6.0 U2 before considering NSX 6.4.0.

From the NSX 6.4.0 release notes:

Supported: 6.0 Update 2, 6.0 Update 3
Recommended: 6.0 Update 3. vSphere 6.0 Update 3 resolves the issue of duplicate VTEPs in ESXi hosts after rebooting vCenter server. See VMware Knowledge Base article 2144605 for more information.

Supported: 6.5a, 6.5 Update 1
Recommended: 6.5 Update 1. vSphere 6.5 Update 1 resolves the issue of EAM failing with OutOfMemory. See VMware Knowledge Base Article 2135378 for more information.

Note: vSphere 5.5 is not supported with NSX 6.4.0.

It doesn’t appear that the matrix has been updated yet for other VMware products that interact with NSX, such as vCloud Director.

Before rushing out to upgrade to NSX 6.4.0, be sure to check for compatibility – especially if you are using any third party products. It may be some time before other vendors certify their products for 6.4.0.

Stay tuned for a closer look at some of the new NSX 6.4.0 features!

 

NSX 6.4 Now Available!

The long anticipated NSX 6.4.0 (build number 7564187) has finally been released. There is a long list of new features that I’ll hopefully cover in more depth in a future post. For now, here are a few highlights:

  • Layer-7 Distributed Firewall – The DFW now supports some layer-7 application context.
  • IDFW – The IDFW now supports user sessions via RDP.
  • HTML5 – Some of the new functionality is done in HTML5. You can see the supported functions here.
  • New Upgrade Coordinator – A single portal that simplifies the upgrade process! More on this later.
  • New HTML5 dashboard – The new NSX default home page.
  • System Scale – A new dashboard that helps you to monitor maximums and supported scale parameters.
  • Single Click Support Bundles – An awesome feature that GSS will appreciate!
  • API Improvements – Now supporting JSON and XML for formatting.
  • NSX Edge Enhancements – Many improvements to BGP, NAT and faster HA failovers.

I think most people will agree that the new L7 firewall is the most exciting new feature. This really takes micro segmentation to the next level and provides features that were only possible with 3rd party add-on products in the past.

I think it’s also important to note that there are many bug fixes in NSX 6.4.0. Not only does it provide many new features, but it fixes bugs found in earlier releases. There are over 33 bug fixes alone in the NSX 6.4.0 release!

Relevant links:

Stay tuned for more info! Martijn Smit at vmguru.com (@smitmartijn) did a great post for those looking for more info on some of the new features!

Check NSX 6.2.x Compatibility Before Upgrading to 6.3.5!

Unlike previous 6.3.x releases, 6.3.5 has some new upgrade minimum version compatibility requirements. This is not only true from a vSphere perspective, but also for the version of NSX 6.2.x you are running. If you are running an older 6.2.0, 6.2.1 or 6.2.2 release of NSX, you’ll need to upgrade to at least 6.2.4 before taking the big step up to 6.3.5. VMware has just updated the NSX Upgrade Matrix to reflect this requirement:

622upg635-1

Screenshot taken from the VMware Interoperability Matrix site.

I expect that VMware will update the 6.3.5 release notes and release a new KB article very shortly. I’ll provide some more detail when that is out. In the meantime, please be sure to heed the version requirements or you will most likely run into problems.

Thankfully there aren’t too many customers still running these old releases of 6.2.x, but if you have already attempted the upgrade and hit problems, you’ll need to roll back. If you took a cold-snapshot of the manager or a clone, you can roll back that way. Otherwise, you’ll need to deploy the original 6.2.x OVA again and restore your FTP backup.

** Edit 11/29/2017: VMware has just updated the NSX 6.3.5 release notes to include mention of the minimum version requirements. The following statement was added:

Important: If you are upgrading NSX 6.2.0, 6.2.1, or 6.2.2 to NSX 6.3.5, you must complete a workaround before starting the upgrade. See VMware Knowledge Base article 000051624 for details.

VMware calls it a “workaround” but it’s basically just upgrading to an interim version before going to 6.3.5. In KB 000051624, VMware recommends going to 6.2.9 as that workflow has been tested. I.e. upgrading from 6.2.0 to 6.2.9, and then to 6.3.5. On a positive note, you only need to upgrade your NSX Manager to 6.2.9, no other components need to be upgraded before proceeding on to 6.3.5.

If you attempt an upgrade from 6.2.2 or older releases, my understanding is that the upgrade will appear to be completed successfully, but your configuration will be missing. VMware calls out the remediation steps of rolling back to the previous version should you run into this issue.

New NSX Controller Issue Identified in 6.3.3 and 6.3.4.

Having difficulty deploying NSX controllers in 6.3.3? You are not alone. VMware has just made public a newly discovered bug impacting NSX controllers based on the Photon OS platform. This includes NSX 6.3.3 and 6.3.4.  VMware KB 000051144 provides a detailed summary of the symptoms, but essentially:

  • New NSX 6.3.3 Controllers will fail to deploy after November 2nd, 2017.
  • New NSX 6.3.4 Controllers will fail to deploy after January 1st, 2018.
  • Controllers deployed before this date will be prompting for a new password on login attempt.

That said, if you attempted a fresh deployment of NSX 6.3.3 today, you would not be able to deploy a control cluster.

The issue appears to stem from root and admin account credentials expiring 90 days after the creation of the NSX build. This is not 90 days after it’s deployed, but rather 90 days after the build was created by VMware. This is why NSX 6.3.3 will begin having issues after November 2nd and 6.3.4 will be fine until January 1st 2018.

Some important points:

  • If you have already deployed NSX 6.3.3 or 6.3.4, don’t worry – your controllers will continue to function just fine. Having expired admin/root passwords will not break communication between NSX components.
  • This issue does not pose any kind of datapath impact. It will only pose issues if you attempt a fresh deployment, attempt to upgrade or delete and re-deploy controllers.
  • Until you’ve had a chance to implement the workaround in KB 000051144, you should obviously avoid any of the mentioned workflows.

It appears that VMware will be re-releasing new builds of the existing 6.3.3 and 6.3.4 downloads with the fix in place, along with a fix in 6.3.5 and future releases. They have already added the following text to the 6.3.3 and 6.3.4 release notes:

Important information about NSX 6.3.3: NSX for vSphere 6.3.3 has been repackaged to address the problems mentioned in VMware Knowledge Base articles 2151719 and 000051144. The originally released build 6276725 is replaced with build 7087283. Please refer to the Knowledge Base articles for more detail. See Upgrade Notes for upgrade information.

Old 6.3.3 Build Number: 6276725
New 6.3.3 Build Number: 7087283

Old 6.3.4 Build Number: 6845891
New 6.3.4 Build Number: 7087695

As an added bonus, VMware took advantage of this situation to include the fix for the NSX controller disconnect issue in 6.3.3 as well. This other issue is described in VMware KB 2151719. Despite what it says in the 6.3.4 release notes, only 6.3.3 was susceptable to the issue outlined in KB 2151719.

If you’ve already found yourself in this predicament, VMware has provided an API call that can be used as a workaround. The API call appears to correct the issue by setting the appropriate accounts to never expire. If the password has already expired, it’ll reset it. It’s then up to you to change the password. Detailed steps can be found in KB 000051144.

It’s unfortunate that another controller issue has surfaced after the controller disconnect issue discovered in 6.3.3. Whenever there is a major change like the introduction of a new underlying OS platform, these things can clearly be missed. Thankfully the impact to existing deployments is more of an inconvenience than a serious problem. Kudos to the VMware engineering team for working so quickly to get these fixes and workarounds released!

 

NSX 6.2.9 Now Available for Download!

Although NSX 6.3.x is getting more time in the spotlight, VMware continues to patch and maintain the 6.2.x release branch. On October 26th, VMware made NSX for vSphere 6.2.9 (Build Number 6926419) available for download.

Below are the relevant links:

This is a full patch release, not a minor maintenance release like 6.2.6 and 6.3.4 were. VMware documents a total of 26 fixed issues in the release notes. Some of these are pretty significant relating to everything from DFW to EAM and even some host PSOD fixes. Definitely have a look through the resolved issues section of the release notes for more detail.

On a personal note, I’m really happy to see NSX continue to mature and become more and more stable over time. Working in the support organization, I can confidently say that many of the problems we used to see often are just not around any more – especially with host preparation and the control plane. The pace in which patch releases for NSX come out is pretty quick and some may argue that it is difficult to keep up with. I think this is just something that must be expected when you are working with state of the art technology like NSX. That said, kudos to VMware Engineering for the quick turnaround on many of these identified issues!

NSX 6.3.4 Now Available!

On Friday October 13th, VMware released NSX for vSphere 6.3.4. You may be surprised to see another 6.3.x version only two months after the release of 6.3.3. Unlike the usual build updates, 6.3.4 is a maintenance release containing only a small number of fixes for problems identified in 6.3.3. This is very similar to the 6.2.6 maintenance release that came out shortly after 6.2.5.

As always, the relevant detail can be found in the 6.3.4 Release Notes. You can also find the 6.3.4 upgrade bundle at the VMware NSX Download Page.

In the Resolved Issues section of the release notes, VMware outlines only three separate fixes that 6.3.4 addresses.

Resolved Issues

I’ll provide a bit of additional commentary around each of the resolved issues in 6.3.4:

Fixed Issue 1970527: ARP fails to resolve for VMs when Logical Distributed Router ARP table crosses 5K limit

This first problem was actually a regression in 6.3.3. In a previous release, the ARP table limit was increased to 20K, but in 6.3.3 the limit regressed back to previous limit of 5K. To be honest, not many customers have deployments to the scale where this would be a problem. A small number of very large deployments may see issues in 6.3.3.

Fixed Issue 1961105: Hardware VTEP connection goes down upon controller reboot. A BufferOverFlow exception is seen when certain hardware VTEP configurations are pushed from the NSX Manager to the NSX Controller. This overflow issue prevents the NSX Controller from getting a complete hardware gateway configuration. Fixed in 6.3.4.

This buffer overflow issue could potentially cause datapath issues. Thankfully, not very many NSX designs include the use of Hardware VTEPs, but if yours does and you are running 6.3.3, it would be a good idea to consider upgrading to 6.3.4.

And the final, but most likely to impact customer’s is listed third in the release notes:

Fixed Issue 1955855: Controller API could fail due to cleanup of API server reference files. Upon cleanup of required files, workflows such as traceflow and central CLI will fail. If external events disrupt the persistent TCP connections between NSX Manager and controller, NSX Manager will lose the ability to make API connections to controllers, and the UI will display the controllers as disconnected. There is no datapath impact. Fixed in 6.3.4.

I discussed this issue in more detail in a recent blog post. You can also find more information on this issue in VMware KB 2151719. In a nutshell, the communication channel between NSX Manager and the NSX Control cluster can become disrupted due to files being periodically purged by a cleanup maintenance script. Usually, you wouldn’t notice until the connection needed to be re-established after a network outage or an NSX manager reboot. Thankfully, as VMware mentions, there is no datapath impact and a simple workaround exists. Despite being more of an annoyance than a serious problem, the vast majority of NSX users running 6.3.3 are likely to hit this at one time or another.

My Opinion and Upgrade Recommendations

The third issue in the release notes described in VMware KB 2151719 is likely the most disruptive to the majority of NSX users. That said, I really don’t think it’s critical enough to have to drop everything and upgrade immediately. The workaround of restarting the controller API service is relatively simple and there should be no resulting datapath impact.

The other two issues described are not likely to be encountered in the vast majority of NSX deployments, but are potentially more serious. Unless you are really pushing the scale limits or are using Hardware VTEPs, there is likely little reason to be concerned.

I certainly think that VMware did the right thing to patch these identified problems as quickly as possible. For new greenfield deployments, I think there is no question that 6.3.4 is the way to go. For those already running 6.3.3, it’s certainly not a bad idea to upgrade, but you may want to consider holding out for 6.3.5, which should include a much larger number of fixes.

On a positive note, if you do decide to upgrade, there are likely some components that will not need to be upgraded. Because there are only a small number or fixes relating to the control plane and logical switching, ESGs, DLRs and Guest Introspection will likely not have any code changes. You’ll also benefit from not having to reboot ESXi hosts for VIB patches thanks to changes in the 6.3.x upgrade process. Once I have a chance to go through the upgrade in my lab, I’ll report back on this.

Running 6.3.3 today? Let me know what your plans are!