Having difficulty deploying NSX controllers in 6.3.3? You are not alone. VMware has just made public a newly discovered bug impacting NSX controllers based on the Photon OS platform. This includes NSX 6.3.3 and 6.3.4. VMware KB 000051144 provides a detailed summary of the symptoms, but essentially:
- New NSX 6.3.3 Controllers will fail to deploy after November 2nd, 2017.
- New NSX 6.3.4 Controllers will fail to deploy after January 1st, 2018.
- Controllers deployed before this date will be prompting for a new password on login attempt.
That said, if you attempted a fresh deployment of NSX 6.3.3 today, you would not be able to deploy a control cluster.
The issue appears to stem from root and admin account credentials expiring 90 days after the creation of the NSX build. This is not 90 days after it’s deployed, but rather 90 days after the build was created by VMware. This is why NSX 6.3.3 will begin having issues after November 2nd and 6.3.4 will be fine until January 1st 2018.
Some important points:
- If you have already deployed NSX 6.3.3 or 6.3.4, don’t worry – your controllers will continue to function just fine. Having expired admin/root passwords will not break communication between NSX components.
- This issue does not pose any kind of datapath impact. It will only pose issues if you attempt a fresh deployment, attempt to upgrade or delete and re-deploy controllers.
- Until you’ve had a chance to implement the workaround in KB 000051144, you should obviously avoid any of the mentioned workflows.
It appears that VMware will be re-releasing new builds of the existing 6.3.3 and 6.3.4 downloads with the fix in place, along with a fix in 6.3.5 and future releases. They have already added the following text to the 6.3.3 and 6.3.4 release notes:
Important information about NSX 6.3.3: NSX for vSphere 6.3.3 has been repackaged to address the problems mentioned in VMware Knowledge Base articles 2151719 and 000051144. The originally released build 6276725 is replaced with build 7087283. Please refer to the Knowledge Base articles for more detail. See Upgrade Notes for upgrade information.
Old 6.3.3 Build Number: 6276725
New 6.3.3 Build Number: 7087283
Old 6.3.4 Build Number: 6845891
New 6.3.4 Build Number: 7087695
As an added bonus, VMware took advantage of this situation to include the fix for the NSX controller disconnect issue in 6.3.3 as well. This other issue is described in VMware KB 2151719. Despite what it says in the 6.3.4 release notes, only 6.3.3 was susceptable to the issue outlined in KB 2151719.
If you’ve already found yourself in this predicament, VMware has provided an API call that can be used as a workaround. The API call appears to correct the issue by setting the appropriate accounts to never expire. If the password has already expired, it’ll reset it. It’s then up to you to change the password. Detailed steps can be found in KB 000051144.
It’s unfortunate that another controller issue has surfaced after the controller disconnect issue discovered in 6.3.3. Whenever there is a major change like the introduction of a new underlying OS platform, these things can clearly be missed. Thankfully the impact to existing deployments is more of an inconvenience than a serious problem. Kudos to the VMware engineering team for working so quickly to get these fixes and workarounds released!