Author Archives: Mike

Certificate Error During Datastore Upload

I have recently rebuilt my home lab – an all too common occurrence due to the number of times I intentionally try to break things. In the process of rebuilding, I had some ISO files I wanted to copy over to a datastore. The process failed and the Web Client greeted me with an uncharacteristically long error message.

dsupload-1

The exact text reads:

“The operation failed for an undetermined reason. Typically, this problem occurs due to certificates that the browser does no trust. If you are using self-signed or custom certificates, open the URL below in a new browser tab and accept the certificate, then retry the operation.”

In my case, the URL that it listed was to one of my ESXi hosts in the compute-a cluster called esx-a2. The error then goes on to reference VMware KB 2147256.

It may seem odd that the vSphere Client would be telling you to visit a random ESXi host’s UI address when you are trying to upload a file via vCenter. But if you stop to think about it for a second, vCenter has no access whatsoever to your datastores. Whether you are trying to create a new VMFS datastore, upload a file or even just browse, vCenter must rely on an ESXi host with the necessary access to do the actual legwork. That ESXi host then relays the information back through the Web Client.

Continue reading

NSX Troubleshooting Scenario 9 – Solution

Welcome to the ninth installment of a new series of NSX troubleshooting scenarios. Thanks to everyone who took the time to comment on the first half of scenario nine. Today I’ll be performing some troubleshooting and will show how I came to the solution.

Please see the first half for more detail on the problem symptoms and some scoping.

Getting Started

As we saw in the first half, our fictional administrator was unable to install the NSX VIBs on the cluster called compute-a:

tshoot9a-1

We also saw that there were two different NSX licences added to vCenter. One called ‘Endpoint’ and the other ‘Enterprise’.

tshoot9b-1

You can see that the ‘Usage’ for both licenses is currently “0 CPUs”, but that’s because it hasn’t been installed on any ESXi hosts yet to consume any. What’s most telling, however, is the small little grey exclamation mark on the license icon. If I hover over this, I get a message stating:

“The license is not assigned. To comply with the EULA, assign the license to at least one asset.”

Continue reading

NSX Troubleshooting Scenario 9

Welcome to the ninth installment of my NSX troubleshooting series. What I hope to do in these posts is share some of the common issues I run across from day to day. Each scenario will be a two-part post. The first will be an outline of the symptoms and problem statement along with bits of information from the environment. The second will be the solution, including the troubleshooting and investigation I did to get there.

The Scenario

As always, we’ll start with a brief problem statement:

“We’re in the process of deploying NSX. We were able to deploy the NSX Manager and Control Cluster, but every time we try to install the VIBs on the host, it fails with a licensing error. We have already added the license for NSX Enterprise in vCenter!”

Every time the customer tries to prepare cluster compute-a, they get the following error:

tshoot9a-1

The exact error is:

“Operation is not allowed by the applied NSX license.”

Looking in the most obvious spot, we can see that the customer had indeed added a license for ‘NSX for vSphere – Enterprise’. Not only that, but there is also an ‘NSX for vShield Endpoint’ license.

Continue reading

5.25″ Floppy Drive Alignment

About a year ago, I bought a dusty old Panasonic WU-475 1.2MB 5.25” floppy drive from someone on Kijiji. It was being sold as-is, but for the price I decided to give it a go. To my surprise, it seemed to work initially, but within a few minutes it began to emit a horrid clanging and grinding noise. After opening the drive up, it was clear that the stepper motor had completely ceased up.

After applying some lubricant to the rail and cleaning the drive out, the motor was again functional. Thinking it would be good to go, I installed it and tested it out again. Excitement quickly turned to disappointment, however, when I discovered that the drive could no longer read any of my 5.25” floppies. After troubleshooting for a while, I discovered that if I formatted a disk using the drive, it could be read/written just fine. It was only diskettes from other sources that wouldn’t work. This behavior seemed to indicate that the drive somehow went out of alignment during my disassembly and cleaning.

I didn’t know much about floppy alignment aside from the fact that some specialized equipment that I didn’t have would be needed to correct the problem. Generally an oscilloscope is used to take readings during sector reads and then fine adjustments are made until the waveform looks correct. This was the suggested method I discovered in the Panasonic service guide for the WU-475.

Discouraged, I had shelved the drive and let it sit for the better part of a year. Fast forward to May 5th – the 26th anniversary of the classic PC game Wolfenstein 3D. It was time to do something retro. I really wanted to get this drive working again, so I did some more research on the subject. That’s when I came across an old thread at the Vintage Computer Forum. A commenter named Rick discussed a great piece of software called ImageDisk by Dave Dunfield. Because I had some brand new 1.2MB IBM formatted diskettes that had never been used or formatted by another drive, I could use these as a reference point and make the necessary adjustments. At any rate, it was certainly worth a try!

 

Every drive is different, but the WU-475 has a pair of screws that hold the stepper motor in position. The screw openings are not perfect circles and allow the mechanism to be slid back and forth a millimeter or so in each direction.

 

Firing up ImageDisk and running the alignment test, I was initially greeted by lots of question marks scrolling down the screen indicating that each sector could not be read. As I loosened the screws and slid the mechanism forward slowly, the PC speaker sprung to life and began to beep indicating successful reads. Once I had it in the position that seemed to yield the best results, I scrolled through all 80 tracks to ensure they could all be read. I then tightened the screws well and lo and behold, the drive works wonderfully again! I’m sure my alignment isn’t perfect, but for all intents and purposes, the drive works.

It’s always a great feeling when you can restore something old and forgotten. As always, do this at your own risk. Making adjustments like this on a live system is inherently risky, so be careful!

Memory Usage Alarm with PCI Passthrough VMs

In the recent revamp of my lab environment, I decided to use VT-d passthrough for a pfsense VM. It has been working well with the integrated Intel igb based NICs on my management host, but I noticed that I started getting memory alarms on the VM.

vtd-mem-0

At first, I thought I may have sized the VM a bit too small with only 512MB of RAM, but when checking in the guest itself, I saw only a small amount was actually being used:

vtd-mem-2

At only 19% utilized, I’m nowhere near the 95% required to trigger this alarm. As you can see in the performance charts, all of the memory is being used by the guest from the perspective of ESXi:

vtd-mem-1

But after thinking about this for a moment, it makes sense – one of the requirements for PCI passthrough is to reserve all guest memory. For passthrough to function, the hypervisor must provide 100% consistent and reliable memory to the guest. What better way to ensure that then to reserve and pin all memory to the VM.

Although I understand why all memory is active and consumed, it’s unfortunate that vCenter doesn’t take into consideration the reason for this. In my search for an answer, I came across VMware KB 2149787. It appears that this can impact not only VMs with passthrough, but also fault tolerant VMs and VMs with latency sensitivity set to ‘high’. Unfortunately, the resolution suggested is to disable to virtual machine memory alarm at the vCenter object level. This effectively disables the alarm for everything in the inventory. I hope that at some point, vSphere will allow disabling specific alarms on a per-VM basis because few people would want to take this approach.

For now, I think the best course of action is to simply click ‘Reset to Green’, which should clear the alarm until the VM is powered off/on again. Just keep in mind that this is normal for this type of VM and that the alarm can be disregarded.

NSX Troubleshooting Scenario 8 – Solution

Welcome to the eighth installment of a new series of NSX troubleshooting scenarios. Thanks to everyone who took the time to comment on the first half of scenario eight. Today I’ll be performing some troubleshooting and will show how I came to the solution.

Please see the first half for more detail on the problem symptoms and some scoping.

Getting Started

In the first half of scenario 8, we saw that our fictional administrator was getting an error message while trying to deploy the first of three controller nodes.

The exact error was:

“Waiting for NSX controller ready controller-1 failed in deployment – Timeout on waiting for controller ready.”

Unfortunately, this doesn’t tell us a whole lot aside from the fact that the manager was waiting and eventually gave up.

tshoot8a-7

Now, before we begin troubleshooting, we should first think about the normal process for controller deployment. What exactly happens behind the scenes?

  1. The necessary inputs are provided via the vSphere Client or REST API (i.e. deployment information like datastore, IP Pool etc).
  2. NSX Manager then deploys a controller OVF template that is stored on it’s local filesystem. It does this using vSphere API calls via its inventory tie-in with vCenter Server.
  3. Once the OVF template is deployed, it will be powered on.
  4. During initial power on, the machine will receive an IP address, either via DHCP or via the pool assignment.
  5. Once the controller node has booted, NSX Manager will begin to push the necessary configuration information to it via REST API calls.
  6. Once the controller node is up, and is able to serve requests and communicate with NSX Manager, the deployment is considered successful and the status in the UI changes from ‘Deploying’ to ‘Connected’

Let’s have a look at the NSX Manager logging to see if we can get more information:

Continue reading

NSX Troubleshooting Scenario 8

Welcome to the eighth installment of my NSX troubleshooting series. What I hope to do in these posts is share some of the common issues I run across from day to day. Each scenario will be a two-part post. The first will be an outline of the symptoms and problem statement along with bits of information from the environment. The second will be the solution, including the troubleshooting and investigation I did to get there.

The Scenario

As always, we’ll start with a brief problem statement:

“I’m doing a new greenfield deployment of NSX and my control cluster is failing to deploy. It seems stuck at ‘Deploying’ and then after a long period of time, it gives me a failure and the appliance gets deleted.”

Let’s have a look and see what this fictional administrator is seeing:

tshoot8a-2

We can see that they’ve successfully deployed NSX Manager at version 6.3.2 and have no controllers successfully deployed yet.

tshoot8a-3

A valid looking IP pool has been created for the controllers with all the pertinent IP settings populated. The controller deployment is being done with the following settings:

Continue reading