Preventing MikroTik RouterOS Bridge MAC Changes

I noticed that my Windows 10 VM kept becoming inaccessible after shutting down and powering up my lab. As part of the power down process, I was cutting power to the MikroTik CRS309 switch, which is the default gateway for the VMs in VLAN 1. After opening a console to the Windows VM I discovered that the network discovery feature was detecting a new network and prompting whether it was trusted. This discovery is based upon the MAC address of the default gateway – sure enough it seemed to be changing after each power up of the CRS309.

After doing some research it seems that this is expected behaviour. The bridge MAC is auto-selected based on one of the bridge ports at boot-up. Because of this, there is a good chance the MAC will change after each boot. If I left my switch up 24/7, this wouldn’t be a problem, but since I don’t, I need keep things consistent.

The MikroTik wiki mentions two options – admin-mac and auto-mac. These two can be used to force the bridge to use a static MAC address. I just selected the current auto-generated MAC to use for this purpose.

You can get the current bridge configuration using the following command:

[admin@MikroTik] > /interface bridge print
Flags: X - disabled, R - running
0 R name="bridge1" mtu=auto actual-mtu=9000 l2mtu=10218 arp=enabled arp-timeout=auto mac-address=4E:A4:50:F4:E7:1C protocol-mode=rstp fast-forward=yes igmp-snooping=no auto-mac=yes ageing-time=5m priority=0x8000 max-message-age=20s
forward-delay=15s transmit-hold-count=6 vlan-filtering=yes ether-type=0x8100 pvid=1 frame-types=admit-all ingress-filtering=no dhcp-snooping=no

I then forced a specific MAC address using the following:

[admin@MikroTik] > /interface bridge set bridge1 admin-mac=4e:a4:50:f4:e7:1C auto-mac=no

And you can validate that the change stuck using the print command again:

[admin@MikroTik] > /interface bridge print
Flags: X - disabled, R - running
0 R name="bridge1" mtu=auto actual-mtu=9000 l2mtu=10218 arp=enabled arp-timeout=auto mac-address=4E:A4:50:F4:E7:1C protocol-mode=rstp fast-forward=yes igmp-snooping=no auto-mac=no admin-mac=4E:A4:50:F4:E7:1C ageing-time=5m
priority=0x8000 max-message-age=20s forward-delay=15s transmit-hold-count=6 vlan-filtering=yes ether-type=0x8100 pvid=1 frame-types=admit-all ingress-filtering=no dhcp-snooping=no
[admin@MikroTik] >

This seemed to have the desired effect. The MAC address is now consistent at every boot up!

Overheating NVMe Flash Drives

I recently deployed an all-NVMe based vSAN configuration in my home lab. I’ll be posting more information on my setup soon, but I decided to use OEM Samsung based SSDs. I’ve got 256GB SM961 MLC based drives for my cache tier, and larger 1TB enterprise-grade PM953s for capacity. These drives are plenty quick for vSAN and can be had for great prices on eBay if you know where to look.

nvmeheat-1
The Samsung Polaris based SM961 is similar to the 960 Pro and well suited for vSAN caching.

Being OEM drives, they don’t have any heatsinks and are pretty bare. As I started running some performance tests using synthetic tools like Crystal Disk Mark and ATTO, I began to see instability. My guest running the test would completely hang after a few minutes of testing and I’d be forced to reboot the ESXi host to recover.

Looking through the logs, it became clear what had happened:

2019-08-16T15:43:26.083Z cpu0:2341677)nvme:AsyncEventReportComplete:3050:Smart health event: Temperature above threshold
2019-08-16T15:43:26.087Z cpu9:2097671)nvme:NvmeExc_ExceptionHandlerTask:317:Critical warnings detected in smart log [2], failing controller
2019-08-16T15:43:26.087Z cpu9:2097671)nvme:NvmeExc_RegisterForEvents:370:Async event registration requested while controller is in Health Degraded state.

One of my nvme drives had overheated! The second time I tried the test, I watched more closely.

Sure enough, it wasn’t the older PM953s overheating, but the newer Polaris based SM961 cache drives. As soon as the heavy writes started, the drive’s temperature steadily increased until it approached 70’C. The moment it hit 70, the guest hung. Looking more closely in ESXi, I could see that the drive completely disappeared. I.e. it was no longer listed as a NVMe device or HBA in the system. It appears that this is safety measure to stop the controller from cooking itself to the point of permanent damage. Since I had no idea it was running so hot, I’d say I’m thankful for this feature – but none the less, I’d have to figure out some way to keep these drives cooler.

ESXi has a limited implementation of SMART monitoring and can pull a few specific metrics. Thankfully, drive temperature is one of them. First, I needed to get the t10 identifier for my nvme drives:

[root@esx-e1:~] esxcli storage core device list |grep SAMSUNG
t10.NVMe____SAMSUNG_MZVPW256HEGL2D000H1______________6628B171C9382499
Display Name: Local NVMe Disk (t10.NVMe____SAMSUNG_MZVPW256HEGL2D000H1______________6628B171C9382499)
Devfs Path: /vmfs/devices/disks/t10.NVMe____SAMSUNG_MZVPW256HEGL2D000H1______________6628B171C9382499
Model: SAMSUNG MZVPW256
t10.NVMe____SAMSUNG_MZ1LV960HCJH2D000MU______________1505216B24382888
Display Name: Local NVMe Disk (t10.NVMe____SAMSUNG_MZ1LV960HCJH2D000MU______________1505216B24382888)
Devfs Path: /vmfs/devices/disks/t10.NVMe____SAMSUNG_MZ1LV960HCJH2D000MU______________1505216B24382888
Model: SAMSUNG MZ1LV960

Running a four second refresh interval using ‘watch’ is a useful way to monitor the drive under stress.

[root@esx-e1:~] watch -n 4 "esxcli storage core device smart get -d t10.NVMe____SAMSUNG_MZVPW256HEGL2D000H1______________6628B171C9382499"
Parameter Value Threshold Worst
---------------------------- ----- --------- -----
Health Status OK N/A N/A
Media Wearout Indicator N/A N/A N/A
Write Error Count N/A N/A N/A
Read Error Count N/A N/A N/A
Power-on Hours 974 N/A N/A
Power Cycle Count 62 N/A N/A
Reallocated Sector Count 0 95 N/A
Raw Read Error Rate N/A N/A N/A
Drive Temperature 35 70 N/A
Driver Rated Max Temperature N/A N/A N/A
Write Sectors TOT Count N/A N/A N/A
Read Sectors TOT Count N/A N/A N/A
Initial Bad Block Count N/A N/A N/A

As you can see, the maximum temperature is listed as 70’C. This isn’t a suggestion as I’ve come to learn the hard way.

To get things cooler I decided to move my fans around in my Antec VSK4000 cases. My lab is geared toward silence more than cooling so the airflow near the PCIe slots is pretty poor. I’ve now got a 120mm fan on the side-panel cooling the slots directly. This benefits my Solarflare 10Gbps NICs as well, which can get quite toasty. This helped significantly, but if I leave a synthetic test running long enough, it will eventually get to 70’C again. Clearly, I’ll need to add passive heatsinks to the SM961s if I want to keep them cool in these systems.

Realistically, it’s only synthetic and very heavy write tests that seem to get the temperature climbing to those levels. It’s unlikely that day-to-day use would cause a problem. None the less, I’m going to look into heatsinks for the drives. They can be had for $5-10 on Amazon, so it seems like a small investment for some extra peace of mind.

The morale of the story – keep an eye on your NVMe controller temps!

MikroTik CRS309 Configuration – Part 2

Configuring inter-VLAN routing on the MikroTik CRS309-1G-8S+IN and checking the routing table.

Welcome to part two of a series on configuring the MikroTik CRS309-1G-8S+IN for use in a VMware home lab. In part one, I discussed RouterOS and how to get VLAN trunking configured for all ports from the CLI. Today I’ll be configuring inter-VLAN routing. Keep in mind that I’m using the CRS309, but this should be applicable to the popular CRS317 and other CRS3xx switches running RouterOS as well.

As I mentioned in my unboxing and overview post, the CRS309 is a fully featured L3 switch. It uses the Marvell Prestera DX switching chip with a forwarding ASIC for line-rate L2 performance at very low power consumption levels. This is great for a home lab, because your non-routed vSAN, iSCSI and vMotion kernel traffic can take advantage of full 10Gbps line rate throughput. To use the L3 features, however, the CRS3xx switches rely on the integrated 32-bit ARM CPU cores in the Prestera DX. This means that things like inter-VLAN routing, firewalling and other L3/L4 features are done in software. MikroTik rates the switch for around 2.5Gbps for anything that will need to be processed by the CPU cores. None the less, this is still plenty of routing throughput for a home lab and I’m really happy that they include a huge number of these features even if they are somewhat performance limited.

VLAN Interface Creation

In the part one I relied on the CLI to get started. Now that we have a management IP configured, we can continue in the UI. To get inter-VLAN routing working on the CRS3xx series switches, there are two steps – first, we’ll need to create VLAN interfaces with the correct VLAN ID specified. Second, we need to assign an IP address to these interfaces.
If you recall in part one, I created a VLAN interface in VLAN 1 to get management connectivity to the switch again. That interface is visible in the UI from the Interfaces/VLAN tab:

crs309-intervlan-1

As a next step, I’ll create VLAN interfaces for the other five VLANs (11,76,98,2005 and 3156). Click the ‘Add New’ button to create a new one. I’ll start with VLAN 11 as an example.

crs309-intervlan-2

In my case there were only three options I needed to change – name, VLAN ID and Interface. I like to give my VLAN interfaces the names INT_VLANX for consistency. Be sure to assign the VLAN interface to the bridge that you created previously so that it is accessible from all ports on the switch. I repeated this for all of my VLANs.

Continue reading “MikroTik CRS309 Configuration – Part 2”

MikroTik CRS309 Configuration – Part 1

The first in a multi-part series on configuring the CRS309 for a VMware home lab. Part 1 covers 802.1q VLAN trunking.

This will be the first of several posts on configuring the new MikroTik CRS309-1G-8S+IN for use in a VMware home lab. Today’s first post will introduce MikroTik’s RouterOS and will focus on VLAN trunking. Please note that although this was written for the CRS309, this should be applicable to all CRS3xx series switches, including the popular CRS317.

RouterOS vs SwOS

MikroTik’s RouterOS firmware that comes pre-loaded on the switch is very feature rich with loads of L3 functionality. It has everything from BGP/OSPF, to firewalling, NAT, DHCP services – you name it. It’s a multi-purpose firmware that was originally created for MikroTik’s routers. Today, it works with many of their other products, including switches, and wireless devices. Because of this, it feels more like router firmware adapted for use with switches as opposed to something designed especially for them. Configuring RouterOS on the CRS309/CRS317 can be challenging and there is a learning curve to overcome – even for those with a lot of networking experience. Things get easier when you learn the terminology and understand what does and does not apply to a switch within RouterOS.

The CRS309 also includes ‘SwOS’ – a straight-forward L2 firmware that is designed especially for their switches. I found SwOS’s UI to be very intuitive and a lot more like what I’d expect to see on a switch. If you are not looking to use any of the advanced L3 features that RouterOS offers, I’d highly recommend using SwOS instead. There is no need to flash back and forth from RouterOS to SwOS and vice-versa. Both firmware packages coexist in the flash memory and you can easily switch back and forth as required.

Although SwOS could suffice for my needs in the home lab, I’m not one to shy away from a challenge and really want to be able to take advantage of some of the L3 features. If I could get them to work, I could potentially retire one or two virtual machines that I currently use for routing and firewalling.

802.1q VLAN Trunking

One of the first things I tried to do with the CRS309 was get a bunch of VLANs created and all the interfaces configured as 802.1q VLAN trunk ports. This is probably the most common VMware home lab requirement with 10Gbps networking. Each host will have a limited number of interfaces, so VLAN trunks are critical to separate out the various services.

When you get into the RouterOS WebFig interface, you’ll find several places where VLANs can be configured. My first attempt was under ‘Switch’ and ‘VLANs’ – sounds logical enough:

crs309-vlans1

But clearly this doesn’t work. There is also a section for VLANs under ‘Interfaces’, but this is for VLAN interface creation, which I’ll get into later. One of the first things to keep in mind with RouterOS is that a lot of the configuration relating to the switch – including VLAN configuration – is done in the ‘Bridge’ section – not the Switch section.

From the CRS3xx series manual:

“Since RouterOS v6.41 bridges provide VLAN aware Layer2 forwarding and VLAN tag modifications within the bridge. This set of features makes bridge operation more like a traditional Ethernet switch and allows to overcome Spanning Tree compatibility issues compared to configuration when tunnel-like VLAN interfaces are bridged..”

Another important thing to remember is that any VLAN related configuration you do in the bridge does not come into effect until an option called vlan-filtering is set to ‘yes’ for the bridge itself.

Continue reading “MikroTik CRS309 Configuration – Part 1”

MikroTik CRS309-1G-8S+IN

Initial impressions on a very energy efficient 10Gbps SFP+ switch and an excellent choice for a VMware home lab!

A little over a year ago I decided to get my lab 10Gbps capable. At the time, the Quanta LB6M seemed like the obvious choice with 24x SFP+ interfaces, several 1Gbps copper interfaces and a price tag of only $300 USD. It worked well in my lab, but as you can imagine, older 10Gbps technology is far from energy efficient. The switch idled at close to 130W, which wasn’t far off from all three of my compute hosts put together. It was also very loud with seven high-RPM 40mm fans. Even when at their lowest setting, these chassis and PSU fans were painfully loud. There is no doubt that this switch is more at home in a datacenter than a home lab. It wasn’t long before I started keeping an eye out for newer offerings based on more efficient and cost-effective technologies. That’s when I came across MikroTik’s latest 3xx series “Cloud Router Switches”.

The MikroTik CRS309-1G-8S+IN

MikroTik is a technology firm based out of Latvia and is well known for their unique networking products. They sell a range of switches, routers, wireless products and even their caseless RouterBOARD systems for those interested in doing custom routers.

I was originally looking to purchase the CRS309’s bigger brother – the CRS317. Feature-wise they are very similar, but the CRS317 doubles up on SFP+ and 1Gbps copper interfaces. With two SFP+ ports on each of my four ESXi hosts, the CRS309’s eight SFP+ ports seemed to be sufficient. The feature that sold me on the CRS309 as opposed to the CRS317, however, was its completely passive/fanless design. The CRS317’s fans only spin up if the unit gets hot, but I liked the simplicity of a completely fanless solution.

Feature-wise, the CRS309 is a pretty impressive switch. Its state of the art Marvell Prestera 98DX8208 switching chip (98DX8216 in the CRS317) is what makes this such an efficient unit. The 98DX8208 is highly integrated and is good for line rate forwarding on all the SFP+ ports. It also includes an integrated dual core 32-bit ARM based processor running 800MHz. You can find more information on the Marvell Prestera here. The flash storage is a pretty spartan 16MB, but this seems to be plenty for the RouterOS and SwOS firmware to coexist on the switch simultaneously. I’ll get more into the CRS309’s software in a future post.

From a L2 perspective, it’s a perfectly capable unit. It can do line-rate L1 and L2 forwarding at about 81,000Mbps aggregate, or 162,000 Mbps full-duplex. This is because the 98DX8208 has an ASIC switching component and all L2 forwarding is done in hardware. The L3 features of the switch, however, are all done in software and must be processed by the integrated ARM CPU cores. Based on MikroTik’s test results, about 2.5Gbps can be expected when traffic has to go through the CPU. Your mileage may vary though depending on the features you use. This may not sound great, but in a home lab environment, you don’t really need high throughput for inter-VLAN routing or other L3 features. I’m most interested in having 10Gbps line rate throughput for host-to-host VSAN traffic, iSCSI, vMotion etc. If you really do need faster routing, you could always spin up a VyOS VM and use your hypervisor’s horsepower for that purpose. Given the $269 USD price of the unit, I think it’s awesome that you get a full suite of L3 features even if throughput is somewhat limited.

You can find more information and the specifications of the CRS309 at Mikrotik’s site.

Continue reading “MikroTik CRS309-1G-8S+IN”

Noctua NH-U9DX i4 and NH-D9DX i4 3U Heatsink Review

There are many different approaches you can take when building a VMware home lab, but I always prefer to do custom-build tower systems. This allows me to select the components I want – the fans, heatsinks, PSU – to keep the noise to a minimum. Even though I keep the lab in the basement, I really don’t like to hear it running. I don’t need the density that a rack provides, and I’m quite happy to have my nearly silent towers humming along on a wire shelf instead. Not to mention that lower noise usually equates to lower power consumption and of course, it keeps my wife happy as well – the most critical metric of all.

I finally got around to upgrading my three ancient compute nodes recently. I have been using the included Dynatron R13 1U copper heatsinks that came with the motherboards I bought on eBay. They work, and keep the CPUs relatively cool, but as you can imagine, noise was not a key consideration in their design. They are as compact and as efficient as possible at only an inch tall. At idle, the Supermicro X9SRL-F keeps them at a low RPM, but put any load on the systems and you’re quickly at 7500RPM and the noise is pretty unbearable. No problem for a datacenter, but not for a home lab.

Today I’ll be taking a bit of a departure from my regular posts and will be doing an in-depth review of two high-end Noctua heatsinks. Noctua was kind enough to send me a review sample of not one, but two of their Xeon heatsinks – the NH-U9DX i4 and the NH-D9DX i4 3U.

Noctua

noctua_200x200pxNoctua is an Austrian company well known for their low noise fans and high-end heatsinks. I’ve been using Noctua heatsinks for ages. In fact, I reviewed some of their original heatsinks and fans many years ago when I used to write hardware reviews. This included their original NH-U12P, the NH-C12P and the smaller NH-U9B. Back then, I praised them for their high-quality construction, near silent operation, excellent mounting hardware and most importantly – excellent cooling performance. That was over ten years ago, and it seems that Noctua is still very well respected for all the same reasons today.

Their gear has always been pricey compared to the competition, but when it comes to Noctua, you get what you pay for.

Square vs Narrow ILM LGA 2011

One of the challenges I had with my new/used Supermicro X9SRL-F boards is that they don’t use the common ‘square’ LGA2011 mounting pattern. Instead, they use what’s referred to as ‘Narrow ILM’ that allows the socket to be more rectangular in shape. This allows more real estate for memory slots and other components on the board. Because of this, I was severely limited in my heatsink choices. Most of what’s available out there for Narrow ILM are rack mount heatsinks similar to what I’m hoping to remove. The majority of the consumer-grade stuff for LGA2011 simply won’t fit.

noctuai4-24

Here you can see the Dynatron R13 narrow ILM heatsink mounted:

Continue reading “Noctua NH-U9DX i4 and NH-D9DX i4 3U Heatsink Review”

Low-Voltage DDR3 – Performance, Heat and Power

When I built my new compute nodes, I chose PC3L-12800R DIMMs that could run in both standard 1.5V and 1.35V low-voltage modes. What I didn’t realize, however, is that Intel’s specification for 1.35V registered memory on Socket R based systems limits low-voltage operation to 1333MHz. The Supermicro X9SRL-F boards I’m using enforce this, despite the modules being capable of the full 1600MHz at 1.35V.

This meant I could run the modules at 1333MHz and enjoy the power/heat reduction that goes with 1.35V operation, or I could force the modules to run at 1600MHz at the ‘standard’ 1.5 volts. As I considered different configuration options, a lot of questions came to mind. Today I’ll be taking an in-depth look at DDR3 memory bandwidth, latency, power consumption and heat. I hope to answer the following questions in this post:

  1. What is the bandwidth difference between 1333MHz with tighter 9-9-9 timings versus 1600MHz at looser 11-11-11 timings?
  2. What is the latency difference between 1333MHz with tighter 9-9-9 timings versus 1600MHz at looser 11-11-11 timings?
  3. Just how much power savings can be realized at 1.35V versus 1.5V?
  4. How much cooler will my DIMMs run at 1.35V versus 1.5V?
  5. What about overclocking my DIMMs to 1867MHz at 12-12-12 timings?
  6. How does memory bandwidth and latency fair with fewer than 4 DIMMs populated on the SandyBridge-EP platform?

A Note On Overclocking

Although there are IvyBridge-EP CPUs that support higher 1866MHz DIMM operation, my SandyBridge-EP based E5-2670s and PC3L-12800 DIMMs do not officially support this. The Supermicro X9SRL-F allows the forcing of 1866MHz at 12-12-12 timings even though the DIMMs don’t have this speed in their SPD tables. Despite being server-grade hardware, this is an ‘overclock’ plain and simple.

I would never consider doing this in a production environment, but in a lab for educational purposes – it was worth a shot. Thankfully, to my surprise, the system appears completely stable at 1866MHz and 1.5V. Because the DIMMs can technically run at 1600MHz with a lower 1.35V, it would seem logical that at 1.5V they’d have additional headroom available. This certainly appears to be true in my case – even with a mixture of three different brands of PC3L-12800R.

With my overclocking success, I decided to include results for 1866MHz operation at 1.5V to see if it’s worth pushing the DIMMs beyond their rated specification.

Bandwidth and Latency

Back in my days as an avid overclocker, increasing memory frequency would generally provide better overall performance than tightening the CAS, RAS and other timings of the modules. Obviously doing both was ideal but if you had to choose between higher frequency and tighter timings, you’d usually come out ahead by sacrificing tighter timings for higher frequencies.

I fully expected the 1600MHz memory running at 11-11-11 timings to be quicker overall than 1333MHz at 9-9-9, but the question was just how much quicker. To test, I booted one of my X9SRL-F systems from a USB stick running Lubuntu 18.04 and used Intel’s handy ‘Memory Latency Checker’ (MLC) tool. MLC allows you to get several performance metrics including peak bandwidth, latency as well as various cross-socket NUMA measurements that can be handy for multi-socket processor systems. I’ll only be looking at a few metrics:

  1. Peak Read Bandwidth
  2. 1:1 Reads/Writes Bandwidth
  3. Idle Latency

memspeed-1

As I expected, despite the looser timings, memory bandwidth increases substantially as the frequency increases. About a 12% boost is obtained going to 1600MHz at 11-11-11 timings. An additional 10% was obtained by overclocking the modules to 1866MHz. Mixed reads and writes get a pretty proportional boost in all three tests.

Continue reading “Low-Voltage DDR3 – Performance, Heat and Power”

Moving a FreeNAS ZFS Pool from a Physical Server to a VM using RDMs

I’ve been using FreeNAS for several years now for both block and NFS storage in my home lab with great success. For more information on my most recent FreeNAS build, you can check out the series here.

Although I’ve been quite pleased with this setup, I had to repurpose the SSDs in the box and had yet another USB boot device failure. This meant I had to reinstall FreeNAS and left me with just a single ZFS pool with a pair of 2TB mechanical drives. It just didn’t feel right to have a full system up and running for just a pair of 2TB drives when I could run them just fine in my management ESXi host. Not to mention the fact that I’ve got 224GB of RAM available there to provide for a much larger L1 ARC cache.

In part 2 of my FreeNAS build series, I looked at using VT-d to passthrough a proper LSI SAS HBA to a VM. This is really the best possible virtual FreeNAS configuration as it bypasses all of the hypervisor’s storage stack and grants direct access to the HBA and drives. I considered using this setup, but I didn’t think it was worth the extra power consumption and cooling needed for the toasty PERC H200 card I’ve been using. Since I wanted to preserve all data on the drives, RDMs seemed to be the next logical solution. This isn’t as ‘pure’ as the VT-d solution, but it still gives the VM full block access to the drives in the system. At any rate, it was worth a try!

Disclaimer: If you are using ZFS and FreeNAS for production purposes or for any critical data that you care about, using a proper physical setup is important. I wouldn’t recommend virtualizing FreeNAS or any other ZFS based storage system for anything but testing or lab purposes.

What I hoped to do was the following:

  1. Take the 2x2TB Western Digital hard drives out of the Dell T110.
  2. Re-install the 2x2TB drives in my Intel S2600 management host on the integrated SATA controller.
  3. Create a new FreeNAS virtual machine.
  4. Add the two drives to the VM as virtual mode RDMs.
  5. Import the existing ZFS volume that is striped across these two drives in FreeNAS.
  6. Re-create the iSCSI target and NFS shares and have access to all existing data in the pool! (assuming all goes well).

Creating a new FreeNAS VM

Once I got the two drives installed in my Intel S2600 management host, I created a new VM and got the FreeNAS OS installed. Below is the virtual hardware configuration I used:

Guest OS type: Other, FreeBSD 64-bit
CPUs: 2x vCPUs
Memory: 16GB (a minimum of 8GB is required)
Hard Disk: 16GB (for the FreeNAS OS boot device, a minimum of 8GB is required)
New SCSI controller: LSI Logic SAS
Network adapter type: VMXNET3
CD/DVD Drive: Mount the FreeNAS 11.2 ISO from a datastore

You’ll notice that some of the options I selected are not defaults for FreeBSD based VMs. This includes the LSI SAS adapter, and the VMware VMXNET3 NIC. LSI Parallel is the default for FreeBSD VMs, but the SAS adapter works well with all recent BSD builds. The same holds true for the VMXNET3 adapter, which has many benefits over the emulated E1000 adapter type.

Continue reading “Moving a FreeNAS ZFS Pool from a Physical Server to a VM using RDMs”

Home Lab Power Automation – Part 3

In part 2, I shared the PowerCLI scripting I used to power on my entire lab environment in the correct order. In this final installment, I’ll take you through the scripting used to power everything down. Although you may think the process is just the reverse of what I covered in part 2, you’ll see there were some other things to consider and different approaches required.

Step 1 – Shutting Down Compute Cluster VMs

To begin the process, I’d need to shut down all VMs in the compute-a cluster. None of the VMs there are essential for running the lab, so they can be safely stopped at any time. I was able to do this by connecting to vCenter with PowerCLI and then using a ‘foreach’ loop to gracefully shut down any VMs in the ‘Powered On’ state.

"Connecting to vCenter Server ..." |timestamp
Connect-VIServer -Server 172.16.1.15 -User administrator@vsphere.local -Password "VMware9("

"Shutting down all VMs in compute-a ..." |timestamp
$vmlista = Get-VM -Location compute-a | where{$_.PowerState -eq 'PoweredOn'}
foreach ($vm in $vmlista)
    {
    Shutdown-VMGuest -VM $vm -Confirm:$false | Format-List -Property VM, State
    }

The above scripting ensures the VMs start shutting down, but it doesn’t tell me that they completed the process. After this is run, it’s likely that one or more VMs may still be online. Before I can proceed, I need to check that they’re all are in a ‘Powered Off’ state.

"Waiting for all VMs in compute-a to shut down ..." |timestamp
do
{
    "The following VM(s) are still powered on:"|timestamp
    $pendingvmsa = (Get-VM -Location compute-a | where{$_.PowerState -eq 'PoweredOn'})
    $pendingvmsa | Format-List -Property Name, PowerState
    sleep 1
} until($pendingvmsa -eq $null)
"All VMs in compute-a are powered off ..."|timestamp

A ‘do until’ loop does the trick here. I simply populate the list of all powered on VMs into the $pendingvmsa variable and print that list. After a one second delay, the loop continues until the $pendingvmsa variable is null. When it’s null, I know all of the VMs are powered off and I can safely continue.

Continue reading “Home Lab Power Automation – Part 3”

Home Lab Power Automation – Part 2

In part 1, I shared some of the tools I’d use to execute the power on and shutdown tasks in my lab. Today, let’s have a look at my startup PowerCLI script.

A Test-Connection Cmdlet Replacement

As I started working on the scripts, I needed a way to determine if hosts and devices were accessible on the network. Unfortunately, the Test-Connection cmdlet was not available in the Linux PowerShell core release. It uses the Windows network stack to do its thing, so it may be a while before an equivalent gets ported to Linux. As an alternative, I created a simple python script that achieves the same overall result called pinghost.py. You can find more detail on how it works in a post I did a few months back here.

The script is very straightforward. You specify up to three space separated IP addresses or host names as command line arguments, and the script will send one ICMP echo request to each of the hosts. Depending on the response, it will output either ‘is responding’ or ‘is not responding’. Below is an example:

pi@raspberrypi:~/scripts $ python pinghost.py vc.lab.local 172.16.10.67 172.16.10.20
vc.lab.local is not responding
172.16.10.67 is responding
172.16.10.20 is not responding

Then using this script, I could create sleep loops in PowerShell to wait for one or more devices to become responsive before proceeding.

Adding Timestamps to Script Output

As I created the scripts, I wanted to record the date/time of each event and output displayed. In a sense, I wanted it to look like a log that could be written to a file and referred to later if needed. To do this, I found a simple PowerShell filter that could be piped to each command I ran:

#PowerShell filter to add date/timestamps
filter timestamp {"$(Get-Date -Format G): $_"}

Step 1 – Power On the Switch

Powering up the switch requires the use of the tplink_smartplug.py python script that I discussed in part 1. The general idea here is to instruct the smart plug to set its relay to a state of ‘1’. This brings the switch to life. I then get into a ‘do sleep’ loop in PowerCLI until the Raspberry Pi is able to ping the management interface of the switch. More specifically, it will wait until the pinghost.py script returns a string of “is responding”. If that string isn’t received, it’ll wait two seconds, and then try again.

"Powering up the 10G switch ..." |timestamp
/home/pi/scripts/tplink-smartplug-master/tplink_smartplug.py -t 192.168.1.199 -c on |timestamp

"Waiting for 10G switch to boot ..." |timestamp
do
{
$pingresult = python ~/scripts/pinghost.py 172.16.1.1 |timestamp
$pingresult
sleep 2
} until($pingresult -like '*is responding*')

When run, the output looks similar to the following:

Continue reading “Home Lab Power Automation – Part 2”