Category: Uncategorised

Migrating ESXi Management VMkernel

I have been doing a fair amount of work with NSX recently. In order to start this work we have had some environment changes to go through before achieving this. One of the changes we had to make was to the network that contains the VMkernel for host management traffic. The overall aim was to migrate the interfaces to a new management VLAN (new subnet, gateway, etc).

Here is how I managed to do it without disruption to any existing management or services running.

1) The first step was to create a portgroup on my vDS for the new Management VLAN that had been trunked to the hosts.

I would advise to configure the port group further for your environment based on VMware Network Best Practices for things like Traffic Shaping, Teaming/Failover, etc.

2) Now the port group exists, add in a new VMkernel for all of your hosts for management traffic. For me, I ended up with 3 vmks: old management (vmk0), vMotion (vmk1) and new management (vmk2).

3) From here, I put hosts into maintenance mode that I was going to reconfigure, just to be on the safe side.

4) At this point, it isn’t possible to remove the existing vmk0 because it is in use. The reason for this, is the hosts TCP/IP stack configuration has the old VMkernel gateway configured. This should be changed to the new management network gateway address on each host:

5) From here, I disconnected the hosts from vCenter.

6) I then changed the host records of my ESXi servers to the new management IP address. Allowed some propagation (in fact I checked from the vCSA appliance that it had picked up the newest record from my DNS servers).

7) Reconnect the host(s) back into vCenter.

8) It is now possible to remove the old management VMkernel adapter (vmk0 in my case).

9) I did follow through the process of rebooting my hosts before exiting maintenance mode, but I do not actually think it matters too much.

There we go! A fairly straight forward process and one that I can’t imagine many people doing. I did have a look to see if anyone else had performed a similar process but they hadn’t moved subnet and gateway. Hopefully this might help someone out there who wants to do this!

Rubrik – PowerShell/API SLA Backup & Restores

Having been lucky enough to procure a Rubrik Cloud Data Management appliance at my work recently; we have had the pleasure of experiencing a fantastic technical solution which has assisted us in improving our backup/recover and business continuity planning. The solution, for us, is still in its infancy but we hope to scale and grow as the business realises the full potential of the service. Until then, we have had fun in preparing it for our own production use as it is such a joy to work with!

One thing we questioned was how we get a list of our SLA Domains (as we’ve made a fair few) and their contents. This could be useful in the scenario of someone accidentally deleting policies or machines out of policies. Another potential use case could be if we needed to ‘rebuild’ our Brik SLA configuration in the event of a major failure – highly unlikely but better to be prepared and have committed some brain cycles to it, right?

With that in mind, my esteemed colleague @LordMiddlewick has written some PowerShell scripts with the help of @joshuastenhouse previous blog posts about using Rubrik RestAPIs .

Backup Script

This script can be scheduled to run at your own convenience. Ensure that you fill in the variables in the top section for your own environment. It is possible to encrypt the password within the file itself, this can be achieved using a methodology described here. We have only encrypted it for transmission to the Rubrik service in the case below for simplicity.

The key take aways from this script in whatever fashion you run it are:

* You receive a bunch of .txt files, for each SLA you have defined, in JSON format. Useful for restoring SLA’s. Here is an example:

* Another take home is the file “VM-SLA.csv” which contains a list of all your VMs that are backed up and to what policy they belong. This is really useful restoring VM’s into SLA’s or bulk importing VM’s into SLA’s.

Restore SLA Domain Policies

To reverse the backup process and restore an SLA or all of your SLA’s into the Rubrik, use the following script:

This script will take any .txt (SLA Backup files) in the designated $path and try and create it back on your Rubrik.

Restore/Import VMs to SLAs

The final part of this excercise is to be able to restore a list of VMs that have been pulled out, against the SLA domain policies that you have. The following script does this by using the above “VM-SLA.csv” to import a list of objects in and assign them as per the csv.

The format for the VM-SLA.csv file is as follows:

In theory, if you have lots of machines you want to bulk assign to any given policies, you can create your own CSV and run it to import your VM estate to your predefined policies using this script. We used this several times when assigning 100+ objects to a given policy and it worked a treat!

Disclaimer:Please try to fully read and understand the above scripts before implementing them. You should test them fully first in a development environment before implementing in any production sense. I/we do not take any responsibility for rouge administrators stupidity.

I’m sure as Rubrik continue to steam ahead with excellent releases, infact they might evne build in some of this functioanlity making these scripts redundant. In the meantime hopefully someone finds these scripts useful, I know we have. Once again big shout out to @LordMiddlewick for writing this and giving me permission to post it and also to @joshuastenhouse for his blog .

My Rubrik Experience

For those who don’t know Rubrik is an up and coming Cloud Data Management platform which essentially provides a converged, scale-out, clustered backup appliance for all of your Infrastructure backup needs. If you have been living under a rock for the last 3 years then please take a look at :

Some other good reading on the product can be seen in the following blog post articles which explain in detail a lot more than I do in this post:


Penguin Punk

Recently I had the pleasure of this little beauty for a month for some testing:


R348 (Brik) – 1 Appliance
Nodes – 4
Disks – 4 SSD + 12 HDD (1 SSD / 3 HDD per node)
Memory: 256 GB (64 GB per node)
CPU: 4 * Intel 8-Core
Network: 4 * Dual-Port 10GbE & 4 * Dual Port 1GbE & 1 1GbE IPMI

Total Useable Capacity – 59.6 TB


The reason I had my hands on this device was to test the functionality of Rubrik, pure and simple. I hooked it up to a 6 node vSphere 6.5 cluster running 10Tb of FC attached storage which covers around 100 virtual machines, ranging across Windows 10, 2008R2-2016 and Linux (RHEL6/7, Ubuntu, Centos). I had around a month of “playtime” with a fairly solid test plan to get through.

Simplicity: We had the appliance delivered ahead of time and the onsite engineer came a few days later after a simple rack and stack. Within 2 hours we had the cluster up and running (it would have been quicker if it wasn’t for our network blocking mDNS!). Beautiful, simple deployment.

Configuration: See simplicity! I’d already created a Rubrik Service account in my domain with the correct vCenter permissions. Adding my test cluster was a breeze and the VM discovery happened within minutes. I could have added all my machines to the built-in SLA Domain Protection policies and that would have me good to go, but I wanted to play in depth!

Useability: The navigation on the system is a beautiful HTML 5 interface that is really intuitive. If you haven’t seen it I suggest you take a look. Whilst we had an engineer present, everything was so simple to drive it felt natural and elegantly put together. One of the things I was really keen on, was replicating some archive VM data out to a cloud provider. It is fair to say that within about 10-15 minutes, before the engineer had time to get me a guide, I had configured an Archive target to a fresh Azure Blob store I had created. So easy.


Features: Coming from a legacy backup platform that isn’t very well geared towards a modern data center, I was blown away. Going from only having traditional agent based backups for Linux/Windows to having some awesome benefits such as:

– Snapshots
– Replication
– Archival (Local and Cloud)
– Live Mounts
– Google like File system search
– SQL DB Point in Time Recovery
– Physical O/S agent recovery
– Well documented API to consume

Was quite a big turn around in such a swift implementation.


I understand the buzz of Rubrik and how they are a game changer in the Data Management, Backup/Recovery world. For modern data centers that are largely virtualised this is a product that really must be considered. Given the new 3.2 release where they provide the ability to backup your Cloud workloads using a Rubrik appliance, it really is starting to provide a well rounded and unique solution.

I would highly recommend any person looking into the backup space for their own benefit to make sure this vendor is reviewed!

I won the EVO:Rail Challenge – VMworld here I come!!

A while back I entered the EVO:Rail Challenge put on by VMware.


I did this for two reasons:

– I was interested to see their hyper-converged infrastructure appliance product in action.
– Motivation of winning a VMworld Full Conference pass!


I’ve always been impressed  by the Hands-On-Labs as it gives great exposure to a range of products with very little effort. My only bug-bear is that I don’t have enough time to fit them all in, as much as I’d love too!

The challenge takes pieces from the HOL-SDC-1428: Introduction to EVO:Rail lab and modifies it to make you fix the purposefully broken aspects of an EVO:Rail deployment. There are a few extra bits in there you have to complete before grabbing a screenshot of your time and ending the lab.

I received contact in early June stating that I had been shortlisted, due to having a good time, and asking for my screenshot – which I quickly submitted. The following week I was emailed by the team to inform me that I’d won. The official announcement was made last night! To say I am excited is an understatement!! It will be the second time I will have had the pleasure of attending VMworld and I can’t wait.

I’d recommend to anyone interested in EVO:Rail or winning a ticket to VMworld to give the EVO:Rail Challenge a go, you never know! You might be quick enough to win a ticket!

I’d like to close this post by thanking all the teams and individuals behind the scenes at VMware for making my week! It is an honor to have won and I look forward to seeing people at the event!

How to enable VM Multicast Traffic across multiple hosts on Cisco UCS

Recently I was asked to investigate an issue occurring on some VM’s within the environment I look after. The sysadmin informed me that he was unable to receive multicast traffic on VM’s in the same VLAN when they were separated on different hosts in the cluster. However, when the VM’s were on the same host, traffic flowed fine!

After doing some reading I found that the issue is with the UCS Fabric Interconnects. Here is what was needed to fix the problem.

Configure an IGMP querier on Nexus 5K

1. SSH into the first Nexus 5k switch.

2. Backup the configuration with.

3. Check the IGMP configuration on the VLAN that requires Multicast traffic.


4. As no querier is present, create one.


5. Check the configuration is now correct.


6. Repeat the above steps 1-5 on the second 5K switch.

Disable IGMP snooping on Nexus 1000v

1. SSH into your Nexus 1000v.

2. Backup the configuration as above.

3. Check the current IGMP snooping status, which is “Enabled” by default.


4. Disable IGMP snooping on the 1000v.


5. Verify that IGMP snooping has been disabled.



By default, VM’s in the same VLAN will only talk multicast traffic to each other if they reside on the same host. This is due to the way in which the UCS Fabric Interconnects handles multicast traffic and the fact that they only store MAC address information from the blade servers connected to them directly.

The work around is to is to enable an IGMP querier which will be used to facilitate multicast traffic on the upstream switches, for the specific VLAN in question.

I’m not entirely sure disabling IGMP snooping on the 1000v as a whole is a good idea, it might be best to do it on a per VLAN level, where it is required.