VMware & Azure Site Recovery – Part 4: Failover and thoughts

In my final post in this series, I’m taking a look at performing a test fail-over of some VM’s I have on premises to an Azure Site Recovery instance.
This is a really easy process and doesn’t take too much documenting. Afterwards I’m going to share a few things I’ve learned along the way and my thoughts on ASR.

VM Fail-over

1) Navigate to the recovery vault from within your Azure Dashboard.

2) Under Protected Items Navigation pane and select “Replicated Items”.

3) Here you should get a view of all the machines that have been protected by you and their status.

4) Select a VM that you want to test and click Test Fail-over.

5) Choose the recovery point and the network that you want to fail-over to and select OK.

6) Some checks should complete and the fail-over environment is prepared. After around 15 minutes I had my VM running and awaiting user input checks.

As you can see, the fail-over process is easy for a single VM. I’m still in a testing phase at the moment so haven’t performed a mass failover but it doesn’t seem to take more than 10 minutes to move.

An issue is presented with testing, there is no “KVM” option of the VM – unlike other hosting providers. So the only way to check the VM had been put in the network and came up correctly was to build a management 2012 R2 server in my ASR recovery network with a public IP and enable RDP.

Lessons Learned:

1) There are some limitations on what is possible with your test fail-over VMs. Example being no KVM at present (only a screenshot view of the system).

2) All VM’s failed over receive a 20GB “free” disk for temporary working. A nice feature but for us it caused Linux VM’s to have their disk devices renamed. E.G – our ‘sdb’ disk became ‘sdc’ because the temporary disk took its label. Not ideal and I’m not sure what can be done to disable this at present. Microsoft recommend mounting disks by their UUID to work around this although this is contrary to Red Hat advice and mounting logical volumes.

3) If you have logical volumes which contain thinpool logical volumes, they are undetected by the Microsoft agent on the Linux VM. This isn’t great for our environment as we have these for DB snapshots, as an example.

4) Having a management server built and running to enable communications into your test fail-over network is a good idea!

Final thoughts:

I think that this service is very good and for Windows Servers very easy. Microsoft seem to be pushing their Azure offering hard and new features have been implemented in the portal by the time I managed to finish this blog series. There are limitations with the Linux offering and it is slightly more clunky, but that’s expected. The important thing to note is that it does work!

I believe that for a greenfield site where you can take into account a lot of the DR/Fail-over caveats and issues it is a great service. If you aren’t greenfield you might come across limitations that might be hard to overcome without having to re-architect a lot of your services (this is true in a lot of environments where DR is being built in as an afterthought). This is the issue I face at my workplace with this service.

For a small/medium size business, being able to replicate your infrastructure out into the cloud and pay a “minimal” fee to have DR capability is almost a no brainer. The simplicity of setup and the safety in knowing you can spin up in the cloud to maintain service is really excellent. It is almost certainly much cheaper than having to buy another DC/Room/Rack to put in a load of other kit and replicate too, especially for those with tighter purse strings.

Really understanding the service and what it will cost you is key, whilst replication costs and cloud storage are “minimal” – in the event of a fail-over you might find the bill coming in from Microsoft to be a much higher than normal. I guess that is the roll of the dice you take with having an opex “buy now pay later, should I need it” approach.

I would recommend ASR to people looking at a cloud DR service, obviously adding in caution and heavily advising to do your homework before taking the plunge.

One comment

  1. Pingback: VMware & Azure Site Recovery - How to Code .NET

Leave a Reply

Your email address will not be published. Required fields are marked *