In this final post of my SRM on VMAX 20K series, I’ll be going through a very simple creation of Protection and Recovery groups. I’ll protect a test VM that I have created on my primary site, that lives on a replicated volume, and perform a “Recovery” in both directions (Planned Migration). This process can get much more involved the deeper you go with your own SRM configuration. For these purposes I’m just testing the infrastructure works before starting my own full DR Recovery configuration.
Create a Protection Group
1) Login to your Primary SRM server and open up vCenter. Navigate to the SRM Plugin.
2) Open “Protection Groups” and select “All Protection Groups” and then “Create Protection Group”.
3) At the wizard, select the Protected site where you want to protect the VM’s. Select the array pair that you have configured and select Next.
4) On the following screen, select the Datastore group you wish to protect. This can be a single or multiple datastores. Once selected, the VM’s residing on it will be added to protection. Click next.
5) When prompted, enter in a meaninfgul protection group name and give it a description. There is nothing worse than having to investigate other configurations without someone taking 10 seconds to fill in some detail!
6) At the summary screen, select FInish. The group will then automatically configure itself and show a status of “OK”. (If the VM(s) are not Powered On, it might not show this).
Create a Recovery Plan
1) Select “Recovery Plans” from the SRM navigation screen. Right click “All Recovery Plans” and select “Create Recovery Plan”.
2) On the wizard, ensure that the Revovery site is the opposite to the configured Protection group site. Select next.
3) Select the protection group(s) that you want to add to the recovery group. In this case it is just my test but it can be many.
4) View the Test Networks tab. Leave the settings as AUTO as the resource mappings created earlier will take care of the links. Click next.
5) Give the plan a good name and description. Click next and finish.
Perform a Recovery
In my environment, we do not have the EMC Time Finder license. That means that we are unable to use the “Test” feature. This effectivley takes a snapshot of the R2 device and allows that to mount rather than the actual R2 volume. This means that the replication from R1 to R2 isn’t broken whilst testing is taking place. Once the “Test” completes, the R2 snap automatically scrapped and everything is cleared up.
Instead, I have to perform a full recovery which will involve moving all the VM’s over to Recovery Site and splitting replication. Once they are back up, I have to re-protect them, which effectivley reverses the Array replication and makes the Recovery site the protected site and vice-versa.
1) Open the Recovery Plan from above, select the “Recovery Steps” tab.
2) Perform a recovery by clicking the red “Recovery” button.
3) As this is a planned test, accept the warning about altering the VM state and select Next.
4) Accept the Summary screen of the recovery and click Start.
5) SRM Will then perform recovery steps which can be observed from the “Recovery Steps” tab as it progresses.
6) At the end of the process, the VM’s will come up on the Protected Site as planned.
Perform a Reprotect
As a Recovery has taken place, the link that replicates Primary Site (RDF1) to Replicated Site (RDF2) has been split. The Primary devices have become Write Disabled (WD) and the Secondary is now Read/Write (RW). As seen here:
1) The action to reprotect the failed over group can be performed by simply clicking “Reprotect” at the top of the Recovery Plan area.
This process will now reverse the R2 to become R1 and vice-versa. It will also re enable the replication link back in the opposite direction and modify the Protection Group to automatically reverse too.
Now the VM(s) live on the original Recovery Site (which is now Protected Site) and everything has been reversed. To get them back to where they were originally, perform another Recovery and Reprotect again.
This completes the SRM configuration on VMAX 20K that I went through to the point of configuring and testing the Infrastructure. There were obviously much more detailed plans that followed in failing over different volumes/groups of VM’s and implementing multiple Protection Groups and Recovery Plans.
Overall there was a lot of reading involved and I’d like to thank @CodyHorstman again for his awesome documentation which I mentioned in Part I of this series. The versions I’ve used are a little out of date now and I’d encourage anyone proceeding with the implementation for the first ever time to try and use the latest version(s) where possible. I had restrictions in my environment which were impassable at the time (including Interoperability on versions of vCenter and later version of SRM) so my hands were tied to what I have documented in this series.
I hope that someone one day finds this useful, please leave any comments or questions and I’ll do my best to answer them!