Tagged: VMware

SRM and VMAX 20K Part XI: Create Protection Groups & Recovery Plans

In this final post of my SRM on VMAX 20K series, I’ll be going through a very simple creation of Protection and Recovery groups. I’ll protect a test VM that I have created on my primary site, that lives on a replicated volume, and perform a “Recovery” in both directions (Planned Migration). This process can get much more involved the deeper you go with your own SRM configuration. For these purposes I’m just testing the infrastructure works before starting my own full DR Recovery configuration.

Create a Protection Group

1) Login to your Primary SRM server and open up vCenter. Navigate to the SRM Plugin.

2) Open “Protection Groups” and select “All Protection Groups” and then “Create Protection Group”.

3) At the wizard, select the Protected site where you want to protect the VM’s. Select the array pair that you have configured and select Next.

Protect1

4) On the following screen, select the Datastore group you wish to protect. This can be a single or multiple datastores. Once selected, the VM’s residing on it will be added to protection. Click next.

Protect2

5) When prompted, enter in a meaninfgul protection group name and give it a description. There is nothing worse than having to investigate other configurations without someone taking 10 seconds to fill in some detail!

Protect3

6) At the summary screen, select FInish. The group will then automatically configure itself and show a status of “OK”. (If the VM(s) are not Powered On, it might not show this).

Protect4

Protect7

Create a Recovery Plan

1) Select “Recovery Plans” from the SRM navigation screen. Right click “All Recovery Plans” and select “Create Recovery Plan”.

2) On the wizard, ensure that the Revovery site is the opposite to the configured Protection group site. Select next.

Protect8

3) Select the protection group(s) that you want to add to the recovery group. In this case it is just my test but it can be many.

Protect9

4) View the Test Networks tab. Leave the settings as AUTO as the resource mappings created earlier will take care of the links. Click next.

Protect10

5) Give the plan a good name and description. Click next and finish.

Protect11

Perform a Recovery

In my environment, we do not have the EMC Time Finder license. That means that we are unable to use the “Test” feature. This effectivley takes a snapshot of the R2 device and allows that to mount rather than the actual R2 volume. This means that the replication from R1 to R2 isn’t broken whilst testing is taking place. Once the “Test” completes, the R2 snap automatically scrapped and everything is cleared up.

Instead, I have to perform a full recovery which will involve moving all the VM’s over to Recovery Site and splitting replication. Once they are back up, I have to re-protect them, which effectivley reverses the Array replication and makes the Recovery site the protected site and vice-versa.

1) Open the Recovery Plan from above, select the “Recovery Steps” tab.

2) Perform a recovery by clicking the red “Recovery” button.

Protect12

3) As this is a planned test, accept the warning about altering the VM state and select Next.

Protect13

4) Accept the Summary screen of the recovery and click Start.

Protect14

5) SRM Will then perform recovery steps which can be observed from the “Recovery Steps” tab as it progresses.

Protect15

6) At the end of the process, the VM’s will come up on the Protected Site as planned.

Perform a Reprotect

As a Recovery has taken place, the link that replicates Primary Site (RDF1) to Replicated Site (RDF2) has been split. The Primary devices have become Write Disabled (WD) and the Secondary is now Read/Write (RW). As seen here:

RDFREcover1

RDFRecover2

RDFRecover3

1) The action to reprotect the failed over group can be performed by simply clicking “Reprotect” at the top of the Recovery Plan area.

Protect16

2) At the wiard, accept the confirmation that re-protection is going to be reconfigured and select Next.
Protect17

3) Accept the confirmation and click start.
RDFRecover5

This process will now reverse the R2 to become R1 and vice-versa. It will also re enable the replication link back in the opposite direction and modify the Protection Group to automatically reverse too.
RDFRecover4

Now the VM(s) live on the original Recovery Site (which is now Protected Site) and everything has been reversed. To get them back to where they were originally, perform another Recovery and Reprotect again.

Conclusion

This completes the SRM configuration on VMAX 20K that I went through to the point of configuring and testing the Infrastructure. There were obviously much more detailed plans that followed in failing over different volumes/groups of VM’s and implementing multiple Protection Groups and Recovery Plans.

Overall there was a lot of reading involved and I’d like to thank @CodyHorstman again for his awesome documentation which I mentioned in Part I of this series. The versions I’ve used are a little out of date now and I’d encourage anyone proceeding with the implementation for the first ever time to try and use the latest version(s) where possible. I had restrictions in my environment which were impassable at the time (including Interoperability on versions of vCenter and later version of SRM) so my hands were tied to what I have documented in this series.

I hope that someone one day finds this useful, please leave any comments or questions and I’ll do my best to answer them!

SRM and VMAX 20K Part X: Swap Volume and Resource Mappings

In this post I am going to quickly cover the configuration of hosts for SRM. There isn’t a huge amount to cover and essentially it comes down to:

– Swap volume configuration
– Resource Mappings & Placeholder

The reason these bits need some thought is simple, if you are failing over a VM with 32GB of memory and no reservation, the swap file will by default live with the VM and take up 32GB space on the datastore. If you are putting a VM on replicated storage, you don’t really want the cost/overhead of replicating a swap file that can be easily redirected and created on local storage.

Resource mappings simply define the relationships between vCenter objects in order for SRM failover to work correctly. The placeholder Datastore is a fairly small local VMFS volume that is used to store configuration files of VM’s that are part of an SRM Protection Group. I created a small local volume on each site VMAX in my configuration for this purpose.

Configure Cluster/ESXi Swap Volumes

For each cluster of VMware hosts that is going to be using SRM functionality, the cluster must be set to use a specific swap volume. This volume will be a generic non-replicated Datastore on the local site VMAX that must be provisioned and configured before commencing. Once the cluster/hosts are configured in this way all virtual machines residing in this cluster, by default, will use this swap volume unless set otherwise. It is important to provision the Swap datastore to the necessary size for the intended cluster. E.G- If you have 50 VM’s in a cluster with 8GB Memory configured on each and no reservations, you need 400GB of space for the swap files in worst case scenario.

1) Navigate to the hosts and clusters menu on vCenter

2) Right click the cluster that you want to configure

3) Navigate dwon to the “Swapfile Location” and select the option to store the swapfile with the datastore on the host. Click ok.
SWAP1

4) Once complete, select the first host in the cluster and the configuration tab. Then select “Virtual Machine Swapfile Location” under software.
SWAP2

5) Select “Edit” at the top right hand of the screen. Highlight the volume that is destined to be the Swap volume for the cluster. Click Ok.
SWAP3

6) Repeat the above 5 steps for each host in the cluster that is being configured.

From this point on wards, all VM’s in the cluster will use this Datastore for their swap files (once power cycled). For any VM’s that are not going to be protected by SRM and staying on local disk, it might be worth ensuring that the following is set:
SWAP4

Failing that, if it is to be protected, ensure that the option of “Store in the host’s swapfile datastore” is set.

Configure Resource Mappings

Within SRM we have to configure mappings of the primary and secondary vCenter objects for SRM to know the resource to use during failover. A good example of this is mapping one cluster in Primary site against the recovery cluster on the Secondary site that will be used for the VM’s to failover on to. This can be a lot of work, depending on how many resources you want to map. I’ll cover the basics below.

1) From within the SRM management screen, select “Sites” and the primary site from the list.

2) Select the “Resource Mappings” tab at the top of the screen. Select a resource from the primary site and then select “Configure Mapping”.
MAP1

3) On the mapping screen, select the counter-part resource that you want to map to in the secondary site. Click ok.
MAP2

4) Next, select the “Folder Mappings” tab. Select the folder in the primary site and then “Configure Mapping”. Click ok in the Mapping box when you’ve selected the counter-part object.
MAP3

5) The last mapping is under “Network Mappings”. Select the port group of primary site and then “Configure Mapping”. Click ok in the mapping box when the counter-part is selected.
MAP4

6) A placeholder Datastore also has to be configured on each site. Select the “Placeholder Datastore” tab and then “Configure Placeholder Datastore”.
MAP5

7) Select the Secondary Site from the Navigation and repeat the above 6 step, mapping resources to the opposite site.

Conclusion

Configuring VMware hosts to be ready for SRM is quite easy. My main focus was the local swap volume size for each cluster. During the original configuration, I had to make changes to this on the ESXi host. This wasn’t as straightforward; I had to move VM’s around and power cycle them to get them to use an alternative volume once I had specified it. I also couldn’t unmount the original swap datastore even though nothing was using it anymore. The only way I got around it was to maintenance mode the hosts in the cluster and power cycle them! What I’m trying to say is that make sure you’ve thought about it properly first and then provision and configure. This was a learning experience for me and not the end of the world!

In the next, and last post of the series, I’ll be walking through creating some basic protection and recovery groups in SRM and performing a recovery.

SRM with VMAX 20K Part IX: Provisioning SRDF/A replicated volumes

In this post I am going to run through setting up replicated volumes on the VMAX 20K. There is a little assumed knowledge with this, I only know what I’ve learnt through reading up and dabbling with these monster arrays. As discussed in Part I of this series, there is already a VMAX Configured with SRDF connectivity enabled. I’m not going to go into much detail about this purely because it would require a lot more than a single blog post. I’m mostly concerned with getting replicated volumes ready in a state for SRM.

To give a very brief description of some of the terminology being used:

R1 Device / RDF1 Device Group = Source (Read/Write) volume or group of volumes on Primary Site Array.

R2 Device / RDF2 Device Group = Target (Write Disabled) volume or group of volumes on Secondary Site Array.

SRDF Group = Used to define a relationship between two Symmetrix Storage Arrays that are connected by SRDF. It defines SRDF Director ports assigned on local and remote arrays. Each SRDF volume must be assigned to an SRDF Group as an R1 or R2 device.

Device Group = User defined object which groups together Symmetrix devices. It enables you to call multiple R1/2 volumes by giving them a group membership. Device Groups propagate using GNS (Global Name Service) which copies group information to remote SYAMPI configuration databases to enable consistent naming schemes. (In my head I think of AD Groups and replication between DC’s to give it a simile).

There is plenty of information available out there from EMC and affiliated groups of people, as I mentioned in Part I of my series, which have enabled me to learn and get this working. There is a lot of documentation and some of it is heavy going if you aren’t an EMC SAN guru, but it is well worth reading.

Right, on with the show!

Configure Volumes

The process of configuring a volume on the VMAX is quite straight forward. I use the Unisphere Web Console because I find it easy and quite idiot proof.

1) Login to each site VMAX Array and configure a new volume as shown below. Ensure under Advanced Settings that the device has a dynamic capability of “RDF1_OR_RDF2_Capable”.
SRDF_1In this example, I needed 4 volumes on each site.

2) The volumes should create as normal and just be normal TDEV (Thin Device) volumes at this stage.

Primary Site:
SRDF_2

Secondary Site:
SRDF_3

3) At this time I recorded all volume numbers, WWN’s, pool memberships, etc for quick reference and good practice on documentation.

Create SRDF Group

1) On the Primary Site VMAX, navigate to Data Protection, Replication Groups ans Pools, SRDF Groups:
SRDF_4

2) Select “Create” and then fill out information on the group. Including a: Group Label (Name), Communication Protocol, Unique Group Number and Director ports that you have assigned for SRDF on the VMAX:
SRDF_5

3) Close the success message. A new group label should appear in the Web UI.
SRDF_6

4) It is also possible to check the group via the SYMCLI. This can be done on the Array Manager server as configured in Part V.

SRDF_7

SRDF_8 Where XX2 is the last 3 digits of your Symmetrix Array as seen in step 4. For me, XX2 is Local (R1) and XX4 is Remote (R2).

5) Create a text file in a directory on the Array Manager host. List the volumes created earlier from the Local and Remote arrays. As below:
SRDF_9

6) Run CMD as Administrator and enter the following command, when prompted hit “Y” to confirm:

SRDF_10

SRDF_11
This command pairs the devices listed in the txt file and creates an RDF1 group on the local array on SRDF Group 4 (Created above) in Adaptive Copy mode.

7) Check the status of the SRDF Group and the paired volumes within, using the device pairs file:

SRDF_12
Notice that the pair state is suspended. The astute reader might also see that I have 5 volume pairs in this group rather than 4 as per my earlier creation. This was because I needed another later on so I added it in before recording my steps.

8) Now the devices have been paired, start the non-consistent SYNC of the devices. Run the command below and accept the prompt with “Y”:

SRDF_13

9) Check the status of the SYNC. In this case, both volumes are “empty” and thin provisioned. Run the command:

SRDF_14

The state “SyncInProg” can be seen. This command refreshes every 5 seconds showing the progress of the volume synchronisation.

10) Once complete, run a final check on the status of the SRDF Group:

SRDF_15

11) As the SRDF SRA required Sync/Async replication, set the replication mode of the SRDF Group to Asynchronous (SRDF/A), confirming the prompt:

SRDF_16

SRDF_17

Configure Device Group

This will enable the calling of volume pairs within an SRDF Group within having to use a text file to specify the ID’s. This will replicate across both arrays using GNS. I create and configured the device group on the Primary array.

1) Create the device group and give it a name:

SRDF_18

2) Add each of the devices to the group and check it has worked by querying it:

SRDF_19

Repeat with all devices that are R1. The group will automatically create and populate on the Remote Array.

3) Run a check on the new Device Group by showing the full details:

SRDF_20

SRDF_21

If you check the status of the output above, you will see that the device group is not consistent. Consistency groups are required for SRM.
SRDF_22

4) Enable device consistency on the group, run the following command and accept the prompt:

SRDF_23

After this, re-run the same show command on the group and check the consistency state
SRDF_24

5) It is now possible to query the Device Group and it’s contents without having to use the manual text file.

SRDF_25

6) The volumes can now be presented to the VMware ESXi storage groups that are configured for you hosts on each VMAX. Present the R1 devices to your Primary Array Storage Group and the R2 devices to the Secondary Array Storage Group:
SRDF_26

Conclusion

This finishes off quite a large post containing a bit of a crash course in configuring replicated volumes on a VMAX 20K using SRDF/A.

I have experienced quite a few storage arrays in my time but the VMAX is on the high end of in terms of capability, performance and availability. The above, for me, was a challenge in understanding some concepts and getting it to work. I had a few difficulties along the way which EMC Support were very good with assisting me on. I’m always happy to bash support when they are useless but in this case I got through to a couple of guys in Ireland who were excellent.

As I mentioned at the start, there is a wealth of information that could have gone into this. I’ve covered the main bits in terms of implementation, but there is a fair bit more work around configuration and general administration of the VMAX that might not be here. Although, I’m happy to ask questions or deflect to the right person if anyone is reading this and has any further questions.

In the next post, I’m going back into VMware and configuring the environment ready for SRM.

SRM with VMAX 20K Part VIII: Configuring EMC Virtual Storage Integrator

This is a very short and sweet post about implementing the EMC Virtual Storage Integrator (VSI). This is a plug-in for the vSphere client which enables further management of your specific array. There are advanced options made available for SRDF, Storage Views and Management. It is quite useful to have although I haven’t really used it in any anger.

The process is really quite simple.

Install VSI

1) Download the install media for the latest version of the VSI Supported. For me that was VSI 5.7.0

2) Login to the Primary (protected) SRM Server via RDP.

3) Run the installer. At the prompt, read and accept the EULA. Click Install, accept the UAC prompt if you get one:
VSI_1

4) At the summary screen, select Finish.
VSI_2

5) Run the vSphere client as Administrator (this is a bug with UAC and VSI in the version I was using) and login.

6) On the home screen in vSphere, there should now be an icon under “Solutions and Applications” for the VSI:
VSI_3

7) Select the VSI icon. Enter in the information for the local Array Manager sever configured in Part VI.
VSI_4
The admin is the default admin credential of the Solutions Enabler SMI-S service.

8) Take note of some EMC SRDF SRA Global Options presented.
VSI_5

8) Once complete, login to the Primary SRM vCenter through the client. There is a top level tab for “EMC SRDF SRA Utilities”. Select the “Login” link at the top right hand corner:
VSI_6

9) Enter in the details for the Recovery Site vCenter and Array Manager.
VSI_7

10) Make sure the connection is successful and click Finish.
VSI_8

11) Repeat steps 1-10 on the Recovery Site SRM Server.

Conclusion

Nothing extreme here in the slightest, a very simple bit of software. It is worth noting that once you are logged in you might want to change the default admin password that the VSI uses to connect to the array via the vSphere client.

I believe this tool can be quite powerful so please use at your own risk and be very careful using it!

In the next blog post, I will get around to configuring replicated volumes on the VMAX Arrays for use with SRM.

SRM with VMAX 20K Part VII: Installation of Storage Replication Adapters

In this post, I’m going to cover the installation of the Storage Replication Adapter (SRA) for Site Recovery Manager. When using array-based replication, this is an absolute must piece of software that the vendor providers as it enables SRM to integrate and work the specific array. The SRA needs to be installed into each SRM host on both recovery and protected sites.

The main things to consider with the SRA are:

– Site Recovery Manager needs to be installed first, before proceeding (obviously!).

– Check SRA availability for your storage array type by checking the VMware Compatibility guide for SRM

– Make sure the same very is installed on each site.

Installing the SRA

1) Make sure you have downloaded the necessary SRA for configuration. For me, this was the EMC SRDF SRA v5.6.0.0

2) Login to the primary site server via RDP.

3) Run the installer and at the welcome prompt, hit next.

SRA1

4) At the EULA, read fully and carefully. Print out a copy, take out an entire rainforest, and make sure your legal team fully investigate any ramifications, click next.
SRA2

5) The wizard now auto detects the installation path for Solutions Enabler, which I thought was cool. Click Install.
SRA3

6) When all is done, click Finish.
SRA4

7) At this stage, restart the Site Recovery Manager service.

8) Repeat steps 1-7 on the Recovery Site SRM Host.

9) Open up vCenter on the primary SRM Server, navigate to the plugin and enter in your credentials.

10) On the Array Managers tab, select “Add Array Manager”.
SRA5

11) Enter in the information for the Array Manager servers that have been previously configured.
SRA6

12) On the next configuration screen, enter in the details of the Primary Protected Site Array Manager server. Fill out the details of the Recovery site Array Manager in the second section.
SRA7

13) Click Next and ensure there is a success Screen.
SRA8

14) Now login to the Protected Site SRM Server and repeat steps 9-13 and configure the Array Manager in the opposite direction:
SRA9

SRA10

15) Once this is complete. Login to the first site and select the Array Managers tab. There should be a discovered Array Pair available, click “Enable”.
SRA11

16) If you select the “Devices” tab and refresh, you should see all the volumes that are available on the VMAX that are created as SRDF pairs which are contained in a device group, which are also presented to VMware.
SRA12

This will only be present if you have configured SRDF and RDF1/2 volumes, which I will cover in more detail in my next post.


Conclusion

This section of the guide is relatively straight forward. For documentation purposes I have put this software installation side of the SRA before configuring volumes to be presented. In reality, this can be done first and then you can create your replicated volumes and discover them once you have presented storage at a later date.

As I stated in my previous post, to link up the SRA to the remote array manager, port 2707 must be open between the SRM server and the Array Manager servers.

Nothing too difficult here, I’ll be going into more detail about the EMV Virtual Storage Integrator in my next post!