https://www.traditionrolex.com/36

Myth busted: vCenter Site Recovery Manager using vSphere Replication for Datacenter migration causes data loss

It seems there is a lot of confusion out there on how vCenter Site Recovery manager work with vSphere Replication when used for a data center migration, thanks in big part to competing products FUDs. Many admins still believe that using vSphere Replication with vCenter Site Recovery Manager for data center migration you will still lose up to 15 minutes of Data. This confusion has evolved due to the following two limitations of vSphere Replication:

  • Lowest RPO possible using vSphere replication  is 15 minutes
  • You cannot replicate powered-off virtual machines. Replication begins when the virtual machine is powered on. You cannot use vSphere Replication to replicate virtual machines templates. <== This statement right of the vSphere documentation.

Here is how the confusion came to life. If you have experienced or read about vCenter Site Recover Manager with storage replication and looked at the sequence of events when doing a data center migration, you will notice it will do a final sync of the data between the two sites right before it cut the replication between the the two sites. If you try to compare the same method with what is happening in vSphere replication and knowing the above two limitations, you will think that when the Data Center Migration is initiated in SRM, it will shutdown the VM and at that time the VM replication was lagging with up to 15 minutes behind based on the provided RPO and as vSphere replication can not replicate after the VM is turned off, the VM will be losing up to 15 minutes of data when coming on the other site, but that is not true as its missing a very minor but important detail that many people seems to over look.… Read More

Qtree SnapMirror warnings and limitations with SRM

While setting up SRM with NetApp 6290 at a customer site, my customer was using Qtrees and Qtree SnapMirror which caused us few issues. If you are setting up SRM and using NetApp Qtree SnapMirror, there is quite few warnings, limitations, and best practices that you will need to be aware of. I have listed the most common ones below, though for a more complete list you should check the following document: http://www.netapp.com/us/media/tr-4064.pdf

– Avoid using hidden Qtrees as that seems to cause problems with several versions of the NetApp SRM SRA. One of the most common errors caused by such configuration is:

Error: Failed to sync data on replica device ‘/vol/volume_name/lun#. Device found is neither of SAN type nor of the NAS type. Ensure that the device exists on the storage array and is of type NAS or SAN. 

RM NetApp Hidden Qtree error

– If you have configured qtrees as NFS datastores, you must create an NFS export for each qtree in order for SRM to be able to discover the NFS datastore. If you export only the volume that contains the qtrees, so that there is only one export line for the volume in the /etc/exports file, SRM will not be  able to discover the qtree NFS datastores.
Read More

SRM Error: XmlValidateException Element SourceDevice is not vaild for content model: Source Device.

While setting up vCenter Site Recovery Manager 5.0.3 with NetAPP 6290 filer,  & in particular at the stage where I try to enable the array pair and after I fixed the timeout error I have documented in the following post: SRM Time Out (300 seconds) while waiting for SRA to complete discoverDevice Command, I have faced this error:

Internal error: std::exception ‘class Dr::Xml::XmlValidateException’ “Element ‘SourceDevices’ is not valid for content model: ‘(SourceDevice,)”‘.

A screen capture of the error is below:

 

SRM NetApp 6290 element SourceDevice is not valid for content model

After a bit of digging I have found out that is a well known NetApp SRA 2.0.1 bug which is documentd here, and there is few possible workaround. Below I will document these workarounds, where you can take the most feasible option for your environment.

1- Revert back to NetApp SRA 2.0.0 as that does not suffer the same problem. This bug only seems to affect NetApp SRA 2.0.1 not 2.0.0. This is actually the solution I went with as it required the least amount of work and maintenance going forward.

2- Download and install NetApp SRA Patch 2.0.1P2. This patch does not seems to be generally available and you will have to request it from NetApp support or digg for it into the NetApp support portal.… Read More

SRM Time Out (300 seconds) while waiting for SRA to complete discoverDevice Command

While setting up vCenter Site Recovery Manager 5.0.3 with NetAPP storage,  & in particular at the stage where I try to enable the array pair I was welcomed with the following error:

SRM Time Out (300 seconds) while waiting for SRA to complete ‘discoverDevice’ Command.

 SRM Time out while waiting for SRA to complete discoverdevice command

 

This SRM SRA Adapter error is not limited to NetApp storage, but a similar error can be seen with EMC & other vendors storage. If your storage is pretty busy or you have too many devices on it, the operation just can not get completed within the 300 Second (5 minutes) time out set by default in SRM. The solution for this problem is really simple, all you have to do is to increase the storage.commandTimeout . In my case, I just increased it to 1200 second (20 minutes) which should be more than sufficient for most environments.  Further, it is important to mention that this time out setting is setup per site, & must be repeated for both sites involved in your SRM setup. Below is the exact instruction:

1. Click Sites in the left pane, right-click your primary site, and click Advanced Settings.

2. In the navigation pane of the Advanced Settings window, click Storage.… Read More

VMware vCenter Site Recovery Manager Service failed to start

While installing VMware SRM 5.0.3 at a customer site, both nodes installations have given me the following error right before the installation completion:

Failed to start service.

Details:
VMware vCenter Site Recovery Manager Service failed to start.
Check that all required Windows services are running. View the server log for more information.

Press Retry to try again or press Cancel to exit installation.

For those of you who want to see the actual error as it show on screen, below is a screenshot of the error:

SRM Service failed to start error

It turned out the solution is really simple, just change the VMware Site Recovery Manager Service in Windows Service Manager to use the domain service account you used during the SRM installation rather than the local account and then try to start the service again. The service should start up and you should be able to hit the retry button on the installation wizard where the installation of Site Recovery Manager complete successfully after.

I am not sure for whatever reason during VMware SRM setup, the VMware Site Recovery Manager service got installed using local system rather than the service account entered during the installation. I thought to share this quick fix with anyone tumbling through it.… Read More

VMware SRM NetAPP SRA required user permissions

While setting up VMware SRM for a customer lately with a NetAPP filer, I have faced the challenge where the customer wanted to use the minimum required permissions for the user the Solution Replication Adapter (SRA) use to connect to the NetApp filer. At minimum the customer wanted to use a different account from root to be able to audit which user has carried out the changes. After doing my research, I have found out the below three ways to create users to be used by the NetApp SRA.

1- Use the NetApp Root user, if this does not violate your security policy then this is the easiest route as you will have to change nothing on the NetApp Filer. For a secure  environments, I would recommend trying one of the below two methods.

2- Add a new user to the NetApp Administrator group,  This seems to be the most commonly used as it is an easy way of doing it while allowing you to audit the filer activities. You will be able to distinguish which actions were invoked by SRM from ones that were invoked by the root account. Use the below commands at your NetApp console to create a new NetApp Administrator to use for your SRM SRA:

useradmin user add SrmUser -g Administrators
 

Or if you want to use a domain user (Assuming your NetApp Filer was configured with Domain Authentication)

useradmin domainuser add DOMAIN\SrmUser -g Administrators
 

3- Creating a specific permission role and add your SrmUser to it, While method 2 documented above allow you to create a distinguished admin to be used with the SRM SRA you will be granting that user unrestricted access to the NetApp Filer, where if you wanted to restrict the SRA to only the minimum required permissions you will need to follow the below steps:

a.… Read More

VMware Site Recovery Manager Licensing FAQs

Lately many questions about VMware Site Recovery Manager Licensing has been raising up specially ones related to VMware vSphere Essential, vSphere Essential Plus, and other acceleration kits. Getting answer to these were always a hassle to find till I found this magical document with answers to many of these. You can find the document  at: Site Recovery Manager Pricing Licensing FAQ – Q409.pdf

I have decided to post the questions & answers of this document on my blog due to their importance & the huge demand for them, where this document is not well indexed online & quite hard to find. You can download the document above & use it offline, or look online below for the answer you require (The documents have points I have not included as they seemed well-known to me):

Q: Is Site Recovery Manager included with VMware vSphere?

A: Site Recovery Manager is not included as part of any VMware vSphere editions. Note that Site Recovery Manager does require a supported version of VMware vSphere or VMware Infrastructure.

Q: Will customers receive Site Recovery Manager as part of their Support & Subscription for VMware vSphere or VMware Infrastructure?

A: No, Site Recovery Manager requires an additional purchase.… Read More

VMware Site Recovery Manager (SRM) 4 has just been released

I have a great news today from VMware. They have released the new version of VMware Site Recovery Manager SRM which will support VMware vSphere. This release of SRM had been long waited by many of us who upgraded, or already planning to upgrade to vSphere. I thought I will share the good news with every one.

The support for vSphere is not everything, but another great feature of the new SRM is the ability to support multiple source to a single target storage, which was not supported by earlier releases of VMware.

Another long waited feature for SRM is the support for NFS Datastores. Yes, finally NFS is supported by VMware Site Recovery Manager.

Although vCenter 4 Linked mode is not a feature of VMware Site Recovery manager 4, it will be a great plus for the new SRM. vCenter 4 Linked mode will  allow VMware administrators to manage both sites for SRM from a single console.

The last thing worth mentioning is that VMware Site Recovery Manager 4 will be compatible with vDistributed switches.

You can download VMware Site Recovery Manager 4 Here.

Keep tuned & I will update you as I found out more about the new SRM Release.… Read More

VMware Site Recovery Manager, Automated Virtualized DR

By June, Site Recovery Manager will be available. This product made by VMware is intended to simplify the planning and testing of the disaster recovery project.

According to VMware,

The planning of recovery after a disaster traditionally exposes companies to significant risk of prolonged failure because too cumbersome to implement, cumbersome and difficult to maintain tests

Recovery Site Manager, of course compatible with VI3 and Virtual Center, allows us to automate the DR, based on virtual servers created for the purposes of the DR.

VMware SRM is a small revolution because it allows not only to make an automatic failover for servers and storage between the main site and the site of DR, but also to conduct tests in real time, without impact on production. One way to verify that the DR part of infrastructure is fully operational and ready when needed …

VMware Site Recovery Manager is compatible with many software replication as 3PAR, Dell, EMC, FalconStor, Hitachi Data Systems, HP, IBM, LeftHand Networks and NetApp. SRM is compatible with most storage software, which makes it totally independent of equipment used in your DR.… Read More