Disaster recovery is often an area that doesn’t get as much planning and attention as it should. Most IT departments and smaller outsourcing firms are so busy constantly putting out fires, that the process of actually testing if the data can be recovered almost never happens. Another problem I see often is that the business owner usually has a completely different set of RPO’s (recovery point objectives) and RTO’s (Recovery Time Objectives) in mind than IT has. For most companies, the days of just doing 1 backup at the end of the business day just doesn’t cut it anymore. Staff expect their data to always be accessible, and when something happens to it, they expect to get a recent copy back fast. So what does all this mean, and what should you do as a business owner to make sure your companies most valuable asset (your data) is protected and recoverable when you need it most?
- What are your recovery point objectives:
Ask yourself this question. If I am working on a budget plan for the year, and I accidentally deleted the data, what is an acceptable point in time to have that file recovered ? Am I OK with last nights data? Do I need a copy from 8hrs ago? Or do I need the file from 1hr ago because I was actively working on the file all day? This is defined as your RPO.
- What are your recovery time objectives?
Another question to ask yourself is, “If you had a complete system failure, how long can your systems be down before you start losing money?” If there was a natural disaster that caused power to be lost for an extended period of time, do you need to be up and running in a remote location, and if so, how fast? This is defined as your RTO.
These are 2 very important questions that your IT partner should be asking you. The answers you provide will help IT design the appropriate DR plan to meet your RPO’s and RTO’s. Now that you have these two objectives defined, what can you do to make sure these are actually being met, and most importantly, tested and documented? Below are some great technologies you could implement to help you achieve both low RTO’s and well as frequent RPO’s.
Demand for RPO’s up to 15 minutes:
- — Volume Shadow Copy:
This is a free built in feature to Windows Server, and can be enabled on each volume on a schedule that meets your needs.
- — Dell AppAssure Replay:
Replay has the ability to take snapshots of your mission critical applications as often as every 15 minutes, which can then be replicated to an off-site server in a remote location. Replay has a unique feature for applications like SQL and Exchange where Replay will test to see if the databases can be mounted, and notify you if there are any issues.
- Demand for RTO’s of 5 minutes or less
- — Hyper-V Replicas:
With the release of Server 2012, Hyper-V now has virtual machine replication (Hyper-V replicas) built in! These VMs can be replicated to a warm standby server on-site as often as every 5 minutes, or sent off-site to the cloud. This is a great low cost option to be up and running quickly in the event of a failure.
- — NeverFail:
If you need true high availability with near 0 downtime, using HA software like NeverFail will be your best option. NeverFail will make sure your data is always available, and can automatically be failed over to a remote server.
Great! I have my RPO’s and RTO’s defined ,I have my software vendor selected that I am going to use, how often should I actually test this and how do I know its being done?
Each business should have a documented disaster recovery plan that outlines all of the backup solutions in place, how often they run, and a procedure to recover the data for each solution. In my managed services practice, we receive daily notifications that let us know if the backups have run successfully or not. This is a great starting point, however we want to make sure the data is actually recoverable. Each week we make sure we run through a manual check where we validate the ability to recover a file. We also like to do a complete DR test at least quarterly where we bring the DR site online, and have a few employees verify they can access their data. Each check is documented in a managed services ticket and sent to the client contact with the results of the test for their records. If the recovery process is not up-to-date in the DR plan, the engineer is responsible to updating the plan with the most recent recovery steps.
I hope this helps you think more about your disaster recovery requirements, as well as give you some ideas of ways to accomplish just about any RPO or RTO. If you need assistance creating a DR plan, or would like assistance validating an existing DR plan, please contact me to schedule a consultation!