The Information Technology Infrastructure Library (ITIL) is a set of best practices for improving and aligning IT services to business needs. ITIL processes provide various key performance indicators (KPI) to allow you to measure IT performance and cost. It provides benchmarks for measuring IT service delivery, improving quality and productivity, and ensuring better asset utilization.
Mapping your disaster recovery (DR) plan to ITIL processes can help create a more business-driven DR plan and improve your DR plan performance. ITIL processes can also assist in maintaining and testing your DR plan.
Mapping ITIL processes to DR
Ensuring smooth ITSCM
A consistent IT service continuity management (ITSCM) is essential to any DR plan. ITSCM is often implemented using a top-down approach. The best way to ensure an enterprise-wide structure to ITSCM is to draft a policy and present it to the key stakeholders for approval. Once signed, define the scope of the project and allocate resources to the project plan. ITSCM can be addressed using the service design (SD) 4.5 and continual service improvement (CSI) 5.6.3 processes of ITIL.
Defining individual continuity plans
Plan your DR at the organizational as well as small unit level. Individual DR planning is discussed in SD 188.8.131.52 of the ITIL processes, which recommend that each individual unit identify key recovery scenarios based on risk assessment and business impact analysis (BIA).
Analyzing IT risks
Risk analysis determines the probability and impact of disaster threats or vulnerabilities, and can be mapped to SD 184.108.40.206. The ITIL processes recommend the Management of Risk (M_o_R) standards for assessing IT risks. This method involves creating a risk profile based on severity of impact and the likelihood of occurrence. Risk acceptance criteria are also defined in this process. After doing a risk analysis, you can plan risk reduction measures.
Doing a business impact analysis (BIA)
SD 220.127.116.11 of the ITIL standards framework provides guidelines for doing a business impact analyses. Once you identify the prime disaster scenarios, it is crucial that the impact on key business processes be quantified as well.
A BIA will reveal both the financial and non-financial impact of a disaster. For example, the BIA might indicate consequences such as loss of revenue, reputation, or breach of law.
When doing a BIA, also take into account the change in business priorities over time. Seek the opinion of both senior management, and those who are likely to be directly impacted by the disaster.
Once a BIA is done, ITIL processes stipulate that the recovery time objective (RTO) and recovery point objective (RPO) for business processes be calculated. RTO is the minimum time in which services will recover to normal, and RPO is the maximum acceptable data loss, after a disaster.
Authorize a crisis management team to invoke the DR plan if certain previously predicted criteria in BIA are met. This DR activity can be mapped to change management and configuration management processes discussed in SD 18.104.22.168 and 22.214.171.124 of the ITIL processes.
Focusing on critical infrastructure
It’s essential to consider availability needs while planning critical infrastructure requirements to develop resiliency. Resilience is the ability of a set of configuration items (CIs) to continue to quickly provide a required function when some CIs have failed. Resilience and prioritization have been dealt with in the availability management processes of SD 126.96.36.199 of the ITIL processes framework.
Changing controls to suit shifting business requirements
Update the various IT service continuity procedures and plans via change management to ensure that they are accurate and up to date. Changes to DR plans are addressed in the ITIL change management process, service transition and SD 188.8.131.52 of the ITIL processes.
Record the CIs for the critical services to be recovered during a disaster in the configuration management database (CMDB). This will help track and establish ownership of the CIs, version control, and will ensure that the plan stays current.
Understanding off-site storage requirements
Service Design 184.108.40.206 and Service Operations 5.2.3 of the ITIL processes explain off-site storage collaboration with business process owners. It is necessary to recognize how the current data classification policy is implemented, criticality of the data, and who the user is. Answers to these broad-level questions will help determine the off-site storage for critical media, documentation, and resources.
Conducting regular tests and training
Service Design 220.127.116.11 of the ITIL processes can help define a regular test schedule and program for DR. SD 18.104.22.168, 22.214.171.124, and CSI processes, elaborate on the importance of staff training for key aspects of DR.
Establish a regular schedule of training and drills so that employees are prepared for a disaster scenario. ITIL processes can help define KPIs to measure the regularity with which the test program is being adhered to, and the effectiveness of the training sessions.
Implementing an action plan
Implementation of the DR action plan is addressed in SD 126.96.36.199 and 188.8.131.52 of the ITIL processes. As stated by the ITIL process, a DR plan should be change- and version-controlled, with a specific distribution list.
Maintain standard procedures and a set number of people to be contacted during a disaster, and keep a copy of the plan at an off-site location. Equip the service desk with these procedures and call-lists, and appoint them as a single point of contact (SPOC) during a disaster. The service desk will be responsible for coordinating with the designated personnel to execute the DR plan.
Planning for the IT recovery phase
Once recovery is complete, operations should resume from the DR site as per the signed service level agreements (SLAs). Vacate the DR site and resume operations in the primary site to minimize business impact due to downtime. Service Design 184.108.40.206 and 220.127.116.11 of the ITIL processes address the recovery phase.
Understanding business processes for the future
ITIL’s service strategy defines a portfolio of current and future services offered by an organization. Following the inventory, the business impact and return on investment (ROI) of these services can be understood. It is important that regular management assessments and reviews are carried out for the DR plans to stay current.
For instance, the assessment levels defined can be mapped to internationally recognized standards such as the Capability Maturity Model (CMM). Conduct this annually, or after major infrastructure changes occur. SD 18.104.22.168 and SD 22.214.171.124 of the ITIL processes can be mapped to this DR activity.
About the author: Anand Choksi, a senior consultant at Aujas Networks, is primarily responsible for leading engagement in ISO 27001, business continuity, disaster recovery, data protection and information risk management. He has an ITIL – Foundation certification. Anand holds a bachelor's degree in business administration (BBA) and a masters in computer applications (MCA).
(As told to Mitchelle R Jansen)
This was first published in October 2011