Business Continuity Management (BCM) comes into play only when a disaster occurs. Unlike the usual meaning of “disaster” in English, in the context of BCM it simply means, “any event that causes an essential service to be interrupted”.
Business Continuity Management is process based and takes an holistic view, rather than just a
technological view, and includes personnel, alternative sites, manual workarounds, and anything else
that can be used to ensure that essential services continue to be provided at agreed minimum levels within specified time periods, and then restored to full operation in a swift and cost-effective
manner, as defined in the SLA (Service Level Agreement)
Why you need a plan
Organizations increasingly depend upon information processing and telecommunications support and so the criticality of maintaining a high level of reliability, through continuity planning, will continue to increase.
Today an organization often faces considerable financial risk and damage to its name and reputation should normal business operations be unavailable in critical areas for even relatively short periods of time. Likewise it faces loss should it be seen to be without a suitable plan and framework in place to deal with possible outages.
Frequent news items concerning incidents such as a prolonged loss of network, Internet and email services and power outages, clearly demonstrate that relatively simple (and possibly predictable) risk events can take a significant period to recover.
What have the organisations learned from these incidents?
Where is it documented?
And most importantly – if these events were to reoccur today would they recover more quickly?
To address these and other issues the organization needs to develop a business-focused Business Continuity Plan (BCP) to enable business operations to continue during emergencies when an outage would normally result. What is Business Continuity Management?
In the years leading up to the turn of the last century the focus of most businesses managers was turned to the looming “Y2K bug”. This had an unfortunate side effect that business managers were viewing ICT as the enemy, a black hole that threatened to swallow-up extensive resources correcting a seemly self-created problem. The less-negative aspect of Y2K and disasters such as the “Twin Towers” catastrophe was that Business managers began to think seriously about how they could continue their core business activities during a significant crisis. Even short periods of outage can have dramatic effects on reputation and profitability, so a means of continuing “no matter what” must be devised.
Business Continuity Management (BCM) is not just Risk Management or Disaster Recovery.
Risk Management attempts to predict, quantify and qualify the various risks that an organisation might experience, as well as attempting to prevent or at least reduce the likelihood (or severity) of a risk event occurring, while suggesting methods for dealing with the risk events.
Disaster Recovery is usually very limited in scope being chiefly concerned with fixing ICT equipment, infrastructure or software.
BCM on the other hand is not concerned with the causes of problems or the likelihood of them occurring, for BCM comes into play only when a risk event has occurred, also BCM is process based and takes a holistic view, rather than just a technology view, and includes personnel, alternative sites, manual workarounds, and anything else that can be used to ensure business continuity.
The Business Continuity Planning process encompasses many disciplines such as: Risk Management, Disaster Recovery, Facilities Management, Supply Chain Management, Quality Management, Health & Safety, Knowledge Management, Emergency Management, Security, and Crises Management.
The planning process employs tools such as: Business Impact Analysis, Business Impact Resource Recovery Analysis,Gap Analysis. And must be cognisant of both the current budgetary constraints and recognised best practices. The plan must be reinforced with positive staff communication (advertising). Development and Maintenance.
The “Business Continuity Team” within the organisation, with assistance from key support areas, is responsible for developing and maintaining the Business Continuity Plan. Maintenance
It is crucial to ensure that the plan accurately reflects the environment (buildings and resources) and infrastructure within the organisation. To ensure this, the Plan must be reviewed, updated and tested regularly, and personnel retrained accordingly.
The Plan will also be changed through interaction with the Change Advisory Board (or equivalent). This task is the responsibility of the organization in conjunction with the Data Security Manager, the Operations Manager and the line managers.
Keep it Simple, Complexity comes at a cost
It is important to resist the temptation to build more and more technological redundancy into the overall risk. Many problems have simple answers; the goal of the Business Continuity Team must be to find simple and cheap answers wherever possible, whilst staying within the organization risk-appetite profile.
Any changes that may affect the Plan should be registered with the Change Advisory Board, the board will then notify the Business Continuity Team of the need for changes.
The Business Continuity Team and the Change Advisory Board will meet to formally review the Plan in light of all changes registered within the previous quarter. This may also result in retraining for staff and testing of plan modifications.
The Business Continuity Team the Change Advisory Board initiates a complete review of the Plan, which may result in major revisions to this document. These revisions will be distributed to all personnel on the distribution list, in exchange for the superseded Plan. At that time the Business Continuity Team will table an annual status report on continuity planning to the CEO, or other nominated office or person in the absence of a CEO.
The areas that need to considered are:
Stakeholders and clients
Stores and spares
ICT equipment (including mobile equipment), infrastructure and software.
Telecommunications, including mobile phones
Data stores in any form, including offsite backup, and offsite mobile computers and personal
The plan must be subjected to routine testing to prove that it works and to have managers “Business Continuity Plan”-ready. Alternative sites (such as a disaster recovery site) and facilities must be proven in service. A Business Continuity Plan is NOT an insurance policy – and so should not lead to complacency – it is an urgent response to significant disruptions (outages) to critical business processes with the potential for serious damage to the organization and its reputation.
Testing the Business Continuity Plan is an essential element of preparedness. Partial tests of individual components and recovery plans will be carried out on a regular basis. A comprehensive test of the plan and our recovery sites will be performed on an annual basis. Recovery testing of Category I (Critical) systems will be done annually. Simulation exercises that will include the organization’s business partners will be carried out annually under the direction of the Business Continuity Team. Recovery Procedures
In a crisis no-one has the time to read huge manuals (assuming that the manuals can actually be found and are up to date), and the more complexity that has been designed into a recovery system the more likely it is to fail. So the plan must be concise and it must be managed, also the staff need to be trained to responded appropriately.
The temptation to include too much detail or be too prescriptive can have an adverse affect, and so the level of detail has to allow for manoeuvring.
It can help to bear in mind that not every detail can be foreseen and many operations will be obvious at the time. Likewise applying planning and resources to areas that are non-critical, or whose failure would have little cost to the organization and its reputation, is a waste of time and should not be considered for this process.
To produce a Business Continuity Plan (ISCP) that will ensure that organization will, under a wide range of adverse conditions:
1 minimise loss/degradation of agreed Mission Critical Activities (MCA’s) to organization clients;
2 minimise the impact from loss/degradation of MCA’s to organization clients;
3 expedite the structured and timely recovery to normal operation; and
4 consider personnel, equipment, software, third-party services, work areas, and data.
The project will include recommendations for the ongoing management of the plan.
Establish guidelines, policies, procedures and documents to enable the continuity of critical services to be assured under adverse circumstances that would ordinarily lead to outages. This plan is intended to cover the organization only, and not the Agencies, which must have their own Business Continuity Plans.
The Owner of this strategy: should be the CEO, CIO or other senior member.
Business Continuity Management, Typical
All users should comply with the following principles:
- In the event of a disruption to service, reference must first be made to the Business Continuity Plan for the area, before action is taken.
- A record of the nature of the problem, the steps taken to resume the service, and lessons learned, is to be made and returned to the officer in charge of the Business Continuity Plan.
- Business Continuity Planning is subject to continuous improvement and so the officer in charge of the Business Continuity Plan will review the record of each incident (interviewing personnel where necessary), with a view to improving the plan.
- If you are involved in the purchase of new infrastructure or operating systems, or in the development of or modification of applications, locations or infrastructure, check whether these items necessitate modifications to the Business Continuity Plan.
- Report any areas of non-compliance to the Manager, Business Continuity Team.
From the outset the author intended following the recommendations and structure of the Business Continuity Institute (BCI), and on the whole this has been done. However in areas the BCI is at variance with the recommendations of both the Australian National Audit Office (ANAO) and with Gartner Research yet there appears to be a level of agreement between the latter two (and with common sense) the author has modified the BCI framework accordingly.
In particular BCI stipulate combining the Mission Critical Activities (MCA’s) identification phase with the Business Impact Analysis (BIA) and Risk Analysis (RA), but ANAO and Gartner Research recommend identifying the MCA’s in a prior phase, this appears to make more sense as the Business Continuity Plan (BCP) should be concerned only with Mission Critical Activities (MCA’s). Also if risk is the driving consideration for the plan (rather that criticality), then critical processes that are considered to have a low risk of failure might be ignored.
The resultant planning process then can be a combination of the three main sources listed above.
Organisational design and structure for the Business Continuity Plan
The Business Continuity Plan is activated by an interruption of at least one critical process and combines a number of continuity, disaster recovery, crises and other plans under a unified management structure.
This plan will minimise the impact of the outage, enable the service to continue and ultimately restore the service to normal operation. The operation of the plan will terminate upon the full resumption of the service and with the documentation of the lessons learned (which may led to future enhancement of the Business Continuity Plan).
There are four main areas to consider when developing the plan
1. Interruption of computing
2. Interruption to communications
3. Loss of key personnel
4. Loss of facilities eg buildings, offices.
Perform a Business Impact Analysis (BIA)
It is essential to form a plan that will serve the business Mission Critical Activities (MCI’s) and so a business perspective is vital to the success of any plan.
To achieve this, workshops and interviews should be conducted with key business personnel to determine their concerns for their departments, along with their view of the recovery priorities (ranking) for their area and their concerns about stability and recovery.
During the interview consider the impact (financially and otherwise) of a complete failure in any service and how long the area can survive without it, this will help determine the Maximum Acceptable Outage (MAO) for each process, as well as the Recovery Time Objectives (RTO’s) and Recovery Point Objectives (RPO’s).
Also to be identified is the “single point of failure” for each process, i.e. a failure to a MCA that has no viable alternative.
Evaluate the impact of a total outage of a critical process under the following headings (ANAO):
Loss of revenue / increased expense.
Service delivery standards
Public or political embarrassment
Loss of client confidence
Loss of management control
Regulatory, statutory or contractual liability
Specific/unique vulnerabilities, and
Perform Business Impact Resource Recovery Analysis (BIRRA)
The BIRRA is intended to identify the minimum resources required to achieve the Recovery Time Objectives (RTO’s) and the Recovery Point Objectives (RPO’s).
Perform Risk Analysis and Design Continuity Treatments
“The point to risk management is not to operate your business in a risk free environment…It’s to tip the scale to your advantage. So it becomes strategic, rather than defensive.” – Peter G. M. Cox, CFO, United Grain Growers Limited.
Typical risk classification framework
In general, risks may be viewed as being negative (threats) or positive (opportunities), but for the purposes of service continuity, only the negative aspects should be considered. The reason for this is that the Business Continuity Plan becomes active only when a risk event is manifested.