Recent Rackspace Exchange Outage is Another Reminder to Firms of the Importance of Business Continuity Planning
The ongoing Microsoft Exchange outage at Rackspace Technology, one of the largest cloud and email hosting providers in the U.S., is just the latest reminder of the importance of having a business continuity plan (BCP) with contingency strategies to respond to and recover from business disruptions, such as unanticipated systemwide outages of critical third-party vendors.
Investing in third-party cloud providers such as Rackspace allows organizations to lower operational and staffing costs as well as centralize data while providing added flexibility to organizations’ digital transformation processes. However, it also opens organizations up to additional vulnerabilities in the case of technological failures or system outages. Rackspace’s ongoing Exchange outage highlights a key drawback of this dependency. Reported to be a result of a ransomware attack, Rackspace customers are currently without access to Hosted Exchange services while forensic investigators determine whether any sensitive data was affected.
While business disruptions are inevitable, there are preventative measures your organization can take to successfully respond to a disruption and minimize the aftershock. In this piece, we will summarize:
Practical steps to follow if you experience a vendor outage before a BCP is in place
Recommended exercises to complete before creating or revising your BCP
Key elements to include in your BCP
How to test your BCP for deployment.
Practical Steps to Respond to a Vendor Outage Before a BCP is in Place
Though this scenario is not ideal, an organization can initiate incident response for a disruption in the following ways if a formal BCP has yet to be ratified:
- Identify and notify the appropriate team that will need to respond to the incident at hand (i.e. your IT team). Establish how this communication will occur, bearing in mind this can depend on the outage. Ask yourself what modes of communication are working, and which are the fastest and most reliable.
- Funnel official communication through one channel. Having one source of truth diminishes uncertainty and allows the team to focus on the task at hand: restoring business. When instructions come from one source, it does not leave room for doubt or for multiple parties to give conflicting marching orders.
- Assess the root cause of the outage. Ask yourself if the outage is isolated or widespread. Evaluate the scope: is it due to an internal error, or is it outside of your control (i.e. at a vendor)?
- Communicate with internal (employees and management) and external stakeholders (customers, investors, key suppliers, etc.). Be transparent; let users know your team is aware of the issue and is working to resolve it in a timely manner. However, do not overshare until a clear plan is identified to avoid spreading panic.
- Take notes on what works/does not work during the response process. This information can be integrated in the BCP when it is built.
Before Creating or Revising Your BCP
Whether you are creating your plan for the first time or refreshing an existing one, here a few key steps to keep in mind:
- Perform a business impact analysis. This exercise evaluates the potential effects of a disruption and reveals gaps in your current policies (including but not limited to critical processing times, system failure workarounds, vendor dependencies, and resource requirements to stay operational). Identifying potential repercussions and deficiencies in your current plans (formal or informal) highlights specific areas of improvement on which to focus.
- Determine whether your key vendors depend on Rackspace or other critical service providers. Even if you do not rely on Rackspace, you may have critical vendors who do. If Rackspace goes down, so do your critical vendors and so do you.
- Risk-rank your key vendors. Risk-ranking vendors by sensitivity of the data they retain as well as their criticality to your day-to-day business operations provides visibility into some of your greatest risks. Once identified, you can tailor content in your BCP to address outages at these vendors specifically. In addition, get a clear understanding of the business continuity measures these key vendors have in place.
- Back up your data regularly. When vendors such as Rackspace experience outages, your organization may lose access to vital data. Backing up your data at regular intervals helps ensure minimal data loss. Business needs vary, so business impact analyses can help define acceptable restoration points and how much data can be lost before your business operations suffer.
You can consolidate the information collected from these exercises to tailor much of your BCP to your exact business needs and formalize the plan in writing.
10 Essential Elements to Include in Your BCP
When drafting the latest version of your BCP, be sure to document:
- Your business continuity team. This team is responsible for enterprise-level plan activation and recovery during an incident. This team should be comprised of employees in multiple business units and not just Information Technology.
- Your recovery management team. This team is responsible for overall maintenance and activation of specific departmental plans.
- The communication chain. Dictate clear expectations on when, how, and to whom communications should escalate during a disruption.
- Any points of contact outside our organization. This can include law enforcement, legal counsel, or advisors/points of contact at relevant third-party vendors.
- Any plans your organization created for specific high impact disruptions. An example of such a disruption would be a system outage at your cloud provider, such as AWS.
- How to properly notify relevant parties. Consider who to notify in these situations and note it may not be the same set of individuals in each scenario. Notifications could include (but are not limited to) all employees, just management, and/or stakeholders.
- How to assess the situation. Your team should feel empowered and prepared to provide updates during an incident.
- How to activate each BCP. Know the appropriate guidelines set forth by your organization.
- How to resume business. Detail when and how it is best to efficiently and safely resume business.
- How to report post-incident. Summarize the event, successes and challenges alike. Distribute the report to key stakeholders and/or employees. Use feedback to this report to improve/adjust your BCP moving forward.
Testing your BCP for Deployment
After drafting, it is critical to test your BCP to ensure it is actionable and that your team knows how execute it successfully. An effective testing method is conducting a tabletop exercise, which simulates real-world scenarios in a safe environment. Tabletop exercises are a collaborative experience across your team designed to encourage communication and situational awareness, in addition to highlighting potential improvements to the BCP in anticipation of a disruption rather than because of one.
During a tabletop exercise, the exercise coordinator should:
- Assemble a group comprised of employees who would participate in an actual incident response.
- Assign roles to members of this team.
- Develop a realistic scenario against which to test your plan.
- Facilitate the exercise and any debrief after its conclusion.
- Document any gaps and assist with remediation strategy.
Regulatory bodies are doubling down on their BCP requirements due to evolving cyber threats and other operational risks, such as climate change. The FCA has been clear that regulated firms must take all reasonable steps to have a business continuity plan in place. Likewise, business continuity and disaster recovery plans have been a key focus of the SEC’s 2022 Exam Priorities.
In addition to the steps outlined above, ACA Aponix recommends getting ahead of outages and disasters by:
- Testing your cyber and privacy risk controls, network, and web applications.
- Monitoring threat intelligence for the latest news and trends.
- Supplementing your BCP with staff cybersecurity training, as your staff is your first line of defense against cyber incidents.
- Maintaining written policies, procedures, and governance
For more information on this guidance, download our cybersecurity checklist.