Operational Resilience in Focus, Part One: What is Operational Resilience?
Regulatory bodies across the globe are concentrating efforts on increasing firms’ operational resilience in response to evolving risks, including cyber incidents, to protect market participants.
But what is operational resilience, and why does it matter? Operational resilience is an essential (and for many, now mandatory) undertaking.
It may feel daunting to understand, let alone comply with, new operational resilience requirements. But firms do not have to go it alone—ACA is here to help with a four-part series tackling the following topics:
- Part one, the remainder of this article, will unravel the concept of operational resilience, why it is important, and identify the attributes of an operationally resilient company.
- Part two will focus on the concept of cyber and information security as a component of operational resilience and what an organization needs to achieve cyber and information security resilience.
- Part three will break down the European Union’s proposed Digital Operational Resiliency Act (DORA) and its required cyber framework.
- Part four will explain the cyber requirements in the SEC’s proposed cybersecurity rule 206(4)-9.
What is Operational Resilience?
Operational resilience is a company’s capability to adapt to evolving risk landscapes, respond to shocks that evolve across functions of the broader enterprise, and provide services under duress. Regulators expect firms to operate uninterrupted in the face of these stressors to help the financial system absorb and adapt to them (a comprehensive list of affected firms will be included in parts three and four, or can be found on pages 28-29 in DORA and the introduction of the SEC’s rule).
ACA defines operational resilience as having five critical domains with five foundational components.
Operational resilience is not “one-and-done” and should not be considered a point-in-time exercise. Operational resilience is a team sport that actively discourages silos. This means all teams should know how to activate recovery plans together, knowing not only how to communicate to their own teams but also how their teams are expected to work with others when restoring operations.
An operationally resilient company can:
- Keep critical business processes up and running (simply put, “keep the lights on”) during disruptions.
- Respond to disruptions effectively and without delay.
- Return to business-as-normal after the conclusion of a disruption.
- Learn from disruptions to better prepare for future incidents.
- Execute remediation strategies if gaps in their programs are identified after an incident.
- Be accountable for the disruptions and maintain records of their events and mitigation.
- Evaluate and address the inherent risk posed to them by third-party vendors.
What Are the Domains of Operational Resilience?
The domains to focus on in working towards operational resilience are:
- Cyber and Information Security Resilience—Detecting and protecting against threats and vulnerabilities, as well as restoring operations during cyber disruptions affecting an organization’s networks, information systems, and data.
- Technological Resilience—Protecting and ensuring the durability of critical technological systems and tools needed to deliver services and/or goods.
- Third-Party and Supply Chain Resilience—Performing diligence on all critical third-party relationships to determine if they are adequately resilient in the face of disruptions. Note: third-party resilience does not solely refer to third-party risk management (TPRM) and cyber resilience, but rather how these vendors perform broadly across all functions and services (for example, their ability to continue vital deliveries during supply chain and personnel shortages).
- Business Operation Resilience—Establishing protocols and policies to ensure that employees (especially those with roles critical to day-to-day functions) can continue to work during a disruption.
- Relational Resilience—Assuring clear and prompt communication with external stakeholders (such as clients, investors, and consumers) during a disruption, as well as managing the safety and workplace demands of personnel (such as prioritizing their safety during threats).
What Are the Foundational Components of Operational Resilience?
For organizations to be operationally resilient, they must, at minimum, account for the following foundational components:
- Program Governance—Establishing strong oversight programs to manage an organization’s overall operational risk. It requires buy-in across the enterprise (e.g., IT, finance, compliance, procurement, human resources), the ability to measure success, and processes to report key risks and program results to senior management and the board of directors. Owners of all risk management processes should be assigned with clear roles and duties.
- Risk Management—Overseeing an operational risk program built on existing industry best practices as well as new requirements to fill in gaps. Risk management measures are designed to identify risks, minimize their impact, and keep up with their evolution. They should scale appropriately to an organization’s size, complexity, and business activities. Risk assessments and business continuity planning (BCP) are key elements. Process owners established in Program Governance oversee risk management measures.
- Planning and Testing—Building, planning, testing, and improving an organization’s resilience in the face of business disruptions. BCP, incident response, and disaster recovery (DR) are required and should account for likely scenarios (such as ransomware or natural disasters). Formal policies should clarify communication structures, systems and communication mapping, threat detection and escalation, remediation, and testing. Capability to execute policies and plans should be always present, and key third parties needed for these plans should be identified (e.g., attorneys and public).
- TPRM—Managing vendor relationships before signing, during the contract duration, and after contract expiration. Organizations must understand how resilient their third parties are, how dependable they are in their supply chain, and what key technologies are needed to support their business functions. Vendors should be intensely scrutinized through due diligence and held accountable for ensuring risk findings are addressed.
- Reporting—Adhering to new requirements set by regulatory entities that require organizations to notify internal and external parties of incidents within newly establish timetables. In addition, organizations must keep records of incidents. Parties to notify may include regulatory authorities, internal staff, external stakeholders, the media, and criminal justice investigators.
What Are the Benefits of Operational Resilience?
Operational resilience offers sector-wide regulatory benefits, as well as internal and external benefits to an organization.
Financial Sector Regulatory Benefits
An operationally resilient organization meets emerging regulatory requirements. These requirements:
- Ensure financial stability and economic integrity.
- Standardize program expectations, including cyber, and requirements across the financial sector.
- Harmonize audits and enforcement.
- Provide transparency around significant incidents that impact others (e.g., clients, consumers, advisers and their funds, or private fund clients).
Internal Enterprise Benefits
An operationally resilient organization is prepared, current, and forward-looking because:
- Operational resilience capabilities cover an organization before, during, and after an incident, allowing for consistent and timely resolution.
- Regular testing and improvements of policies help them to not stagnate, as well as continually identify and address an organization’s pain points.
- Resilient systems, platforms, and policies afford employees time to focus on operational improvements rather than continual fixes to old problems
- An innovative organization who delivers under all circumstances appears stable and is more attractive to new and current employees, bolstering both hiring and retention.
External Enterprise Benefits
An operationally resilient organization has competitive advantage because:
- Organizations can put investors and customers first; they are equipped to continue to serve their internal and external stakeholders during disruptions and are less likely to lose capital and repute.
- Customers and investors will be more at ease knowing their funds are safe and their day-to-day responsibilities will be minimally impacted in the event of a disruption.
- The windfall and fortification offered by operational resilience policies means an organization has less operational and regulatory risks, which can boost the public image of the brand and management team.
How Is Operational Resilience Different from Business Continuity Planning and Disaster Recovery?
BCP, DR, and operational resilience are similar concepts in that they all comprise living documents that focus on disruption response and recovery. They also require annual testing and organization-wide buy-in. However, BCP and DR are components of operational resilience; in other words, they are not synonymous. Having a BCP and DR plan does not mean an organization is operationally resilient, but an operationally resilient organization will have a BCP and DR.
BCP has a narrower focus on practices and procedures during a disruption. DR is the focus on restoring activities and systems when a BCP is activated. Operational resilience more broadly encompasses an organization’s ability to function, adapt, and absorb any blow without significant damage before, during, and after a disruption. Operational resilience is made up of numerous interconnected mechanisms across all parts of the enterprise aimed at enhancing governance, risk management, transparency, expertise, formal written policy (BCP, DR, and beyond), and efficacy (internally and sector-wide).
Example – Fictional Company “Jungle Delivery Services” (“Jungle”)
Jungle is a retailer and one of their largest revenue streams comes from the sale and delivery of goods. Sales and delivery are critical business functions.
Jungle has a fleet of delivery trucks in City A that regularly deliver packages to City B. A truck is enroute to City B to deliver packages when the engine starts billowing smoke (the “disruption”).
The driver enacts a strategy to get the truck moving again. The driver calls roadside assistance to fix their truck.
The key concern of the disruption is that the packages in the broken-down truck will not be delivered to City B on time. Because Jungle has a BCP, the driver knows how to respond to this situation. The plan lays out clear instructions, including:
- The names of the incident response team and contacts to whom the driver should escalate the issue.
- The roles of each member of the response team and how they are expected to respond.
- Instructions for the team (in this case, to send a spare truck and driver to the breakdown location and have the drivers load packages from the broken truck into the spare).
- Directives to write an incident report detailing the disruption and its resolution.
The spare truck and driver will then continue their trek to City B while the original driver stays with the truck until it is picked up and repaired.
Operational resilience broadly addresses operational risks before, during, and after disruptions. Operational resilience identifies the scenarios that threaten Jungle’s critical business functions and systems. It asks, “what if a truck breaks down enroute?”, but also “what if there are no spare trucks?”, “what if there is a personnel shortage and all drivers are busy during a breakdown?”, and even more broadly, “what else would negatively impact our day-to-day operations? How can we assure the trucks will make it from City A to City B in all scenarios?” To become operationally resilient, Jungle:
- Conducts risk assessments and then creates robust and clear policies across their enterprise to address their risks (including BCP and DR).
- Tests the policies’ abilities to prevent disruptions.
- Makes changes to their policies to fill any gaps identified in their tests.
- Continues to test and enhance policies year-over-year.
- Establishes a governance structure with clear leadership and resources dedicated to risk management.
- Explores and thoroughly vets relationships with third parties that can help them ensure deliveries are on time (i.e., partnerships with other delivery companies).
- Keeps a written record of their disruptions and mitigation efforts.
Business Continuity Planning
|Incorporates but is not limited to BCP and DR||Part of operational resilience||Part of BCP and operational resilience|
|Always "on"||Activated as needed||Activated as needed|
|Holistic focus (addressing before, during and after disruptions||Narrow focus (during disruptions)||Narrow focus (during disruptions)|
|Broad governance, risk management and crisis prevention/detection/response||Specific to crisis management and response||Specific to restoring operations|
Steps to Get Started
To get started on building out a cyber operational resilience framework to protect against threats, manage business disruptions, secure connections with third parties, and preserve relationships, ACA advises:
- Defining the organization’s risk threshold, or the point at which a disruption causes intolerable harm to external stakeholders.
- Performing maturity and gap analyses to identify areas of strength, weakness, or where policy and procedures are nonexistent.
- Appealing to upper management and/or board of directors to drive home the importance of a robust operational resilience framework.
- Developing a roadmap for creating new programs or building out existing ones.
- Performing a business impact analysis to identify situations likely to affect a firm, which will assist in building out specific policies such as the BCP.
- Pinpointing vendors with critical software needed to maintain business operations and prepare to perform due diligence.
- Ascertaining internal staff’s current capabilities to design and implement elements of an operational resilience framework and exploring options to supplement any disparities (be it through training current staff, hiring subject matter experts, or outsourcing to a trusted third party).
How We Can Help
ACA can help organizations establish operational resilience or help them along in their journeys. A good place to start is with a business impact analysis to identify gaps in current programs, predict and prepare for the most detrimental disruptions, and begin protection and mitigation planning.
In the next part of the series, “Operational Resilience in Focus, Part Two: Operational Resilience and Cyber,” we dive further into the cyber resilience component of operational resilience to learn what it means and how to achieve it.