Skip to main content
← Back to K Definitions

Kritieke functies

Operational Resilience: Definition, Example, and FAQs

Operational resilience is the ability of a financial institution or other organization to deliver its critical functions and services through disruptions, whether from cyberattacks, natural disasters, or other operational failures. As a crucial aspect of risk management, it goes beyond simply preventing outages; it emphasizes the capacity to adapt, withstand, recover, and learn from adverse events to minimize their impact on customers, the market, and overall financial stability. Operational resilience integrates various elements such as people, processes, technology, and third-party dependencies to ensure continuous service delivery, even under severe stress.

History and Origin

The concept of operational resilience has evolved significantly, particularly in the wake of major financial crises and the increasing sophistication of cyber threats. Historically, financial institutions primarily focused on business continuity planning (BCP) and disaster recovery, aiming to restore systems and operations after an incident. However, these approaches often focused on internal IT systems and processes, rather than the end-to-end delivery of essential services to customers and the wider financial system.20, 21

The 2008 global financial crisis highlighted the interconnectedness of the financial system and the potential for a single firm's operational failure to cascade across the market. This led regulators worldwide to broaden their focus from simply preventing financial collapse due to credit or market risks to ensuring the stability of critical financial services. The rise of cybersecurity threats, technology failures, and other non-financial risks further accelerated this shift.18, 19

In response, international bodies like the Basel Committee on Banking Supervision (BCBS) began developing comprehensive frameworks for operational resilience. The BCBS, for instance, published "Principles for Operational Resilience" in March 2021, which aimed to strengthen banks' ability to withstand operational risk-related events such as pandemics, cyber incidents, technology failures, or natural disasters. This framework built upon existing guidance on corporate governance, outsourcing, and business continuity, seeking to create a more coherent and outcomes-focused approach to managing operational disruptions.16, 17

Key Takeaways

  • Operational resilience ensures that an organization can deliver its important business services and critical functions even when faced with severe disruptions.
  • It shifts the focus from preventing all failures to minimizing the impact of unavoidable disruptions.
  • Operational resilience involves identifying key services, setting clear impact tolerance levels, and continuously testing the ability to operate within those tolerances.
  • It requires an integrated approach across people, processes, technology, and third-party relationships.
  • Regulatory bodies globally increasingly mandate operational resilience for financial institutions to safeguard financial stability.

Interpreting Operational Resilience

Interpreting operational resilience involves understanding an organization's capacity to maintain the continuity of its essential services despite disruptions. It moves beyond the traditional measure of uptime (how long a system is available) to assess the "impact tolerance" for specific critical services.15 Impact tolerance defines the maximum acceptable level of disruption to an important business service, typically measured in terms of time, data loss, or volume of transactions affected.14

Organizations evaluate operational resilience by mapping end-to-end processes that support critical services, including all underlying technology, people, and third-party risk dependencies. This mapping helps identify vulnerabilities and interdependencies that could cause disruptions. The effectiveness of an operational resilience framework is judged by how well an organization can prepare for, respond to, and recover from severe but plausible scenarios, ensuring that critical services remain within their defined impact tolerances.12, 13 Robust governance and regular reviews are vital for this ongoing assessment.11

Hypothetical Example

Consider a large investment bank, "Global Capital Markets Inc.," that provides various critical services, including real-time trade execution and settlement for institutional clients. A key critical function for them is the processing of equity trades.

One morning, a regional power grid failure, compounded by a localized flood, renders Global Capital Markets Inc.'s primary data center inoperable. This represents a severe but plausible scenario.

  1. Identification of Critical Function and Impact Tolerance: Global Capital Markets Inc. had previously identified "real-time equity trade execution" as a critical function with an impact tolerance of 2 hours, meaning any disruption beyond this period would cause severe harm to market integrity and client trust.
  2. Mapping and Dependencies: Through prior operational resilience efforts, the bank had mapped out all dependencies for this service. This included the primary data center, a redundant backup data center located in a different geographical region, dedicated staff, network connectivity providers, and third-party risk market data feeds.
  3. Response and Recovery:
    • Automated Failover: Within 15 minutes of the primary data center outage, automated systems initiated a failover to the secondary data center.
    • Contingency Activation: The operational resilience team immediately activated its contingency planning protocols. Key personnel, including traders and IT support, swiftly transitioned to pre-designated alternative work sites with secure network access to the secondary data center.
    • Communication: Clients were informed of the disruption and the activation of resilience plans within 30 minutes.
    • Monitoring: Continuous monitoring ensured that trade execution volumes remained within acceptable parameters, demonstrating that the critical service was being delivered within its impact tolerance.

Despite the severe disruption to its primary infrastructure, Global Capital Markets Inc. continued to process equity trades within its 2-hour impact tolerance, demonstrating effective operational resilience.

Practical Applications

Operational resilience has become a fundamental requirement across the financial services sector and beyond. Its practical applications are wide-ranging:

  • Banking and Financial Markets: Banks, investment firms, clearinghouses, and payment systems implement operational resilience frameworks to protect essential services like payment processing, trading, and settlement from disruptions. Regulators like the U.S. Federal Reserve have issued guidance on sound practices to strengthen operational resilience for financial institutions, emphasizing the ability to deliver critical operations through any hazard.10
  • Regulatory Compliance: Many jurisdictions, including the UK with its Financial Conduct Authority (FCA) and Prudential Regulation Authority (PRA), and the European Union with its Digital Operational Resilience Act (DORA), have mandated specific operational resilience requirements. These regulations often require firms to identify important business services, set impact tolerances, and conduct rigorous stress testing and mapping.8, 9
  • Cybersecurity Defense: Operational resilience is intertwined with cybersecurity strategies. It helps organizations prepare for and recover from cyberattacks, ensuring that critical services can continue even if certain systems are compromised. The Federal Reserve Bank of New York, for instance, has conducted research modeling how a cyberattack on a major bank could amplify through the U.S. financial system, underscoring the importance of resilience.6, 7
  • Third-Party Risk Management: With increasing reliance on external vendors for critical services (e.g., cloud computing, IT support), operational resilience frameworks extend to managing third-party risk. This involves assessing the resilience capabilities of suppliers and ensuring their disruptions do not jeopardize the firm's own critical functions.5

Limitations and Criticisms

While operational resilience is crucial for financial stability, its implementation and effectiveness can face several limitations and criticisms:

  • Complexity and Cost: Establishing a comprehensive operational resilience framework is complex and expensive. It requires significant investment in technology, people, and processes, particularly for large, interconnected financial institutions. The cost of downtime in financial services can be substantial, with some estimates suggesting annual costs in the millions or even hundreds of millions of dollars for large organizations.3, 4
  • Defining "Critical Functions": Identifying and agreeing upon what constitutes a "critical function" or "important business service" can be subjective and challenging, potentially leading to inconsistencies in application.
  • Over-reliance on Technology: While technology is central to resilience, an over-reliance on automated solutions without robust human oversight and adaptable processes can create new vulnerabilities.
  • Third-Party Dependencies: Managing operational resilience across complex supply chains and numerous third-party risk providers presents a significant challenge. A failure at a single critical vendor can disrupt multiple firms simultaneously, even if those firms have strong internal resilience. For example, the TSB Bank IT meltdown in the UK in 2018, which led to significant customer disruption and regulatory fines, highlighted failures in managing operational risks arising from IT outsourcing arrangements with a critical third-party supplier.1, 2
  • Scenario Plausibility: Stress testing relies on "severe but plausible" scenarios, but predicting all potential disruptions, especially novel or unprecedented events, remains difficult. Organizations may optimize for known risks, leaving them vulnerable to unforeseen incidents.
  • Regulatory Burden: The multitude of overlapping and sometimes differing regulatory compliance requirements across jurisdictions can create a compliance burden for internationally active firms.

Despite these challenges, the ongoing evolution of operational resilience frameworks aims to address these limitations, striving for a more robust and adaptable financial system.

Operational Resilience vs. Business Continuity Planning

While closely related, operational resilience and business continuity planning (BCP) are distinct concepts in risk management. BCP traditionally focuses on the recovery of an organization's internal processes and systems after a disruption. It is primarily concerned with restoring an organization's operations to a pre-defined state following an incident, often with a focus on specific IT systems or departmental functions.

Operational resilience, on the other hand, takes a broader, outcomes-focused approach. Instead of merely restoring internal systems, it prioritizes the continued delivery of critical services to customers and the wider market, even amidst disruption. Operational resilience is about understanding the "impact tolerance" of these services – how much disruption can be tolerated before severe harm occurs – and building the capability to stay within those limits. It encompasses the entire ecosystem that supports a service, including external dependencies, and emphasizes the ability to adapt and learn from incidents. Therefore, while BCP is a component of operational resilience, operational resilience represents a more holistic and proactive strategy to ensure continuous service availability and minimize harm from any type of operational disruption.

FAQs

Why is operational resilience important for financial institutions?

Operational resilience is vital for financial institutions because it safeguards the continuous delivery of critical services, such as payment processing and trading, which are essential for maintaining financial stability and public trust. Disruptions can lead to significant financial losses, reputational damage, and wider systemic impacts.

What are "critical functions" in the context of operational resilience?

Critical functions (often referred to as important business services) are the activities or services an organization provides whose disruption would cause severe harm to consumers, market integrity, the firm's viability, or financial stability. Identifying these functions is the first step in building operational resilience.

How is operational resilience measured or assessed?

Operational resilience is typically assessed by identifying important business services, setting clear impact tolerance levels for each service (e.g., maximum downtime, data loss), and then conducting scenario testing and mapping exercises to determine if the organization can maintain service delivery within these tolerances during severe but plausible disruptions. Strong governance and continuous review are also key.

What are common causes of operational disruptions?

Common causes of operational disruptions include cyberattacks, technology failures (e.g., hardware malfunctions, software bugs), natural disasters, human error, and failures by third-party risk service providers.

What is the role of regulation in operational resilience?

Regulatory bodies worldwide, such as the Federal Reserve and the Financial Conduct Authority, issue guidelines and mandates for operational resilience to ensure the stability of the financial system. These regulations often require firms to implement specific frameworks, conduct regular stress testing, and report on their resilience capabilities.

AI Financial Advisor

Get personalized investment advice

  • AI-powered portfolio analysis
  • Smart rebalancing recommendations
  • Risk assessment & management
  • Tax-efficient strategies

Used by 30,000+ investors