What is the importance of operational resilience?

Copy URL

Operational resilience can refer to both a system’s ability to resist losses and outages and to recover from them if they occur. Increasingly, the focus is on managing third party risk, and coping with failures of a given cloud infrastructure provider.

In the case of banking, insurance, and other financial services companies, operational resilience can also specifically refer to an organization’s ability to provide critical services when faced with a larger disruption.

Why does digital operational resilience matter now?

Even before the COVID-19 pandemic, the rapid expansion of digital services pushed financial institutions to move more applications to third-party technologies—particularly cloud infrastructure providers—and the pandemic accelerated this transition.

Though financial services organizations have always used third-party technologies, the accelerated move to those digital services led regulatory authorities in Europe, the Middle East, and Africa (EMEA) to introduce the need for additional governance and controls, which set off a global digital operational resiliency trend.

One key piece of legislation, the Digital Operational Resilience Act (DORA), specifically hones in on financial institutions’ reliance on large cloud infrastructure providers. Regulators view this heavy dependency as a potential systemic risk. By this, they mean that if something happened to one of the major cloud providers, the impact would have widespread global consequences on the stability and trust of the financial markets. DORA looks to reduce this risk by regulating financial services providers’ operational resilience.

How does operational resilience apply to cloud platforms?

IoT often gets attention from consumers, whose experiences with technologies like wearable smartwatches are tempered by the inherent privacy and security concerns that come with constant connectivity. This consumer perspective is prevalent throughout all kinds of enterprise IoT projects—especially when the end user is the general public.

Enterprise IoT solutions allow companies to improve existing business models and build new connections with customers and partners—but not without challenges. The volume of data produced by a system of smart devices can become overwhelming (often described as big data). Integrating big data into existing systems and setting up data analytics to act on it can get complicated.

IoT security is a major consideration when building IoT systems. Still, for many companies, IoT has been worth the effort, and successful enterprise IoT use cases can be found in nearly every industry.

Operational resilience and the digital operational resilience trend are important for financial services organizations for multiple reasons that factor into the core of their business models. 

  1. Lost revenue generated from critical services: In the face of a disruption that brings down critical services and subverts business continuity, financial services organizations lose revenue at an estimated rate of $5,000,000 USD per hour if those services are unavailable to customers. Additionally, customers, citizens, and partners who rely on those critical services can have severe losses in revenue if the critical services they rely on are not available at key moments.
     
  2. Cost of sanctions and fines: Financial services companies are subject to strict regulatory requirements. Disruptions in critical services could lead to non-compliance and result in fines, penalties, or sanctions, negatively impacting revenue.

    Stakeholders, including customers, regulators, and partners expect uninterrupted access to critical services even during technical issues. This expectation has become increasingly important as companies rely more on third-party solutions such as CSPs. Consequently, regulatory language has evolved to emphasize that the company bears ultimate accountability for ensuring the availability and continuity of these services.

    As legislation in the global digital operational resiliency trend such as DORA moves into law as early as 2025, these costs could balloon even further. And DORA isn’t the only legislation that will codify operational resilience requirements. Other regulatory authorities have proposed similar requirements—namely the United Kingdom’s Prudential Regulations Authority (PRA) and Financial Conduct Authority (FCA).

    So while an institution in North America, Latin America (LATAM), or the Asia-Pacific region (APAC), may assume that DORA may not apply, they could be impacted by a DORA-like regulation, such as Canada's new Financial Consumer Protection Framework, which went into effect in June 2022. 
     
  3. Reputational costs: In the era of digital trust, service outages caused by a lack of operational resilience play a crucial role in shaping customers' perceptions. As more and more financial services organizations rely on digital platforms to deliver their offerings, consumers have become accustomed to uninterrupted experiences. Any disruption in service can erode the trust customers place in the company, potentially leading to a loss of clientele and revenue.

    With increasing competition in the market, consumers have a multitude of choices when selecting service providers. A strong foundation of trust can be a decisive factor in retaining customers and attracting new ones. By ensuring the reliability and availability of services, a company can nurture and reinforce trust among its client base.
     
  4. Market disruptions: Today's companies are highly interconnected. Failures in just one firm now have the potential to cascade and disrupt national economies.

    This phenomenon is not limited to the financial services sector but extends to other industries as well, such as communications, and energy. The intricate web of dependencies among businesses means operational disruptions can have far-reaching consequences on a broader scale.

    The challenges posed by this interconnectedness are extensive. Ensuring business continuity and mitigating risks become increasingly important to prevent systemic disruptions. Companies must invest in robust risk management strategies, including strong cybersecurity measures, disaster recovery plans, and operational resilience. Additionally, fostering cross-industry collaboration and communication can help identify vulnerabilities and develop best practices to safeguard against potential cascading effects. By addressing these challenges head-on, companies can contribute to a more stable and resilient global economy.
     
  5. Growing critical service level requirements: The increasing digitalization of services has led to a transformation in how financial services organizations operate. With the expectation of 24/7, always-on availability, virtually every underlying system, and service has now become "mission-critical" for these institutions. Operational resilience is important in maintaining these growing suites of services for customers while minimizing the risk of costly disruptions.
     
  6. Lack of interoperability between third-party tools: As financial services increase the use of an increasing number of third-party tools, business continuity is put at risk if those tools don’t interoperate with each other well. Operational resilience is therefore supported with tools that can create a single, unified platform across the various different third-party tools.

    Additionally, the number of interconnected/complex platforms and environments, such as on-premise, off-premise, cloud services, and at the edge, need a unified platform to work together to deliver an end-user service.

Red Hat resources

As institutions look for ways to mitigate operational risks, a modern cloud platform may offer a way to address these risks with lower cost and effort. While a single infrastructure provider can institute a resilient strategy and, in the event of a disaster, could offer portability to other zones or regions, it does not address potential systemic third party issues that require an exit with the provider.

Red Hat provides cloud services that enhance kubernetes, so that you can operate across multiple clouds as a unified environment. These services offer the following benefits for those looking to manage risk and improve resilience:

  • Manage Kubernetes clusters: Run your operations from anywhere and manage any Kubernetes cluster in your fleet. 
  • Accelerate development to production: Speed up application development pipelines with self-service provisioning.
  • Increase application availability: Deploy legacy and cloud-native applications quickly across distributed clusters.
  • Central management automatically: Free up IT departments with self-service cluster deployment that automatically delivers applications.
  • Ease compliance: Streamline security compliance with centralized policy enforcement across clusters.
  • Reduce operational costs: Lower operational costs with a unified management interface.

Red Hat® OpenShift® is a unified cloud platform to build, modernize, and deploy applications at scale. In addition, Red Hat Advanced Cluster Management for Kubernetes controls clusters and applications from a single console, with built-in security policies. Extend the value of Red Hat OpenShift® by managing multiple clusters, and enforcing policies across multiple clusters at scale. This enables you to effectively ensure compliance, monitor usage, and maintain consistency.

Learn more about Red Hat Advanced Cluster Management

Red Hat Consulting and Red Hat Training help customers develop cloud-native applications that are both portable and highly available, meaning that critical data and workloads can be quickly recovered in disaster recovery. 

Additionally, Red Hat’s extensive partner ecosystem offers additional capabilities for business continuity management in both cloud and non-cloud systems and environments.

Explore Red Hat Advanced Cluster Security for Kubernetes 

Hub

The official Red Hat blog

Get the latest information about our ecosystem of customers, partners, and communities.

All Red Hat product trials

Our no-cost product trials help you gain hands-on experience, prepare for a certification, or assess if a product is right for your organization.

Keep reading

What is a CVE?

CVE, short for Common Vulnerabilities and Exposures, is a list of publicly disclosed computer security flaws.

What is secrets management?

Secrets management is a method for ensuring that the sensitive information needed to run your day to day operations is kept confidential.

What is role-based access control (RBAC)?

Role-based access control is a method of managing user access to systems, networks, or resources based on their role within a team or a larger organization.

Security resources