
SAMA CSF Compliance and Actual Resilience Are Different Programs
BCDR

Mahesh Chandran
CEO Dataring
In the SAMA-regulated organizations I've worked with, there's a recurring pattern: the bank or insurer has a SAMA CSF compliance program that satisfies the auditor every year, and it has a separate resilience capability that may or may not satisfy reality. The two run in parallel, owned by different teams, evaluated against different criteria, and rarely talking to each other. After the March 2026 Gulf cloud incident, the gap between them is no longer theoretical.
This post is about that gap and how to close it. For the broader BCDR architecture context, see our cloud DR in the GCC pillar. For the GCC-wide regulatory landscape, see our GCC requirements comparison.
The thesis
SAMA CSF was designed to drive resilience. The standard mode of compliance has drifted toward documentation rather than capability. A bank can pass a SAMA audit while running an architecture that would not survive a real regional outage, because the audit checks for the existence of artifacts (documented BCP, named owners, evidence of testing) and the artifacts can exist without the underlying capability. The gap is widest where boards have signed off on documents they don't understand operationally.
What SAMA CSF actually requires
SAMA CSF distributes business continuity requirements across several domains — governance, operations, technology, third-party risk. The core obligations come down to four:
Annual BCM testing. Not optional, not a recommendation. Required.
Board-level governance. Resilience is a board obligation, not an IT function. The board must demonstrably understand exposure and approve the mitigation strategy.
Documented incident response procedures. Detection through containment through recovery, with defined timeframes.
Third-party risk extension. Cloud and SaaS dependencies are part of the resilience perimeter, not outside it.
These are real requirements. The issue isn't that they're inadequate; it's that they admit forms of compliance that don't produce resilience.
The Resilience Maturity Ladder
Most SAMA-regulated organizations occupy one of four positions on what I'd call a resilience maturity ladder. Each level produces a different real capability under stress, even when each level can satisfy the same audit checklist.
Level 1: Document-compliant
The organization has a written BCP, a named CISO, an incident response plan, and an annual tabletop summary. The plan has not been tested against a real failover. The tabletop is discussion-only. The CISO has not personally seen the recovery path validate end-to-end. This satisfies most audit standards. It does not produce capability.
The tell: ask anyone in the organization what their actual recovery time was during the last real incident, and the answer is unclear or unknown.
Level 2: Component-tested
The organization regularly tests individual components — restore tests for backups, failover drills for specific databases, DR runbooks rehearsed by the platform team. Components work in isolation. The end-to-end path from incident detection through full recovery has not been exercised as one sequence.
The tell: each component team can demonstrate their piece works, but no single person can describe the full recovery sequence without consulting multiple others.
Level 3: Region-tested
The organization has actually run a full regional failover under controlled conditions, with real traffic moved to the DR environment for a defined window, and validated that the user-facing path works end-to-end. This is the bar I'd consider the minimum standard for Tier 0 systems after March 2026, and it's the bar regulators are increasingly expected to require for systemic-risk institutions.
The tell: a specific date and duration can be cited when the bank ran live traffic from the DR region.
Level 4: Coincident-tested
The organization has run a full regional failover under conditions that include simulated cyber pressure, degraded monitoring, partial information, and time constraint. The exercise tests the human decision layer alongside the technical recovery and surfaces failure modes that calm-conditions tests miss.
The tell: the chief risk officer was personally in the room during a Level 4 exercise and remembers operational details that are not in any document.
In the engagements I've done across the GCC, most banks I've assessed sit at Level 1 or Level 2. A small number have reached Level 3 for specific Tier 0 systems. Almost none are systematically at Level 4. This matters because real incidents are coincident — the cyber and physical pressures arrive together, exactly what Level 4 simulates and lower levels do not.
What changes after March 2026
The March 2026 Gulf cloud incident did not violate any existing SAMA CSF requirement, because the framework was written before kinetic pressure on regional cloud infrastructure was a planning scenario. Several updates are reasonable to expect, and prudent organizations are positioning for them now rather than waiting.
Multi-region architecture as floor, not ceiling. Current SAMA CSF requires "adequate" recovery capability, which most organizations have read as multi-AZ within a single region. After a regional event of the kind seen in March, the floor should be cross-region for at least Tier 0 systems. The pattern decision that follows is in our pattern decision guide.
Coincident scenario testing. Traditional compliance testing happens in calm conditions. The realistic threat is coincident. Tests that don't simulate it are decreasingly adequate as evidence.
Pre-arranged data residency exceptions. Cross-border failover under regulator approval is a conversation to start before the next incident, not during it. See our residency guide.
Building the bridge from compliance to capability
The work of moving from Level 1–2 compliance to Level 3–4 capability is concrete and can be sequenced.
Step 1: Honest internal classification. Where do you actually sit on the ladder? Not where the documentation says — where the reality is. This is a 60-minute conversation between the CISO, the head of infrastructure, and the chief risk officer. The honest answer is sometimes uncomfortable. It's the starting point for everything else.
Step 2: Pick one Tier 0 system and run a Level 3 test. Don't try to upgrade the whole portfolio. Pick the most critical system, schedule a real regional failover for a known window, and run it. The gaps surfaced by the test become the work plan.
Step 3: Document the test for the regulator. SAMA CSF requires evidence of testing. A Level 3 test produces stronger evidence than any tabletop. The same artifact strengthens both compliance posture and operational confidence.
Step 4: Run the test against board members. The CRO and at least one board member with risk responsibility should observe a Level 3 test annually. The board's ability to govern resilience requires personal knowledge of how it works, not just briefings about it.
Step 5: Repeat for the next Tier 0 system, then Tier 1. Resilience scales by reference implementation, not by program. One Tier 0 system at Level 3 gives you a template. The second is faster. By the third, the team is operational.
Common compliance traps
A few patterns I see consistently in engagements:
The compliance department owns BCDR. The compliance team is good at producing documentation that satisfies audit. They are not the right team to design or operate recovery capability. When BCDR sits inside compliance, the natural drift is toward Level 1.
The CTO owns BCDR but doesn't report to the board on it. Resilience reporting that goes through risk committees but not technology committees produces governance theater. Both audiences need to see the same picture.
Annual is the wrong cadence for Tier 0. SAMA CSF requires annual testing as a floor. For Tier 0 systems, annual testing is decay rate, not learning rate. Quarterly is closer to honest.
Vendor SLA is treated as resilience. "AWS guarantees 99.99%" is not a resilience claim. It's a contract clause. Real resilience is what your team can do when the SLA is missed. See our vendor contracts post.
NCA ECC-2 and the broader Saudi context
SAMA CSF applies to financial institutions. NCA ECC-2 applies to critical infrastructure operators and government entities. Organizations under both must satisfy the more stringent requirement in each domain. The maturity ladder above maps cleanly to ECC-2 with one addition: ECC-2's supply chain and operational technology requirements add a fifth dimension that is genuinely separate from the BCM ladder. For organizations under both, integrate the two programs rather than running them in parallel.
Tradeoffs and honest limitations
The maturity ladder is a heuristic, not an audit framework. SAMA's actual evaluation criteria are domain-specific and don't map cleanly to four levels. The ladder is for internal navigation, not for filing.
Level 4 is expensive. Coincident-scenario testing requires senior leadership time, scenario design effort, and willingness to surface uncomfortable findings. Most organizations can sustain it once or twice a year, not more.
Compliance and resilience are not opposed. The argument here is not that compliance work is wasted; it's that compliance without operational capability is incomplete. The two should reinforce each other, not substitute.
A practical takeaway
If your organization has a SAMA CSF compliance program but cannot answer "when did we last run a Level 3 test on our most critical system, and what did we learn?" then the compliance program is not yet producing resilience. The first concrete step is the 60-minute internal classification conversation. The second is selecting one Tier 0 system for a real Level 3 test in the next quarter.
For DR pattern selection that follows from this work, see our pattern decision guide. For the testing layer in detail, see our DR testing post. For the broader regional context, see our cloud DR in the GCC pillar. If you'd like outside support running a Level 3 test or building the bridge from compliance to capability, Dataring's resilience practice does this engagement regularly. Get in touch.




