Model validation is the most scrutinised MRM pillar in bank examinations. Examiners consistently cite inadequate independent validation as the primary model risk deficiency in examination findings. This guide covers the complete model validation process: what SR 11-7 and SR 26-2 require, how to structure it, and how AI models change the validation calculus.
What Is Model Validation Under SR 11-7 / SR 26-2?
Model validation is the set of processes intended to verify that models perform as expected, are appropriate for their intended use, and that their limitations are understood and managed. Validation is an ongoing function — not a one-time event — and must be independent from model development and business ownership.
The Risk Dispatch — weekly briefing
Get weekly AI risk management and compliance intelligence for financial institutions. Free, no spam.
Subscribe free →The Three Components of SR 11-7 Model Validation
1. Conceptual Soundness
Conceptual soundness assessment evaluates whether the model's design and theory are appropriate for the intended application. Validators review the model's underlying assumptions, the theoretical basis for the methodology, and whether inputs, processing logic, and outputs are logically consistent. For ML models, this includes reviewing architecture choices, feature selection rationale, and training data quality. A conceptual soundness finding does not necessarily mean the model is wrong — it may mean the documentation is insufficient to assess soundness.
2. Ongoing Monitoring
Ongoing monitoring is the continuous oversight of model performance after deployment. It includes: tracking outputs against benchmarks and actual outcomes, monitoring input data stability, detecting performance degradation, and triggering revalidation when thresholds are breached. Under SR 26-2, monitoring frequency should be proportionate to materiality and the rate of environmental change — not driven by a fixed calendar. Practical monitoring programmes include population stability indices (PSI), outcome analysis comparing predicted vs actual, and data quality dashboards.
3. Outcomes Analysis
Outcomes analysis directly compares model predictions to realised outcomes. For a credit scoring model: are predicted default rates tracking actual defaults? For fraud models: what is the false positive rate in production versus validation assumptions? Outcomes analysis requires production data, clean outcome labels, and sufficient time horizons — limitations that must be documented when full outcomes analysis is not yet possible.
Independent Review Requirements
Both SR 11-7 and SR 26-2 require that validation be conducted by parties independent of model development and business use. Independence has two dimensions: organisational (the validator does not report to the model owner) and intellectual (genuine challenge, not rubber-stamping). Common independence failures: validators who are former developers of the model under review; validation teams funded by the model owner's budget; external vendors whose validation methodology was provided by the firm that built the model.
Model Validation Documentation Checklist
- Validation scope and objectives — what was tested, what was out of scope, and why
- Model overview and development documentation — theory, assumptions, data sources, training methodology
- Conceptual soundness assessment with explicit findings
- Data quality assessment — sources, completeness, representativeness, bias risks
- Quantitative testing results — backtests, sensitivity analysis, benchmarking, stress tests
- Outcomes analysis — predicted vs actual performance over available horizon
- Limitations inventory — known weaknesses, compensating controls, monitoring triggers
- Validation findings and ratings — classified by severity with remediation actions
- Validator independence attestation
- Management response and action plan
- Approval or conditional approval record
How AI Models Require Enhanced Validation Under SR 26-2
AI and ML models introduce validation challenges that SR 11-7's framework was not designed to address. The three components still apply, but each requires augmentation.
Conceptual soundness for AI: ML models often lack a clean theoretical basis. Validators must assess whether training data is representative and unbiased, whether the model generalises beyond the training distribution, and whether complexity is justified by performance gains over simpler alternatives.
Monitoring for AI: ML performance can degrade non-linearly as input distributions drift. Monitoring must include feature-level PSI, not just output-level tracking. Banks should define explicit revalidation triggers tied to monitored degradation thresholds.
Outcomes analysis for AI: Many AI models are used in domains where outcomes are delayed, rare, or difficult to isolate. Validators must document these limitations and define alternative performance metrics where outcomes analysis is structurally constrained.
SR 26-2 also introduces aggregate model risk — the view that multiple individually acceptable models can create unacceptable systemic risk when they interact. Validation programmes for AI-heavy portfolios should include scenario analysis of model interaction effects. See our SR 26-2 guide and MRM framework compliance guide for the full governance context.