SR 11-7 and its 2026 successor SR 26-2: the model risk management framework
The Federal Reserve's Supervisory Guidance on Model Risk Management (SR 11-7), issued in 2011, established the foundational framework for model risk management in US financial institutions. Despite its age, SR 11-7 was the dominant reference standard for model governance globally for 15 years, and was adopted by prudential regulators in Australia, the UK, Canada, and Europe as the basis for their own model risk expectations. On April 17, 2026, the US agencies issued SR 26-2 to supersede SR 11-7, moving to a risk-based, materiality-tiered approach. The core principles β conceptual soundness, independent validation, governance β remain intact.
SR 11-7 defines a model as a quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories to process input data into quantitative estimates. Modern machine learning systems clearly fall within this definition. The guidance's core requirements, conceptual soundness, rigorous development, validation by independent parties, and sound governance, apply to ML models. What SR 11-7 did not anticipate is how difficult these requirements become when applied to models whose internal logic is not fully interpretable.
Where traditional model risk management breaks down with ML
The explainability gap
SR 11-7 requires that models be conceptually sound, that their logic can be articulated, understood, and assessed for appropriateness. Traditional statistical models satisfy this requirement: the relationship between inputs and outputs can be expressed mathematically, examined for economic intuition, and evaluated by subject matter experts.
Deep learning models and complex ensemble methods do not satisfy this requirement in the same way. Their internal representations are not human-interpretable. A gradient-boosted model with thousands of trees, or a neural network with millions of parameters, produces outputs through processes that cannot be fully articulated even by the model's developers. Conceptual soundness assessment for these models requires different techniques: feature importance analysis, partial dependence plots, SHAP values, and adversarial testing, none of which provide the same level of assurance as being able to read a model's logic directly.
Distributional shift
Traditional model validation tests model performance on a hold-out sample drawn from the same distribution as the training data. This approach assumes that the distribution of inputs the model will encounter in production is similar to the distribution on which it was trained. For statistical models used in stable environments, this assumption is often reasonable.
For ML models deployed in dynamic environments, it frequently is not. Data drift, changes in the statistical properties of input data over time, and concept drift, changes in the relationship between inputs and the target variable, can degrade ML model performance rapidly and without obvious warning signs. A credit model trained on pre-2020 data encountered severe distributional shift during the pandemic. A fraud detection model trained on pre-2024 data may perform poorly against fraud patterns that emerged subsequently.
Model risk management for ML requires active monitoring for distributional shift, not just periodic performance review, and clear protocols for model refresh or replacement when drift is detected.
Validation infrastructure
SR 11-7 requires independent model validation, assessment by a team separate from model development. For traditional statistical models, this requires statistical expertise and access to model documentation and data. For complex ML models, independent validation requires ML expertise, access to training data and code, interpretability tooling, and the ability to conduct adversarial testing. Many financial institutions' validation functions do not yet have this capability, creating a gap between the requirement and the practice.
Adapting the SR 11-7 framework for ML
Tiered model inventory
Not all ML models require the same governance intensity. A tiered approach, classifying models by the criticality of the decisions they inform and the potential for adverse outcomes, allows governance resources to be concentrated where risk is highest. High-tier models (credit decisions, market risk, capital calculation) require full SR 11-7 treatment with ML-specific enhancements. Low-tier models (internal efficiency tools, non-consequential analytics) can operate under lighter governance.
Pre-deployment validation standards for ML
Validation of ML models before production deployment should include: conceptual soundness assessment using interpretability techniques; performance testing across demographic subgroups to identify fairness issues; stress testing under distributional shift scenarios; adversarial testing for robustness; and documentation of model limitations and appropriate use constraints. These requirements should be codified in a model validation policy that explicitly addresses ML characteristics.
Ongoing monitoring infrastructure
ML model monitoring requires automated infrastructure, not manual periodic review. Production monitoring systems should track input data distributions against training baselines, output distributions against expected ranges, performance metrics against defined thresholds, and fairness metrics across relevant subgroups. Monitoring triggers, defined thresholds that prompt escalation, investigation, or model suspension, must be documented and acted upon systematically.
The EU AI Act alignment
For financial services firms subject to both SR 11-7 and the EU AI Act, there is significant structural alignment between the two frameworks. The EU AI Act's requirements for high-risk AI (risk management system, data governance, technical documentation, human oversight, accuracy and robustness) map directly to SR 11-7's model risk management components. Firms with mature model risk management functions can build EU AI Act compliance on that foundation, rather than treating it as a parallel obligation. The incremental requirements are real but manageable: the Act adds transparency obligations, conformity assessment, and EU AI database registration that SR 11-7 does not require.