-
Notifications
You must be signed in to change notification settings - Fork 0
06_Improve
Phillip Bailey edited this page Jun 24, 2025
·
5 revisions
The Improve function focuses on recovering from AI-related incidents, addressing root causes, and continuously enhancing AI system resilience, governance, and trustworthiness.
This section aligns with:
- NIST CSF 2.0: RECOVER
- NIST AI RMF 1.0: MANAGE
- EU AI Act: Title IX (Post-Market Monitoring and Corrective Actions)
- Restore AI system operations safely after a failure, misuse, or degradation
- Feed lessons learned into updated controls, policies, and models
- Track improvement metrics over time (e.g., bias reduction, incident frequency)
- Decommission or re-engineer systems that consistently fail trust standards
- Build organizational resilience through continuous feedback and adaptive governance
- AI systems are safely restored or gracefully retired post-incident
- Improvements are traceable across governance, monitoring, and safeguards
- Recurring failures are reduced and policy updates reflect real-world learning
- Trust in AI systems is reinforced through transparent and accountable iteration
| Element | Description |
|---|---|
| Recovery Procedures | Steps to safely restore model performance or shut down malfunctioning AI |
| Root Cause Analysis (RCA) | Formal review of why an incident occurred and how to prevent recurrence |
| Corrective Action Plans | Documented changes to controls, data, or governance processes |
| Model Versioning | Maintain traceable versions and rollback plans for major changes |
| Postmortems | Structured reviews capturing human, technical, and process factors |
| Retirement Strategy | Criteria and workflows to phase out harmful or underperforming models |
| Continuous Learning Loops | Use audit logs, incident reviews, and monitoring signals to drive improvement |
- Conduct post-incident reviews within 48 hours of any AI failure
- Update governance policies based on RCA findings
- Establish model rollback and retraining pipelines in CI/CD
- Measure and publish improvement KPIs (e.g., incident MTTD/MTTR)
- Retire models that consistently underperform on fairness or reliability metrics
- RC.RP – Recovery Planning
- RC.IM – Improvements
- RC.CO – Communications
- Improve based on documented incidents and feedback
- Update risk posture, policies, and controls continuously
- Learn from new context, threats, and stakeholder expectations
- AI Post-Incident Review Template
- Root Cause Analysis (RCA) Worksheet
- Model Versioning & Rollback Log
- AI System Decommissioning Guide
- Trustworthiness Improvement KPI Dashboard