Skip to content

06_Improve

Phillip Bailey edited this page Jun 24, 2025 · 5 revisions

Improve

The Improve function focuses on recovering from AI-related incidents, addressing root causes, and continuously enhancing AI system resilience, governance, and trustworthiness.

This section aligns with:

  • NIST CSF 2.0: RECOVER
  • NIST AI RMF 1.0: MANAGE
  • EU AI Act: Title IX (Post-Market Monitoring and Corrective Actions)

Objectives

  • Restore AI system operations safely after a failure, misuse, or degradation
  • Feed lessons learned into updated controls, policies, and models
  • Track improvement metrics over time (e.g., bias reduction, incident frequency)
  • Decommission or re-engineer systems that consistently fail trust standards
  • Build organizational resilience through continuous feedback and adaptive governance

Outcomes

  • AI systems are safely restored or gracefully retired post-incident
  • Improvements are traceable across governance, monitoring, and safeguards
  • Recurring failures are reduced and policy updates reflect real-world learning
  • Trust in AI systems is reinforced through transparent and accountable iteration

Key Improvement Elements

Element Description
Recovery Procedures Steps to safely restore model performance or shut down malfunctioning AI
Root Cause Analysis (RCA) Formal review of why an incident occurred and how to prevent recurrence
Corrective Action Plans Documented changes to controls, data, or governance processes
Model Versioning Maintain traceable versions and rollback plans for major changes
Postmortems Structured reviews capturing human, technical, and process factors
Retirement Strategy Criteria and workflows to phase out harmful or underperforming models
Continuous Learning Loops Use audit logs, incident reviews, and monitoring signals to drive improvement

Example Practices

  • Conduct post-incident reviews within 48 hours of any AI failure
  • Update governance policies based on RCA findings
  • Establish model rollback and retraining pipelines in CI/CD
  • Measure and publish improvement KPIs (e.g., incident MTTD/MTTR)
  • Retire models that consistently underperform on fairness or reliability metrics

Related NIST Categories

CSF 2.0 RECOVER Categories

  • RC.RP – Recovery Planning
  • RC.IM – Improvements
  • RC.CO – Communications

AI RMF MANAGE Focus Areas

  • Improve based on documented incidents and feedback
  • Update risk posture, policies, and controls continuously
  • Learn from new context, threats, and stakeholder expectations

Artefacts

  • AI Post-Incident Review Template
  • Root Cause Analysis (RCA) Worksheet
  • Model Versioning & Rollback Log
  • AI System Decommissioning Guide
  • Trustworthiness Improvement KPI Dashboard

Clone this wiki locally