Skip to content

ThePagePage/rag-governance-framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Governance Framework

A comprehensive governance framework for Retrieval-Augmented Generation (RAG) systems in regulated industries. Provides citation tracking, audit trails, human-in-the-loop validation, and compliance patterns for building trustworthy AI applications.

License: MIT Node.js Version

Why This Framework?

RAG systems in regulated industries (financial services, legal, healthcare, etc.) face unique challenges:

  • Auditability: Regulators require clear evidence of how AI-generated responses were produced
  • Citation Accuracy: Claims must be traceable to source documents
  • Human Oversight: High-risk outputs require human review before delivery
  • Compliance: Responses must adhere to regulatory and policy constraints
  • Version Control: Source documents change; responses must reflect the correct version

This framework provides battle-tested patterns to address these challenges.

Features

🔗 Citation Tracking

  • Automatic extraction and validation of source citations
  • Confidence scoring for citation relevance
  • Citation chain verification (nested references)
  • Missing citation detection

📋 Audit Trail

  • Immutable logging of all RAG operations
  • Full request/response capture with metadata
  • Tamper-evident audit records
  • Export to compliance systems (JSON, CSV)

👤 Human-in-the-Loop Validation

  • Configurable validation workflows
  • Risk-based routing (auto-approve low risk, escalate high risk)
  • Reviewer assignment and tracking
  • Approval/rejection with comments
  • SLA monitoring for review queues

✅ Compliance Patterns

  • Content policy enforcement
  • PII detection and redaction hooks
  • Prohibited topic filtering
  • Response length and format constraints
  • Regulatory disclosure injection

📊 Governance Dashboard Hooks

  • Metrics collection for monitoring
  • Quality scoring over time
  • Reviewer performance tracking
  • Citation accuracy trends

Installation

npm install rag-governance-framework

Quick Start

const { 
  RAGGovernor,
  CitationTracker,
  AuditLogger,
  ValidationWorkflow 
} = require('rag-governance-framework');

// Initialize the governance layer
const governor = new RAGGovernor({
  auditLogger: new AuditLogger({ storage: 'file', path: './audit-logs' }),
  citationTracker: new CitationTracker({ requireCitations: true }),
  validationWorkflow: new ValidationWorkflow({
    riskThreshold: 0.7,  // Auto-approve if risk < 0.7
    reviewers: ['legal-team@example.com']
  })
});

// Wrap your RAG pipeline
const response = await governor.process({
  query: "What are the early retirement provisions?",
  context: retrievedDocuments,  // Your RAG retrieval results
  generateFn: async (query, context) => {
    // Your LLM call here
    return await openai.chat.completions.create({ ... });
  }
});

// Response includes governance metadata
console.log(response.content);           // The generated response
console.log(response.citations);         // Extracted citations with validation
console.log(response.auditId);           // Unique audit trail ID
console.log(response.validationStatus);  // 'approved', 'pending', 'rejected'

Core Concepts

RAGGovernor

The main orchestrator that wraps your RAG pipeline with governance controls.

const governor = new RAGGovernor({
  // Required: Audit logging configuration
  auditLogger: new AuditLogger(options),
  
  // Optional: Citation tracking
  citationTracker: new CitationTracker(options),
  
  // Optional: Human validation workflow
  validationWorkflow: new ValidationWorkflow(options),
  
  // Optional: Compliance policies
  policies: [
    new PIIPolicy({ action: 'redact' }),
    new ContentPolicy({ prohibitedTopics: ['investment-advice'] }),
    new DisclosurePolicy({ footer: 'This is AI-generated content.' })
  ]
});

Citation Tracking

Ensures AI responses are grounded in source documents.

const tracker = new CitationTracker({
  requireCitations: true,           // Reject responses without citations
  minCitations: 1,                  // Minimum citations required
  validateSourceExists: true,       // Check cited sources exist in context
  confidenceThreshold: 0.5,         // Minimum relevance score
  citationFormat: 'inline'          // 'inline', 'footnote', or 'endnote'
});

// Extract and validate citations from a response
const result = tracker.validate(response, sourceDocuments);
console.log(result.citations);      // Array of extracted citations
console.log(result.isValid);        // Whether citation requirements met
console.log(result.missingRefs);    // Citations that couldn't be verified

Audit Logger

Creates immutable, tamper-evident records of all RAG operations.

const logger = new AuditLogger({
  storage: 'file',                  // 'file', 'memory', or custom adapter
  path: './audit-logs',
  retention: '7-years',             // Retention policy
  includeContext: true,             // Log retrieved documents
  includePrompts: true,             // Log system/user prompts
  hashAlgorithm: 'sha256',          // For tamper detection
  sensitiveFields: ['ssn', 'dob']   // Fields to redact in logs
});

// Manually log an event
await logger.log({
  eventType: 'rag-query',
  query: userQuery,
  response: generatedResponse,
  citations: extractedCitations,
  metadata: { userId: '123', sessionId: 'abc' }
});

// Export for compliance reporting
await logger.export({ 
  format: 'json', 
  dateRange: { start: '2024-01-01', end: '2024-12-31' }
});

Validation Workflow

Routes high-risk outputs to human reviewers.

const workflow = new ValidationWorkflow({
  // Risk assessment
  riskAssessor: (response, context) => {
    // Custom risk scoring logic
    if (response.includes('financial advice')) return 0.9;
    if (response.citations.length === 0) return 0.8;
    return 0.3;
  },
  
  // Thresholds
  autoApproveThreshold: 0.3,    // Auto-approve if risk <= this
  autoRejectThreshold: 0.95,    // Auto-reject if risk >= this
  
  // Review queue configuration
  reviewers: ['team@example.com'],
  slaHours: 24,                 // Review SLA
  escalationPath: ['manager@example.com'],
  
  // Callbacks
  onApproval: async (response, reviewer) => { /* notify user */ },
  onRejection: async (response, reviewer, reason) => { /* handle rejection */ },
  onTimeout: async (response) => { /* SLA breach handling */ }
});

// Submit for validation
const result = await workflow.submit(response, {
  priority: 'high',
  requester: 'user@example.com'
});

// Check status
const status = await workflow.getStatus(result.validationId);

Compliance Policies

Enforce content and regulatory requirements.

const { 
  PIIPolicy, 
  ContentPolicy, 
  DisclosurePolicy,
  LengthPolicy 
} = require('rag-governance-framework/policies');

// PII Detection and Redaction
const piiPolicy = new PIIPolicy({
  action: 'redact',              // 'redact', 'reject', or 'flag'
  patterns: ['email', 'phone', 'ssn', 'credit-card'],
  customPatterns: [/MEMBER-\d{6}/g]  // Custom PII patterns
});

// Content Restrictions
const contentPolicy = new ContentPolicy({
  prohibitedTopics: ['investment-advice', 'medical-diagnosis'],
  requiredDisclaimers: ['ai-generated'],
  maxConfidenceLevel: 0.9       // Flag overconfident responses
});

// Regulatory Disclosures
const disclosurePolicy = new DisclosurePolicy({
  header: null,
  footer: 'This response was generated by AI and should be verified.',
  conditions: {
    'financial': 'This is not financial advice.',
    'legal': 'This is not legal advice. Consult a qualified professional.'
  }
});

Advanced Usage

Custom Storage Adapters

const { AuditLogger, StorageAdapter } = require('rag-governance-framework');

class PostgresAdapter extends StorageAdapter {
  async write(record) {
    await db.query('INSERT INTO audit_logs ...', record);
  }
  
  async read(filter) {
    return await db.query('SELECT * FROM audit_logs WHERE ...', filter);
  }
}

const logger = new AuditLogger({
  storage: new PostgresAdapter({ connectionString: '...' })
});

Risk Assessment Models

const { ValidationWorkflow, RiskAssessor } = require('rag-governance-framework');

class MLRiskAssessor extends RiskAssessor {
  async assess(response, context) {
    // Call your ML model
    const score = await mlModel.predict({
      response: response.content,
      citationCount: response.citations.length,
      topicClassification: await classify(response.content)
    });
    
    return {
      score,
      factors: ['low-citation-count', 'sensitive-topic'],
      explanation: 'Response discusses financial topics with limited citations'
    };
  }
}

Integration with Azure AI / OpenAI

const { RAGGovernor } = require('rag-governance-framework');
const { AzureOpenAI } = require('openai');

const client = new AzureOpenAI({ ... });

const governor = new RAGGovernor({ ... });

// Wrap your Azure OpenAI calls
const governedResponse = await governor.process({
  query: userQuery,
  context: await vectorStore.search(userQuery),
  generateFn: async (query, context) => {
    const completion = await client.chat.completions.create({
      model: 'gpt-4',
      messages: [
        { role: 'system', content: systemPrompt },
        { role: 'user', content: `Context: ${context}\n\nQuestion: ${query}` }
      ]
    });
    return completion.choices[0].message.content;
  }
});

Configuration Reference

Environment Variables

RAG_GOVERNANCE_LOG_LEVEL=info
RAG_GOVERNANCE_AUDIT_PATH=./audit-logs
RAG_GOVERNANCE_RETENTION_DAYS=2555
RAG_GOVERNANCE_HASH_ALGORITHM=sha256

Full Configuration Example

const governor = new RAGGovernor({
  auditLogger: new AuditLogger({
    storage: 'file',
    path: process.env.AUDIT_PATH || './audit-logs',
    retention: '7-years',
    includeContext: true,
    includePrompts: true,
    hashAlgorithm: 'sha256',
    sensitiveFields: ['ssn', 'dob', 'account_number'],
    rotationPolicy: 'daily'
  }),
  
  citationTracker: new CitationTracker({
    requireCitations: true,
    minCitations: 1,
    maxCitations: 20,
    validateSourceExists: true,
    confidenceThreshold: 0.5,
    citationFormat: 'inline',
    allowedSourceTypes: ['document', 'policy', 'regulation']
  }),
  
  validationWorkflow: new ValidationWorkflow({
    enabled: true,
    autoApproveThreshold: 0.3,
    autoRejectThreshold: 0.95,
    reviewers: ['reviewer@example.com'],
    slaHours: 24,
    escalationPath: ['manager@example.com'],
    queuePersistence: 'redis'
  }),
  
  policies: [
    new PIIPolicy({ action: 'redact' }),
    new ContentPolicy({ prohibitedTopics: ['investment-advice'] }),
    new DisclosurePolicy({ footer: 'AI-generated content.' })
  ],
  
  metrics: {
    enabled: true,
    collector: 'prometheus',
    prefix: 'rag_governance_'
  }
});

Best Practices for Regulated Industries

Financial Services

  1. Citation Requirements: Always require citations for factual claims
  2. Disclosure: Add clear AI-generated disclaimers
  3. Audit Retention: 7+ years for regulatory compliance
  4. Human Review: Route investment-related queries to compliance

Legal

  1. Jurisdiction Awareness: Track which jurisdiction documents apply to
  2. Version Control: Ensure citations reference correct document versions
  3. Privilege: Flag potentially privileged content
  4. Disclaimer: Clear "not legal advice" disclosures

Healthcare

  1. PHI Protection: Strict PII/PHI redaction policies
  2. Clinical Review: Route medical content to clinical reviewers
  3. FDA Compliance: Flag promotional content for review
  4. Audit Trail: HIPAA-compliant logging

Contributing

Contributions are welcome! Please read our Contributing Guide for details.

License

MIT License - see LICENSE for details.

Acknowledgments

Built with insights from practitioners in financial services, legal tech, and healthcare AI. Special thanks to the AI governance community for sharing patterns and anti-patterns.

About

A governance framework for Retrieval-Augmented Generation (RAG) systems in regulated industries. Provides citation tracking, audit trails, human-in-the-loop validation, and compliance patterns.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors