RAG Governance Framework

A comprehensive governance framework for Retrieval-Augmented Generation (RAG) systems in regulated industries. Provides citation tracking, audit trails, human-in-the-loop validation, and compliance patterns for building trustworthy AI applications.

Why This Framework?

RAG systems in regulated industries (financial services, legal, healthcare, etc.) face unique challenges:

Auditability: Regulators require clear evidence of how AI-generated responses were produced
Citation Accuracy: Claims must be traceable to source documents
Human Oversight: High-risk outputs require human review before delivery
Compliance: Responses must adhere to regulatory and policy constraints
Version Control: Source documents change; responses must reflect the correct version

This framework provides battle-tested patterns to address these challenges.

Features

🔗 Citation Tracking

Automatic extraction and validation of source citations
Confidence scoring for citation relevance
Citation chain verification (nested references)
Missing citation detection

📋 Audit Trail

Immutable logging of all RAG operations
Full request/response capture with metadata
Tamper-evident audit records
Export to compliance systems (JSON, CSV)

👤 Human-in-the-Loop Validation

Configurable validation workflows
Risk-based routing (auto-approve low risk, escalate high risk)
Reviewer assignment and tracking
Approval/rejection with comments
SLA monitoring for review queues

✅ Compliance Patterns

Content policy enforcement
PII detection and redaction hooks
Prohibited topic filtering
Response length and format constraints
Regulatory disclosure injection

📊 Governance Dashboard Hooks

Metrics collection for monitoring
Quality scoring over time
Reviewer performance tracking
Citation accuracy trends

Installation

npm install rag-governance-framework

Quick Start

const { 
  RAGGovernor,
  CitationTracker,
  AuditLogger,
  ValidationWorkflow 
} = require('rag-governance-framework');

// Initialize the governance layer
const governor = new RAGGovernor({
  auditLogger: new AuditLogger({ storage: 'file', path: './audit-logs' }),
  citationTracker: new CitationTracker({ requireCitations: true }),
  validationWorkflow: new ValidationWorkflow({
    riskThreshold: 0.7,  // Auto-approve if risk < 0.7
    reviewers: ['legal-team@example.com']
  })
});

// Wrap your RAG pipeline
const response = await governor.process({
  query: "What are the early retirement provisions?",
  context: retrievedDocuments,  // Your RAG retrieval results
  generateFn: async (query, context) => {
    // Your LLM call here
    return await openai.chat.completions.create({ ... });
  }
});

// Response includes governance metadata
console.log(response.content);           // The generated response
console.log(response.citations);         // Extracted citations with validation
console.log(response.auditId);           // Unique audit trail ID
console.log(response.validationStatus);  // 'approved', 'pending', 'rejected'

Core Concepts

RAGGovernor

The main orchestrator that wraps your RAG pipeline with governance controls.

const governor = new RAGGovernor({
  // Required: Audit logging configuration
  auditLogger: new AuditLogger(options),
  
  // Optional: Citation tracking
  citationTracker: new CitationTracker(options),
  
  // Optional: Human validation workflow
  validationWorkflow: new ValidationWorkflow(options),
  
  // Optional: Compliance policies
  policies: [
    new PIIPolicy({ action: 'redact' }),
    new ContentPolicy({ prohibitedTopics: ['investment-advice'] }),
    new DisclosurePolicy({ footer: 'This is AI-generated content.' })
  ]
});

Citation Tracking

Ensures AI responses are grounded in source documents.

const tracker = new CitationTracker({
  requireCitations: true,           // Reject responses without citations
  minCitations: 1,                  // Minimum citations required
  validateSourceExists: true,       // Check cited sources exist in context
  confidenceThreshold: 0.5,         // Minimum relevance score
  citationFormat: 'inline'          // 'inline', 'footnote', or 'endnote'
});

// Extract and validate citations from a response
const result = tracker.validate(response, sourceDocuments);
console.log(result.citations);      // Array of extracted citations
console.log(result.isValid);        // Whether citation requirements met
console.log(result.missingRefs);    // Citations that couldn't be verified

Audit Logger

Creates immutable, tamper-evident records of all RAG operations.

const logger = new AuditLogger({
  storage: 'file',                  // 'file', 'memory', or custom adapter
  path: './audit-logs',
  retention: '7-years',             // Retention policy
  includeContext: true,             // Log retrieved documents
  includePrompts: true,             // Log system/user prompts
  hashAlgorithm: 'sha256',          // For tamper detection
  sensitiveFields: ['ssn', 'dob']   // Fields to redact in logs
});

// Manually log an event
await logger.log({
  eventType: 'rag-query',
  query: userQuery,
  response: generatedResponse,
  citations: extractedCitations,
  metadata: { userId: '123', sessionId: 'abc' }
});

// Export for compliance reporting
await logger.export({ 
  format: 'json', 
  dateRange: { start: '2024-01-01', end: '2024-12-31' }
});

Validation Workflow

Routes high-risk outputs to human reviewers.

const workflow = new ValidationWorkflow({
  // Risk assessment
  riskAssessor: (response, context) => {
    // Custom risk scoring logic
    if (response.includes('financial advice')) return 0.9;
    if (response.citations.length === 0) return 0.8;
    return 0.3;
  },
  
  // Thresholds
  autoApproveThreshold: 0.3,    // Auto-approve if risk <= this
  autoRejectThreshold: 0.95,    // Auto-reject if risk >= this
  
  // Review queue configuration
  reviewers: ['team@example.com'],
  slaHours: 24,                 // Review SLA
  escalationPath: ['manager@example.com'],
  
  // Callbacks
  onApproval: async (response, reviewer) => { /* notify user */ },
  onRejection: async (response, reviewer, reason) => { /* handle rejection */ },
  onTimeout: async (response) => { /* SLA breach handling */ }
});

// Submit for validation
const result = await workflow.submit(response, {
  priority: 'high',
  requester: 'user@example.com'
});

// Check status
const status = await workflow.getStatus(result.validationId);

Compliance Policies

Enforce content and regulatory requirements.

const { 
  PIIPolicy, 
  ContentPolicy, 
  DisclosurePolicy,
  LengthPolicy 
} = require('rag-governance-framework/policies');

// PII Detection and Redaction
const piiPolicy = new PIIPolicy({
  action: 'redact',              // 'redact', 'reject', or 'flag'
  patterns: ['email', 'phone', 'ssn', 'credit-card'],
  customPatterns: [/MEMBER-\d{6}/g]  // Custom PII patterns
});

// Content Restrictions
const contentPolicy = new ContentPolicy({
  prohibitedTopics: ['investment-advice', 'medical-diagnosis'],
  requiredDisclaimers: ['ai-generated'],
  maxConfidenceLevel: 0.9       // Flag overconfident responses
});

// Regulatory Disclosures
const disclosurePolicy = new DisclosurePolicy({
  header: null,
  footer: 'This response was generated by AI and should be verified.',
  conditions: {
    'financial': 'This is not financial advice.',
    'legal': 'This is not legal advice. Consult a qualified professional.'
  }
});

Advanced Usage

Custom Storage Adapters

const { AuditLogger, StorageAdapter } = require('rag-governance-framework');

class PostgresAdapter extends StorageAdapter {
  async write(record) {
    await db.query('INSERT INTO audit_logs ...', record);
  }
  
  async read(filter) {
    return await db.query('SELECT * FROM audit_logs WHERE ...', filter);
  }
}

const logger = new AuditLogger({
  storage: new PostgresAdapter({ connectionString: '...' })
});

Risk Assessment Models

const { ValidationWorkflow, RiskAssessor } = require('rag-governance-framework');

class MLRiskAssessor extends RiskAssessor {
  async assess(response, context) {
    // Call your ML model
    const score = await mlModel.predict({
      response: response.content,
      citationCount: response.citations.length,
      topicClassification: await classify(response.content)
    });
    
    return {
      score,
      factors: ['low-citation-count', 'sensitive-topic'],
      explanation: 'Response discusses financial topics with limited citations'
    };
  }
}

Integration with Azure AI / OpenAI

const { RAGGovernor } = require('rag-governance-framework');
const { AzureOpenAI } = require('openai');

const client = new AzureOpenAI({ ... });

const governor = new RAGGovernor({ ... });

// Wrap your Azure OpenAI calls
const governedResponse = await governor.process({
  query: userQuery,
  context: await vectorStore.search(userQuery),
  generateFn: async (query, context) => {
    const completion = await client.chat.completions.create({
      model: 'gpt-4',
      messages: [
        { role: 'system', content: systemPrompt },
        { role: 'user', content: `Context: ${context}\n\nQuestion: ${query}` }
      ]
    });
    return completion.choices[0].message.content;
  }
});

Configuration Reference

Environment Variables

RAG_GOVERNANCE_LOG_LEVEL=info
RAG_GOVERNANCE_AUDIT_PATH=./audit-logs
RAG_GOVERNANCE_RETENTION_DAYS=2555
RAG_GOVERNANCE_HASH_ALGORITHM=sha256

Full Configuration Example

const governor = new RAGGovernor({
  auditLogger: new AuditLogger({
    storage: 'file',
    path: process.env.AUDIT_PATH || './audit-logs',
    retention: '7-years',
    includeContext: true,
    includePrompts: true,
    hashAlgorithm: 'sha256',
    sensitiveFields: ['ssn', 'dob', 'account_number'],
    rotationPolicy: 'daily'
  }),
  
  citationTracker: new CitationTracker({
    requireCitations: true,
    minCitations: 1,
    maxCitations: 20,
    validateSourceExists: true,
    confidenceThreshold: 0.5,
    citationFormat: 'inline',
    allowedSourceTypes: ['document', 'policy', 'regulation']
  }),
  
  validationWorkflow: new ValidationWorkflow({
    enabled: true,
    autoApproveThreshold: 0.3,
    autoRejectThreshold: 0.95,
    reviewers: ['reviewer@example.com'],
    slaHours: 24,
    escalationPath: ['manager@example.com'],
    queuePersistence: 'redis'
  }),
  
  policies: [
    new PIIPolicy({ action: 'redact' }),
    new ContentPolicy({ prohibitedTopics: ['investment-advice'] }),
    new DisclosurePolicy({ footer: 'AI-generated content.' })
  ],
  
  metrics: {
    enabled: true,
    collector: 'prometheus',
    prefix: 'rag_governance_'
  }
});

Best Practices for Regulated Industries

Financial Services

Citation Requirements: Always require citations for factual claims
Disclosure: Add clear AI-generated disclaimers
Audit Retention: 7+ years for regulatory compliance
Human Review: Route investment-related queries to compliance

Legal

Jurisdiction Awareness: Track which jurisdiction documents apply to
Version Control: Ensure citations reference correct document versions
Privilege: Flag potentially privileged content
Disclaimer: Clear "not legal advice" disclosures

Healthcare

PHI Protection: Strict PII/PHI redaction policies
Clinical Review: Route medical content to clinical reviewers
FDA Compliance: Flag promotional content for review
Audit Trail: HIPAA-compliant logging

Contributing

Contributions are welcome! Please read our Contributing Guide for details.

License

MIT License - see LICENSE for details.

Acknowledgments

Built with insights from practitioners in financial services, legal tech, and healthcare AI. Special thanks to the AI governance community for sharing patterns and anti-patterns.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
src		src
test		test
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

RAG Governance Framework

Why This Framework?

Features

🔗 Citation Tracking

📋 Audit Trail

👤 Human-in-the-Loop Validation

✅ Compliance Patterns

📊 Governance Dashboard Hooks

Installation

Quick Start

Core Concepts

RAGGovernor

Citation Tracking

Audit Logger

Validation Workflow

Compliance Policies

Advanced Usage

Custom Storage Adapters

Risk Assessment Models

Integration with Azure AI / OpenAI

Configuration Reference

Environment Variables

Full Configuration Example

Best Practices for Regulated Industries

Financial Services

Legal

Healthcare

Contributing

License

Acknowledgments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages