Skip to content

ivproduced/nist-rag-agent

Repository files navigation

NIST RAG Agent ๐Ÿค–

Conversational AI assistant for NIST cybersecurity standards and OSCAL compliance, powered by Retrieval-Augmented Generation (RAG)

Python 3.10+ LangChain License: MIT

๐ŸŽฏ What is This?

A production-ready AI agent that answers questions about NIST cybersecurity frameworks (SP 800-53, 800-37, 800-171, etc.) using:

  • RAG (Retrieval-Augmented Generation) - Searches actual NIST documents, not hallucinations
  • LangChain - Multi-tool agent with chat history
  • FAISS - Vector similarity search over 10+ NIST publications
  • OpenAI/Azure OpenAI - GPT-4 for intelligent responses

Perfect for security assessors, compliance professionals, and anyone working with NIST standards.

โœจ Features

  • ๐Ÿ“š Pre-indexed NIST Documents: 10+ publications ready to query
    • NIST SP 800-53 Rev 5 (Security Controls)
    • NIST SP 800-37 Rev 2 (Risk Management Framework)
    • NIST SP 800-171 Rev 3 (CUI Protection)
    • NIST SP 800-60, 800-63, 800-30, and more
  • ๐Ÿ” Intelligent Tool Selection: RAG โ†’ Control lookup โ†’ Web search fallback
  • ๐Ÿ’ฌ Session-based Chat History: Contextual conversations per user
  • ๐ŸŽฏ Citation: Always includes Control ID, Title, URL, Section
  • ๐Ÿš€ FastAPI Service: REST API ready for integration
  • ๐Ÿณ Docker Ready: Containerized deployment

๐Ÿš€ Quick Start

Prerequisites

python >= 3.10
openai >= 1.0
langchain >= 0.1

Installation

# Clone the repository
git clone https://github.com/yourusername/nist-rag-agent.git
cd nist-rag-agent

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
cp .env.example .env
# Edit .env with your OpenAI API key

Basic Usage

from agent import NistRagAgent

# Initialize the agent
agent = NistRagAgent()

# Ask a question
response = agent.query(
    question="What does NIST say about access control?",
    session_id="user123"
)

print(response["answer"])
# Includes citations: Control ID, Title, URL

Run as API Service

# Start the FastAPI server
python api_service.py

# Query via REST
curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{"question": "Explain AC-1", "session_id": "user123"}'

๐Ÿ“ Project Structure

nist-rag-agent/
โ”œโ”€โ”€ agent.py                 # Core RAG agent implementation
โ”œโ”€โ”€ api_service.py          # FastAPI REST service
โ”œโ”€โ”€ embeddings/             # Pre-built NIST document embeddings
โ”‚   โ”œโ”€โ”€ NIST.SP.800-53r5.chunks.json
โ”‚   โ”œโ”€โ”€ NIST.SP.800-53r5.chunks.npy
โ”‚   โ””โ”€โ”€ ... (10+ documents)
โ”œโ”€โ”€ tools/                  # Custom LangChain tools
โ”‚   โ”œโ”€โ”€ nist_lookup.py
โ”‚   โ”œโ”€โ”€ control_detail.py
โ”‚   โ””โ”€โ”€ web_search.py
โ”œโ”€โ”€ examples/              # Usage examples
โ”‚   โ”œโ”€โ”€ basic_query.py
โ”‚   โ”œโ”€โ”€ batch_analysis.py
โ”‚   โ””โ”€โ”€ session_demo.py
โ”œโ”€โ”€ tests/                 # Test suite
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ .env.example
โ”œโ”€โ”€ Dockerfile
โ””โ”€โ”€ README.md

๐Ÿ› ๏ธ Advanced Usage

Custom Embeddings

Add your own NIST documents:

from tools.embedding_builder import build_embeddings

# Build embeddings from PDF
build_embeddings(
    pdf_path="NIST.SP.800-XX.pdf",
    output_dir="embeddings/"
)

Multi-Session Chat

# User Alice asks about access control
agent.query("What is AC-1?", session_id="alice")

# Later, Alice asks a follow-up
agent.query("What are the requirements?", session_id="alice")
# Agent remembers we're talking about AC-1

# User Bob has a separate conversation
agent.query("What is IR-4?", session_id="bob")

Tool Customization

# Add your own tools
from langchain_core.tools import tool

@tool("custom_tool")
def my_custom_tool(query: str) -> str:
    """Your custom NIST-related functionality"""
    return "Custom response"

agent = NistRagAgent(extra_tools=[my_custom_tool])

๐Ÿณ Docker Deployment

# Build the image
docker build -t nist-rag-agent .

# Run the container
docker run -p 8000:8000 \
  -e OPENAI_API_KEY=your_key \
  nist-rag-agent

# Or use docker-compose
docker-compose up -d

๐Ÿ“Š Supported NIST Publications

Document Description Chunks
SP 800-53 Rev 5 Security and Privacy Controls ~2,500
SP 800-37 Rev 2 Risk Management Framework ~800
SP 800-171 Rev 3 CUI Protection ~600
SP 800-60 Vol 2 Rev 1 Information Types ~1,200
SP 800-63-3 Digital Identity ~900
SP 800-30 Rev 1 Risk Assessment ~700
SP 800-137 Continuous Monitoring ~400
SP 800-18 Rev 1 Security Plans ~300
CSWP 29 AI Risk Management ~500

๐Ÿงช Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=. --cov-report=html

# Test specific functionality
pytest tests/test_agent.py::test_access_control_query

๐Ÿ”ง Configuration

Edit .env to customize:

# OpenAI Configuration
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o  # or gpt-4, gpt-3.5-turbo

# Azure OpenAI (alternative)
AZURE_OPENAI_ENDPOINT=https://...
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_DEPLOYMENT=...

# LangChain (optional)
LANGCHAIN_API_KEY=...
LANGCHAIN_TRACING_V2=true

# RAG Configuration
TOP_K_RESULTS=3
CHUNK_SIZE=1000
EMBEDDING_MODEL=text-embedding-ada-002

๐Ÿค Contributing

Contributions welcome! Areas of interest:

  • Additional NIST publications (800-137A, 800-161, etc.)
  • Enhanced citation formatting
  • OSCAL integration (SSP generation, profile validation)
  • Performance optimizations
  • UI/UX (Streamlit, Gradio)

๐Ÿ“ License

MIT License - see LICENSE for details.

๐Ÿ™ Acknowledgments

  • NIST for publishing open cybersecurity standards
  • LangChain for the agent framework
  • OpenAI for GPT models and embeddings

๐Ÿ“ฎ Contact

Built by a federal cybersecurity professional working with AI-assisted development.


Note: This tool provides information retrieval only. Always verify compliance requirements with official NIST publications and your organization's policies.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors