Conversational AI assistant for NIST cybersecurity standards and OSCAL compliance, powered by Retrieval-Augmented Generation (RAG)
A production-ready AI agent that answers questions about NIST cybersecurity frameworks (SP 800-53, 800-37, 800-171, etc.) using:
- RAG (Retrieval-Augmented Generation) - Searches actual NIST documents, not hallucinations
- LangChain - Multi-tool agent with chat history
- FAISS - Vector similarity search over 10+ NIST publications
- OpenAI/Azure OpenAI - GPT-4 for intelligent responses
Perfect for security assessors, compliance professionals, and anyone working with NIST standards.
- ๐ Pre-indexed NIST Documents: 10+ publications ready to query
- NIST SP 800-53 Rev 5 (Security Controls)
- NIST SP 800-37 Rev 2 (Risk Management Framework)
- NIST SP 800-171 Rev 3 (CUI Protection)
- NIST SP 800-60, 800-63, 800-30, and more
- ๐ Intelligent Tool Selection: RAG โ Control lookup โ Web search fallback
- ๐ฌ Session-based Chat History: Contextual conversations per user
- ๐ฏ Citation: Always includes Control ID, Title, URL, Section
- ๐ FastAPI Service: REST API ready for integration
- ๐ณ Docker Ready: Containerized deployment
python >= 3.10
openai >= 1.0
langchain >= 0.1# Clone the repository
git clone https://github.com/yourusername/nist-rag-agent.git
cd nist-rag-agent
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your OpenAI API keyfrom agent import NistRagAgent
# Initialize the agent
agent = NistRagAgent()
# Ask a question
response = agent.query(
question="What does NIST say about access control?",
session_id="user123"
)
print(response["answer"])
# Includes citations: Control ID, Title, URL# Start the FastAPI server
python api_service.py
# Query via REST
curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{"question": "Explain AC-1", "session_id": "user123"}'nist-rag-agent/
โโโ agent.py # Core RAG agent implementation
โโโ api_service.py # FastAPI REST service
โโโ embeddings/ # Pre-built NIST document embeddings
โ โโโ NIST.SP.800-53r5.chunks.json
โ โโโ NIST.SP.800-53r5.chunks.npy
โ โโโ ... (10+ documents)
โโโ tools/ # Custom LangChain tools
โ โโโ nist_lookup.py
โ โโโ control_detail.py
โ โโโ web_search.py
โโโ examples/ # Usage examples
โ โโโ basic_query.py
โ โโโ batch_analysis.py
โ โโโ session_demo.py
โโโ tests/ # Test suite
โโโ requirements.txt
โโโ .env.example
โโโ Dockerfile
โโโ README.md
Add your own NIST documents:
from tools.embedding_builder import build_embeddings
# Build embeddings from PDF
build_embeddings(
pdf_path="NIST.SP.800-XX.pdf",
output_dir="embeddings/"
)# User Alice asks about access control
agent.query("What is AC-1?", session_id="alice")
# Later, Alice asks a follow-up
agent.query("What are the requirements?", session_id="alice")
# Agent remembers we're talking about AC-1
# User Bob has a separate conversation
agent.query("What is IR-4?", session_id="bob")# Add your own tools
from langchain_core.tools import tool
@tool("custom_tool")
def my_custom_tool(query: str) -> str:
"""Your custom NIST-related functionality"""
return "Custom response"
agent = NistRagAgent(extra_tools=[my_custom_tool])# Build the image
docker build -t nist-rag-agent .
# Run the container
docker run -p 8000:8000 \
-e OPENAI_API_KEY=your_key \
nist-rag-agent
# Or use docker-compose
docker-compose up -d| Document | Description | Chunks |
|---|---|---|
| SP 800-53 Rev 5 | Security and Privacy Controls | ~2,500 |
| SP 800-37 Rev 2 | Risk Management Framework | ~800 |
| SP 800-171 Rev 3 | CUI Protection | ~600 |
| SP 800-60 Vol 2 Rev 1 | Information Types | ~1,200 |
| SP 800-63-3 | Digital Identity | ~900 |
| SP 800-30 Rev 1 | Risk Assessment | ~700 |
| SP 800-137 | Continuous Monitoring | ~400 |
| SP 800-18 Rev 1 | Security Plans | ~300 |
| CSWP 29 | AI Risk Management | ~500 |
# Run all tests
pytest
# Run with coverage
pytest --cov=. --cov-report=html
# Test specific functionality
pytest tests/test_agent.py::test_access_control_queryEdit .env to customize:
# OpenAI Configuration
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o # or gpt-4, gpt-3.5-turbo
# Azure OpenAI (alternative)
AZURE_OPENAI_ENDPOINT=https://...
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_DEPLOYMENT=...
# LangChain (optional)
LANGCHAIN_API_KEY=...
LANGCHAIN_TRACING_V2=true
# RAG Configuration
TOP_K_RESULTS=3
CHUNK_SIZE=1000
EMBEDDING_MODEL=text-embedding-ada-002Contributions welcome! Areas of interest:
- Additional NIST publications (800-137A, 800-161, etc.)
- Enhanced citation formatting
- OSCAL integration (SSP generation, profile validation)
- Performance optimizations
- UI/UX (Streamlit, Gradio)
MIT License - see LICENSE for details.
- NIST for publishing open cybersecurity standards
- LangChain for the agent framework
- OpenAI for GPT models and embeddings
Built by a federal cybersecurity professional working with AI-assisted development.
- GitHub Issues: Report bugs or request features
- Discussions: Ask questions or share ideas
Note: This tool provides information retrieval only. Always verify compliance requirements with official NIST publications and your organization's policies.