Skip to content

YashNuhash/Helix

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6,268 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧬 Helix: Temporal GraphRAG

LightRAG + Graphiti = Temporal Knowledge Graphs for RAG


🎯 What is Helix?

Helix fuses LightRAG's proven dual-level retrieval with Graphiti's bi-temporal Knowledge Graph to create a next-generation RAG system with:

Feature Capability
Temporal Awareness Point-in-time queries, automatic edge invalidation
Multi-Hop Reasoning BFS-based path exploration with scoring
Hallucination Detection Composite Fidelity Index (CFI) verification
Incremental Updates No full graph rebuild required

πŸ“Š Benchmark Targets

Category Datasets Metrics Target Baseline
Temporal Time-LongQA, ECT-QA, MultiTQ Hit@1, Hit@5, Acc 70-75% 45-55%
Hallucination Legal QA, Medical QA, FEVER AUC, CFI >0.95 0.84-0.94
Multi-Hop MuSiQue, 2WikiMHQA, HotpotQA F1, EM 70-75 54-59
Scalability UltraDomain (all) Tokens, Latency <600K 14M

πŸ“¦ Installation

From PyPI

pip install helix-rag

From Source (Development)

git clone https://github.com/YashNuhash/Helix.git
cd Helix

# Install with Helix dependencies
pip install -e ".[helix]"

Dependencies

Helix requires:

  • Neo4j (for Graphiti Knowledge Graph)
  • Supabase (optional, for vector storage)
  • LLM API (any provider - configured via environment)

βš™οΈ Configuration

Copy .env.example to .env and configure:

cp .env.example .env

Required Environment Variables

# Neo4j Configuration (for Graphiti)
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password

# LLM Configuration (model-agnostic)
LLM_MODEL_NAME=your_model_name
LLM_API_KEY=your_api_key

# Supabase (optional)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_KEY=your_key

Supabase Setup (Optional)

Run scripts/supabase_schema.sql in your Supabase SQL Editor to create the vector storage table.


πŸš€ Quick Start

Basic Usage

import asyncio
from helix import Helix

async def main():
    # Initialize Helix
    async with Helix() as helix:
        # Insert document with temporal tracking
        result = await helix.insert(
            "Alan Turing was born on June 23, 1912. "
            "He is considered the father of computer science.",
            source_description="Wikipedia"
        )
        print(f"Extracted {result['entities_extracted']} entities")
        
        # Query with temporal awareness
        answer = await helix.query(
            "When was Alan Turing born?",
            mode="hybrid"
        )
        print(answer["answer"])

asyncio.run(main())

Temporal Queries

from datetime import datetime
from helix import Helix
from helix.utils import is_temporal_query, extract_temporal_params

async def temporal_example():
    async with Helix() as helix:
        # Detect temporal intent
        query = "What was the CEO of Apple in 2015?"
        
        if is_temporal_query(query):
            params = extract_temporal_params(query)
            print(f"Temporal query detected: {params.temporal_keywords}")
        
        # Query with point-in-time context
        result = await helix.query(
            query,
            valid_at=datetime(2015, 1, 1),
            include_temporal_context=True
        )
        print(result)

asyncio.run(temporal_example())

Hallucination Detection

from helix.hallucination import HallucinationDetector

async def verify_response():
    async with Helix() as helix:
        detector = HallucinationDetector(graphiti=helix.graphiti)
        
        # Get response
        result = await helix.query("Tell me about Alan Turing")
        
        # Verify against knowledge graph
        verification = await detector.verify_response(
            response=result["answer"],
            query="Tell me about Alan Turing",
            context=result.get("temporal_context")
        )
        
        print(f"Grounded: {verification.is_grounded}")
        print(f"CFI Score: {verification.confidence_score:.2f}")
        print(f"Entity Coverage: {verification.entity_coverage:.2%}")

asyncio.run(verify_response())

Multi-Hop Reasoning

from helix.multihop import MultiHopRetriever

async def multihop_example():
    async with Helix() as helix:
        retriever = MultiHopRetriever(graphiti=helix.graphiti)
        
        # Find reasoning paths
        paths = await retriever.find_paths(
            query="How is Alan Turing connected to modern AI?",
            max_hops=3
        )
        
        # Format as context
        context = retriever.format_paths_as_context(paths)
        print(context)

asyncio.run(multihop_example())

πŸ“ˆ Evaluation

Running Benchmarks

Helix includes evaluation scripts for academic benchmarks. Use these in Google Colab or Kaggle:

# Install Helix
!pip install helix-rag

# Run temporal benchmark
from helix.eval import TemporalBenchmark

benchmark = TemporalBenchmark(dataset="time-longqa")
results = await benchmark.run()
print(f"Hit@1: {results['hit_at_1']:.2%}")

Supported Benchmarks

Benchmark Dataset Command
Temporal Time-LongQA helix eval --dataset time-longqa
Temporal ECT-QA helix eval --dataset ect-qa
Multi-Hop MuSiQue helix eval --dataset musique
Multi-Hop HotpotQA helix eval --dataset hotpotqa
Hallucination FEVER helix eval --dataset fever
Scalability UltraDomain helix eval --dataset ultradomain

Colab/Kaggle Notebook

# Quick evaluation notebook
import os
os.environ["LLM_API_KEY"] = "your_key"
os.environ["LLM_MODEL_NAME"] = "your_model"
os.environ["NEO4J_URI"] = "bolt://localhost:7687"
os.environ["NEO4J_PASSWORD"] = "password"

from helix import Helix
from helix.eval import run_all_benchmarks

# Run all benchmarks
results = await run_all_benchmarks()
print(results.to_dataframe())

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Helix                                β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚   LightRAG  β”‚  β”‚   Graphiti   β”‚  β”‚  Helix Modules    β”‚   β”‚
β”‚  β”‚  (Retrieval)β”‚  β”‚ (Temporal KG)β”‚  β”‚                   β”‚   β”‚
β”‚  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€   β”‚
β”‚  β”‚ - Chunking  β”‚  β”‚ - Episodes   β”‚  β”‚ - TemporalHandler β”‚   β”‚
β”‚  β”‚ - Embedding β”‚  β”‚ - Bi-temporalβ”‚  β”‚ - Hallucination   β”‚   β”‚
β”‚  β”‚ - Vector DB β”‚  β”‚ - Resolution β”‚  β”‚ - MultiHop        β”‚   β”‚
β”‚  β”‚ - Dual-levelβ”‚  β”‚ - Invalidate β”‚  β”‚ - CFI Scoring     β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚         β”‚                β”‚                    β”‚              β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚                          β–Ό                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚                    Storage Layer                     β”‚    β”‚
β”‚  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€    β”‚
β”‚  β”‚  Neo4j (Graph)  β”‚  Supabase (Vector)  β”‚  Local KV   β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

helix/
β”œβ”€β”€ __init__.py           # Package entry (v0.1.1)
β”œβ”€β”€ core/
β”‚   └── helix.py          # Main Helix class
β”œβ”€β”€ storage/
β”‚   β”œβ”€β”€ graphiti_impl.py  # GraphitiGraphStorage
β”‚   └── supabase_impl.py  # SupabaseVectorStorage
β”œβ”€β”€ temporal/
β”‚   └── query_handler.py  # TemporalQueryHandler
β”œβ”€β”€ hallucination/
β”‚   └── detector.py       # HallucinationDetector (CFI)
β”œβ”€β”€ multihop/
β”‚   └── retriever.py      # MultiHopRetriever (BFS)
└── utils/
    └── temporal_utils.py # Temporal parsing

πŸ”¬ Research Goals

Helix is designed to achieve state-of-the-art performance on:

  1. Temporal GraphRAG: 70-75% accuracy on temporal QA benchmarks
  2. Hallucination Detection: AUC >0.95 using graph-aligned verification
  3. Multi-Hop Reasoning: F1 70-75 on complex reasoning benchmarks
  4. Scalability: <600K tokens for indexing (vs 14M baseline)

See PLAN.md for detailed research methodology.


πŸ“š Citation

If you use Helix in your research, please cite:

@software{helix2024,
  title = {Helix: Temporal GraphRAG with LightRAG and Graphiti},
  author = {Your Name},
  year = {2024},
  url = {https://github.com/YashNuhash/Helix}
}

🀝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.


πŸ“„ License

MIT License - see LICENSE for details.


Built with 🧬 Helix

LightRAG + Graphiti = Temporal GraphRAG

About

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 81.4%
  • TypeScript 16.6%
  • Shell 1.0%
  • Jupyter Notebook 0.3%
  • JavaScript 0.3%
  • CSS 0.2%
  • Other 0.2%