Transform unstructured stories into structured, queryable Knowledge Graphs using cutting-edge NLP.
Story2KG is an end-to-end Natural Language Processing (NLP) pipeline that transforms raw narrative text into structured Knowledge Graphs (KGs).
Unlike prior works focusing on individual subtasks (NER, SRL, summarization), Story2KG provides a unified framework for:
- 🧹 Preprocessing & Coreference Resolution
- 📚 Scene Segmentation
- 🔎 Entity, Attribute, and Event Extraction
- 😃 Emotion Classification
- ✍️ Abstractive Summarization (BART/T5)
- 🌐 Knowledge Graph Construction in Neo4j
The result is an interactive, queryable graph that captures characters, events, emotions, and relationships from stories.
✅ End-to-End Pipeline – From raw text to KG.
✅ Narrative-Aware Design – Handles stories, characters, events, and temporal flow.
✅ Hybrid NLP Models – SpaCy, Hugging Face Transformers, AllenNLP.
✅ Graph Visualization – Neo4j integration for querying and visualization.
✅ Modular Architecture – Swap models (e.g., BART → T5).
✅ Applications – Education, Digital Humanities, Explainable AI, Story Analytics.
graph TD
A[Raw Story Text] --> B[Preprocessing & Coreference Resolution]
B --> C[Scene Segmentation]
C --> D[Deep Context Extraction: Entities, Attributes, Events, Emotions]
D --> E[Abstractive Summarization]
E --> F[Neo4j Knowledge Graph Construction]
Story2KG/
│── notebooks # Jupyter notebooks for experimentation
│── Architectures # Adaptive + Hierarchical
│── requirements.txt # Dependencies
│── License # MIT License
│── README.md # Project Documentation
git clone https://github.com/yourusername/Story2KG.git
cd Story2KGpython -m venv venv
source venv/bin/activate # (Linux/Mac)
venv\Scripts\activate # (Windows)pip install -r requirements.txt- Install Neo4j Desktop or run a Docker container:
docker run -d --name neo4j -p 7474:7474 -p 7687:7687 -e NEO4J_AUTH=neo4j/test neo4j:latestfrom story2kg import pipeline
story_text = """Once, a hare laughed at a tortoise for being slow..."""
kg = pipeline.run(story_text)
# Export to Neo4j
kg.export_to_neo4j(uri="bolt://localhost:7687", user="neo4j", password="test")MATCH (c:Character)-[r]->(e:Event)
RETURN c.name, type(r), e.description;| Component | Precision | Recall | F1-Score |
|---|---|---|---|
| Entity Recognition | 89.0% | 91.2% | 90.1% |
| Attribute Detection | 85.4% | 83.7% | 84.5% |
| Emotion Classification | 80.2% | 78.9% | 79.5% |
| Knowledge Graph Completeness | – | – | 92.5% |
- Expansion to larger story corpora (novels, folklore).
- Integration with LLMs (GPT, LLaMA) for improved reasoning.
- Support for Temporal & Causal KGs.
- Advanced narrative coherence evaluation metrics.
📘 Education – Transform textbooks/fables into interactive graphs.
📖 Digital Humanities – Analyze cultural narratives & folklore.
🤖 Explainable AI – Human-readable narrative reasoning.
📝 Story Analytics – Character profiling & plot analysis.
- Dr. Mohan Allam – Advisor
- Dasara Rajiv Kumar – Co-Author
- Rohit Mukkala – Co-Author
- Sujal Ghonmode – Co-Author
This project is licensed under the MIT License. See LICENSE for details.