Skip to content

Deepu8405/whatsapp-chat-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

9 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ’ฌ WhatsApp Analyzer Pro

An advanced data analytics and AI application that transforms raw WhatsApp chat exports into deep, meaningful insights โ€” powered by Python, Streamlit, Sentiment Analysis and Google Gemini AI.

Python Streamlit Docker Gemini License


๐ŸŽฏ Project Objective

WhatsApp Analyzer Pro is a portfolio-grade data science project built to demonstrate advanced analytical and AI skills through real-world WhatsApp chat data.

Key Goals:

  • Parse and process unstructured WhatsApp text data into structured insights
  • Apply NLP techniques for sentiment scoring and mood tracking
  • Build an LLM-powered agent that learns from real chat patterns
  • Present findings through an interactive, visually appealing dashboard
  • Deploy end-to-end using Docker, Kubernetes and Jenkins CI/CD pipeline

โœจ Features

๐Ÿ“Š Analytics Dashboard

  • Total messages, words, media and links overview
  • Monthly and daily message timelines
  • Activity heatmap (hour vs day of week)
  • Most active users with contribution percentages
  • Word cloud and top 20 most common words
  • Emoji frequency and distribution analysis
  • Response time analysis per user
  • Conversation starter identification

๐Ÿง  Sentiment Analysis

  • Per-message VADER sentiment scoring
  • Overall mood score out of 10
  • Monthly sentiment timeline (positive/negative trends)
  • Sentiment heatmap by hour and day
  • Per-user sentiment comparison
  • Toxic and aggressive message detection

๐Ÿค– AI Chat Agent

  • Dynamic personality extraction from real chat data
  • Two-role conversation setup (you vs AI clone)
  • Hinglish-aware LLM responses via Google Gemini
  • Chat memory for multi-turn conversations
  • RPM and RPD usage counter with warnings
  • Model selector with rate limit display

๐Ÿ› ๏ธ Tech Stack

Layer Technology
Frontend Streamlit, Custom CSS, Plotly
Backend Python 3.14, Pandas, NLTK
NLP VADER Sentiment, WordCloud
AI/LLM Google Gemini API (google-genai)
Containerization Docker
Orchestration Kubernetes (K8s)
CI/CD Jenkins
Registry Docker Hub

๐Ÿ—๏ธ Project Architecture

whatsapp-analyzer/
โ”‚
โ”œโ”€โ”€ app.py                          # Main Streamlit entry point
โ”‚
โ”œโ”€โ”€ config/
โ”‚   โ””โ”€โ”€ settings.py                 # App-wide constants & config
โ”‚
โ”œโ”€โ”€ core/                           # OOP Business Logic
โ”‚   โ”œโ”€โ”€ base.py                     # Master Base/Super class
โ”‚   โ”œโ”€โ”€ preprocessor.py             # WhatsApp chat parser
โ”‚   โ”œโ”€โ”€ analyzer.py                 # Analytics computations
โ”‚   โ”œโ”€โ”€ sentiment.py                # Sentiment analysis
โ”‚   โ””โ”€โ”€ agent.py                    # AI Chat Agent
โ”‚
โ”œโ”€โ”€ ui/                             # UI Components
โ”‚   โ”œโ”€โ”€ sidebar.py                  # Sidebar component
โ”‚   โ”œโ”€โ”€ dashboard.py                # Analytics dashboard
โ”‚   โ”œโ”€โ”€ sentiment_ui.py             # Sentiment UI
โ”‚   โ””โ”€โ”€ agent_ui.py                 # Agent chat UI
โ”‚
โ”œโ”€โ”€ pages/                          # Streamlit multipage
โ”‚   โ”œโ”€โ”€ 1_How_To_Use.py             # How To Use guide
โ”‚   โ””โ”€โ”€ 2_Contact_Me.py             # Contact/Portfolio page
โ”‚
โ”œโ”€โ”€ assets/
โ”‚   โ””โ”€โ”€ style.css                   # Custom CSS styling
โ”‚
โ”œโ”€โ”€ data/
โ”‚   โ””โ”€โ”€ stop_hinglish.txt           # Hinglish stopwords
โ”‚
โ”œโ”€โ”€ tests/                          # Unit tests
โ”‚
โ”œโ”€โ”€ devops/                         # DevOps configuration
โ”‚   โ”œโ”€โ”€ Dockerfile                  # Docker build file
โ”‚   โ”œโ”€โ”€ docker-compose.yml          # Local Docker Compose
โ”‚   โ”œโ”€โ”€ Jenkinsfile                 # CI/CD Pipeline
โ”‚   โ””โ”€โ”€ k8s/
โ”‚       โ”œโ”€โ”€ deployment.yaml         # Kubernetes Deployment
โ”‚       โ””โ”€โ”€ service.yaml            # Kubernetes Service
โ”‚
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ .gitignore
โ””โ”€โ”€ README.md

๐Ÿš€ Getting Started

Prerequisites

  • Python 3.14+
  • pip
  • Git
  • Docker (optional, for containerized deployment)

Local Development

1. Clone the repository:

git clone https://github.com/deepu84059/whatsapp-analyzer-pro.git
cd whatsapp-analyzer-pro

2. Create and activate virtual environment:

# Windows
python -m venv venv
venv\Scripts\Activate.ps1

# Linux/Mac
python3 -m venv venv
source venv/bin/activate

3. Install dependencies:

pip install -r requirements.txt

4. Run the app:

streamlit run app.py

5. Open in browser:

http://localhost:8501

๐Ÿ“ฑ How To Export WhatsApp Chat

Android:

  1. Open WhatsApp chat
  2. Tap three dots (โ‹ฎ) โ†’ More โ†’ Export Chat
  3. Select Without Media
  4. Save the .txt file

iPhone:

  1. Open WhatsApp chat
  2. Tap contact/group name โ†’ Export Chat
  3. Select Without Media
  4. Save the .txt file

๐Ÿ”‘ Gemini API Key Setup

The AI Chat Agent requires a free Google Gemini API key:

  1. Go to aistudio.google.com
  2. Sign in with your Google account
  3. Click Get API Key โ†’ Create API Key
  4. Copy the key and paste it in the app sidebar

Free tier limits:

Model RPM RPD
gemini-2.5-flash-lite 10 250
gemini-2.5-flash 5 20

๐Ÿณ Docker Deployment

Build the image:

docker build -f devops/Dockerfile -t whatsapp-analyzer:latest .

Run the container:

docker run -p 8501:8501 whatsapp-analyzer:latest

Using Docker Compose:

cd devops
docker-compose up

๐Ÿ“Š OOP Architecture

The project follows a clean OOP design pattern:

Base (core/base.py)
โ”œโ”€โ”€ Preprocessor    โ€” Parses raw WhatsApp .txt into DataFrame
โ”œโ”€โ”€ Analyzer        โ€” Computes all analytics from DataFrame
โ”œโ”€โ”€ Sentiment       โ€” Scores sentiment and detects toxicity
โ””โ”€โ”€ Agent           โ€” Extracts patterns and powers LLM chat

All shared utility methods (text cleaning, stopwords, emoji extraction, DataFrame filtering) live in the Base superclass and are inherited by all core classes.


๐Ÿงช Running Tests

pytest tests/ -v

๐ŸŒ Live Demo

Coming soon on Streamlit Cloud


๐Ÿ‘จโ€๐Ÿ’ป Author

Deepu Kumar Rajak Data Scientist & AI Enthusiast ยท IIT Kharagpur

I'm passionate about leveraging AI to solve real-world problems. My work spans machine learning, deep learning, NLP, and generative AI applications.

๐ŸŒ Portfolio


๐Ÿ“„ License

This project is licensed under the MIT License.


๐Ÿ™ Acknowledgements

About

A high-performance Streamlit dashboard for analyzing WhatsApp chat exports. Features are user activity mapping, and interactive word clouds. etc

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors