An advanced data analytics and AI application that transforms raw WhatsApp chat exports into deep, meaningful insights โ powered by Python, Streamlit, Sentiment Analysis and Google Gemini AI.
WhatsApp Analyzer Pro is a portfolio-grade data science project built to demonstrate advanced analytical and AI skills through real-world WhatsApp chat data.
Key Goals:
- Parse and process unstructured WhatsApp text data into structured insights
- Apply NLP techniques for sentiment scoring and mood tracking
- Build an LLM-powered agent that learns from real chat patterns
- Present findings through an interactive, visually appealing dashboard
- Deploy end-to-end using Docker, Kubernetes and Jenkins CI/CD pipeline
- Total messages, words, media and links overview
- Monthly and daily message timelines
- Activity heatmap (hour vs day of week)
- Most active users with contribution percentages
- Word cloud and top 20 most common words
- Emoji frequency and distribution analysis
- Response time analysis per user
- Conversation starter identification
- Per-message VADER sentiment scoring
- Overall mood score out of 10
- Monthly sentiment timeline (positive/negative trends)
- Sentiment heatmap by hour and day
- Per-user sentiment comparison
- Toxic and aggressive message detection
- Dynamic personality extraction from real chat data
- Two-role conversation setup (you vs AI clone)
- Hinglish-aware LLM responses via Google Gemini
- Chat memory for multi-turn conversations
- RPM and RPD usage counter with warnings
- Model selector with rate limit display
| Layer | Technology |
|---|---|
| Frontend | Streamlit, Custom CSS, Plotly |
| Backend | Python 3.14, Pandas, NLTK |
| NLP | VADER Sentiment, WordCloud |
| AI/LLM | Google Gemini API (google-genai) |
| Containerization | Docker |
| Orchestration | Kubernetes (K8s) |
| CI/CD | Jenkins |
| Registry | Docker Hub |
whatsapp-analyzer/
โ
โโโ app.py # Main Streamlit entry point
โ
โโโ config/
โ โโโ settings.py # App-wide constants & config
โ
โโโ core/ # OOP Business Logic
โ โโโ base.py # Master Base/Super class
โ โโโ preprocessor.py # WhatsApp chat parser
โ โโโ analyzer.py # Analytics computations
โ โโโ sentiment.py # Sentiment analysis
โ โโโ agent.py # AI Chat Agent
โ
โโโ ui/ # UI Components
โ โโโ sidebar.py # Sidebar component
โ โโโ dashboard.py # Analytics dashboard
โ โโโ sentiment_ui.py # Sentiment UI
โ โโโ agent_ui.py # Agent chat UI
โ
โโโ pages/ # Streamlit multipage
โ โโโ 1_How_To_Use.py # How To Use guide
โ โโโ 2_Contact_Me.py # Contact/Portfolio page
โ
โโโ assets/
โ โโโ style.css # Custom CSS styling
โ
โโโ data/
โ โโโ stop_hinglish.txt # Hinglish stopwords
โ
โโโ tests/ # Unit tests
โ
โโโ devops/ # DevOps configuration
โ โโโ Dockerfile # Docker build file
โ โโโ docker-compose.yml # Local Docker Compose
โ โโโ Jenkinsfile # CI/CD Pipeline
โ โโโ k8s/
โ โโโ deployment.yaml # Kubernetes Deployment
โ โโโ service.yaml # Kubernetes Service
โ
โโโ requirements.txt
โโโ .gitignore
โโโ README.md
- Python 3.14+
- pip
- Git
- Docker (optional, for containerized deployment)
1. Clone the repository:
git clone https://github.com/deepu84059/whatsapp-analyzer-pro.git
cd whatsapp-analyzer-pro2. Create and activate virtual environment:
# Windows
python -m venv venv
venv\Scripts\Activate.ps1
# Linux/Mac
python3 -m venv venv
source venv/bin/activate3. Install dependencies:
pip install -r requirements.txt4. Run the app:
streamlit run app.py5. Open in browser:
http://localhost:8501
Android:
- Open WhatsApp chat
- Tap three dots (โฎ) โ More โ Export Chat
- Select Without Media
- Save the
.txtfile
iPhone:
- Open WhatsApp chat
- Tap contact/group name โ Export Chat
- Select Without Media
- Save the
.txtfile
The AI Chat Agent requires a free Google Gemini API key:
- Go to aistudio.google.com
- Sign in with your Google account
- Click Get API Key โ Create API Key
- Copy the key and paste it in the app sidebar
Free tier limits:
| Model | RPM | RPD |
|---|---|---|
| gemini-2.5-flash-lite | 10 | 250 |
| gemini-2.5-flash | 5 | 20 |
Build the image:
docker build -f devops/Dockerfile -t whatsapp-analyzer:latest .Run the container:
docker run -p 8501:8501 whatsapp-analyzer:latestUsing Docker Compose:
cd devops
docker-compose upThe project follows a clean OOP design pattern:
Base (core/base.py)
โโโ Preprocessor โ Parses raw WhatsApp .txt into DataFrame
โโโ Analyzer โ Computes all analytics from DataFrame
โโโ Sentiment โ Scores sentiment and detects toxicity
โโโ Agent โ Extracts patterns and powers LLM chat
All shared utility methods (text cleaning, stopwords, emoji extraction, DataFrame filtering) live in the Base superclass and are inherited by all core classes.
pytest tests/ -vComing soon on Streamlit Cloud
Deepu Kumar Rajak Data Scientist & AI Enthusiast ยท IIT Kharagpur
I'm passionate about leveraging AI to solve real-world problems. My work spans machine learning, deep learning, NLP, and generative AI applications.
๐ Portfolio
This project is licensed under the MIT License.
- Streamlit โ for the amazing web framework
- Google Gemini โ for the free LLM API
- VADER Sentiment โ for sentiment analysis
- Plotly โ for interactive charts