B.Tech AI · Amrita Vishwa Vidyapeetham | MS Computer Science · Pace University, New York
I don't just build models — I ship AI systems that run in production on AWS.
From RAG pipelines and LLM agents to computer vision and MLOps — full stack, end to end.
| Project | Live Demo | Stack | |
|---|---|---|---|
| 🟣 | AI Financial Research Agent | GPT-4o · FastAPI · AWS EC2 · Docker | |
| 🔵 | Enterprise RAG Chatbot | LangChain · ChromaDB · AWS EC2 · Docker |
I'm an AI/ML Engineer who ships production AI — not just prototypes. My work spans the full stack: LLM agents, RAG pipelines, computer vision, and cloud-native MLOps on AWS.
- 🤖 LLM Engineering — RAG pipelines, LangGraph agents, tool-calling, prompt engineering, OpenAI API
- 🔭 End-to-end ML — data ingestion → feature engineering → training → real-time inference → monitoring
- ☁️ Cloud-native systems — AWS EC2, S3, Lambda, Kinesis, SageMaker, RDS, EventBridge
- 👁️ Computer Vision — YOLOv8 object detection, ResNet-50 classification, 50ms inference APIs
- 📊 MLOps — data drift detection, automated retraining, MLflow, CloudWatch, GitHub Actions CI/CD
- 🎓 AWS Certified Cloud Practitioner · MS CS, Pace University, New York
"A model that doesn't reach production is a prototype, not a solution."
| Metric | Result | Project |
|---|---|---|
| ⚡ Report Generation | < 30 seconds end-to-end | AI Financial Research Agent |
| 🔍 RAG Retrieval Accuracy | 87% on benchmark queries | Enterprise RAG Chatbot |
| 🎯 Fraud Detection Accuracy | 96.7% on 50k+ claims | Insurance Fraud Detection |
| 🧠 Inference Latency | 50ms real-time | Insurance Fraud Detection |
| 📈 Churn Prediction F1 | 91% accuracy / 89% F1 | Customer Churn + MLOps |
| 📉 Sales Forecast RMSE | 4.2% — 18% better than baseline | Retail Sales Forecasting |
🟣 AI Financial Research Agent — Live Demo ↗ · GitHub ↗ · Portfolio ↗
Autonomous GPT-4o agent that generates structured BUY/HOLD/SELL investment reports in under 30 seconds
Most AI financial tools let the LLM decide whether to buy or sell. This system does not. The investment decision is made entirely by a deterministic scoring engine — GPT-4o only explains the decision in plain English. This eliminates hallucination from the most critical part of the pipeline.
User Request → FastAPI → LangGraph Agent → Market Data (Stooq) + Fundamentals (Alpha Vantage) + News (Tavily)
→ Deterministic Scoring Engine (Technical + Fundamental + Sentiment)
→ Conflict Detection → Confidence Model → BUY / HOLD / SELL
→ GPT-4o Explanation → Redis Cache → JSON Response → Dark UI
Key innovations: Normalised scoring (missing data not penalised) · Conflict detection override · 5-factor confidence model · Portfolio ranking mode · Time-horizon weight profiles · Redis caching (10× API cost reduction)
Python FastAPI LangGraph GPT-4o Alpha Vantage Tavily Redis Docker AWS EC2
🔵 Enterprise RAG Chatbot — Live Demo ↗ · GitHub ↗ · Portfolio ↗
Production RAG pipeline — 87% retrieval accuracy, hallucination guardrails, live on AWS EC2
Full production RAG system: MMR retrieval for result diversity, metadata-injected citations, session-isolated upload mode, structured JSON logging, and real-time streaming. Every response includes [SOURCE N: filename | Page] — zero hallucinations.
PDF (Library or Upload) → PyPDF → Chunking (1000/200 overlap)
→ OpenAI Embeddings (text-embedding-3-small) → ChromaDB
→ MMR Retrieval (fetch_k=20, top-5) → Hallucination Guardrail
→ Strict Prompt + Citations → GPT-3.5-turbo → Streamlit Streaming UI
Key innovations: Similarity threshold guardrail · MMR diversity retrieval · Metadata citation injection · Upload-any-PDF mode · Session-isolated temp vectorstore · Structured query logging
Python LangChain OpenAI API ChromaDB Streamlit Docker AWS EC2 PyPDF
🔴 Insurance Fraud Detection — GitHub ↗ · Portfolio ↗
Dual-model CV pipeline — 96.7% accuracy on 50k+ insurance claims at 50ms inference
YOLOv8 localises vehicle damage in real time. ResNet-50 classifies severity and flags fraud indicators. Served via FastAPI at 50ms average inference latency on a cloud-native AWS pipeline.
Claim Image → YOLOv8 (damage localisation) → ResNet-50 (severity classification)
→ FastAPI Inference API (50ms) → AWS S3 + Lambda + Kinesis → RDS Reporting
YOLOv8 ResNet-50 PyTorch OpenCV FastAPI AWS S3 Lambda Kinesis SageMaker RDS
🟢 Retail Sales Forecasting — GitHub ↗ · Portfolio ↗
LSTM + Prophet ensemble — 4.2% RMSE, 18% better than single-model baselines on AWS SageMaker
Engineered a batch ETL pipeline processing 500K+ retail transactions. LSTM captures complex temporal patterns; Prophet handles seasonality and holidays. Deployed as a SageMaker real-time endpoint with automated weekly retraining from S3.
LSTM Prophet PyTorch SageMaker S3 Pandas NumPy
🟡 Customer Churn Prediction — GitHub ↗ · Portfolio ↗
MLOps pipeline — 91% accuracy, SHAP explainability, automated CI/CD retraining
XGBoost classifier with 21 engineered features including NLP-extracted signals from support tickets. CloudWatch + EventBridge triggers automated model retraining on data drift. MLflow tracks all experiments.
XGBoost SHAP MLflow AWS Comprehend CloudWatch EventBridge Docker GitHub Actions
Generative AI & LLM
LangGraph RAG Pipelines Vector Databases Prompt Engineering Embeddings Tool Calling
Machine Learning & CV
YOLOv8 ResNet-50 LSTM Prophet SHAP
Cloud & Infrastructure
EC2 S3 SageMaker Lambda Kinesis RDS CloudWatch EventBridge
MLOps & Tools
Weights & Biases Jupyter Linux Bash
- 🔧 AI Resume Analyzer — LLM-powered resume scoring and feedback system
- 🔍 Semantic Search Engine — production vector search with re-ranking and RAGAS evaluation
- 🤖 Advanced RAG — multi-modal RAG with images and tables, hybrid retrieval
- ⚙️ Advanced MLOps — model versioning, experiment tracking, CI/CD for ML
- 📡 Large-Scale Pipelines — Apache Spark, AWS Glue, real-time streaming architectures
I'm actively seeking AI/ML Engineer, LLM Engineer, and Applied AI roles at US tech companies and startups.
| 🌐 Portfolio | rajkumarai.dev |
| linkedin.com/in/raj-kumar-nelluri | |
| rajkumarn2002@gmail.com | |
| 🐙 GitHub | github.com/Rajkumar2002-Rk |