Skip to content
View Susilkessav's full-sized avatar

Block or report Susilkessav

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Susilkessav/README.md

Hey, I'm Susil 👋

I'm a grad student at Northeastern University finishing my MS in Data Analytics Engineering.

I like building systems that move, transform, and learn from data — from data pipelines and ML infrastructure to full-stack tools that people actually use. Most of the time, I’m building something around data engineering, AI/ML, or platform/backend systems.

Outside of work and school, I’m usually exploring new ideas, gaming with friends, trying things on Hack The Box, or learning something new just because it looks interesting.

I’m currently looking for full-time roles in Data Engineering, AI/ML Engineering, and Data Platform / SWE.

LinkedIn Email


A little about me

  • I enjoy building projects that solve real problems, not just demos
  • I’m especially interested in data systems, LLM/RAG applications, ML infra, and scalable backend design
  • I like understanding how things work under the hood and improving them
  • I’m big on self-growth, hands-on learning, and building things I’m genuinely excited about

Things I’ve built

RAG Chatbot — Amazon Automotive Reviews

LangChain ChromaDB SentenceTransformers Airflow Langfuse RAGAS FastAPI Docker

Built an LLM-powered Q&A system over 2M+ automotive reviews with query decomposition, vector retrieval, and agent-style answer generation. Reached ~90% faithfulness on RAGAS with response times under 2 seconds, and used Langfuse to track retrieval drift and latency so debugging felt like engineering, not guessing.

Predictive Analytics Pipeline — Telemetry Data

PySpark Spark SQL dbt XGBoost MLflow Delta Lake Kafka

Designed a distributed ML pipeline across 5+ telemetry sources and 10M+ records, with lag-aware joins and dbt contracts across 20+ feature tables. Improved reliability for downstream ML teams, reduced compute cost by 45%, and shipped a model that reached 0.73 ROC-AUC with a ~20% precision@k lift.

Task & Workflow Management Platform

FastAPI React PostgreSQL Redis JWT/RBAC Docker GitHub Actions pytest

Built a full-stack task platform with 20+ REST endpoints, authentication, RBAC, and caching. Optimized schema and Redis-backed reads to cut API latency by ~35%, and set up CI/CD with 88% test coverage so releases stayed predictable.

IoT Health Monitoring Platform

Kafka MySQL PostgreSQL Great Expectations Tableau

Developed ETL workflows for 120k+ IoT sensor records across 12 device streams and engineered time-series HRV features for mental health analytics. Also built 12+ Tableau dashboards to help analysts connect sleep, stress, and biometric trends without digging through raw data.


Tech I enjoy working with

Data & Platform — PySpark · Databricks · Delta Lake · dbt · Airflow · Kafka · Great Expectations · Snowflake
ML / AI — PyTorch · TensorFlow · Scikit-Learn · XGBoost · LangChain · RAG · LoRA/PEFT · ONNX · MLflow · Langfuse
Cloud / Infra — AWS · Azure · Docker · Kubernetes · GitHub Actions · Terraform
Languages — Python · SQL · Java · JavaScript · C++ · Bash
Databases — PostgreSQL · MongoDB · Snowflake · BigQuery · Redshift · ChromaDB · FAISS


Currently

  • building projects around RAG, data pipelines, and ML systems
  • exploring better ways to combine LLMs + structured data
  • looking for opportunities where I can build impactful data/AI systems at scale

Beyond code

I like spending time building what I’m passionate about, especially projects around AI/ML, data, and systems. I also enjoy gaming with friends, solving technical challenges, and constantly finding ways to improve myself.


GitHub Stats GitHub Streak

Pinned Loading

  1. amazon-automotive-mlops amazon-automotive-mlops Public

    Python

  2. msd-continuous-manufacturing msd-continuous-manufacturing Public

    Continuous pharmaceutical manufacturing process

    Python

  3. Voice-Scheduling-Agent Voice-Scheduling-Agent Public

    Real Time voice assistant

    TypeScript

  4. Kioptrix Kioptrix Public

    1

  5. PortSwigger PortSwigger Public

    1

  6. AI-Sports-Commentary-Engine AI-Sports-Commentary-Engine Public

    Python