Skip to content
View ManasaDeshagouni's full-sized avatar
๐Ÿ”ช
cooking something up!!!
๐Ÿ”ช
cooking something up!!!
  • 21:21 (UTC -12:00)

Block or report ManasaDeshagouni

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please donโ€™t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
ManasaDeshagouni/Readme.MD


~/about

I'm Manasa, a Master's student in Computer Science at San Josรฉ State University and a Graduate Research Assistant working on zero-shot malware attribution with metric-learned embeddings.

Previously, I spent 2+ years at Optum (UnitedHealth Group) shipping production ML, GenAI features, and backend systems for a secure file-transfer platform.

I care about the part of ML that actually gets used: retrieval, inference, guardrails, evaluation, rollout safety, and the systems that turn predictions into action.

  • ๐Ÿ”ฌ Currently researching: zero-shot retrieval, metric learning, FAISS-based search
  • ๐Ÿง  Interested in: production ML, GenAI/RAG, embeddings, backend systems
  • ๐Ÿค Open to: SWE, MLE, Applied ML, and Research Engineer roles

projects/featured

  โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
  โ”‚  ๐Ÿ”  "What was that paper about          โ”‚
  โ”‚       attention mechanisms I read         โ”‚
  โ”‚       last month?"                        โ”‚
  โ”‚                                           โ”‚
  โ”‚  Found 3 results in 47ms                  โ”‚
  โ”‚  โ”œโ”€โ”€ ๐Ÿ“„ attention_is_all_you_need.pdf     โ”‚
  โ”‚  โ”‚   โœฆ 0.94 relevance ยท chunk 3/12       โ”‚
  โ”‚  โ”œโ”€โ”€ ๐Ÿ“ transformer_notes.md              โ”‚
  โ”‚  โ”‚   โœฆ 0.89 relevance ยท tagged: #nlp     โ”‚
  โ”‚  โ””โ”€โ”€ ๐Ÿ–ผ๏ธ architecture_diagram.png          โ”‚
  โ”‚      โœฆ 0.81 relevance ยท CLIP matched     โ”‚
  โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Your second brain, with semantic superpowers.

A local-first AI agent that ingests everything โ€” PDFs, notes, receipts, images, code snippets โ€” and makes it all searchable by meaning, not keywords. Multimodal embeddings (MiniLM for text, CLIP for images), FAISS HNSW index, adaptive retrieval with temporal reranking.

What it handles How it performs
100k+ documents indexed < 500ms retrieval
5 features: search, Q&A, summarize, rank, discover Metadata filtering + temporal reranking


๐Ÿ”” NotifyOps

  03:14:22 โš   ALERT  disk_full on prod-db-03
  03:14:22 โš   ALERT  disk_full on prod-db-03   โ† duplicate, suppressed
  03:14:23 โš   ALERT  disk_full on prod-db-03   โ† duplicate, suppressed
  03:14:24 ๐Ÿ”” PAGED  @sarah (on-call: infra)    โ† 1.9s from first alert
  03:14:26 โœ… ACK    @sarah acknowledged         โ† 220ms ackโ†’resolve
  
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚ 3 alerts โ†’ 1 page โ†’ 1 ack  โ”‚
  โ”‚ 58% noise eliminated       โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Pages the right engineer. Kills the noise.

Multi-tenant on-call alerting SaaS with real-time dedup, correlation, and idempotent workers. JWT/HMAC-secured ingest. React + WebSocket console for live ack/resolve. Runs standalone or as a pre-filter ahead of PagerDuty/Opsgenie.

Metric Result
First-notify p95 1.9s @ 350 req/s
Duplicate suppression 58%
Delivery success 77% โ†’ 94% (retries + backoff + DLQ < 0.6%)
Ackโ†’resolve p95 220ms


๐ŸŽญ TruthReaper

          AUDIO                          TEXT
            โ”‚                              โ”‚
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚  Acoustic feats  โ”‚          โ”‚  DistilBERT + cues  โ”‚
   โ”‚  โ†’ BiLSTM        โ”‚          โ”‚  โ†’ text embedding    โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
            โ”‚         confidence            โ”‚
            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ”Œโ”€โ”€โ–ผโ”€โ”€โ–ผโ”€โ”€โ”
                    โ”‚ XGBoost โ”‚  โ† late fusion
                    โ”‚  Fuser  โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”˜
                         โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”
                    โ”‚ TRUTH or โ”‚
                    โ”‚ DECEPTIONโ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    
   accuracy: 89.4%  ยท  precision: 93.5%

Your voice says more than your words.

Multimodal deception detection that fuses what you say with how you say it. Temporal acoustic features encoded via BiLSTM, transcript text encoded via DistilBERT with explicit lexical/linguistic cues, late-fused through XGBoost for robust classification on short, noisy clips.


๐ŸŽฎ More builds โ†’ QuizChronicles
  โ”Œโ”€ ROOM: "algo-arena" โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ 4/10 players โ”€โ”
  โ”‚                                              โ”‚
  โ”‚  ๐ŸŸข alice    142 pts   solving Q3...         โ”‚
  โ”‚  ๐ŸŸข bob      138 pts   submitted โœ“  180ms   โ”‚
  โ”‚  ๐ŸŸก charlie  120 pts   idle                  โ”‚
  โ”‚  ๐ŸŸข you      155 pts   ๐Ÿ† leading            โ”‚
  โ”‚                                              โ”‚
  โ”‚  โฑ๏ธ 02:34 remaining                          โ”‚
  โ”‚  fan-out: 120ms p95 across 800 sockets       โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Interactive coding + quiz platform with modular Spring Boot backend, sandboxed code execution, React + Monaco editor, real-time rooms, leaderboards, proctor controls over WebSockets, and a timed game-themed solo coding mode.

Metric Result
Submission p95 180ms @ 200 users
Fan-out p95 120ms @ 800 sockets
Dropped updates < 0.5%
Cache speedup 230 โ†’ 140ms (โˆ’39%)


work/production

Note

Can't open-source proprietary code โ€” but here's what I built and what it did.

โšก Predictive Reliability Engine
Kafka โ†’ FastAPI+ONNX โ†’ Spring Boot gates
     scoring: p95 85ms @ 1.2k msgs/s

shadow mode โ†’ canary (14d, 0 FP) โ†’ prod


๐Ÿค– GenAI Product Features
config/logs โ†’ sanitize โ†’ RAG retrieve
  โ†’ LLM (LLaMA-2 / Mistral-7B + LoRA)
  โ†’ validate schema โ†’ serve


๐Ÿ”ง ECG Platform Services
20+ UIs + Spring Boot APIs
dual-schema rollout โ†’ 0 breakages
correlation IDs: UIโ†’APIโ†’workers



research/active

๐Ÿฆ  Neural Fingerprints for Malware

Graduate Research Assistant @ SJSU

malware binary โ†’ image โ†’ ResNet encoder
  โ†’ L2 normalize โ†’ hypersphere embedding
  โ†’ FAISS HNSW โ†’ nearest family match

  train: 47 families (MalNet + MalImg)
  test:  17 unseen families (zero-shot)

Learning domain-robust embeddings with ProxyAnchor, Triplet, and SupCon losses so unseen malware can be clustered and retrieved โ€” without retraining.

Key insight: Strong in-domain separation โ‰  cross-domain generalization. The bottleneck is representation stability under dataset shift, not loss function choice.

Cross-domain retrieval 88.02% (MalNetโ†’MalImg)
Strict zero-shot 57โ€“67% (17 unseen families)
Best loss ProxyAnchor

๐Ÿƒ Cross-Domain HAR

Zero-shot Pocket Activity Recognition

4 source datasets (heterogeneous phones)
  โ†’ unified calibration pipeline
  โ†’ physics-aware temporal model
  โ†’ 3 zero-shot target datasets

  standing recall: 0% โ”€โ”€fix pipelineโ”€โ”€โ†’ ~99%

Multi-source domain adaptation for Sitting / Standing / Walking. Physics-aware calibrator auto-detects sampling rate, units, and orientation of unseen sensors.

Key insight: Standing recall collapsed to ~0% from preprocessing-distribution mismatch โ€” not model weakness. Fixing the pipeline fixed the model.

Source-domain F1 94.1% (subject-disjoint)
Zero-shot transfer ~95.9% (UTwente)
Standing recovery 0% โ†’ ~99%

papers/

Year Paper Venue
๐Ÿ† 2024 Brain Tumor Detection using Machine Learning ICCDS 2024 ยท Best Paper Award
2023 Deep Learning Techniques for Detection of Deepfakes IJSRSET (ICSCR 2023)

stack.yml

languages:
  - Java
  - Python
  - Go
  - C++
  - TypeScript
  - SQL

ml_and_ai:
  - PyTorch
  - TensorFlow
  - Transformers
  - ONNX Runtime
  - FAISS
  - PEFT / LoRA
  - LangChain
  - scikit-learn
  - XGBoost

backend_and_systems:
  - Spring Boot
  - Spring Security
  - FastAPI
  - Kafka
  - Redis
  - PostgreSQL
  - MongoDB
  - Docker
  - Kubernetes
  - GCP
  - AWS

frontend:
  - React
  - Angular
  - Tailwind
  - Vite

observability:
  - Grafana
  - Prometheus
  - Selenium / Cucumber
  - JUnit
  - k6

"Research matters. Production proves it."

Pinned Loading

  1. Mnemosyne Mnemosyne Public

    A Personal Knowledge Intelligence System for Structured Memory, Semantic Retrieval, and Proactive Recall

    Python

  2. neural-fingerprints neural-fingerprints Public

    Image-based malware attribution using metric learning

    Python

  3. TruthReaper TruthReaper Public

    A dual-track deception detection system designed to classify spoken statements as truthful or deceptive.

    Python

  4. FetiiAI FetiiAI Public

    Intelligent Rideshare Analytics Platform

    Python