Shubhankar Kumar Shubhankar9934

Shubhankar Kumar

AI Research Engineer · Mathematics & Scientific Computing

Models are easy. Reliable systems are not.

Perspective

I work at the intersection of applied mathematics, machine learning, and distributed systems.

I am less interested in training another model and more interested in making intelligent systems behave correctly under real constraints—latency, partial data, failure modes, scale, and ambiguity.

My background in Mathematics & Scientific Computing shapes how I approach AI: I think in terms of assumptions, trade-offs, stability, and convergence, not just APIs.

What I Actually Build

Most of my work focuses on the last mile of AI systems:

Retrieval-Augmented Generation that does not hallucinate under pressure
Agentic workflows that degrade gracefully instead of failing silently
Low-latency inference pipelines (< 1s) with predictable behavior
Systems that can explain why they produced an output, not just what

I care about repeatability, observability, and failure analysis as much as accuracy.

Technical Foundation

Core Strengths

Linear Algebra, Optimization, Probability
ML model behavior analysis (not blind fine-tuning)
Async systems and real-time data pipelines
Designing for latency, throughput, and cost

Tools I Use (Because They Solve Problems)

Area	Stack
LLM & RAG	HuggingFace, vLLM, FAISS, Milvus, BGE Rerankers
Agentic Systems	LangChain, LlamaIndex, custom orchestration
Backend	Python, FastAPI (async), WebSockets
Streaming / Infra	Kafka, Docker, Kubernetes
ML / Math	PyTorch, TensorFlow, NumPy, SciPy

I choose tools based on constraints, not trends.

Selected Engineering Work

Most recent production work is proprietary. Below is the nature of problems I solve.

Real-Time Financial Intelligence Systems

Context: Streaming market data, news, and macro signals
Problem: Generate reasoning-aware outputs in sub-second latency
Approach:

Hybrid RAG combining historical embeddings with live streams
Task-specialized models instead of one large general model
Aggressive caching + async pipelines

Outcome: ~85% latency reduction with more stable outputs

Multimodal RPA & Voice Systems

Context: Enterprise automation under noisy, changing environments
Problem: Traditional RPA breaks when UIs or workflows change
Approach:

Vision-Language models to interpret screens semantically
Speech pipelines with Whisper + VAD + diarization
Robust handling of partial or overlapping inputs

Outcome: Systems that adapt instead of failing on small changes

Research & Writing

I document the hard parts.

Applied Soft Computing (Under Review):
Co-author of a Dynamic Adaptive Large Neighborhood Search (DALNS) algorithm for probabilistic multi-objective routing problems.
Technical Writing:
Deep dives on topic modeling (BERTopic), object detection (YOLO), and applied ML systems.

I believe good research should survive contact with production.

About This GitHub

This profile contains:

Research experiments
System prototypes
Exploratory implementations
Iterative work (not just polished demos)

Some repositories are intentionally raw—they show how an idea evolved.

If you’re looking for flashy demos, this may disappoint.
If you care about engineering judgment, you’ll feel at home.

Open to discussions on architecture, trade-offs, and system behavior — not just prompt templates.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly