Ayushman Singh ayush1399

Ayushman Singh

Applied Research Engineer working on large-scale LLM inference and ML systems.

My work focuses on bridging research ideas and production systems, with an emphasis on GPU-level performance optimization and inference-time techniques.

Areas of focus

LLM inference and serving systems
GPU performance optimization (Triton / CUDA)
Quantization and speculative decoding
KV-cache optimization and batching strategies
Production GenAI infrastructure

Writing

I occasionally write about GPU architecture, inference optimization, and ML systems:

https://www.ayushmansingh.com

Contact

LinkedIn: https://www.linkedin.com/in/ayushmansingh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ayushman Singh ayush1399

Achievements

Achievements

Organizations

Block or report ayush1399

Ayushman Singh

Areas of focus

Writing

Contact

Pinned Loading

Uh oh!