Skip to content
View ayush1399's full-sized avatar
😴
Snoozing
😴
Snoozing

Organizations

@fellowship @revolvecode-com @The-Tech-Odyssey @PlaceholderPlatform

Block or report ayush1399

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ayush1399/README.md

Ayushman Singh

Applied Research Engineer working on large-scale LLM inference and ML systems.

My work focuses on bridging research ideas and production systems, with an emphasis on GPU-level performance optimization and inference-time techniques.

Areas of focus

  • LLM inference and serving systems
  • GPU performance optimization (Triton / CUDA)
  • Quantization and speculative decoding
  • KV-cache optimization and batching strategies
  • Production GenAI infrastructure

Writing

I occasionally write about GPU architecture, inference optimization, and ML systems:

Contact

Pinned Loading

  1. avalanche avalanche Public

    Forked from ContinualAI/avalanche

    Avalanche: an End-to-End Library for Continual Learning.

    Python