Skip to content
View hussainnazary2's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@GGUFloader @local-ai-zone

Block or report hussainnazary2

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hussainnazary2/README.md

I work on LLM inference at the engine and runtime level, focusing on performance, memory efficiency, and predictable behavior in production environments.

My experience includes optimizing inference across CPU and GPU backends, with hands-on use of CUDA, cuBLAS, cuBLASLt, and custom CUDA kernels for transformer workloads. I focus on practical improvements such as quantization-aware execution, efficient KV-cache management, memory allocation strategies, and optimized execution paths tailored to specific model architectures and hardware constraints.

I build and adapt local, cloud-independent inference systems, customizing runtimes for different model families and deployment requirements rather than relying on fixed abstractions. The goal is stable, efficient inference that makes full use of available hardware under real operational conditions.

Pinned Loading

  1. GGUFloader/gguf-loader GGUFloader/gguf-loader Public

    Run ChatGPT OSS, Groke 2 Locally — GGUF Loader with its floating button, ai Models | Open Source & Offline

    Python 20 8

  2. local-ai-zone/local-ai-zone.github.io local-ai-zone/local-ai-zone.github.io Public

    Discover the Best AI Models for Your PC

    HTML 18 9

  3. LLM-Toolkit LLM-Toolkit Public

    Python 1

  4. GGUFloader/Mobile-AI-Assistant GGUFloader/Mobile-AI-Assistant Public

    lightweight, mobile-optimized AI assistant.

    Kotlin 1

  5. GPT-Calendar/smart-calendar GPT-Calendar/smart-calendar Public

    Smart Calendar – An AI-powered Android assistant combining voice commands, finance tracking, location-based reminders, and task management. Features "Kiro" wake word detection, SMS transaction pars…

    Kotlin