Skip to content
View zhinianqin's full-sized avatar

Block or report zhinianqin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. flash-attention-v100 flash-attention-v100 Public

    forked from vllm-project/flash-attention

    Python 46 12

  2. marlin_v100 marlin_v100 Public

    marlin_v100 是一个从 vLLM 主树中提取出来的最小 Marlin 独立开发工作区,聚焦于 Marlin dense 与 Marlin MoE 的源码开发、最小构建和轻量验证。它保留了核心 CUDA/C++ 实现、最小 Python 薄封装、生成器测试与主树回写映射,适合在不受主树全量构建干扰的情况下快速重构和验证 Marlin 相关改动。

    C++ 8

  3. v100llm v100llm Public

    Forked from zh-nj/v100llm

    Python 4 1

  4. fastllm fastllm Public

    Forked from ztxz16/fastllm

    fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tps,多并发可达60+。

    C++

  5. DoodleJump_rg350 DoodleJump_rg350 Public

    Doodle Jump for rg350

    C++

  6. 1CatV2-ai_bondFA 1CatV2-ai_bondFA Public

    Forked from haohervchb/GooseLLM

    1CatV2 with ai bond FA-v100 vibed in by gpt-5.4

    Python