Skip to content
View xzf-thu's full-sized avatar

Block or report xzf-thu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. gpt-omni/mini-omni gpt-omni/mini-omni Public

    open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

    Python 3.5k 307

  2. gpt-omni/mini-omni2 gpt-omni/mini-omni2 Public

    Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

    Python 1.9k 204

  3. Audio-Reasoner Audio-Reasoner Public

    The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.

    Python 285 24

  4. Mini-Omni-Reasoner Mini-Omni-Reasoner Public

    Mini-Omni-Reasoner: a real-time speech reasoning framework that interleaves silent reasoning tokens with spoken response tokens (“thinking-in-speaking”), exploiting the LLM–audio throughput gap to …

    163 19