Pinned Loading
-
gpt-omni/mini-omni
gpt-omni/mini-omni Publicopen-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
-
gpt-omni/mini-omni2
gpt-omni/mini-omni2 PublicTowards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
-
Audio-Reasoner
Audio-Reasoner PublicThe first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.
-
Mini-Omni-Reasoner
Mini-Omni-Reasoner PublicMini-Omni-Reasoner: a real-time speech reasoning framework that interleaves silent reasoning tokens with spoken response tokens (“thinking-in-speaking”), exploiting the LLM–audio throughput gap to …
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.
