Discussed in #28
Originally posted by jianc99 February 18, 2026
Feel free to try it on SGLang. In our test, it consistently delivers 2x speedup across concurrency 1-32 on math, code and chat tasks. More details see https://huggingface.co/z-lab/gpt-oss-20b-DFlash
The DFlash draft models for Qwen3-Coder-Next and gpt-oss-120b is on the way.