A curated list of resources, papers, code, and tutorials for Diffusion Language Models (DLMs), covering research progress, implementations, benchmarks, and applications.
What are Diffusion Language Models?
Xiaochen Zhu
[Website]
14 April 2025
Strengths and limitations of diffusion language models
sean goedecke
[Website]
22 May 2025
Mercury [Website] [Technical Report]
Gemini Diffusion [Website] [Blog]
LLaDA-8B [Website] [Model] [Code]
Dream-7B [Website] [Model] [Code]
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Runpeng Yu, Qi Li, Xinchao Wang
Arxiv 2025 (Jun 16, 2025). [Paper]
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, Surya Ganguli
ICML 2015. [Paper] [Code]
Denoising Diffusion Probabilistic Models
Jonathan Ho, Ajay N. Jain, Pieter Abbeel
NeurIPS 2020. [Paper] [Code(official)] [Code(Pytorch)]
Structured Denoising Diffusion Models in Discrete State-Spaces
Jacob Austin, Daniel D. Johnson, Jonathan Ho, Daniel Tarlow, Rianne van den Berg
NeurIPS 2021. [Paper] [Code]
Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions
Emiel Hoogeboom, Didrik Nielsen, Priyank Jaini, Patrick Forré, Max Welling
NeurIPS 2021. [Paper] [Code]
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong, Shivam Agarwal, Yizhe Zhang, Jiacheng Ye, Lin Zheng, Mukai Li, Chenxin An, Peilin Zhao, Wei Bi, Jiawei Han, Hao Peng, Lingpeng Kong
ICLR 2025. [Paper] [Code]
Large Language Diffusion Models
Shen Nie, Fengqi Zhu, Zebin You, Xiaolu Zhang, Jingyang Ou, Jun Hu, Jun Zhou, Yankai Lin, Ji-Rong Wen, Chongxuan Li
Arxiv 2025. [Paper] [Code] [Model]
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
Fengqi Zhu, Rongzhen Wang, Shen Nie, Xiaolu Zhang, Chunwei Wu, Jun Hu, Jun Zhou, Jianfei Chen, Yankai Lin, Ji-Rong Wen, Chongxuan Li
Arxiv 2025 (May 25, 2025). [Paper] [Code]
A Continuous Time Framework for Discrete Denoising Models
Andrew Campbell, Joe Benton, Valentin De Bortoli, Thomas Rainforth, George Deligiannidis, Arnaud Doucet
NeurIPS 2022. [Paper] [Code]
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
Aaron Lou, Chenlin Meng, Stefano Ermon
ICML 2024. [Paper] [Code] [Model]
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
Siyan Zhao, Devaansh Gupta, Qinqing Zheng, Aditya Grover
Arxiv 2025. [Paper] [Code]
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Jiacheng Ye, Shansan Gong, Liheng Chen, Lin Zheng, Jiahui Gao, Han Shi, Chuan Wu, Xin Jiang, Zhenguo Li, Wei Bi, Lingpeng Kong
NeurIPS 2024. [Paper] [Code]
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Zemin Huang, Zhiyang Chen, Zijun Wang, Tiancheng Li, Guo-Jun Qi
Arxiv 2025 (May 21, 2025). [Paper]
MMaDA: Multimodal Large Diffusion Language Models
Ling Yang, Ye Tian, Bowen Li, Xinchen Zhang, Ke Shen, Yunhai Tong, Mengdi Wang
Arxiv 2025 (May 21, 2025). [Paper] [Code]
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning
Zebin You, Shen Nie, Xiaolu Zhang, Jun Hu, Jun Zhou, Zhiwu Lu, Ji-Rong Wen, Chongxuan Li
Arxiv 2025 (May 22, 2025). [Paper] [Code]
LaViDa: A Large Diffusion Language Model for Multimodal Understanding
Shufan Li, Konstantinos Kallidromitis, Hritik Bansal, Akash Gokul, Yusuke Kato, Kazuki Kozuka, Jason Kuen, Zhe Lin, Kai-Wei Chang, Aditya Grover
Arxiv 2025 (May 23, 2025). [Paper] [Code]
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola, Aaron Gokaslan, Justin T. Chiu, Zhihan Yang, Zhixuan Qi, Jiaqi Han, Subham Sekhar Sahoo, Volodymyr Kuleshov
ICLR 2025. [Paper] [Code] [Model]
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Chengyue Wu, Hao Zhang, Shuchen Xue, Zhijian Liu, Shizhe Diao, Ligeng Zhu, Ping Luo, Song Han, Enze Xie
Arxiv 2025. [Paper] [Code]
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
Shansan Gong, Ruixiang Zhang, Huangjie Zheng, Jiatao Gu, Navdeep Jaitly, Lingpeng Kong, Yizhe Zhang
Arxiv 2025. [Paper] [Code]