A repository linking to publicly available dialog datasets. Feel free to send issues or pull requests.
Please see the gh-pages website to view tables of all our gathered corpora: https://breakend.github.io/DialogDatasets
This accompanies our paper: https://arxiv.org/abs/1512.05742
If you find it helpful, please cite:
@ARTICLE{2015serbansurvey,
author = {{Vlad Serban}, I. and {Lowe}, R. and {Henderson}, P. and {Charlin}, L. and
{Pineau}, J.},
title = "{A Survey of Available Corpora for Building Data-Driven Dialogue Systems}",
journal = {ArXiv e-prints},
archivePrefix = "arXiv",
eprint = {1512.05742},
primaryClass = "cs.CL",
keywords = {Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Computer Science - Learning, Statistics - Machine Learning, 68T01, 68T05, 68T35, 68T50, I.2.6, I.2.7, I.2.1},
year = 2015,
month = dec
}