dpdata is a Python package for manipulating atomistic data of software in computational science.
If you use this software, please cite the following paper:
- Jinzhe Zeng, Xingliang Peng, Yong-Bin Zhuang, Haidi Wang, Fengbo Yuan, Duo Zhang, Renxi Liu, Yingze Wang, Ping Tuo, Yuzhi Zhang, Yixiao Chen, Yifan Li, Cao Thang Nguyen, Jiameng Huang, Anyang Peng, Marián Rynik, Wei-Hong Xu, Zezhong Zhang, Xu-Yuan Zhou, Tao Chen, Jiahao Fan, Wanrun Jiang, Bowen Li, Denan Li, Haoxi Li, Wenshuo Liang, Ruihao Liao, Liping Liu, Chenxing Luo, Logan Ward, Kaiwei Wan, Junjie Wang, Pan Xiang, Chengqian Zhang, Jinchao Zhang, Rui Zhou, Jia-Xin Zhu, Linfeng Zhang, Han Wang, dpdata: A Scalable Python Toolkit for Atomistic Machine Learning Data Sets, J. Chem. Inf. Model., 2025, DOI: 10.1021/acs.jcim.5c01767.
dpdata only supports Python 3.8 and above. You can setup a conda/pip environment, and then use one of the following methods to install dpdata:
- Install via pip:
pip install dpdata
- Install via conda:
conda install -c conda-forge dpdata
- Install from source code:
git clone https://github.com/deepmodeling/dpdata && pip install ./dpdata
To test if the installation is successful, you may execute
dpdata --version
dpdata
is aimmed to support different kinds of atomistic packages:
- Atomistic machine learning packages, such as DeePMD-kit;
- Molecular dynamics packages, such as LAMMPS and GROMACS;
- Quantum chemistry packages, such as VASP, Gaussian, and ABACUS;
- Atomistic visualization packages, such as 3Dmol.js.
- Other atomistic tools, such as ASE.
- Common formats such as
xyz
.
All supported formats are listed here.
The quickest way to convert a simple file from one format to another one is to use the command line.
dpdata OUTCAR -i vasp/outcar -o deepmd/npy -O deepmd_data
For advanced usage with Python APIs, read dpdata documentation.
- cp2kdata adds the latest CP2K support for dpdata.
For how to create your own plugin packages, read dpdata documentation.