R package dpGMM for dynamic programming based Gaussian mixture modelling clustering for 1D and 2D data.
Package functionality:
- Variety of criteria for selection component number (AIC, AICc, BIC, ICL-BIC and LR),
- Posibility of merging the components with a small standard deviation within each other,
- Control of minimum allow variance of component to avoid picks,
- Quick stop if LR test does not show improvement (off/on) accompanied by possibility to control significance level,
- Analysis of single measurements/vector as well as binned data,
- Distribution plot with selected components in ggplot,
- QQ-plot of fitted distribution and standard normal distribution,
- Gaussian Mixture Modeling for 2D data.
You can install the package from GitHub with:
# install.packages("devtools")
devtools::install_github("ZAEDPolSl/dpGMM")At first after isnatlation load library into R and load examplary data.
library(dpGMM)
data(example)Next, let's load GMM control parameters for 1D and change maximum number of iteration to 1000 (just to speed up).
custom.settings <- GMM_1D_opts
custom.settings$max_iter <- 1000To run GMM for 1D vector data the following code will work.
mix_test <- runGMM(example$Dist, opts = custom.settings)If you use this package in your research, please cite the following paper:
[1] Zyla, J., Szumala, K., Polanski, A., Polanska, J., & Marczyk, M. (2026). dpGMM: A new R package for efficient and robust Gaussian mixture modeling of 1D and 2D data. Journal of Computational Science, 95, 102811.
[2] Polanski, A., Marczyk, M., Pietrowska, M., Widlak, P., & Polanska, J. (2018). Initializing the EM algorithm for univariate Gaussian, multi-component, heteroscedastic mixture models by dynamic programming partitions. International Journal of Computational Methods, 15(03), 1850012.
