Shrinking and Regularizing Finite Mixture Models
Leader: Sylvia Frühwirth-Schnatter
Scientific staff: Gertraud Malsiner-Walli
Often in data the presence of groups of observations with different characteristics is suspected. However, the group memberships are either not available or not observable. Such a situation requires the application of a statistical method in the data analysis which allows to explicitly account for the presence of these latent groups and which aims at determining the group sizes as well as the group characteristics. The standard model-based tool in statistical analysis for this problem is the finite mixture model.
Finite mixture models have been used for more than 100 years and represent a flexible and generally applicable statistical tool with many extensions and variations already proposed. However, some problems remain still unresolved such as the correct selection of variables to include in the analysis which drive the group structure and the choice of a suitable model which avoids overfitting the heterogeneity in order to ensure easy interpretability and precise estimation of parameters.
In this research project we will aim at improving the application of finite mixture models by providing tools based on shrinkage and regularization which allow selecting a suitable model where relevant variables and irrelevant variables are automatically distinguished and the parameters are chosen in a way to avoid overfitting heterogeneity. Theoretical results will be complemented by applications and software implementations as add-on package for the open-source software R, an environment for statistical computing and graphics (http://www.R-project.org).
The availability of improved statistical methods in combination with software implementations allows for a better analysis and increased understanding of data in empirical quantitative research. Due to the wide applicability of finite mixture models, for example in astronomy, biology, economic, marketing, medicine and psychology, results of this research project are assumed to have an impact also on other areas of research, by allowing for improved insights into latent group structures which are present in the data.
Malsiner-Walli G., Frühwirth-Schnatter S., Grün B. Identifying mixtures of mixtures using Bayesian estimation, in: Journal of Computational and Graphical Statistics, 2016.
Malsiner-Walli G., Frühwirth-Schnatter S., Grün B. Model-based clustering based on sparse finite Gaussian mixtures, in: Statistics and Computing, Volume 26, Page(s) 303-324, 2016.
Grün B., Malsiner-Walli G. Discussion of "How to Find an Appropriate Clustering for Mixed-Type Variables with Application to Socio-Economic Stratification" by Hennig and Liao, in: Journal of the Royal Statistical Society: Series C (Applied Statistics), Volume 62, Number 3, Page(s) 350-351, 2013.
FWF project number: P28740
Duration: 01.11.2016 - 31.10.2019