Research Seminar Series in Statistics and Mathematics

Ort: Wirtschaftsuniversität Wien , Departments 4 D4.4.008 am 22. November 2019 Startet um 09:00 Endet um 10:30
Art Vortrag/Diskussion
SpracheEnglish
Vortragende/r Keefe Murphy (School of Mathematics and Statistics, University College Dublin)
Veranstalter Institut für Statistik und Mathematik
Kontakt katrin.artner@wu.ac.at

Keefe Murphy (School of Mathematics and Statistics, University College Dublin) about "Infinite Mixtures of Infinite Factor Analysers"

The Institute for Statistics and Mathematics (Department of Finance, Accounting and Statistics) cordially invites everyone interested to attend the talks in our Research Seminar Series, where internationally renowned scholars from leading universities present and discuss their (working) papers.

No registration required.

The list of talks for the winter term 2019/20 is available via the following link: https://www.wu.ac.at/en/statmath/resseminar

Abstract:
Factor-analytic Gaussian mixtures are often employed as a model-based approach to clustering high-dimensional data. Typically, the numbers of clusters and latent factors must be fixed in advance of model fitting. The pair which optimises some model selection criterion is then chosen. For computational reasons, having the number of factors differ across clusters is rarely considered.
Here the infinite mixture of infinite factor analysers (IMIFA) model is introduced. IMIFA employs a Pitman-Yor process prior to facilitate automatic inference of the number of clusters using the stick-breaking construction and a slice sampler. Automatic inference of the cluster-specific numbers of factors is achieved using multiplicative gamma process shrinkage priors and an adaptive Gibbs sampler. IMIFA is presented as the flagship of a family of factor-analytic mixtures.
Applications to benchmark data, metabolomic spectral data, and a handwritten digit example illustrate the IMIFA model's advantageous features. These include obviating the need for model selection criteria, reducing the computational burden associated with the search of the model space, improving clustering performance by allowing cluster-specific numbers of factors, and uncertainty quantification.



zurück zur Übersicht