Abstracts Research Seminar Summer Term 2023
Multivariate Sparse Clustering for Extremes
Studying the tail dependence of multivariate extremes is a major challenge in extreme value analysis. Under a regular variation assumption, the dependence structure of the positive extremes is characterized by a measure, the spectral measure, defined on the positive orthant of the unit sphere. This measure gathers information on the localization of large events and often has sparse support since such events do not simultaneously occur in all directions. However, it is defined via weak convergence, which does not provide a natural way to capture this sparsity structure. In this talk, we introduce the notion of sparse regular variation, which allows better to learn the tail structure of a random vector X. We use this concept in a statistical framework and provide a procedure that captures clusters of extremal coordinates of X. This approach also includes the identification of a threshold above which the values taken by X are considered as extreme. It leads to an efficient algorithm called MUSCLE which we illustrate in numerical experiments. We end our presentation with an application to extreme variability for financial and meteorological data.
Joint work with Nicolas Meyer (IMAG, University of Montpellier).
Approximating PDEs With Wide Neural Networks
Neural networks are a rich family of function approximators, which perform particularly well in high dimension. In this talk, we will consider training Neural Networks as approximators of the solutions of PDEs, using the 'deep Galerkin' and 'Q-PDE' algorithms. We will show conditions under which, in the limiting regime where the neural network becomes infinitely wide, the approximator can be proven to converge to the true Sobolev solution of the PDE, along with numerical examples.
Consistent Tests of Independence via Rank Statistics
A number of modern applications call for statistical tests that are able to consistently detect non-linear dependencies between a pair of random variables based on a random sample drawn from the pair's joint distribution. This has led to renewed interest in the classical problem of designing measures of correlation. When the considered random variables are continuous, it is appealing to define correlations on the basis of the ranks of the data points as the resulting tests become distribution-free. In this talk, I will first review recent progress on rank correlations that yield consistent tests. In a second part, I will turn to the problem of detecting dependence between random vectors and discuss how to construct consistent and distribution-free tests with the help of a recently introduced approach to define multivariate ranks using optimal transport.
Johanna G. Nešlehová:
Limiting Behaviour of Maxima under Dependence
When working with large insurance portfolios, say, the assumption of independence between claims may no longer be appropriate. As a consequence, classical limiting theory of maxima of iid variables may no longer apply. In this presentation, I will explore the weak limits of maxima of identically distributed random variables which are neither independent nor form a locally dependent time series and derive extensions of the Fisher—Tippett—Gnedenko Theorem. It turns out that the possible weak limits of suitably scaled maxima are no longer extreme-value distributions in general, but an asymptotic theory can nonetheless be developed and is driven by the properties of the diagonal of the underlying copula. I will further derive results on uniform convergence and present various illustrative examples. This talk is based on joint work with Klaus Herrmann and Marius Hofert.
Alejandra Avalos Pacheco:
Integrative Large-Scale Bayesian Learning: From Factor Analysis to Graphical Models
Data integration is crucial when separate data sources are associated on the same phenomenon. Integrative models provide gains in statistical power and help to take accurate decisions sooner. However, the lack of proper integration tools could lead to unreliable and misleading inference.
In this talk I will present novel methods to integrate continuous and binary heterogeneous data: sparse factor regression (FR), multi-study factor regression (MSFR) and multiple Ising graphical (MIG) models. The FR model provides a tool for continuous data exploration via dimensionality reduction and sparse low-rank covariance estimation, while correcting for a range of systematic biases. MSFR models are extensions of FR that enable us to jointly obtain a covariance structure that identify and estimate the group-specific covariances in addition to the common components. The MIG model studies the heterogeneity induced in a set of binary variables by external factors and provides the embedded network structures of distinct groups.
I will discuss the use of several priors that lead to more interpretable models, such as sparse priors (local and non-local), which learn the dimension of the latent factors; and Markov Random Field priors, which enable the borrowing of strength across different groups.
Finally, I will show the usefulness of our methods by providing a visual representation of the data and by answering data-specific questions, such as: providing survival predictions in different cancer patients, associating cardio-metabolic disease risks with dietary patterns of distinct latino populations, or analysing the confidence in political institutions in different web engagement segments groups.
I will highlight the benefits of our models compared to other techniques, not only in the accuracy of the data signals but also in a better prediction, and I will provide computationally efficient tools for inference.
Invariance and Causality in Transformation Models: Causal Feature Selection and Robust Prediction
Discovering causal relationships from observational data is a fundamental yet challenging task. For some applications, it may suffice to learn the causal drivers of a given response variable instead of the entire causal graph. Invariant causal prediction (ICP) is a method for causal feature selection which requires data from heterogeneous settings. ICP assumes that the mechanism of the response is the same in all settings and exploits invariance of the conditional distribution of the response given its parents across those settings. The original formulation of ICP for linear models has been extended to general independent additive noise models and to nonparameteric settings using conditional independence testing. However, additive noise models are not suitable for applications in which the response is not measured on a continuous scale, but rather reflects categories or counts, while nonparametric conditional independence testing often suffers from low power. To bridge this gap, we develop ICP for continuous, categorical, count-type, and uninformatively censored responses in parametric transformation models. We propose procedures for testing invariance based on score residuals, establish coverage guarantees and empirically show gains in power over nonparametric alternatives when the model is correctly specified. Our proposed method is implemented in the R~package 'tramicp', which we demonstrate on data from a randomized controlled trial. Lastly, we explore how invariance can aid in prediction under distribution shift when some assumptions of ICP, such as causal sufficiency, are violated.
Parametric Statistical Inference for High-Dimensional Diffusions
This talk is dedicated to the problem of parametric estimation in the diffusion setting and mostly concentrated on properties of the Lasso estimator of drift component. More specifically, we consider a multivariate parametric diffusion model X observed continuously over the interval [0, T] and investigate drift estimation under sparsity constraints. We allow the dimensions of the model and the parameter space to be large. We obtain an oracle inequality for the Lasso estimator and derive an error bound for the L2-distance using concentration inequalities for linear functionals of diffusion processes. The probabilistic part is based upon elements of empirical processes theory and, in particular, on the chaining method. Some alternative estimation procedures, such as adaptive Lasso and Slope will also be discussed to give a perspective on improving the obtained results.
Statistical Aspects of High-Dimensional Data
High-dimensional data appear naturally in many applications and are an increasingly active research field. We review recent developments in high-dimensional mean estimation and discuss optimal estimation-procedures in the context of regression and convex stochastic optimisation problems. We then focus on the nonparametric estimation of probability distributions and discuss several results on the interplay between the error measured in Wasserstein distance and the behaviour of extremal singular values of certain random matrices.
Covariance Estimation for Random Surfaces Beyond Separability
Non-parametric covariance estimation lies at the heart of functional data analysis. When working with random surfaces/fields, considerations of statistical and computational efficiency often compel the use of "separability" of the covariance. This talk explores two efficient alternatives to this ambivalent assumption. Firstly, we study a setting where the covariance structure may fail to be separable locally -- either due to noise contamination or due to the presence of a non-separable short-range dependent signal component. For such a covariance structure, we introduce non-parametric estimators hinging on shifted partial tracing -- a novel concept enjoying strong denoising properties. Secondly, we propose a distinctive decomposition of the covariance, which exposes separability as an unconventional form of low-rankness. Allowing for a higher "rank" suggests a structured class in which any covariance can be approximated up to an arbitrary precision. An abstraction of the power iteration method to general Hilbert spaces allows one to estimate the aforementioned decomposition on the level of data. Truncation and retention of the leading terms automatically induces a non-parametric estimator of the covariance, whose parsimony is dictated by the truncation level governing the bias-variance trade-off.
Sample Size Planning for Latent Variable Models: Classical Approaches and Recent Developments
Planning an empirical study in the social sciences involves an estimation of the sample size that is necessary to detect a particular effect. In this talk, I discuss methods to address the problem of sample size estimation for latent variable models that are widely used in psychology and related fields. An initial approach involves classical methods of power analysis, which allow an analytical estimation of sample size. A drawback of these approaches is that they are computationally expensive under some realistic scenarios. As an alternative, a second approach is presented that uses machine learning models to estimate sample size. This second approach was found to lead to promising results under realistic conditions in a variety of simulation studies. All of the presented approaches are available in R packages.
The Role of Correlation in Diffusion Control Ranking Games
In this talk we analyze a stochastic differential game with two players, where each player can control the diffusion intensity of an individual dynamic state process. We suppose that the Brownian motions, driving the players' state equations, are correlated. At a deterministic finite time horizon the players receive a reward depending on their state processes' difference. As long as the correlation of the Brownian motions does not exceed a certain bound there exists a saddle point and the game has a value. For correlations exceeding this bound, however, the game does not have a value. To overcome this issue we introduce the notion of relaxed controls that may be regarded as mixed controls in the differential game setting. We prove that in this class of controls there exists a saddle point and the game has a value for all correlations. The talk is based on joint work with Nabil Kazi-Tani and Julian Wendt.
A Hybrid Random Forest Approach for Modeling and Prediction of International Football Matches
Many approaches that analyze and predict the results of international matches in football/soccer are based on statistical models incorporating several potentially influential features with respect to a national team's sportive success, such as the bookmakers' ratings or the FIFA ranking. Based on all matches from the four previous FIFA World Cups 2002-2014, we compare the most common regression models that are based on the teams' feature information with regard to their predictive performances. Furthermore, an alternative modeling class is investigated, so-called random forests, which can be seen as mixture between machine learning and statistical modeling and are known for their high predictive power.
Within the framework of Generalized Linear Models (GLMs), the most frequently used type of regression models in the literature is the Poisson model. It can easily be combined with different regularization methods such as penalization or boosting.
Our main focus, however, is on the incorporation of so-called hybrid predictors, i.e. features which were obtained by a separate statistical model. We are particularly interested in how those can improve the predictive performance of the models.
For these different modeling techniques, the predictive performance with regard to several goodness-of-fit measures is compared. Based on the estimates of the best performing method all match outcomes of the FIFA World Cup 2018 in Russia are repeatedly simulated (1,000,000 times), resulting in winning probabilities for all participating national teams.
Finally, we shortly sketch how we have progressed in this research line since the FIFA World Cup 2018.
Markovian Transition Semigroups Under Model Uncertainty
When considering stochastic processes for the modelling of real world phenomena, a major issue is so-called model uncertainty or epistemic uncertainty. The latter refer to the impossibility of perfectly capturing information about the future in a single stochastic framework. In a dynamic setting, this leads to the task of constructing consistent families of nonlinear transition semigroups. In this talk, we give an overview of our results on semigroups of convex monotone operators on spaces of continuous functions. As an application, we discuss different types of perturbations of Markov processes. Moreover, we show that LLN and CLT type results for convex expectations can be systematically obtained by the so-called Chernoff approximation. The talk is based on joint works with Jonas Blessing, Robert Denk, Max Nendel and Sven Schweizer.
Tobias Fissler: Spotlights on the Theory of Elicitability
After a general introduction into the role of loss functions in statistical learning and forecast evaluation following Fissler, Lorentzen, Mayer (2022), I will recall and develop some necessary and sufficient conditions for elicitability, which is the crucial condition of consistent M-estimation (Dimitriadis, Fissler, Ziegel, 2023). In particular, I will talk about the necessity of the convex level set property and its failure of being sufficient, which is exemplified by the negative results on the mode (Heinrich-Mertsching and Fissler, 2021). Moreover, I will talk about mitigation strategies for linear combinations of so-called Bayes risks, exemplified in the paper on Range Value at Risk (Fissler and Ziegel, 2021). Finally, I will consider extensions of the theory to set-valued functionals, elaborated in Fissler, Frongillo, Hlavinova and Rudloff (2021).
T. Dimitriadis, T. Fissler, J. Ziegel (2023). Characterizing M-estimators.
Biometrika (forthcoming). https://doi.org/10.1093/biomet/asad026
T. Fissler, C. Lorentzen, M. Mayer (2022).
Model Comparison and Calibration Assessment: User Guide for Consistent Scoring Func- tions in Machine Learning and Actuarial Practice. https://doi.org/10.48550/arXiv.2202.12780
T. Fissler, R. Frongillo, J. Hlavinová, B. Rudloff (2021).
Forecast evaluation of quantiles, prediction intervals, and other set-valued functionals. Electronic Journal of Statistics 15 (1), 1034–1084. https://doi.org/10.1214/21-EJS1808
T. Fissler, J. F. Ziegel (2021).
On the elicitability of range value at risk. Statistics & Risk Modeling 25 (1–2), 25–46. https://doi.org/10.1515/strm-2020-0037
C. Heinrich-Mertsching, T. Fissler (2021).
Is the mode elicitable relative to unimodal distributions? Biometrika 109 (4), 1157–1164 https://doi.org/10.1093/biomet/asab065