# Abstracts Research Seminar Winter Term 2019/20

#### Gareth Roberts: Principled subsampling and super-efficiency for Bayesian inference

This talk will discuss the problem of Bayesian computation for posterior densities which are expensive to compute, typically due to the size of the data set under consideration. While subsampling large data sets is being used effectively for optimisation with large data sets, the problem of fully Bayesian posterior exploration is harder and invariably leads to systematic biases in estimation. Two potential solutions to this problem will be presented. Both have the property that although they both use subsampling, they are examples of so-called “exact approximate” algorithms with no systematic bias. The two algorithms described are the SCaLE algorithm, which works in a framework which combines MCMC and SMC to realise an evanescent Markov process whose quasi-stationary distribution is the target distribution. The second method is an example of Piecewise Deterministic Markov Processes, the so-called Zig-Zag algorithm which also utilises a continuous-time non-reversible Markov process whose stationary distribution is the required target.

#### Lukas Gonon: Dynamic learning based on random recurrent neural networks and reservoir computing systems

In this talk we present our recent results on a mathematical explanation for the empirical success of dynamic learning based on reservoir computing.

Motivated by the task of realized volatility forecasting, we study approximation and learning based on random recurrent neural networks and more general reservoir computing systems. For different types of echo state networks we obtain high-probability bounds on the approximation error in terms of the network parameters. For a more general class of reservoir computing systems and weakly dependent (possibly non-i.i.d.) input data, we then also derive generalization error bounds based on a Rademacher-type complexity.

The talk is based on joint work with Lyudmila Grigoryeva and Juan-Pablo Ortega.

#### Christoph Belak: Stochastic Impulse Control: Recent Progress and Applications

Stochastic impulse control problems are continuous-time optimization problems in which a stochastic system is controlled through finitely many impulses causing a discontinuous displacement of the state process. The objective is to construct impulses which optimize a given performance functional of the state process. This type of optimization problem arises in many branches of applied probability and economics such as optimal portfolio management under transaction costs, optimal forest harvesting, inventory control, and valuation of real options. In this talk, I will give an introduction to stochastic impulse control and discuss classical solution techniques. I will then introduce a new method to solve impulse control problems based on superharmonic functions and a stochastic analogue of Perron’s method, which allows to construct optimal impulse controls under a very general set of assumptions. Finally, I will show how the general results can be applied to optimal investment problems in the presence of transaction costs.

This talk is based on joint work with Sören Christensen (Christian-Albrechts-University Kiel), Lukas Mich (Trier University), and Frank T. Seifried (Trier University).

#### Patrick Cheridito: Deep optimal stopping

I present a deep learning method for optimal stopping problems which directly learns the optimal stopping rule from Monte Carlo samples. As such it is broadly applicable in situations where the underlying randomness can efficiently be simulated. The approach is tested on different problems. In all cases it produces very accurate results in high-dimensional situations with short computing times.

#### Keefe Murphy: Infinite Mixtures of Infinite Factor Analysers

Factor-analytic Gaussian mixtures are often employed as a model-based approach to clustering high-dimensional data. Typically, the numbers of clusters and latent factors must be fixed in advance of model fitting. The pair which optimises some model selection criterion is then chosen. For computational reasons, having the number of factors differ across clusters is rarely considered.

Here the infinite mixture of infinite factor analysers (IMIFA) model is introduced. IMIFA employs a Pitman-Yor process prior to facilitate automatic inference of the number of clusters using the stick-breaking construction and a slice sampler. Automatic inference of the cluster-specific numbers of factors is achieved using multiplicative gamma process shrinkage priors and an adaptive Gibbs sampler. IMIFA is presented as the flagship of a family of factor-analytic mixtures.

Applications to benchmark data, metabolomic spectral data, and a handwritten digit example illustrate the IMIFA model's advantageous features. These include obviating the need for model selection criteria, reducing the computational burden associated with the search of the model space, improving clustering performance by allowing cluster-specific numbers of factors, and uncertainty quantification.

#### Kenneth Benoit: More than Unigrams Can Say: Detecting Meaningful Multi-word Expressions from Political Texts

Almost universal among existing approaches to text mining is the adoption of the bag of words approach, counting each word as a feature without regard to grammar or order. This approach remains extremely useful despite being an obviously inaccurate model of how observed words are generated in natural language. Many substantively meaningful textual features, however, occur not as unigram words but rather as multi-word expressions (MWEs): pairs of words or phrases that together form a single conceptual entity whose meaning is distinct from its individual elements. Here we present a new model for detecting meaningful multi-word expressions, based on the novel application of a statistical method for detecting variable-length term collocations. Combined with frequency and part-of-speech filtering, we show how to detect meaningful MWEs with an application to public policy, political economy, and law. We extract and validate a dictionary of meaningful collocations from three large corpora totalling over 1 billion words, drawn from political manifestos, legislative floor debates, and US federal and Supreme court briefs. Applying the collocations to replicate published studies using unigrams only applied to each field, we demonstrate that using collocations can improve accuracy and validity over the standard unigram bag of words model.

#### Dan Zhu: Automated IPA for Bayesian MCMC: A New Approach for Local Prior Robustness and Convergence Analysis with Application to Multidimensional Macroeconomic Time Series with Shrinkage Priors

Infinitesimal perturbation analysis (IPA) is a widely used approach to assess local robustness of stochastic dynamic systems. In Bayesian inference, assessing local robustness of posterior Markov chain Monte carlo (MCMC) inference poses a challenge for existing methods such is finite differencing, symbolic differentiation and likelihood ratio methods due to the complex stochastic dependence structure and computational intensity of dependent sampling based methods. In this paper we introduce an efficient numerical approach based on automatic differentiation (AD) methods to allow for a comprehensive and exact local sensitivity analysis of MCMC output with respect all input parameters, i.e. prior hyper-parameters (prior robustness) and chain starting values (convergence). Building on recent developments in AD methods in the classical simulation setting, we develop an AD scheme to differentiate MCMC algorithms in order to compute the sensitivities based on exact (up to computer floating point error) first-order derivatives of MCMC draws (Jacobians) alongside the estimation algorithm. We focus on methods for Gibbs-based MCMC inference that are applicable to algorithms composed of both continuous and discontinuous high-dimensional mappings but show how the approach may be extended to cases when Gibbs updates are not available. We illustrate how the methods can be used to help practitioners to assess convergence and prior robustness in an application of Bayesian Vector Autoregression (VAR) based analysis with shrinkage priors for US macroeconomic time series data and forecasting.

#### Cinzia Viroli: Recent advances in Deep Mixture Models

Deep learning is a hierarchical inference method formed by subsequent multiple layers of learning able to more efficiently describe complex relationships. In this talk, Deep Mixture Models are introduced and discussed. A Deep Gaussian Mixture model (DGMM) is a network of multiple layers of latent variables, where, at each layer, the variables follow a mixture of Gaussian distributions. Thus, the deep mixture model consists of a set of nested mixtures of linear models, which globally provide a nonlinear model able to describe the data in a very flexible way. In order to avoid overparameterized solutions, dimension reduction by factor models can be applied at each layer of the architecture thus resulting in deep mixtures of factor analysers.

#### Michael Hecht: Multivariate Newton & Lagrange Interpolation

#### Natalie Packham: Correlation stress testing of stock and credit portfolios

We develop a general approach for stress testing correlations in stock and credit portfolios. Using Bayesian variable selection methods, we build a sparse factor structure, linking individual names or stocks with country and industry factors. Based on methods from modelling correlations in interest rate modelling, especially in the context of market models, we calibrate a parametric correlation matrix, where correlations of stocks / names are represented as a function of the country and industry factors. Economically meaningful stress scenarios on the factors can then be translated into stressed correlations. The method also lends itself as a reverse stress testing framework: using e.g. the Mahalanobis distance on the joint risk factor distribution, allows to infer worst-case correlation scenarios. In a previous related paper (https://doi.org/10.1016/j.jbankfin.2019.01.020), we developed a specific correlation stress model to analyse a USD 6.2 bn loss by JP Morgan in 2012 (known as the "London Whale").

Joint work with Fabian Woebbeking.