# Abstracts Research Seminar Summer Term 2019

#### Stefan Thurner: Elimination of systemic risk in financial markets

Systemic risk in financial markets arises—to a large extent—through the interconnectedness of agents through financial contracts. We show that the systemic risk level of every player in a financial system can be quantified by simple network measures. With actual central bank data of Austria and Mexico we are able to compute the total expected systemic losses of an economy, a number that allows us to estimate the cost of a financial crisis. We can further show on real data that it is possible to compute the systemic risk contribution of every single transaction in the financial system. We propose a smart financial transaction tax that incentivizes players to avoid systemically risky transactions. Avoiding this tax effectively restructures the topology of financial networks so that large-scale contagion events become impossible. We can prove the existence of a systemically risk-optimal equilibrium under this tax. An agent based model demonstrates that this Systemic Risk Tax practically eliminates the network-component of systemic risk in a system.

#### Antonietta Mira: Bayesian dimensionality reduction via identifications of data intrinsic dimensions

Even if defined on a large dimensional space, data points usually lie onto one or more hypersurfaces, or manifolds, with much smaller intrinsic dimensions (ID). The recent TWO-NN method (Facco et al., 2017, Scientific Report), allows estimating the ID when all points lie onto a single sub-manifold.

TWO-NN only assumes that the density of points is approximately constant in a small neighborhood around each point. Under this hypothesis, the ratio of the distances of a point from its first and second neighbor follows a Pareto distribution that depends parametrically only on the ID. We first extend the TWO-NN model to the case in which the data lie onto several sub-manifolds each one with its own different ID. While the idea behind the model extension is simple (the Pareto is replaced by a finite mixture of K Pareto distributions), a non-trivial Bayesian algorithm is required for estimating the model and assigning each point to its own manifold. Applying this method, which we dub Hidalgo (Heterogeneous Intrinsic Dimension ALGOrithm), we uncover a surprising ID variability in several real-world datasets. In fact, we are able to show how this methodology helps to discover latent clusters hidden in data of different nature, ranging from protein folding trajectory to financial indexes computed on balance sheets. Hidalgo obtains remarkable results, but its main limitation consists in fixing a priori the number of sub-manifolds, i.e. of components in the mixture. To overcome this issue we employ a flexible Bayesian Nonparametric approach and model the data as an infinite mixture of Pareto distributions using a Dirichlet Process Mixture Model. This framework allows evaluating the uncertainty relative to the number of mixture components and to the assignments of data points to sub-manifolds. Since the posterior distribution has no closed form, to perform inference we employ the Slice Sampler algorithm. From preliminary analyses on simulated and well-known datasets (e.g. Fisher's Iris dataset), the full Bayesian nonparametric version of the TWO-NN provides promising results allowing to recover a rich data structure starting from the intrinsic dimension, a pure geometric data feature, and only requiring the definition of a distance measure.

Joint work with Michele Allegra, Francesco Denti, Elena Facco, Alessandro Laio and Michele Guindani.

#### Harald Baayen: Wide learning in language modeling

Convolutional neural networks are widely and successfully used in natural language processing. However, it turns out that there are tasks in which learning with ‘wide’ networks, i.e., simple networks with just an input and an output layer and very large numbers of units, can be surprisingly successful when carefully chosen features (based on domain knowledge) are used. I will illustrate finding for three case studies: French baboons learning to discriminate between English words and pseudowords, human auditory word recognition, and the computational modeling of inflectional morphology with what amounts to multivariate multiple regression.

#### Simon Wood: Large smooth models for big data and space time modelling of daily pollution data

Motivated by trying to develop spatio-temporal models of 4 decades worth of daily air pollution measurements from the UK black smoke monitoring network, this talk discusses the challenges associated with generalized additive (or Gaussian latent process) modelling of 10 million data using models with around 10000 coefficients and 10 to 30 smoothing parameters. It is shown how parallelization can be achieved, provided that fitting methods are developed that are sufficiently block oriented to scale well, and how discretization of covariates can be exploited for further substantial gains in efficiency. The developed methods reduced computation times from weeks to around 5 minutes, for the motivating pollution model and are available in R package mgcv.

#### Nadja Klein: Implicit Copulas from Bayesian Regularized Regression Smoothers

We show how to extract the implicit copula of a response vector from a Bayesian regularized regression smoother with Gaussian disturbances. The copula can be used to compare smoothers that employ different shrinkage priors and function bases. We illustrate with three popular choices of shrinkage priors – a pairwise prior, the horseshoe prior and a g prior augmented with a point mass as employed for Bayesian variable selection – and both univariate and multivariate function bases. The implicit copulas are high-dimensional, have flexible dependence structures that are far from that of a Gaussian copula, and are unavailable in closed form. However, we show how they can be evaluated by first constructing a Gaussian copula conditional on the regularization parameters, and then integrating over these. Combined with non-parametric margins the regularized smoothers can be used to model the distribution of non-Gaussian univariate responses conditional on the covariates. Efficient Markov chain Monte Carlo schemes for evaluating the copula are given for this case. Using both simulated and real data, we show how such copula smoothing models can improve the quality of resulting function estimates and predictive distributions.

#### Michaela Szölgyenyi: Convergence order of Euler-type schemes for SDEs in dependence of the Sobolev regularity of the drift

Stochastic differential equations with irregular (non-globally Lipschitz) coefficients are a very active topic of research. We study the strong convergence rate of the Euler-Maruyama scheme for scalar SDEs with additive noise and irregular drift. We provide a novel framework for the error analysis by reducing it to a weighted quadrature problem for irregular functions of Brownian motion. By analysing the quadrature problem we obtain for arbitrarily small ε > 0 a strong convergence order of (1+κ)/2–ε for a non-equidistant Euler-Maruyama scheme, if the drift has Sobolev-Slobodeckij-type regularity of order κ. In the multi-dimensional setting we allow the drift coefficient to be non-Lipschitz on a set of positive reach. We prove strong convergence of an Euler-type scheme, which uses adaptive step-sizing for a better resolution close to the discontinuity. We obtain a numerical method which has – up to logarithmic terms – strong convergence order 1/2 with respect to the average computational cost.

#### Radu Ioan Bot: Proximal algorithms for nonconvex and nonsmooth minimization problems

In this talk, we discuss proximal algorithms for nonconvex and nonsmooth minimization problems. We begin with a short survey of the convergence results of the proximal-gradient algorithm for convex optimization problems. Further, we introduce a proximal-gradient algorithm with inertial and memory effects for the minimization of the sum of a proper and lower semicontinuous function with a possibly nonconvex smooth function. We prove that the sequence of iterates converges to a critical point of the objective, provided that a regularization of the latter function satisfies the Kurdyka-Łojasiewicz property. This applies to semialgebraic, real subanalytic, uniformly convex and convex functions satisfying a growth condition. In the last part of the talk we propose an algorithm for solving d.c. (difference-convex) optimization problems which allows the evaluation of both the concave and the convex part by their proximal points. Additionally, we allow a smooth part, which is evaluated via its gradient. For this algorithm we show the connection to the Toland dual problem and that a descent property for the objective function of a primal-dual formulation of the problem holds. Convergence of the iterates is guaranteed, if this objective function satisfies the Kurdyka-Łojasiewicz property.

#### Sylvia Kaufmann: The bank lending channel in Switzerland: Capturing cross-section heterogeneity and asymmetry over time

This paper studies the bank lending channel of monetary policy transmission in Switzerland over the last three decades. In contrast to early empirical analysis of the bank lending channel (Kashyap and Stein, 1995, 2000), we adopt an approach that is agnostic about which bank characteristic drives the heterogenous response of bank lending to changes in interest rates. In addition, our empirical model allows for changing lending reaction over time in a state-dependent way. Our results are consistent with the existence of a bank lending channel in Switzerland during the period from 1987 until 2016. However, the bank lending channel doesn't seem to work continuously, as we find episodes during which the bank lending channel is muted. These episodes were marked by increased economic uncertainty, which had a negative impact on loan growth.

#### Raffaele Argiento: From infinity to here: a Bayesian nonparametric perspective of finite mixture models

Modelling via finite mixtures is one of the most fruitful Bayesian approach, particularly useful for clustering when there is unobserved heterogeneity in the data. The most popular algorithm under this approach is the reversible jump MCMC that can be nontrivial to design, especially in high-dimensional spaces. We will show how nonparametric methods can be transferred into the parametric framework. We first introduce a class of almost sure finite discrete random probability measures obtained by normalization of finite point processes. Then, we use the new class as mixing measure of a mixture model and derive its posterior characterization. The resulting class encompasses the popular finite Dirichlet mixture model. In order to compute posterior statistics, we propose an alternative to the reversible jump: borrowing notation from the nonparametric Bayesian literature, we set up a conditional MCMC algorithm based on the posterior characterization of the unnormalized point process. The flexibility of the model and the performances of our algorithm are illustrated on simulated and real data.

#### Katia Colaneri: A class of recursive optimal stopping problems with an application to stock trading

In this paper we introduce and solve a class of optimal stopping problems of recursive type. In particular, the stopping payoff depends directly on the value function of the problem itself. In a multi-dimensional Markovian setting we show that the problem is well posed, in the sense that the value is indeed the unique solution to a fixed point problem in a suitable space of continuous functions, and an optimal stopping time exists. We then apply our class of problems to a model for stock trading in two different market venues and we determine the optimal stopping rule in that case. This is a joint work with Tiziano De Angelis.