Abstracts Research Seminar Winter Term 2021/22
Arbitrage Principles in Insurance
In this work we study the valuation of insurance contracts from a fundamental viewpoint. We start from the observation that insurance contracts are inherently linked to financial markets, be it via interest rates, or – as in hybrid products, equity-linked life insurance and variable annuities – directly to stocks or indices. By defining portfolio strategies on an insurance portfolio and combining them with financial trading strategies we arrive at the notion of insurance-finance arbitrage (IFA). A fundamental theorem provides two sufficient conditions for presence or absence of IFA, respectively. For the first one it utilizes the conditional law of large numbers and risk-neutral valuation. As a key result we obtain a simple valuation rule, called QP-rule, which is market consistent and excludes IFA. Utilizing the theory of enlargements of filtrations we construct a tractable framework for general valuation results, working under weak assumptions. The generality of the approach allows to incorporate many important aspects, like mortality risk or dependence of mortality and stock markets which is of utmost importance in the recent corona crisis. For practical applications, we provide an affine formulation which leads to explicit valuation formulas for a large class of hybrid products.
Using Subset Log-Likelihoods to Trim Outliers in Gaussian Mixture Models
Mixtures of Gaussian distributions have been a popular choice in model-based clustering. Outliers can affect parameters estimation and, as such, must be accounted for. Predicting the proportion of outliers correctly is paramount as it minimizes misclassification error. It is proved that, for a finite Gaussian mixture model, the log-likelihoods of the subset models are distributed according to a mixture of beta distributions. An algorithm is then proposed that predicts the proportion of outliers by measuring the adherence of a set of subset log-likelihoods to a beta mixture reference distribution. This algorithm removes the least likely points, which are deemed outliers, until model assumptions are met.
Joint work with Katharine M. Clark.
Strategies and Software for Robust Color Palettes in Data Visualizations
Color is an integral element in many data visualizations such as maps, heat maps, bar plots, scatter plots, or time series displays. Well-chosen colors can make graphics more appealing and, more importantly, help to clearly communicate the underlying information. Conversely, poorly-chosen colors can obscure information or confuse the readers.
To avoid problems and misinterpretations, we introduce general strategies for selecting robust color palettes that are intuitive for many audiences, including readers with color vision deficiencies. The construction of sequential, diverging, or qualitative palettes is based on appropriate light-dark "luminance" contrasts while suitably controlling the "hue" and the colorfulness ("chroma").
The strategies are also easy to put into practice using computations based on the so-called Hue-Chroma-Luminance (HCL) color model, e.g., as provided in our "colorspace" software package (https://colorspace.R-Forge.R-project.org/). To aid selection and application of these palettes the package provides scales for use with ggplot2; shiny (and tcltk) apps for interactive exploration (see also https://hclwizard.org/); visualizations of palette properties; accompanying manipulation utilities (like desaturation and lighten/darken), and emulation of color vision deficiencies.
Herding in Probabilistic Forecasts
Decision makers often ask experts to forecast a future state. Experts, however, can be biased. In the economics and psychology literature, one extensively studied behavioral bias is called herding. Under strong levels of herding, disclosure of public information may lower forecasting accuracy. This result, however, has been derived only for point forecasts. In this work, we consider experts' probabilistic forecasts under herding, find a closed-form expression for the first two moments of a unique equilibrium forecast, and show that the experts report too similar locations and inflate the variance of their forecasts due to herding. Furthermore, we show that the negative externality of public information no longer holds. In addition to reacting to new information as expected, probabilistic forecasts contain more information about the experts' full beliefs and interpersonal structure. This facilitates model estimation. To this end, we consider a one-shot setting with one forecast per expert and show that our model is identifiable up to an infinite number of solutions based on point forecasts, but up to two solutions based on probabilistic forecasts. We then provide a Bayesian estimation procedure for these two solutions and apply it to economic forecasting data collected by the European Central Bank and the Federal Reserve Bank of Philadelphia. We find that, on average, the experts invest around 19% of their efforts into making similar forecasts. The level of herding shows an increasing trend from 1999 to 2007 but drops sharply during the financial crisis of 2007-2009, and then rises again until 2019.
Herman K. van Dijk:
Quantifying Time-Varying Forecast Uncertainty and Risk for the Real Price of Oil
A novel and numerically efficient quantification approach is proposed to forecast uncertainty of the real price of oil using a combination of probabilistic individual model forecasts. This combination method extends earlier approaches that have been applied to oil price forecasting, by allowing for sequentially updating of time-varying combination weights, estimation of time-varying forecast biases and facets of miscalibration of individual forecast densities and time-varying inter-dependencies among models. To illustrate the usefulness of the method, an extensive set of empirical results is presented about time-varying forecast uncertainty and risk for the real price of oil over the period 1974-2018. It is shown that the combination approach systematically outperforms commonly used benchmark models and combination approaches, both in terms of point and density forecasts. The dynamic patterns of the estimated individual model weights are highly time-varying, reflecting a large time variation in the relative performance of the various individual models. The combination approach has built-in diagnostic information measures about forecast inaccuracy and/or model set incompleteness, which provide clear signals of model incompleteness during three crisis periods. To highlight that the approach also can be useful for policy analysis, a basic analysis of profit-loss and hedging against price risk is presented.
Stefano M. Iacus:
Subjective Well-Being and Social Media
In this talk we quickly review some literature on the measurement of well-being through official statistics and surveys and then we will focus on how to extract some version of subjective well-being from social media posts.
We present the basics of the sentiment analysis approach used to construct a new social media based indicator of subjective well-being.
We will focus mainly on an application to Italy and Japan for which we construct the SWB-I and SWB-J indexes, using Twitter data from 2013 till mid 2020.
The countries are interesting because of their similarities and differences which are captured to some extent by those indicators.
We then discuss how these subjective well-being indexes relates to traditional measures of well-being and their decrease during the COVID-19 pandemic.
An extensive discussion is presented in the newly published book on this topic. The indicator SWB-I have been introduced in the Journal of Official Statistics along with a method to control for social media bias. Its Japanese counter part, namely SWB-J, is available here. Finally, those interested in the COVID-19 impact on these subjective well-being indicators may want to read this preprint.
Alexander J. McNeil:
Time Series Models With Infinite-Order Partial Copula Dependence
Stationary and ergodic time series can be constructed using an s-vine decomposition based on sets of bivariate copula functions. The extension of such processes to infinite copula sequences is considered and shown to yield a rich class of models that generalizes Gaussian ARMA and ARFIMA processes to allow both non-Gaussian marginal behaviour and a non-Gaussian description of the serial partial dependence structure. Extensions of classical causal and invertible representations of linear processes to general s-vine processes are proposed and investigated. A practical and parsimonious method for parameterizing s-vine processes using the Kendall partial autocorrelation function is developed. The potential of the resulting models to give improved statistical fits in many applications is indicated with examples using macroeconomic data.
Joint work with Martin Bladt.
Gael M. Martin:
Loss-Based Variational Bayes Prediction
We propose a new method for Bayesian prediction that caters for models with a large number of parameters and is robust to model misspecification. Given a class of high- dimension (but parametric) predictive models, this new approach constructs a posterior predictive using a variational approximation to a loss-based, or Gibbs, posterior that is directly focused on predictive accuracy. The theoretical behavior of the new prediction approach is analyzed and a form of optimality demonstrated. Applications to both simulated and empirical data using high-dimensional Bayesian neural network and autoregressive mixture models demonstrate that the approach provides more accurate results than various alternatives, including misspecified likelihood-based predictions. Whilst the current paper uses observation-driven models only as the predictive models, the presentation will contain theoretical results, and some preliminary numerical results, related to state space models. In this setting, the use of the variational approximation has to be dealt with with more care.
Joint work with David T Frazier, Ruben Loaiza-Maya and Bonsoo Koo.
Multivariate data is often visualized using linear projections, produced by techniques such as principal component analysis, linear discriminant analysis, and projection pursuit. A problem with projections is that they obscure low and high-density regions near the center of the distribution. Sections, or slices, can help to reveal them. In this talk I will introduce section pursuit, a new method to search for interesting slices of the data. Linear projections are used to define sections of the parameter space, and we calculate interestingness by comparing the distribution of observations, inside and outside a section. By optimizing this index, it is possible to reveal features such as holes (low density) or grains (high density), which can be useful when data distributions depart from uniform or normal, as in visually exploring nonlinear manifolds, and functions in multivariate space. I will show how section pursuit can be applied when exploring decision boundaries from classification models or when exploring subspaces induced by complex inequality conditions from a multiple parameter model.
Stripping the Discount Curve – a Robust Machine Learning Approach
We propose a non-parametric method, kernel ridge regression, for estimating the discount curve from treasury securities, with regularization penalties in terms of the smoothness of the approximating discount curve. We provide analytical solutions, which are straight-forward to implement. We then apply our method on a large data set of U.S. Treasury securities to extract term structure estimates at daily frequency. The resulting term structure estimates closely matches benchmarks from the literature but have smaller pricing errors.
Joint work with Kay Giesecke, Markus Pelger, and Ye Ye.