# Abstracts Research Seminar Summer Term 2014

## Marius Hofert: An extreme value approach for modeling operational risk losses depending on covariates

A general methodology for modeling loss data depending on covariates is developed. The parameters of the frequency and severity distributions of the losses may depend on covariates. The loss frequency over time is modeled via a non-homogeneous Poisson process with integrated rate function depending on the covariates. This corresponds to a generalized additive model which can be estimated with spline smoothing via penalized maximum likelihood estimation. The loss severity over time is modeled via a nonstationary generalized Pareto model depending on the covariates. Whereas spline smoothing can not be directly applied in this case, an efficient algorithm based on orthogonal parameters is suggested. The methodology is applied to a database of operational risk losses. Estimates, including confidence intervals, for Value-at-Risk (also depending on the covariates) as required by the Basel II/III framework are computed.

## Håvard Rue: Penalising model component complexity: A principled practical approach to constructing priors

Selecting appropriate prior distributions for parameters in a statistical model is the Achilles heel of Bayesian statistics. Although the prior distribution should ideally encode the users prior knowledge about the parameters, this level of knowledge transfer seems to be unattainable in practice; often standard priors are used without much thought and with the implicit hope that the obtained results are not too prior sensitive. Despite the development of so-called objective priors, which are only available (due to mathematical issues) for a few selected and highly restricted model classes, the applied statistician has in practice few guidelines to follow when choosing the priors. An easy way out of this dilemma is to re-use prior choices of others, with an appropriate reference, to avoid further questions about this issue.

In a Bayesian-software system like R-INLA, where models are build by adding up various model components to construct a linear predictor, we are facing a real and practical challenge. Default priors must be set for the parameters of a large number of model components which are supposed to work well in a large number of scenarios and situations. None of the (limited) guidelines in the literature seem to be able to approach such a task.

In this paper we introduce a new concept for constructing prior distributions where we make use of the natural nested structure inherent to many model components. This nested structure defines a model component as a flexible extension of a base model, which allows us to define proper priors which penalize the complexity induced from the natural base model. Based on this observation, we can compute the prior distribution after the input of a user-defined (weak) scale-parameter for that model component. These priors are invariant to reparameterisations, have a natural connection to Jeffreys priors, are designed to support Occam’s razor and seem to have the excellent robustness properties, all which are highly desirable and allow us to use this approach to define default prior distributions.

We will illustrate our approach on a series of examples using the R-INLA package for doing fast approximate Bayesian inference for the class of latent Gaussian models. The Student-t case will be discussed in detail as it is the simplest non-trivial example. Then we will discuss other cases, like classical unstructured random effect models, spline smoothing and disease mapping.

Joint work with Thiago G. Martins, Daniel P. Simpson, Andrea Riebler (NTNU) and Sigrunn H. Sørbye (Univ. of Tromsø).

## Steve Scott: Bayes and Big Data: The Consensus Monte Carlo Algorithm

A useful definition of “big data" is data that is too big to comfortably process on a single machine, either because of processor, memory, or disk bottlenecks. Graphics processing units can alleviate the processor bottleneck, but memory or disk bottlenecks can only be eliminated by splitting data across multiple machines. Communication between large numbers of machines is expensive (regardless of the amount of data being communicated), so there is a need for algorithms that perform distributed approximate Bayesian analyses with minimal communication. Consensus Monte Carlo operates by running a separate Monte Carlo algorithm on each machine, and then averaging individual Monte Carlo draws across machines. Depending on the model, the resulting draws can be nearly indistinguishable from the draws that would have been obtained by running a single machine algorithm for a very long time. Examples of consensus Monte Carlo are shown for simple models where single-machine solutions are available, for large single-layer hierarchical models, and for Bayesian additive regression trees (BART).

Joint work with Alexander W. Blocker, Fernando V. Bonassi, Hugh A. Chipman, Edward I. George, and Robert E. McCulloch.

## Mattias Villani: Speeding up MCMC with Efficient Data Subsampling

The computing time for Markov Chain Monte Carlo (MCMC) algorithms can be prohibitively large for datasets with many observations, especially when the data density for each observation is costly to evaluate. We propose a framework based on a Pseudo-marginal MCMC where the likelihood function is unbiasedly estimated from a random subset of the data, resulting in substantially fewer density evaluations. The subsets are selected using efficient sampling schemes, such as Probability Proportional-to-Size (PPS) sampling where the inclusion probability of an observation is proportional to an approximation of its contribution to the likelihood function. We illustrate the approach on a bivariate probit model with an endogenous treatment effect fitted to a microeconomic dataset of Swedish firms with half a million observations.

## Mark Steel: Incorporating unobserved heterogeneity in Weibull survival models: A Bayesian approach

Flexible classes of survival models are proposed that naturally deal with both outlying observations and unobserved heterogeneity. We present the family of Rate Mixtures of Weibull distributions, for which a random effect is introduced through the rate parameter. This family contains i.a. the well-known Lomax distribution and can accommodate flexible hazard functions. Covariates are introduced through an Accelerated Failure Time model and we explicitly take censoring into account. We construct a weakly informative prior that combines the structure of the Jeffreys prior with a proper (informative) prior. This prior is shown to lead to a proper posterior distribution under mild conditions. Bayesian inference is implemented by means of a Metropolis-within-Gibbs algorithm. The mixing structure is exploited in order to provide an outlier detection method. Our methods are illustrated using a real dataset on cerebral palsy.

(joint work with C. Vallejos)

## Peter Rossi: Valuation of Patents and Product Features: A Structural Approach

We develop a market-based paradigm to value the enhancement or addition of features to a product. We define the market value of a product or feature enhancement as the change in the equilibrium profits that would prevail with and without the enhancement. In order to compute changes in equilibrium profits, a valid demand system must be constructed to value the feature. The demand system must be supplemented by information on competitive offerings and cost. In many situations, demand data is either not available or not informative with respect to demand for a product feature. Conjoint methods can be used to construct the demand system via a set of designed survey-based experiments. We illustrate our methods using data on the demand for digital cameras and demonstrate how the profits-based metric provides very different answers than the standard welfare or Willingness-To-Pay calculations.

## Markus Pauly: Resampling methods for randomly censored survival data

Studies in biomedical research typically face the problem of incomplete observations, e.g. right censored survival times. In this context the famous weighted logrank tests are frequently applied to compare two samples of randomly right censored survival times.

In the first (and major) part of this talk we address the question how to combine a couple of weighted logrank statistics to achieve good power of the corresponding survival test for a whole linear space or cone of alternatives which are given by hazard rates. This leads to a new class of semiparametric projection-type tests which can be carried out as permutation tests and possess desirable properties.

The second part of the talk deals with more complex time to event data strucures as e.g. given by competing risks model. Here we also show how to apply different resampling techniques as the (wild) bootstrap to construct adequate test decisions for quantities of interest.

## Wolfgang Hörmann: Risk simulation with optimally stratified importance sampling

It is well accepted in the simulation literature, that importance sampling is a useful variance reduction technique for risk quantification problems. In this talk we first demonstrate that also stratification with optimal sample size allocation is well suited for rare event simulation problems. Actually it turns out that the variance reduction can be clearly increased when combining importance sampling with that optimal stratification. Optimal stratification is also well suited to optimize the variance reduction for tail-loss simulations with multiple loss thresholds. Using optimal stratification there exists even a simple closed form solution that minimizes the sum of squared errors and the sum of squared relative errors of all simulation estimates. The simulation examples in this talk include portfolio loss and option pricing problems.