Abstracts Research Seminar Winter Term 2021/22

Thorsten Schmidt:
Arbitrage Principles in Insurance

In this work we study the valuation of insurance contracts from a fundamental viewpoint. We start from the observation that insurance contracts are inherently linked to financial markets, be it via interest rates, or – as in hybrid products, equity-linked life insurance and variable annuities – directly to stocks or indices. By defining portfolio strategies on an insurance portfolio and combining them with financial trading strategies we arrive at the notion of insurance-finance arbitrage (IFA). A fundamental theorem provides two sufficient conditions for presence or absence of IFA, respectively. For the first one it utilizes the conditional law of large numbers and risk-neutral valuation. As a key result we obtain a simple valuation rule, called QP-rule, which is market consistent and excludes IFA. Utilizing the theory of enlargements of filtrations we construct a tractable framework for general valuation results, working under weak assumptions. The generality of the approach allows to incorporate many important aspects, like mortality risk or dependence of mortality and stock markets which is of utmost importance in the recent corona crisis. For practical applications, we provide an affine formulation which leads to explicit valuation formulas for a large class of hybrid products.

Paul McNicholas:
Using Subset Log-Likelihoods to Trim Outliers in Gaussian Mixture Models

Mixtures of Gaussian distributions have been a popular choice in model-based clustering. Outliers can affect parameters estimation and, as such, must be accounted for. Predicting the proportion of outliers correctly is paramount as it minimizes misclassification error. It is proved that, for a finite Gaussian mixture model, the log-likelihoods of the subset models are distributed according to a mixture of beta distributions. An algorithm is then proposed that predicts the proportion of outliers by measuring the adherence of a set of subset log-likelihoods to a beta mixture reference distribution. This algorithm removes the least likely points, which are deemed outliers, until model assumptions are met.

Joint work with Katharine M. Clark.

Achim Zeileis:
Strategies and Software for Robust Color Palettes in Data Visualizations

Color is an integral element in many data visualizations such as maps, heat maps, bar plots, scatter plots, or time series displays. Well-chosen colors can make graphics more appealing and, more importantly, help to clearly communicate the underlying information. Conversely, poorly-chosen colors can obscure information or confuse the readers.

To avoid problems and misinterpretations, we introduce general strategies for selecting robust color palettes that are intuitive for many audiences, including readers with color vision deficiencies. The construction of sequential, diverging, or qualitative palettes is based on appropriate light-dark "luminance" contrasts while suitably controlling the "hue" and the colorfulness ("chroma").

The strategies are also easy to put into practice using computations based on the so-called Hue-Chroma-Luminance (HCL) color model, e.g., as provided in our "colorspace" software package (https://colorspace.R-Forge.R-project.org/). To aid selection and application of these palettes the package provides scales for use with ggplot2; shiny (and tcltk) apps for interactive exploration (see also https://hclwizard.org/); visualizations of palette properties; accompanying manipulation utilities (like desaturation and lighten/darken), and emulation of color vision deficiencies.

Ville Satopää:
Herding in Probabilistic Forecasts

Decision makers often ask experts to forecast a future state. Experts, however, can be biased. In the economics and psychology literature, one extensively studied behavioral bias is called herding. Under strong levels of herding, disclosure of public information may lower forecasting accuracy. This result, however, has been derived only for point forecasts. In this work, we consider experts' probabilistic forecasts under herding, find a closed-form expression for the first two moments of a unique equilibrium forecast, and show that the experts report too similar locations and inflate the variance of their forecasts due to herding. Furthermore, we show that the negative externality of public information no longer holds. In addition to reacting to new information as expected, probabilistic forecasts contain more information about the experts' full beliefs and interpersonal structure. This facilitates model estimation. To this end, we consider a one-shot setting with one forecast per expert and show that our model is identifiable up to an infinite number of solutions based on point forecasts, but up to two solutions based on probabilistic forecasts. We then provide a Bayesian estimation procedure for these two solutions and apply it to economic forecasting data collected by the European Central Bank and the Federal Reserve Bank of Philadelphia. We find that, on average, the experts invest around 19% of their efforts into making similar forecasts. The level of herding shows an increasing trend from 1999 to 2007 but drops sharply during the financial crisis of 2007-2009, and then rises again until 2019.

Herman K. van Dijk:
Quantifying Time-Varying Forecast Uncertainty and Risk for the Real Price of Oil

A novel and numerically efficient quantification approach is proposed to forecast uncertainty of the real price of oil using a combination of probabilistic individual model forecasts. This combination method extends earlier approaches that have been applied to oil price forecasting, by allowing for sequentially updating of time-varying combination weights, estimation of time-varying forecast biases and facets of miscalibration of individual forecast densities and time-varying inter-dependencies among models. To illustrate the usefulness of the method, an extensive set of empirical results is presented about time-varying forecast uncertainty and risk for the real price of oil over the period 1974-2018. It is shown that the combination approach systematically outperforms commonly used benchmark models and combination approaches, both in terms of point and density forecasts. The dynamic patterns of the estimated individual model weights are highly time-varying, reflecting a large time variation in the relative performance of the various individual models. The combination approach has built-in diagnostic information measures about forecast inaccuracy and/or model set incompleteness, which provide clear signals of model incompleteness during three crisis periods. To highlight that the approach also can be useful for policy analysis, a basic analysis of profit-loss and hedging against price risk is presented.

Stefano M. Iacus:
Subjective Well-Being and Social Media

In this talk we quickly review some literature on the measurement of well-being through official statistics and surveys and then we will focus on how to extract some version of subjective well-being from social media posts.
We present the basics of the sentiment analysis approach used to construct a new social media based indicator of subjective well-being.
We will focus mainly on an application to Italy and Japan for which we construct the SWB-I and SWB-J indexes, using Twitter data from 2013 till mid 2020.
The countries are interesting because of their similarities and differences which are captured to some extent by those indicators.
We then discuss how these subjective well-being indexes relates to traditional measures of well-being and their decrease during the COVID-19 pandemic.

References:
An extensive discussion is presented in the newly published book on this topic. The indicator SWB-I have been introduced in the Journal of Official Statistics along with a method to control for social media bias. Its Japanese counter part, namely SWB-J, is available here. Finally, those interested in the COVID-19 impact on these subjective well-being indicators may want to read this preprint.

Alexander J. McNeil:
Time Series Models With Infinite-Order Partial Copula Dependence

Stationary and ergodic time series can be constructed using an s-vine decomposition based on sets of bivariate copula functions. The extension of such processes to infinite copula sequences is considered and shown to yield a rich class of models that generalizes Gaussian ARMA and ARFIMA processes to allow both non-Gaussian marginal behaviour and a non-Gaussian description of the serial partial dependence structure. Extensions of classical causal and invertible representations of linear processes to general s-vine processes are proposed and investigated. A practical and parsimonious method for parameterizing s-vine processes using the Kendall partial autocorrelation function is developed. The potential of the resulting models to give improved statistical fits in many applications is indicated with examples using macroeconomic data.

Joint work with Martin Bladt.

Gael M. Martin:
Loss-Based Variational Bayes Prediction

We propose a new method for Bayesian prediction that caters for models with a large number of parameters and is robust to model misspecification. Given a class of high- dimension (but parametric) predictive models, this new approach constructs a posterior predictive using a variational approximation to a loss-based, or Gibbs, posterior that is directly focused on predictive accuracy. The theoretical behavior of the new prediction approach is analyzed and a form of optimality demonstrated. Applications to both simulated and empirical data using high-dimensional Bayesian neural network and autoregressive mixture models demonstrate that the approach provides more accurate results than various alternatives, including misspecified likelihood-based predictions. Whilst the current paper uses observation-driven models only as the predictive models, the presentation will contain theoretical results, and some preliminary numerical results, related to state space models. In this setting, the use of the variational approximation has to be dealt with with more care.

Joint work with David T Frazier, Ruben Loaiza-Maya and Bonsoo Koo.

Ursula Laa:
Section Pursuit

Multivariate data is often visualized using linear projections, produced by techniques such as principal component analysis, linear discriminant analysis, and projection pursuit. A problem with projections is that they obscure low and high-density regions near the center of the distribution. Sections, or slices, can help to reveal them. In this talk I will introduce section pursuit, a new method to search for interesting slices of the data. Linear projections are used to define sections of the parameter space, and we calculate interestingness by comparing the distribution of observations, inside and outside a section. By optimizing this index, it is possible to reveal features such as holes (low density) or grains (high density), which can be useful when data distributions depart from uniform or normal, as in visually exploring nonlinear manifolds, and functions in multivariate space. I will show how section pursuit can be applied when exploring decision boundaries from classification models or when exploring subspaces induced by complex inequality conditions from a multiple parameter model.

Damir Filipović:
Stripping the Discount Curve – a Robust Machine Learning Approach

We propose a non-parametric method, kernel ridge regression, for estimating the discount curve from treasury securities, with regularization penalties in terms of the smoothness of the approximating discount curve. We provide analytical solutions, which are straight-forward to implement. We then apply our method on a large data set of U.S. Treasury securities to extract term structure estimates at daily frequency. The resulting term structure estimates closely matches benchmarks from the literature but have smaller pricing errors.

Joint work with Kay Giesecke, Markus Pelger, and Ye Ye.

Name	Purpose	Lifetime	Provider
CookieConsent	Saves your consent to using cookies.	30 days	WU
site-popup	Saves if popup was filled or closed.	30 days	WU
BACH_PRXY_ID	To be able to display some WU-specific content, it is necessary that some information must be accessed by back-end WU systems. Required to assign the appropriate answer to a request.	20 years	WU
BACH_PRXY_SN	To be able to display some WU-specific content, it is necessary that some information must be accessed by back-end WU systems. Required to assign the appropriate answer to a request.	session	WU
fe_typo_user	Required for login and access to protected content or for editing the user’s personal profile.	session	WU
be_typo_user	Required for login and editing content in the TYPO3 back end.	session	WU
be_lastLoginProvider	Stores the last method used for logging in to the TYPO3 back end.	90 days	WU
ASP.NET_SessionId	Required for assigning visitors to forms.	session	WU (forms.wu.ac.at)
__RequestVerificationToken	Required to protect forms against attacks.	session	WU (forms.wu.ac.at)
ESRASOFTSID	Required for identifying the logged-in user in the Business Language Center’s course registration system.	session	WU (esrasoft.wu.ac.at)
esraSoftWiData	Required to track the language and language courses selected by the user.	session	WU (esrasoft.wu.ac.at)
esraSimpleSAMLAuthToken	Required for identifying WU employees during the course registration process.	session	WU (esrasoft.wu.ac.at)
esraSimpleSAML	Required for identifying WU employees during the course registration process.	session	WU (esrasoft.wu.ac.at)
SimpleSAML	Required for identifying WU employees during the course registration process.	session	WU (esrasoft.wu.ac.at)

Name	Purpose	Lifetime	Provider
_pk_id	Used by Matomo Analytics to store a few details about the user, such as the unique visitor ID.	30 days	WU (piwik.wu.ac.at)
_pk_ref	Used by Matomo Analytics to store the attribution information, the referrer initially used to visit the website.	6 months	WU (piwik.wu.ac.at)
_pk_ses	Created by Matomo Analytics, short-lived cookies used to temporarily store data for the current visit.	1 hours	WU (piwik.wu.ac.at)
_gcl_au	Contains a randomly generated user ID.	3 months	Google
AMP_TOKEN	Contains a token that can be used to retrieve a Client ID from AMP Client ID service. Other possible values indicate opt-out, request in progress or an error retrieving a Client ID from AMP Client ID service.	1 year	Google
_dc_gtm_--property-id--	Used by DoubleClick (Google Tag Manager) to help identify the visitors by either age, gender or interests.	2 years	Google
_ga	Contains a randomly generated user ID. Using this ID, Google Analytics can recognize returning users on this website and merge the data from previous visits.	2 year	Google
_gat_gtag	Certain data is only sent to Google Analytics a maximum of once per minute. As long as it is set, certain data transfers are prevented.	1 minute	Google
_gid	Contains a randomly generated user ID. Using this ID, Google Analytics can recognize returning users on this website and merge the data from previous visits.	24 hour	Google
_gac_gb	Contains campaign-related information for the user. If Google Analytics and Google Ads accounts are linked, the conversion tags on the Google Ads website read this cookie.	90 day	Google
_dc_gtm	Used to throttle the request rate.	1 minute	Google
IDE	Contains a randomly generated user ID. Using this ID, Google can recognize the user across different websites across domains and display personalized advertising.	1 year	Google
player	This cookie saves user-specific settings before an embedded Vimeo video is played. This means that the next time you watch a Vimeo video, your preferred settings will be loaded.	1 year	Vimeo
vuid	This cookie is used to save the usage history of the user.	2 year	Vimeo
__cf_bm	This cookie is used to distinguish between humans and bots. This is necessary for Vimeo to collect valid data about the use of the service.	1 day	Vimeo
_uetvid	This cookie is set to enable the use of the Vimeo video player.	1 year	Vimeo
_tt_enable_cookie	This cookie is used to enable the vimeo video embedding on the WU Website and for other unspecified purposes.	1 year	Vimeo
afUserId	This cookie collects data from users who interact with embedded Vimeo videos.	2 years	Vimeo
_abexps	This cookie saves settings made by the user, e.g. Default language, region or username as well as interaction data of the user with Vimeo	10 months	Vimeo
_clck	This cookie enables the use of the embedded Vimeo video player	1 year	Vimeo
has_logged_in	This cookie stores login information and if the user has ever logged in.	10 years	Vimeo
language	This cookie remembers the language setting of a user. This ensures that Vimeo appears in the language selected by the user.	11 years	Vimeo
_ttp	This cookie is set to enable the use of the Vimeo video player	1 year	Vimeo
sd_client_id	This cookie stores data about the users current video settings and a personal identification token	2 year	Vimeo
_rdt_uuid	This cookie collects data about the users actions on websites that have a vimeo video embedded.	3 months	Vimeo
vimeo_cart	This cookie is used to check how many times a video has been played by the user.	10 years	Vimeo
OptanonConsent	This cookie stores information about the consent status of a visitor.	1 year	Vimeo
_scid	This cookie is used to assign a unique ID to a user	10 months	Vimeo
hjSessionBenutzer_	Set when a user first lands on a page. Persists the Hotjar User ID which is unique to that site. Hotjar does not track users across different sites. Ensures data from subsequent visits to the same site are attributed to the same user ID.	1 year	Hotjar
_hjid	This is an old cookie which is not set anymore, but if a user has it unexpired in their browser. It will be reused and migrated to _hjSessionUser_{site_id}. Set when a user first lands on a page. Persists the Hotjar User ID which is unique to that site. Ensures data from subsequent visits to the same site are attributed to the same user ID.	1 year	Hotjar
_hjFirstSeen	Identifies a new users first session. Used by Recording filters to identify new user sessions.	30 minutes	Hotjar
_hjHasCachedUserAttributes	Enables us to know whether the data set in _hjUserAttributes Local Storage item is up to date or not.	session	Hotjar
_hjUserAttributesHash	Enables us to know when any User Attribute has changed and needs to be updated.	2 minutes	Hotjar
_hjBenutzerAttribute	Stores User Attributes sent through the Hotjar Identify API. No explicit expiration.	session	Hotjar
hjViewportId	Stores user viewport details such as size and dimensions.	session	Hotjar
hjActiveViewportIds	Stores user active viewports IDs. Stores an expirationTimestamp that is used to validate active viewports on script initialization.	session	Hotjar
_hjSession_	Holds current session data. Ensures subsequent requests in the session window are attributed to the same session.	30 minutes	Hotjar
_hjSessionTooLarge	Causes Hotjar to stop collecting data if a session becomes too large. Determined automatically by a signal from the server if the session size exceeds the limit.	1 hour	Hotjar
_hjSessionResumed	Set when a session/recording is reconnected to Hotjar servers after a break in connection.	session	Hotjar
_hjCookieTest	Checks to see if the Hotjar Tracking Code can use cookies. If it can, a value of 1 is set. Deleted almost immediately after it is created.	session	Hotjar
_hjLocalStorageTest	Checks if the Hotjar Tracking Code can use Local Storage. If it can, a value of 1 is set. Data stored in _hjLocalStorageTest has no expiration time, but it is deleted almost immediately after it is created.	none	Hotjar
_hjSessionStorageTest	Checks if the Hotjar Tracking Code can use Session Storage. If it can, a value of 1 is set. Data stored in _hjSessionStorageTest has no expiration time, but it is deleted almost immediately after it is created.	none	Hotjar
_hjIncludedInPageviewSample	Set to determine if a user is included in the data sampling defined by your site's pageview limit.	2 minutes	Hotjar
_hjIncludedInSessionSample_	Set to determine if a user is included in the data sampling defined by your site's daily session limit.	2 minutes	Hotjar
_hjAbsoluteSessionInProgress	Used to detect the first pageview session of a user.	30 minutes	Hotjar
_hjTLDTest	We try to store the _hjTLDTest cookie for different URL substring alternatives until it fails. Enables us to try to determine the most generic cookie path to use, instead of page hostname. It means that cookies can be shared across subdomains (where applicable). After this check, the cookie is removed.	session	Hotjar

Name	Purpose	Lifetime	Provider
test_cookie	Is set as a test to check whether the browser allows cookies to be set. Does not contain any identification features.	15 minute	Google
IDE	Contains a randomly generated user ID. Using this ID, Google can recognize the user across different websites across domains and display personalized advertising.	1 year	Google
_gcl_au	Contains a randomly generated user ID.	90 day	Google
_gcl_aw	This cookie is set when a user clicks on a Google ad on the website. It contains information about which ad was clicked.	90 day	Google
xs	Used to maintain a Facebook session. It works in combination with the c_user cookie to authenticate the user's identity on Facebook.	1 year	Facebook
fr	Used to serve advertisements and measure and improve their relevance.	90 day	Facebook
m_pixel_ratio	Performance cookie used by Facebook with Facebook pixels.	session	Facebook
wd	Used for analysis purposes. Technical parameters are logged (e.g. aspect ratio and dimensions of the screen) so that Facebook apps can be displayed correctly.	7 day	Facebook
dpr	Used for analysis purposes. Technical parameters are logged (e.g. aspect ratio and dimensions of the screen) so that Facebook apps can be displayed correctly.	7 day	Facebook
sb	Used to save browser details and Facebook account security information.	2 year	Facebook
dbln	Used to save browser details and Facebook account security information.	2 year	Facebook
spin	Cookie for advertising purposes and reporting on social campaigns.	session	Facebook
presence	Contains the "Chat" status of a logged in user.	1 month	Facebook
x-referer	Performance cookie that is used by Facebook in combination with Facebook pixels.	session	Facebook
cppo	Cookie for statistical purposes.	90 day	Facebook
datr	Identifies the browser for security and website integrity purposes, including account recovery and identification of potentially compromised accounts.	2 year	Facebook
locale	Saves language settings.	session	Facebook
_fbp	A cookie for Facebook advertising that is used to track and improve relevance and to serve ads on Facebook.	90 day	Facebook
_fbc	A cookie for Facebook advertising that is used to track and improve relevance and to serve ads on Facebook.	90 day	Facebook
UserMatchHistory	This cookie is used to synchronize the LinkedIn Ads IDs.	30 day	LinkedIn
AnalyticsSyncHistory	This cookie saves the time at which the user was synchronized with the "lms_analytics" cookie.	30 day	LinkedIn
li_oatml	This cookie is used to identify LinkedIn members outside of LinkedIn for advertising and analysis purposes.	30 day	LinkedIn
lms_ads	This cookie is used to identify LinkedIn members outside of LinkedIn.	30 day	LinkedIn
lms_analytics	This cookie is used to identify LinkedIn members for analysis purposes.	30 day	LinkedIn
li_fat_id	This cookie is an indirect member identification that is used for conversion tracking, retargeting and analysis.	30 day	LinkedIn
li_sugr	This cookie is used to determine probabilistic matches of the identity of a user.	90 day	LinkedIn
U	This cookie identifies the user’s browser.	3 month	LinkedIn
_guid	This cookie is used to identify a LinkedIn member for advertising via Google Ads.	90 day	LinkedIn
BizographicsOptOut	This cookie is used to determine the rejection status for tracking by third-party providers.	10 year	LinkedIn
lidc	This cookie makes it easier to select LinkedIn's data center.	24 hours	LinkedIn
aam_uuid	This cookie is used for ID synchronization with Adobe Audience Manager.	30 days	LinkedIn
AMCV_XXX_at_AdobeOrg	This cookie contains a unique identifier for the Adobe Experience Cloud.	180 days	LinkedIn
li_mc	This cookie is used as a temporary cache. It is used to have the user's consent information from the database available client side.	2 years	LinkedIn
lang	This cookie stores the language settings of a user. This ensures that the LinkedIn.com website appears in the language selected by the user.	session	LinkedIn
twll	This cookie is set when X is embedded on the page. X collects data that is mainly used for tracking and targeting.	4 year	X
secure_session	This cookie is set when X is embedded on the page. E.g. X's like or sharing functions.	14 year	X
guest_id	This cookie is set by X when a visitor shares content from the WU website on X.	2 year	X
personalization_id	This cookie is set by X to measure the performance of X advertising campaigns in a user's browsers and devices	2 year	X
remember_checked	This cookie is set by when X is embedded on the page. X collects data that is mainly used for tracking and targeting.	4 year	X
remember_checked_on	This cookie is set when X is embedded on the page. E.g. X's like or sharing functions.	4 year	X
mbox	This cookie is intended for identifying X users, for analyzing interaction with the X Service and advertising whitin the service.	2 years	X
guest_id_ads	This cookie is set due to X integration and for sharing content to social media.	10 months	X
d_prefs	This cookie ist used to check referral links and the login status.	90 days	X
ct0	This cookie is set due to X integration and sharing capabilities for the social media.	10 months	X
kdt	This cookie is used to monitor the users login status on X.	10 months	X
guest_id_marketing	This cookie is set for tracking and analytics purposes.	10 months	X
twid	This cookie checks if you are logged in to X during a browser session.	1 year	X
auth_token	This cookie is required for authentication and checks whether the user is logged in.	10 months	X
external_referer	This cookie collects statistical data, including how often you visit X and how long a user stays on X.	1 day	X
NID	This cookie contains a unique ID that is used to save user-specific settings and other information, in particular your preferred language, how many search results should be displayed per page and whether the Google SafeSearch filter should be activated.	6 month	YouTube
1P_JAR	This Google cookie is used to optimize advertising, to provide ads that are relevant to users, to improve reports on campaign performance or to prevent a user from seeing the same ads multiple times.	1 month	YouTube
CONSENT	This cookie is used to support Google's advertising services.	20 year	YouTube
OTZ	Aggregated analysis of website visitors.	17 day	YouTube

Abstracts Research Seminar Winter Term 2021/22

Thorsten Schmidt: Arbitrage Principles in Insurance

Paul McNicholas: Using Subset Log-Likelihoods to Trim Outliers in Gaussian Mixture Models

Achim Zeileis: Strategies and Software for Robust Color Palettes in Data Visualizations

Ville Satopää: Herding in Probabilistic Forecasts

Herman K. van Dijk: Quantifying Time-Varying Forecast Uncertainty and Risk for the Real Price of Oil

Stefano M. Iacus: Subjective Well-Being and Social Media

Alexander J. McNeil: Time Series Models With Infinite-Order Partial Copula Dependence

Gael M. Martin: Loss-Based Variational Bayes Prediction

Ursula Laa: Section Pursuit

Damir Filipović: Stripping the Discount Curve – a Robust Machine Learning Approach