Abstracts Research Seminar Winter Term 2015/16

Marius Hofert: Improved Algorithms for Computing Worst Value-at-Risk: Numerical Challenges and the Adaptive Rearrangement Algorithm

Numerical challenges inherent in algorithms for computing worst Value-at-Risk in homogeneous portfolios are identified and solutions as well as words of warning concerning their implementation are provided. Furthermore, both conceptual and computational improvements to the Rearrangement Algorithm for approximating worst Value-at-Risk for portfolios with arbitrary marginal loss distributions are given. In particular, a novel Adaptive Rearrangement Algorithm is introduced and investigated. These algorithms are implemented using the R package qrmtools.

Claudia Klüppelberg: Modelling, estimation and model assessment of extreme space-time data

Max-stable processes can be viewed as the natural infinite-dimensional generalisation of multivariate extreme value distributions. We focus on the Brown-Resnick space-time process, a prominent max-stable model. We extend existing spatially isotropic models to anisotropic versions and use pairwise likelihood to estimate the model parameters. For regular grid observations we prove strong consistency and asymptotic normality of the estimators for fixed and increasing spatial domain, when the number of observations in time tends to infinity. We also present a statistical test for spatial isotropy versus anisotropy, which is based on asymptotic confidence intervals of the pairwise likelihood estimators. We fit the spatially anisotropic Brown-Resnick model and apply the proposed test to precipitation measurements in Florida. In addition, we present some recent diagnostic tools for model assessment.

This is joint work with Sven Buhl.

Andreas Hamel: From Multi-Utility Representations to Stochastic Orders and Central Regions - A Set Optimization Perspective

Many important order relations in Economics, Finance and Statistics can be represented by or are defined via families of scalar functions. This includes some non-complete preferences in Economics, in particular Bewley preferences, and stochastic orders. It will be demonstrated that each such order generates (i) a specific closure (hull) operator and (ii) a specific complete lattice of sets. In turn, the latter will be used as image space for optimization problems with set-valued objective functions. A solution concept for such problems will be discussed and applied to particular cases. A current project aims at extending the approach to risk evaluation of multi-variate positions via statistical depth functions.

Nicolas Chopin: Sequential quasi-Monte Carlo and extensions

(joint work with Mathieu Gerber) In this talk, I will discuss SQMC (Sequential quasi-Monte Carlo), a class of algorithms obtained by introducing QMC point sets in particle filtering. Like particle filters, SQMC makes it possible to compute the likelihood and the sequence of filtering distributions of a given state-space (hidden Markov) model. But SQMC converges faster than Monte Carlo filters. I will also discuss how to perform smoothing and to apply it to PMCMC (Particle Markov chain Monte Carlo), and present some numerical illustrations.

Relevant links:
arxiv.org/abs/1402.4039
arxiv.org/abs/1506.06117

Christian Brownlees: Realized Networks

In this work we introduce a LASSO based regularization procedure for large dimensional realized covariance estimators of log-prices. The procedure consists of shrinking the off diagonal entries of the inverse realized covariance matrix towards zero. This technique produces covariance estimators that are positive definite and with a sparse inverse. We name the regularized estimator realized network, since estimating a sparse inverse covariance matrix is equivalent to detecting the partial correlation network structure of the log-prices. We focus in particular on applying this technique to the Multivariate Realized Kernel and the Two-Scales Realized Covariance estimators based on refresh time sampling. These are consistent covariance estimators that allow for market microstructure effects and asynchronous trading. The paper studies the large sample properties of the regularized estimators and establishes conditions for consistent estimation of the integrated covariance and for consistent selection of the partial correlation network. As a by-product of the theory, we also establish novel concentration inequalities for the Multivariate Realized Kernel estimator. The methodology is illustrated with an application to a panel of US bluechips throughout 2009. Results show that the realized network estimator outperforms its unregularized counterpart in an out-of-sample global minimum variance portfolio prediction exercise.

Martyn Plummer: Cuts in Bayesian Graphical Models

Large Bayesian models that combine data from different sources are sometimes difficult to manage, and may exhibit numerical problems such as lack of convergence of MCMC. There is a developing interest in "modularization" as an alternative to full probability modelling, i.e. dividing large models into smaller "modules" and controlling the degree of communication between them. Cuts are an extreme form of modularization. Informally, a cut works as a valve in the model graph, preventing information from flowing back from the data to certain parameters. Cuts have been used in many applications, but are particularly common in pharmacokinetic / pharmacodynamic (PK/PD) models. They have been popularized by the OpenBUGS software which provides a cut function and a modified MCMC algorithm. Unfortunately, cuts do not work. I have shown that the OpenBUGS cut algorithm does not converge to a well-defined distribution, in the sense that the limiting distribution of the Markov chain depends on which sampling methods are used. This leaves us in a situation with a popular idea that is widely used but has no underlying theory and no valid implementation. I will speculate on where to go next.

Yee Whye Teh: Bayesian Nonparametrics in Mixture and Admixture Modelling

Mixture and admixture models are ubiquitous across many disciplines where data exhibit clustering structure. Examples include document topic modelling, genetic admixture modelling and subgroup analysis.One of the difficulties in applying such methods is model selection, where one needs to determine the appropriate number of clusters in the data. In this talk I will overview the Bayesian nonparametric approach to mitigating the model selection difficulty. The idea is to allow for an unbounded number of clusters to potentially explain the data, and using the Bayesian approach to inference to avoid overfitting. The approach builds upon the Dirichlet process and the hierarchical Dirichlet process, and I will also describe more recent work trying to model time varying clustering structure.

Christian Robert: Approximate Bayesian computation for model choice via random forests

Introduced in the late 1990’s, the ABC method can be considered from several perspectives, ranging from a purely practical motivation towards handling complex likelihoods to non-parametric justifications. We propose here a different analysis of ABC techniques and in particular of ABC model selection. Our exploration focus on the idea that generic machine learning tools like random forests (Breiman, 2001) can help in conducting model selection among the highly complex models covered by ABC algorithms. Both theoretical and algorithmic output indicate that posterior probabilities are poorly estimated by ABC. I will describe how our research for an alternative first led us to abandoning the use of posterior probabilities of the models under comparison as evidence tools. As a first substitute, we proposed to select the most likely model via a random forest procedure and to compute posterior predictive performances of the corresponding ABC selection method. It is only recently that we realised that random forest methods can also be adapted to the further estimation of the posterior probability of the selected model. I will also discuss our recommendation towards sparse implementation of the random forest tree construction, using severe subsampling and reduced reference tables. The performances in term of power in model choice and gain in computation time of the resulting ABC-random forest methodology are illustrated on several population genetics datasets.

[This is joint work with Jean-Marie Cornuet, Arnaud Estoup, Jean-Michel Marin and Pierre Pudlo. The current version is available as arxiv.org/pdf/1406.6288v3]

Ivan Mizera: Borrowing Strength from Experience: Empirical Bayes Methods and Convex Optimization

We consider classical compound decision models, mixtures of normal and Poisson distributions, from the nonparametric perspective, that is, with general mixing distribution. In this context, the otherwise rudimentary predictions for unobserved random effects can be remarkably improved by using experience: casting the problem in the classical empirical Bayes framework and elucidating either the unknown prior distribution or directly the optimal prediction rule from the estimates of the marginal distribution of the data. Prominent examples include the prediction of the individual success proportion in popular sports, or estimating the Poisson rate in actuarial science. We discuss two methods that owe their feasibility to the modern convex optimization methods: The first introduces a nonparametric maximum likelihood estimator of the mixture density subject to a monotonicity constraint on the resulting Bayes rule; the second implements, as an alternative to earlier-proposed EM-algorithm strategies, a new approach to the Kiefer-Wolfowitz nonparametric maximum likelihood estimator for mixtures, with the resulting reduction in computational effort of several orders of magnitude for typical problems. The procedures are compared with several existing alternatives in simulations, which particularly focus on situations with sparse mixing distributions.

Omiros Papaspiliopoulos: Building MCMC

The talk provides an overview of techniques for building Markov chain Monte Carlo algorithms, connecting some of the classic works in the area to very recent methods used for sampling numerically intractable distributions, methods based on transformations, and methods based on diffusions. The work in the talk is based on a book jointly written with Gareth O. Roberts.

Name	Purpose	Lifetime	Provider
CookieConsent	Saves your consent to using cookies.	30 days	WU
site-popup	Saves if popup was filled or closed.	30 days	WU
BACH_PRXY_ID	To be able to display some WU-specific content, it is necessary that some information must be accessed by back-end WU systems. Required to assign the appropriate answer to a request.	20 years	WU
BACH_PRXY_SN	To be able to display some WU-specific content, it is necessary that some information must be accessed by back-end WU systems. Required to assign the appropriate answer to a request.	session	WU
fe_typo_user	Required for login and access to protected content or for editing the user’s personal profile.	session	WU
be_typo_user	Required for login and editing content in the TYPO3 back end.	session	WU
be_lastLoginProvider	Stores the last method used for logging in to the TYPO3 back end.	90 days	WU
ASP.NET_SessionId	Required for assigning visitors to forms.	session	WU (forms.wu.ac.at)
__RequestVerificationToken	Required to protect forms against attacks.	session	WU (forms.wu.ac.at)
ESRASOFTSID	Required for identifying the logged-in user in the Business Language Center’s course registration system.	session	WU (esrasoft.wu.ac.at)
esraSoftWiData	Required to track the language and language courses selected by the user.	session	WU (esrasoft.wu.ac.at)
esraSimpleSAMLAuthToken	Required for identifying WU employees during the course registration process.	session	WU (esrasoft.wu.ac.at)
esraSimpleSAML	Required for identifying WU employees during the course registration process.	session	WU (esrasoft.wu.ac.at)
SimpleSAML	Required for identifying WU employees during the course registration process.	session	WU (esrasoft.wu.ac.at)

Name	Purpose	Lifetime	Provider
_pk_id	Used by Matomo Analytics to store a few details about the user, such as the unique visitor ID.	30 days	WU (piwik.wu.ac.at)
_pk_ref	Used by Matomo Analytics to store the attribution information, the referrer initially used to visit the website.	6 months	WU (piwik.wu.ac.at)
_pk_ses	Created by Matomo Analytics, short-lived cookies used to temporarily store data for the current visit.	1 hours	WU (piwik.wu.ac.at)
_gcl_au	Contains a randomly generated user ID.	3 months	Google
AMP_TOKEN	Contains a token that can be used to retrieve a Client ID from AMP Client ID service. Other possible values indicate opt-out, request in progress or an error retrieving a Client ID from AMP Client ID service.	1 year	Google
_dc_gtm_--property-id--	Used by DoubleClick (Google Tag Manager) to help identify the visitors by either age, gender or interests.	2 years	Google
_ga	Contains a randomly generated user ID. Using this ID, Google Analytics can recognize returning users on this website and merge the data from previous visits.	2 year	Google
_gat_gtag	Certain data is only sent to Google Analytics a maximum of once per minute. As long as it is set, certain data transfers are prevented.	1 minute	Google
_gid	Contains a randomly generated user ID. Using this ID, Google Analytics can recognize returning users on this website and merge the data from previous visits.	24 hour	Google
_gac_gb	Contains campaign-related information for the user. If Google Analytics and Google Ads accounts are linked, the conversion tags on the Google Ads website read this cookie.	90 day	Google
_dc_gtm	Used to throttle the request rate.	1 minute	Google
IDE	Contains a randomly generated user ID. Using this ID, Google can recognize the user across different websites across domains and display personalized advertising.	1 year	Google
player	This cookie saves user-specific settings before an embedded Vimeo video is played. This means that the next time you watch a Vimeo video, your preferred settings will be loaded.	1 year	Vimeo
vuid	This cookie is used to save the usage history of the user.	2 year	Vimeo
__cf_bm	This cookie is used to distinguish between humans and bots. This is necessary for Vimeo to collect valid data about the use of the service.	1 day	Vimeo
_uetvid	This cookie is set to enable the use of the Vimeo video player.	1 year	Vimeo
_tt_enable_cookie	This cookie is used to enable the vimeo video embedding on the WU Website and for other unspecified purposes.	1 year	Vimeo
afUserId	This cookie collects data from users who interact with embedded Vimeo videos.	2 years	Vimeo
_abexps	This cookie saves settings made by the user, e.g. Default language, region or username as well as interaction data of the user with Vimeo	10 months	Vimeo
_clck	This cookie enables the use of the embedded Vimeo video player	1 year	Vimeo
has_logged_in	This cookie stores login information and if the user has ever logged in.	10 years	Vimeo
language	This cookie remembers the language setting of a user. This ensures that Vimeo appears in the language selected by the user.	11 years	Vimeo
_ttp	This cookie is set to enable the use of the Vimeo video player	1 year	Vimeo
sd_client_id	This cookie stores data about the users current video settings and a personal identification token	2 year	Vimeo
_rdt_uuid	This cookie collects data about the users actions on websites that have a vimeo video embedded.	3 months	Vimeo
vimeo_cart	This cookie is used to check how many times a video has been played by the user.	10 years	Vimeo
OptanonConsent	This cookie stores information about the consent status of a visitor.	1 year	Vimeo
_scid	This cookie is used to assign a unique ID to a user	10 months	Vimeo
hjSessionBenutzer_	Set when a user first lands on a page. Persists the Hotjar User ID which is unique to that site. Hotjar does not track users across different sites. Ensures data from subsequent visits to the same site are attributed to the same user ID.	1 year	Hotjar
_hjid	This is an old cookie which is not set anymore, but if a user has it unexpired in their browser. It will be reused and migrated to _hjSessionUser_{site_id}. Set when a user first lands on a page. Persists the Hotjar User ID which is unique to that site. Ensures data from subsequent visits to the same site are attributed to the same user ID.	1 year	Hotjar
_hjFirstSeen	Identifies a new users first session. Used by Recording filters to identify new user sessions.	30 minutes	Hotjar
_hjHasCachedUserAttributes	Enables us to know whether the data set in _hjUserAttributes Local Storage item is up to date or not.	session	Hotjar
_hjUserAttributesHash	Enables us to know when any User Attribute has changed and needs to be updated.	2 minutes	Hotjar
_hjBenutzerAttribute	Stores User Attributes sent through the Hotjar Identify API. No explicit expiration.	session	Hotjar
hjViewportId	Stores user viewport details such as size and dimensions.	session	Hotjar
hjActiveViewportIds	Stores user active viewports IDs. Stores an expirationTimestamp that is used to validate active viewports on script initialization.	session	Hotjar
_hjSession_	Holds current session data. Ensures subsequent requests in the session window are attributed to the same session.	30 minutes	Hotjar
_hjSessionTooLarge	Causes Hotjar to stop collecting data if a session becomes too large. Determined automatically by a signal from the server if the session size exceeds the limit.	1 hour	Hotjar
_hjSessionResumed	Set when a session/recording is reconnected to Hotjar servers after a break in connection.	session	Hotjar
_hjCookieTest	Checks to see if the Hotjar Tracking Code can use cookies. If it can, a value of 1 is set. Deleted almost immediately after it is created.	session	Hotjar
_hjLocalStorageTest	Checks if the Hotjar Tracking Code can use Local Storage. If it can, a value of 1 is set. Data stored in _hjLocalStorageTest has no expiration time, but it is deleted almost immediately after it is created.	none	Hotjar
_hjSessionStorageTest	Checks if the Hotjar Tracking Code can use Session Storage. If it can, a value of 1 is set. Data stored in _hjSessionStorageTest has no expiration time, but it is deleted almost immediately after it is created.	none	Hotjar
_hjIncludedInPageviewSample	Set to determine if a user is included in the data sampling defined by your site's pageview limit.	2 minutes	Hotjar
_hjIncludedInSessionSample_	Set to determine if a user is included in the data sampling defined by your site's daily session limit.	2 minutes	Hotjar
_hjAbsoluteSessionInProgress	Used to detect the first pageview session of a user.	30 minutes	Hotjar
_hjTLDTest	We try to store the _hjTLDTest cookie for different URL substring alternatives until it fails. Enables us to try to determine the most generic cookie path to use, instead of page hostname. It means that cookies can be shared across subdomains (where applicable). After this check, the cookie is removed.	session	Hotjar

Name	Purpose	Lifetime	Provider
test_cookie	Is set as a test to check whether the browser allows cookies to be set. Does not contain any identification features.	15 minute	Google
IDE	Contains a randomly generated user ID. Using this ID, Google can recognize the user across different websites across domains and display personalized advertising.	1 year	Google
_gcl_au	Contains a randomly generated user ID.	90 day	Google
_gcl_aw	This cookie is set when a user clicks on a Google ad on the website. It contains information about which ad was clicked.	90 day	Google
xs	Used to maintain a Facebook session. It works in combination with the c_user cookie to authenticate the user's identity on Facebook.	1 year	Facebook
fr	Used to serve advertisements and measure and improve their relevance.	90 day	Facebook
m_pixel_ratio	Performance cookie used by Facebook with Facebook pixels.	session	Facebook
wd	Used for analysis purposes. Technical parameters are logged (e.g. aspect ratio and dimensions of the screen) so that Facebook apps can be displayed correctly.	7 day	Facebook
dpr	Used for analysis purposes. Technical parameters are logged (e.g. aspect ratio and dimensions of the screen) so that Facebook apps can be displayed correctly.	7 day	Facebook
sb	Used to save browser details and Facebook account security information.	2 year	Facebook
dbln	Used to save browser details and Facebook account security information.	2 year	Facebook
spin	Cookie for advertising purposes and reporting on social campaigns.	session	Facebook
presence	Contains the "Chat" status of a logged in user.	1 month	Facebook
x-referer	Performance cookie that is used by Facebook in combination with Facebook pixels.	session	Facebook
cppo	Cookie for statistical purposes.	90 day	Facebook
datr	Identifies the browser for security and website integrity purposes, including account recovery and identification of potentially compromised accounts.	2 year	Facebook
locale	Saves language settings.	session	Facebook
_fbp	A cookie for Facebook advertising that is used to track and improve relevance and to serve ads on Facebook.	90 day	Facebook
_fbc	A cookie for Facebook advertising that is used to track and improve relevance and to serve ads on Facebook.	90 day	Facebook
UserMatchHistory	This cookie is used to synchronize the LinkedIn Ads IDs.	30 day	LinkedIn
AnalyticsSyncHistory	This cookie saves the time at which the user was synchronized with the "lms_analytics" cookie.	30 day	LinkedIn
li_oatml	This cookie is used to identify LinkedIn members outside of LinkedIn for advertising and analysis purposes.	30 day	LinkedIn
lms_ads	This cookie is used to identify LinkedIn members outside of LinkedIn.	30 day	LinkedIn
lms_analytics	This cookie is used to identify LinkedIn members for analysis purposes.	30 day	LinkedIn
li_fat_id	This cookie is an indirect member identification that is used for conversion tracking, retargeting and analysis.	30 day	LinkedIn
li_sugr	This cookie is used to determine probabilistic matches of the identity of a user.	90 day	LinkedIn
U	This cookie identifies the user’s browser.	3 month	LinkedIn
_guid	This cookie is used to identify a LinkedIn member for advertising via Google Ads.	90 day	LinkedIn
BizographicsOptOut	This cookie is used to determine the rejection status for tracking by third-party providers.	10 year	LinkedIn
lidc	This cookie makes it easier to select LinkedIn's data center.	24 hours	LinkedIn
aam_uuid	This cookie is used for ID synchronization with Adobe Audience Manager.	30 days	LinkedIn
AMCV_XXX_at_AdobeOrg	This cookie contains a unique identifier for the Adobe Experience Cloud.	180 days	LinkedIn
li_mc	This cookie is used as a temporary cache. It is used to have the user's consent information from the database available client side.	2 years	LinkedIn
lang	This cookie stores the language settings of a user. This ensures that the LinkedIn.com website appears in the language selected by the user.	session	LinkedIn
twll	This cookie is set when X is embedded on the page. X collects data that is mainly used for tracking and targeting.	4 year	X
secure_session	This cookie is set when X is embedded on the page. E.g. X's like or sharing functions.	14 year	X
guest_id	This cookie is set by X when a visitor shares content from the WU website on X.	2 year	X
personalization_id	This cookie is set by X to measure the performance of X advertising campaigns in a user's browsers and devices	2 year	X
remember_checked	This cookie is set by when X is embedded on the page. X collects data that is mainly used for tracking and targeting.	4 year	X
remember_checked_on	This cookie is set when X is embedded on the page. E.g. X's like or sharing functions.	4 year	X
mbox	This cookie is intended for identifying X users, for analyzing interaction with the X Service and advertising whitin the service.	2 years	X
guest_id_ads	This cookie is set due to X integration and for sharing content to social media.	10 months	X
d_prefs	This cookie ist used to check referral links and the login status.	90 days	X
ct0	This cookie is set due to X integration and sharing capabilities for the social media.	10 months	X
kdt	This cookie is used to monitor the users login status on X.	10 months	X
guest_id_marketing	This cookie is set for tracking and analytics purposes.	10 months	X
twid	This cookie checks if you are logged in to X during a browser session.	1 year	X
auth_token	This cookie is required for authentication and checks whether the user is logged in.	10 months	X
external_referer	This cookie collects statistical data, including how often you visit X and how long a user stays on X.	1 day	X
NID	This cookie contains a unique ID that is used to save user-specific settings and other information, in particular your preferred language, how many search results should be displayed per page and whether the Google SafeSearch filter should be activated.	6 month	YouTube
1P_JAR	This Google cookie is used to optimize advertising, to provide ads that are relevant to users, to improve reports on campaign performance or to prevent a user from seeing the same ads multiple times.	1 month	YouTube
CONSENT	This cookie is used to support Google's advertising services.	20 year	YouTube
OTZ	Aggregated analysis of website visitors.	17 day	YouTube