Research Seminar Series in Statistics and Mathematics

Wirtschaftsuniversität Wien, Departments 4 D4.4.00809:00 - 10:30

Type Lecture / discussion
SpeakerWayne Oldford (Department of Statistics and Actuarial Science, University of Waterloo)
Organizer Institut für Statistik und Mathematik

Visu­al­iz­a­tion is an im­port­ant as­set to data ana­lysis, both in com­mu­nic­at­ing res­ults and in ex­plic­at­ing the ana­lysis nar­rat­ive which led to them. However, it is some­times at its most power­ful when used prior to com­mit­ment to any ana­lysis nar­rat­ive, simply to ex­plore the data with min­imal pre­ju­dice. This is ex­plor­at­ory visu­al­iz­a­tion and its goal is to re­veal struc­ture in the data, espe­cially unanti­cip­ated struc­ture. In­sights gained from ex­plor­at­ory visu­al­iz­a­tion can in­form and possibly sig­ni­fic­antly af­fect any sub­sequent ana­lysis nar­rat­ive.
The size of mod­ern data, in di­men­sion­al­ity and in num­bers of ob­ser­va­tions, poses a for­mid­able chal­lenge for ex­plor­at­ory visu­al­iz­a­tion. First, di­men­sion­al­ity is lim­ited to at most three phys­ical di­men­sions both by the hu­man visual sys­tem and by mod­ern dis­play tech­no­logy. Second, the num­ber of ob­ser­va­tions that can be in­di­vidu­ally dis­played on any device is con­strained by the mag­nitude and res­ol­u­tion of its dis­play screen. The chal­lenge is to develop meth­ods and tools that en­able ex­plor­at­ory visu­al­iz­a­tion of mod­ern data in the face of such con­straints.
Some meth­ods and soft­ware which we have designed to ad­dress this chal­lenge will be presen­ted in this talk (based on joint work with Ad­rian Wad­dell, Adam Rah­man, Marius Hofert, or Cath­er­ine Hur­ley). Most of the talk will fo­cus on the prob­lem of ex­plor­ing higher di­men­sional spaces, largely through de­fin­ing, fol­low­ing, and present­ing “in­ter­est­ing” low di­men­sional tra­ject­or­ies through high di­men­sional space. Both spa­tial and tem­poral strategies will be used to al­low visual tra­versal of the tra­ject­or­ies. Soft­ware which fa­cil­it­ates ex­plor­a­tion via these tra­ject­or­ies will be demon­strated (based mainly on the in­ter­act­ive and ex­tend­ible ex­plor­at­ory visu­al­iz­a­tion sys­tem called ‘loon’, and ‘zen­plots’, each of which are avail­able as an ‘R’ pack­age from CRAN). If time per­mits, our meth­od­o­logy (and soft­ware) for re­du­cing the num­ber of ob­ser­va­tions (without com­prom­ising too much either the em­pir­ical dis­tri­bu­tion or im­port­ant geo­met­ric fea­tures of the high di­men­sional point-cloud) will also be presen­ted.

