Visualization

Team: Axel BennerManuela Hummel, Christina Kunz, Maral Saadati (former members Xiaoqi Jiang, Martin Sill, Alla Slynko, Manuela Zucknick)

With the continual development in biomedical research and the newly arising data types as a consequence, appropriate visualization techniques are needed. Since well-designed statistical graphics are necessary in order to summarize and effectively communicate data, visualization and exploratory data analysis are the traditional statistical research fields that are important for all types of statistical data analysis. Especially with high-dimensional data sets, generating tables and summary statistics is often not sufficient to gain a deeper understanding of the data. The exploratory analysis of such high-dimensional data becomes even more challenging when data of different types, i.e., clinical, gene expression, and copy number variation data, must be combined. We then must design new statistical graphics and develop visualization software tailored to the specific research questions and data sets that we are working with.

Ongoing projects

The DORES plot

The DORES (dose-response screening) plot provides multiple purposes of visualizations, which can be applied not only in screening and comparing dose-response data from a larger number of experiments, but also in optimizing the dose (concentration) range and levels in Dose Range Finding (DRF) studies. The greatest advantage of a DORES plot is using just one plot to give an overview of multiple dose-response experiments and provide a lot of information about experimental data, i.e., which concentration levels and ranges are chosen for each experiment and whether all experiments exhibit explicit and complete dose-response relationships. Additionally, an effective concentration estimate of interest (e.g., ECx and ICx) and its corresponding confidence interval are displayed for each experiment. Moreover, experimental data of a dose-response study involving a large number of experiments is easily comparable using a DORES plot.

© dkfz.de

Visualization of MCMC results for high-dimensional Bayesian hierarchical models

Full posterior inference for Bayesian models provides much more information than classical (frequentist) modeling, because the entire posterior distribution rather than just individual parameters can be explored. In a Bayesian model or variable selection framework, it is of particular interest that the joint probability distribution of variables and models, in addition to the marginal (relative) importance of individual variables, can be studied. However, for high-dimensional models the posterior distribution space of such models becomes immense and highly complex, which makes it necessary to use visualization techniques to summarize the available information.

Completed projects

SEURAT: Visual analytics for the integrated analysis of microarray data

SEURAT is a software tool that provides interactive visualization capability for the integrated analysis of high-dimensional gene expression data. Gene expression data can be analyzed together with associated clinical data, array CGH (comparative genomic hybridization), SNP array (single nucleotide polymorphism) data, and available gene annotations in an integrated manner, whereby the different data types are organized by a comprehensive data manager. Interactive tools are also provided for all graphics: heat maps, dendrograms, bar charts, histograms, event charts and a chromosome browser, which displays genetic variations along the genome. For exploratory data analysis, the software also provides unsupervised data analytics, such as clustering, biclustering and seriation algorithms. To perform clustering and seriation algorithms SEURAT establishes a connection to the R statistical software via the Rserve client. Potentially, this connection allows the software to use all functions implemented in R and Bioconductor. Seurat runs under Windows, MAC OS and Linux.

Software: http://seurat.r-forge.r-project.org/

Publication:

  • Gribov A, Sill M, Lück S, Rücker F, Döhner K, Bullinger L, Benner A, Unwin A. SEURAT: visual analytics for the integrated analysis of microarray data. BMC Med Genomics. 2010 Jun 3;3:21. doi: 10.1186/1755-8794-3-21. URL: http://www.biomedcentral.com/1755-8794/3/21

Graphical Displays for Biomarker Data

Analysis of high-dimensional biomarker data is in general of exploratory nature and aims to discover or dissect subgroups of patients sharing a specific pattern of biomarker measurements. One major challenge is to extract the relevant markers from the extremely large pool of measured markers. Specific techniques, such as grouping and ordering, and dimension reduction, allow us to aggregate huge amounts of data into single meaningful graphics. These graphics can guide the direction of exploration during the analysis. We present graphical tools for unsupervised and supervised objectives based on gene expression data of multiple myeloma patients who are part of the MAQC-II project.

Software: Software is available from the homepage of the book: http://www.elmo.ch/doc/life-science-graphics.

Publication:

  • Zucknick M, Hielscher T, Sill M, Benner A (2012). Graphical Displays for Biomarker Data. In Eds. Krause A, O’Connell M, A Picture is Worth a Thousand Tables: Graphics in Life Sciences, Springer. doi: 10.1007/978-1-4614-5329-1_8

to top