Cookie Hinweis

Wir verwenden Cookies, um Ihnen ein optimales Webseiten-Erlebnis zu bieten. Dazu zählen Cookies, die für den Betrieb der Seite notwendig sind, sowie solche, die lediglich zu anonymen Statistikzwecken, für Komforteinstellungen oder zur Anzeige personalisierter Inhalte genutzt werden. Sie können selbst entscheiden, welche Kategorien Sie zulassen möchten. Bitte beachten Sie, dass auf Basis Ihrer Einstellungen womöglich nicht mehr alle Funktionalitäten der Seite zur Verfügung stehen. Weitere Informationen finden Sie in unseren Datenschutzhinweisen .


Diese Cookies sind für die Funktionalität unserer Website erforderlich und können nicht deaktiviert werden.

Name Webedition CMS
Zweck Dieses Cookie wird vom CMS (Content Management System) Webedition für die unverwechselbare Identifizierung eines Anwenders gesetzt. Es bietet dem Anwender bessere Bedienerführung, z.B. Speicherung von Sucheinstellungen oder Formulardaten. Typischerweise wird dieses Cookie beim Schließen des Browsers gelöscht.
Name econda
Zweck Session-Cookie für die Webanalyse Software econda. Diese läuft im Modus „Anonymisiertes Messen“.

Diese Cookies helfen uns zu verstehen, wie Besucher mit unserer Webseite interagieren, indem Informationen anonym gesammelt und analysiert werden. Je nach Tool werden ein oder mehrere Cookies des Anbieters gesetzt.

Name econda
Zweck Measure with Visitor Cookie emos_jcvid
Externe Medien

Inhalte von externen Medienplattformen werden standardmäßig blockiert. Wenn Cookies von externen Medien akzeptiert werden, bedarf der Zugriff auf diese Inhalte keiner manuellen Zustimmung mehr.

Name YouTube
Zweck Zeige YouTube Inhalte
Name Twitter
Zweck Twitter Feeds aktivieren
Predictive modeling with clinical and molecular data

Predictive modeling

Predictive modeling with clinical and molecular data

Software available

Establishing statistical models for prediction of patient prognosis and/or to investigate the (individual) treatment effect based on clinical and molecular information constitute the core of translational cancer research. It is the prerequisite for the identification of biologically distinct tumor subgroups and allows making treatment decisions tailored to a patient's specific needs. Prognostic biomarkers group patients according to their overall prognosis, e.g., with respect to overall survival. In contrast, predictive biomarkers refine the individual treatment effect and thus form the basis for personalized medicine. Whenever a study fails to show a global treatment effect one hopes to find a treatment effect in at least one subgroup of patients. Tree-based methods are appealing for this purpose, as they do not require an exhaustive search over all possible subgroups. Moreover, they can easily be applied to high-dimensional settings when used in conjunction with ensemble methods, in particular random forests. We extended a model-based recursive partitioning method for subgroup analyses to specifically identify predictive biomarkers by reparametrizing the base model. We tailored our method for application in the randomized clinical trial setting and recently adjusted it for use in observational data analysis as well.
An alternative to ensemble methods for coping with high-dimensional data are regularization approaches. For modelling survival endpoints, we have extensively investigated regularization approaches based on penalized partial likelihood maximisation and developed recommendations for their use. When performing regression modeling in very high covariate dimensions, it is often required to reduce the number of covariates through preliminary screening. A large number of variable screening methods are available by now but there is a lack of guidance on how to select an appropriate method in practice. Specifically for survival analysis, we provided an overview of marginal variable screening methods and made recommendations for their application.
Not only the covariate space, also the survival endpoint itself can be complex. Common prediction models predominantly use composite endpoints such as event-free survival (EFS) or relapse-free survival (RFS). However, time-to-first-event endpoints do not incorporate important aspects of the individual course of the disease. For modeling competing risks data in higher dimensions, we provided a penalized cause-specific hazards approach. The idea is to link the independently penalized cause-specific hazards models by choosing the combination of tuning parameters that yields the best prediction with respect to the incidence of the event of interest at a fixed time point. A multi-state model is required to more accurately capture pathogenic disease processes and underlying etiologies. To decompose EFS and/or RFS accordingly, taking into account high-dimensional molecular factors, we are currently extending model selection and model reduction methods based on stratified reparametrization to combine homogeneous effects for different transitions and data-driven covariate selection using regularization methods.
We applied predictive modeling approaches in a multitude of situations. For instance, to identify patients with chronic lymphocytic leukemia who particularly benefit from chemoimmunotherapy with Fludarabine, Cyclophosphamide, and Rituximab (FCR), we used integrative penalized Cox regression models combining established prognostic factors and gene expression profiles from a phase III clinical trial comparing first-line treatment with FC or FCR. In accompanying research to clinical trials on acute myeloid leukaemia (AML), we characterized the mutation landscape of AML patients. Targeted sequencing data were evaluated by various statistical approaches to reconstruct the temporal order of mutational evolution. A hierarchical Dirichlet process extracted possible biological subtypes of AML and random survival forests were fitted to evaluate the impact of clinical and genetic features. As the data sets we use to build prediction models often involve molecular data, appropriate statistical analysis methods for molecular data from various sources is needed. We addressed the evaluation of statistical methods for accurate detection of methylated and hydroxymethylated CpGs using methylation arrays and investigated the use of Ago-RIP-Seq experiments for the identification of microRNA targets. In cooperation with the Section of Allogeneic Stem Cell Transplantation at Heidelberg University Hospital we investigate the usefulness of EASIX as a prognostic and predictive biomarker for several diseases and endpoints. For instance, we illustrate the prognostic and predictive value of EASIX for time-to-sepsis, the effectiveness of statin-based prophylaxis for non-relapse mortality in different EASIX subgroups and the prognostic value of EASIX for severe complications after CAR-T cell therapy.


  • Benner, A., Zucknick, M., Hielscher, T., Ittrich, C. and Mansmann, U. High-dimensional Cox models: The choice of penalty as part of the model building process. Biom. J., 52: 50-69 (2010)
  • Bloehdorn J. et al. Integrative prognostic models predict long-term survival after immunochemotherapy in chronic lymphocytic leukemia patients. Haematologica 107(3): 615 (2022)
  • Edelmann D. et al. Marginal variable screening for survival endpoints. Biom. J. 62: 610-626 (2020)
  • Krzykalla J. et al. Exploratory identification of predictive biomarkers in randomized trials with normal endpoints. Stat. Med. 39: 923-939 (2020)
  • Krzykalla J. et al. Tree-based exploratory identification of predictive biomarkers in observational data. arXiv preprint arXiv:2212.08460 (2022)
  • Luft T. et al. EASIX in patients with acute graft-versus-host disease: a retrospective cohort analysis. Lancet Haematol. 2017 Sep;4(9):e414-e423. doi: 10.1016/S2352-3026(17)30108-4.
  • Saadati M, Benner A. Statistical challenges of high-dimensional methylation data. Stat Med. 33(30):5347-5357 (2014).
  • Saadati M. et al. Prediction accuracy and variable selection for penalized cause-specific hazards models. Biom. J. 60: 288-306 (2018)
  • Slynko A., Benner A. Statistical methods for classification of 5hmC levels based on the Illumina Infinium HumanMethylation450 (450k) array data, under the paired bisulfite (BS) and oxidative bisulfite (oxBS) treatment. PLoS One. 14(6) (2019)
  • Tichy D. et al., Experimental design and data analysis of Ago-RIP-Seq experiments for the identification of microRNA targets. Brief. Bioinform. 19: 918-929 (2018)

to top
powered by webEdition CMS