Dokument: Combining variables in clinical data using statistical ensemble methods

Titel:Combining variables in clinical data using statistical ensemble methods
Weiterer Titel:Kombination klinischer Variablen unter Verwendung von statistischen Ensemble-Methoden
URL für Lesezeichen:https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=59330
URN (NBN):urn:nbn:de:hbz:061-20220427-081419-1
Kollektion:Dissertationen
Sprache:Englisch
Dokumententyp:Wissenschaftliche Abschlussarbeiten » Dissertation
Medientyp:Text
Autor: Tietz, Tobias [Autor]
Dateien:
[Dateien anzeigen]Adobe PDF
[Details]5,18 MB in einer Datei
[ZIP-Datei erzeugen]
Dateien vom 18.04.2022 / geändert 18.04.2022
Beitragende:Prof. Dr. Schwender, Holger [Betreuer/Doktorvater]
Prof. Dr. Schwender, Holger [Gutachter]
Prof. Dr. Ickstadt, Katja [Gutachter]
Stichwörter:structural MRI, spatial hierarchical clustering, ensemble clustering, voxel-based morphometry, brain parcellation, clustering stability, logic regression, variable selection, importance measure, logicFS, time-to-event data, ensemble prediction
Dewey Dezimal-Klassifikation:500 Naturwissenschaften und Mathematik » 510 Mathematik
Beschreibung:In many clinical studies not the original variables but combinations of these variables are explanatory for the outcome of interest. Finding those combined features using statistical ensemble methods does not only improve prediction but also helps to get a better understanding of the underlying data generating processes.
Two different types of clinical data are considered in two different parts of this thesis, i.e., genotype data relating binarized genetic variations to a time-to-event in Part I and neuroimaging data consisting of structural brain scans in Part II.

In Part I, the combined features are complex interactions of binarized genetic variations, as they are often the actual explanatory features for predicting, e.g., the time to recurrence of a disease. survivalFS is an existing ensemble method searching for such interactions and ranking them according to a predictive partial log-likelihood based importance measure. To improve the ranking of the identified interactions, further importance measures are proposed which are based on two other popular goodness-of-fit measures as well as on a newly introduced adaptation of Harrel's concordance index, referred to as DPO-based C-index. Moreover, noise-adjusted importance measures are introduced correcting for noise-variables falsely reducing the estimated importance of explanatory interactions.
Part II builds upon the crucial and widely accepted concept that the human brain is organized into spatially contiguous, specialized brain regions, which are inter-connected by large-scale networks.
Such spatially contiguous brain regions, i.e., the combined features, are identified using existing spatial hierarchical agglomerative clustering methods as well as the newly proposed SPARTACUS (SPAtial hieRarchical agglomeraTive vAriable ClUStering) method for clustering variables. Subsampling based clustering stability and clustering quality approaches are employed to identify interesting numbers of brain regions and higher-quality brain regions are searched for using ensemble clustering methods.

The performance of the ensemble methods to find combined features is evaluated and compared with popular competing methods, i.e., an importance measure for bivariate variable interactions from random survival forests and spatial spectral clustering, in application to simulated and real data. These applications show that the ensemble methods are able to stably identify combined features and to outperform the competing methods.
Lizenz:In Copyright
Urheberrechtsschutz
Fachbereich / Einrichtung:Mathematisch- Naturwissenschaftliche Fakultät » WE Mathematik » Mathematische Optimierung
Dokument erstellt am:27.04.2022
Dateien geändert am:27.04.2022
Promotionsantrag am:09.11.2021
Datum der Promotion:21.03.2022
english
Benutzer
Status: Gast
Aktionen