Dokument: Drug response prediction with multi-output machine learning methods: pitfalls and new directions

Titel:Drug response prediction with multi-output machine learning methods: pitfalls and new directions
Weiterer Titel:Drug response prediction with multi-output machine learning methods: pitfalls and new directions
URL für Lesezeichen:https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=72297
URN (NBN):urn:nbn:de:hbz:061-20260218-110016-0
Kollektion:Dissertationen
Sprache:Englisch
Dokumententyp:Wissenschaftliche Abschlussarbeiten » Dissertation
Medientyp:Text
Autor: Tran, Nguyen Khoa [Autor]
Dateien:
[Dateien anzeigen]Adobe PDF
[Details]8,56 MB in einer Datei
[ZIP-Datei erzeugen]
Dateien vom 14.02.2026 / geändert 14.02.2026
Beitragende: Klau, Gunnar W. [Betreuer/Doktorvater]
Prof. Dr. Ebenhöh, Oliver [Gutachter]
Dewey Dezimal-Klassifikation:500 Naturwissenschaften und Mathematik » 570 Biowissenschaften; Biologie
Beschreibungen:Predicting the effect of anti-cancer drugs on cancer cells is essential to understanding biological and chemical mechanisms relevant to cancer treatment. Current drug response prediction methods are typically machine learning models trained on two inputs, multiomics
features of cancer cell lines and chemical features of drugs, to predict outputs, commonly represented by drug response metrics such as IC50 or AUC. In this dissertation, limitations of existing methods as well as currently used data and metrics are discussed,
and shifts in research focus are suggested.

First, it is explored whether multi-output support vector regression can capture correlations across different outputs. The findings indicate that support vector regression is unsuitable for this task, in contrast to artificial neural networks.

Next, the performance of neural networks on the large-scale cancer research dataset from the DepMap project is investigated. Particularly, TGSA is examined, the currently leading deep learning method. While TGSA sometimes learns correlations between outputs, these correlations lack biological and chemical relevance. Furthermore, TGSA does not always exceed baseline performance, and when it does, it still fails to surpass a simple multilayer perceptron. These deficiencies are attributed to data inconsistencies in both input and output data as well as unsuited modeling of pharmacodynamics, and alternative modeling approaches are proposed.

As an initial step toward addressing the deficiencies regarding output data, an alternative to traditional drug response metrics like IC50 and AUC, derived from dose-response curves, is needed. Dose-response curves are modeled as 4PL curves, which rely on relatively few measurements and are prone to instability. Live-cell imaging data are images of cell cultures captured at user-defined time intervals (e.g., every 15 minutes), adding a time dimension to increase the number of measurements, improving model stability. By combining the 4PL curve with a logistic function, a new model, VUScope, is introduced to fit dose-time-response surfaces. Along with VUScope, a new drug response metric, GRIVUS, is proposed to replace IC50 and AUC in the long run as GRIVUS allows for more equitable comparisons across different cell lines and drugs. VUScope also enables the prediction of long-term drug responses based on short-term data and still yields reliable results when applied to datasets with few time points (e.g., measurements taken every 24 hours). This makes VUScope compatible with traditional HTS data, enabling labs without live-cell imaging systems to use VUScope by taking measurements every 24 hours. Moderate to high correlations can be observed between drug response results obtained from live-cell imaging and traditional HTS data.

In conclusion, pitfalls in current drug response prediction research are identified and new directions are outlined, providing valuable insights for both wet-lab and dry-lab experiments aimed at advancing cancer treatment.

Predicting the effect of anti-cancer drugs on cancer cells is essential to understanding biological and chemical mechanisms relevant to cancer treatment. Current drug response prediction methods are typically machine learning models trained on two inputs, multiomics
features of cancer cell lines and chemical features of drugs, to predict outputs, commonly represented by drug response metrics such as IC50 or AUC. In this dissertation, limitations of existing methods as well as currently used data and metrics are discussed,
and shifts in research focus are suggested.

First, it is explored whether multi-output support vector regression can capture correlations across different outputs. The findings indicate that support vector regression is unsuitable for this task, in contrast to artificial neural networks.

Next, the performance of neural networks on the large-scale cancer research dataset from the DepMap project is investigated. Particularly, TGSA is examined, the currently leading deep learning method. While TGSA sometimes learns correlations between outputs, these correlations lack biological and chemical relevance. Furthermore, TGSA does not always exceed baseline performance, and when it does, it still fails to surpass a simple multilayer perceptron. These deficiencies are attributed to data inconsistencies in both input and output data as well as unsuited modeling of pharmacodynamics, and alternative modeling approaches are proposed.

As an initial step toward addressing the deficiencies regarding output data, an alternative to traditional drug response metrics like IC50 and AUC, derived from dose-response curves, is needed. Dose-response curves are modeled as 4PL curves, which rely on relatively few measurements and are prone to instability. Live-cell imaging data are images of cell cultures captured at user-defined time intervals (e.g., every 15 minutes), adding a time dimension to increase the number of measurements, improving model stability. By combining the 4PL curve with a logistic function, a new model, VUScope, is introduced to fit dose-time-response surfaces. Along with VUScope, a new drug response metric, GRIVUS, is proposed to replace IC50 and AUC in the long run as GRIVUS allows for more equitable comparisons across different cell lines and drugs. VUScope also enables the prediction of long-term drug responses based on short-term data and still yields reliable results when applied to datasets with few time points (e.g., measurements taken every 24 hours). This makes VUScope compatible with traditional HTS data, enabling labs without live-cell imaging systems to use VUScope by taking measurements every 24 hours. Moderate to high correlations can be observed between drug response results obtained from live-cell imaging and traditional HTS data.

In conclusion, pitfalls in current drug response prediction research are identified and new directions are outlined, providing valuable insights for both wet-lab and dry-lab experiments aimed at advancing cancer treatment.
Lizenz:Creative Commons Lizenzvertrag
Dieses Werk ist lizenziert unter einer Creative Commons Namensnennung 4.0 International Lizenz
Fachbereich / Einrichtung:Mathematisch- Naturwissenschaftliche Fakultät » WE Informatik » Algorithmische Bioinformatik
Dokument erstellt am:18.02.2026
Dateien geändert am:18.02.2026
Promotionsantrag am:10.10.2025
Datum der Promotion:02.02.2026
english
Benutzer
Status: Gast
Aktionen