Dokument: Detection of functional modules in genomic and metagenomic datasets
Titel: | Detection of functional modules in genomic and metagenomic datasets | |||||||
Weiterer Titel: | Detektion funktioneller Module in genomischen und metagenomischen Datensätzen | |||||||
URL für Lesezeichen: | https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=38592 | |||||||
URN (NBN): | urn:nbn:de:hbz:061-20160620-144052-7 | |||||||
Kollektion: | Dissertationen | |||||||
Sprache: | Englisch | |||||||
Dokumententyp: | Wissenschaftliche Abschlussarbeiten » Dissertation | |||||||
Medientyp: | Text | |||||||
Autor: | Dr. Konietzny, Sebastian Gil Anthony [Autor] | |||||||
Dateien: |
| |||||||
Beitragende: | Prof. Dr. McHardy, Alice [Gutachter] Prof. Dr. Lercher, Martin [Gutachter] | |||||||
Stichwörter: | latent dirichlet allocation, probabilistic topic models, metabolic pathways, metagenomes | |||||||
Dewey Dezimal-Klassifikation: | 000 Informatik, Informationswissenschaft, allgemeine Werke » 004 Datenverarbeitung; Informatik | |||||||
Beschreibung: | Cellular processes typically correspond to one or more functional modules, which represent groups of functionally interacting proteins. Common examples of functional modules are metabolic pathways, protein complexes, and signal transduction chains. Studying the composition of functional modules is an important challenge because it paves the way to exploiting microbial proteins for improvements of biotechnological techniques. The problem here is to identify interacting proteins given only their gene sequences, and to understand the cross-effects between individual protein functions.
Proteins are encoded in the genes of organisms, and result as products of gene expression. With modern DNA sequencing techniques, it became a highly automated and relatively cheap process to access the gene repertoires (‘genomes’) of organisms. As a consequence, thousands of sequenced genomes became available in public databases, and the numbers are rapidly increasing. Moreover, modern techniques enabled metagenome studies of microbial communities, i.e. the sequencing of environmental DNA probes without the need of cultivating organisms in the laboratories. A so-called metagenome thus represents the mixed genetic material of a microbial community of species. A common approach for detecting interacting proteins is referred to as phylogenetic profiling. Its basic assumption is that functionally coupled genes tend to co-evolve, which suggests that protein-protein interactions (PPIs) are detectable from gene co-occurrence patterns across sets of genomes. This principle enables a computational identification of pairwise interactions of proteins and of groups of interacting proteins. The key challenge of this PhD project was to develop new machine-learning-based methods for the computational detection of functional modules based on the principles of phylogenetic profiling. Notably, only a few previous studies had analyzed the applicability of standard phylogenetic profiling methods on large collections of genomes before, and the analysis of metagenomic datasets was largely untouched. The author’s main scientific contributions are the development and evaluation of two new methods for functional module inference for genomic and metagenomic input datasets (Konietzny et al., 2011, 2014). These methods are based on probabilistic topic models, which originally stem from the field of text mining, and the idea of applying such models to gene sets of (meta)genomes is new. Topic models are Bayesian graphical models which are known to be robust against noise in the input data. This property is important for the analysis of gene presence/absence patterns because currently available methods for DNA sequencing and gene prediction can produce erroneous outputs. Moreover, the newly developed methods discussed in this thesis enable the identification of genomic elements, that is, proteins and entire functional modules that are linked to specific capabilities of cells (‘phenotypic traits’ of organisms, or ‘phenotype’ for short). Therefore, they represent valuable instruments for the identification of biocatalysts from microbes which might enable innovations in biotechnology and medical health care. | |||||||
Lizenz: | Urheberrechtsschutz | |||||||
Fachbereich / Einrichtung: | Mathematisch- Naturwissenschaftliche Fakultät » WE Informatik » Bioinformatik | |||||||
Dokument erstellt am: | 20.06.2016 | |||||||
Dateien geändert am: | 20.06.2016 | |||||||
Promotionsantrag am: | 26.11.2015 | |||||||
Datum der Promotion: | 09.05.2016 |