Dokument: Inference and analysis of recurring genomic variation in human populations

Titel:Inference and analysis of recurring genomic variation in human populations
URL für Lesezeichen:https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=71747
URN (NBN):urn:nbn:de:hbz:061-20251218-105019-2
Kollektion:Dissertationen
Sprache:Englisch
Dokumententyp:Wissenschaftliche Abschlussarbeiten » Dissertation
Medientyp:Text
Autor: Ashraf, Hufsah [Autor]
Dateien:
[Dateien anzeigen]Adobe PDF
[Details]49,81 MB in einer Datei
[ZIP-Datei erzeugen]
Dateien vom 17.12.2025 / geändert 17.12.2025
Beitragende:Prof. Dr. Marschall, Tobias [Gutachter]
Prof. Dr. Dilthey, Alexander [Gutachter]
Prof. Dr. Ossowski, Stephan [Gutachter]
Dewey Dezimal-Klassifikation:500 Naturwissenschaften und Mathematik » 570 Biowissenschaften; Biologie
Beschreibung:Genomic variation is fundamental to understanding human biology, from population-level diversity to disease susceptibility. Recent advances in both sequencing technologies and computational methods have markedly improved our ability to detect and interpret variation across human genomes, yet challenges persist. This dissertation presents a series of computational approaches aimed at improving the identification, characterization, and analysis of genomic variants, with a particular focus on inversions and their recurrence across human populations.

The first part of this thesis introduces k-merald, a method developed to improve allele detection accuracy while using error-prone sequencing data. By leveraging platform-specific error profiles, k-merald enhances alignment reliability and substantially reduces genotyping error rates, especially under low-coverage conditions.

The second part presents ArbiGent, a tool for genotyping inversions and copy number variants using Strand-seq data. ArbiGent corrects for alignment artifacts by normalizing Strand-seq read counts based on locus-specific mappability, improving genotyping reliability in repetitive regions. Its role in generating a high-confidence inversion callset, as part of a project under the Human Genome Structural Variation Consortium (HGSVC), is also presented in this part. This callset revealed that inversions affect a larger portion of the genome than other variant types and are enriched in highly repetitive and disease-associated regions of the genome.

The third part introduces the new concept of toggling-indicating SNPs (tiSNPs) and describes an inversion recurrence detection approach that analyzes allelic patterns of within-inversion SNPs across haplotypes to distinguish between single and recurrent inversion events. This approach, supported by orthogonal validation, revealed widespread inversion recurrence, including events that overlap known disease-associated loci.

The fourth part introduces Pivot, a tool for detecting recurrent inversions within a graph-based pangenomic framework, eliminating reference bias. By incorporating all within-inversion variants into the analysis, Pivot displays high sensitivity for recurrence detection. Applied to diverse haplotype panels from the HGSVC and the Human Pangenome Reference Consortium (HPRC), Pivot detected novel recurrent inversions and revealed recurrence evidence in multiple disease-relevant regions.

The final part of this dissertation extends the investigation of recurrence beyond inversions and presents an approach to detect recurrent deletions. This analysis identified several candidate recurrent deletions in a cohort of 1,019 samples from the 1000 Genomes Project (1KGP), flanked by transposable elements, implicating non-allelic homologous recombination as a potential mechanism.
Lizenz:Creative Commons Lizenzvertrag
Dieses Werk ist lizenziert unter einer Creative Commons Namensnennung 4.0 International Lizenz
Fachbereich / Einrichtung:Mathematisch- Naturwissenschaftliche Fakultät
Dokument erstellt am:18.12.2025
Dateien geändert am:18.12.2025
Promotionsantrag am:24.04.2025
Datum der Promotion:12.12.2025
english
Benutzer
Status: Gast
Aktionen