Visual Analytics Approaches to Genomic Data Representation: From Molecular Storage to Multidimensional Projection
Loading...
Files
Date
Publisher
Philipps-Universität Marburg
Supervisors
Abstract
With the increase in data generation, particularly in DNA-based storage and phylogenomics, advanced visualization techniques and effective data-user congruence are essential. This thesis addresses the multifaceted challenges at this intersection, focusing on innovative approaches to manage, visualize, and analyze genomic data. At the heart of our exploration is the quest to bridge the gap between raw data and its meaningful representation. The initial analysis critically examines the potential of DNA as a green data storage medium, highlighting its superior density, energy efficiency, and retention time compared to conventional storage paradigms. However, the path from theory to practice
is fraught with challenges. The physical storage of synthetic DNA, its vulnerability to degradation under adverse conditions, and the balance between data redundancy for error correction and logical data density are discussed in detail. Furthermore, the exploration delves into the nuances of the DNA data storage channel, focusing on the design considerations for DNA-based storage, especially considering its long-term archiving and environmental impact. Building on these insights, DNAsmart, a novel tool developed for the visual exploration of genomics data within the context of DNA-based storage, is introduced. Domain experts conventionally use
intricate, non-visual methods to evaluate and compare encoding sequences. However, DNAsmart integrates these evaluation methods with interactive visual techniques, creating an accessible and user-friendly framework for visual analytics. This tool not only upholds the integrity of data stored within DNA but also provides a straightforward interpretation of the storage process, enabling users to identify inefficiencies, detect patterns, and fully utilize the potential of DNA-based storage.
Transitioning from storing data at the molecular level to the taxonomic classification of organisms, an area undergoing significant disruption due to advances in genome sequencing. Here, the spotlight is on the Context-Aware Phylogenetic Trees (CAPT) tool, highlighting its transformative potential to reshape traditional taxonomic practice. By using context-aware visual solutions, CAPT addresses the challenges inherent in reconciling traditional taxonomic methods with genomic data. The potential of CAPT to revolutionize taxonomy, both in terms of accuracy and scalability, is discussed, and its implications for broader biological research are highlighted. Expanding beyond the realm of genomics, this thesis also deals with multidimensional projections and presents an extensive and innovative design space of visual techniques designed for multidimensional data projections. The systematic exploration and implementation of the design space further enable the development of effective visual techniques suitable for genomic data. However, while rooted in genomic data, the methods and insights presented have broader application in the visualization domain, underscoring the universality of the challenges and solutions discussed. In conclusion, this thesis provides a systematic exploration of the challenges and opportunities at the intersection of genomics and visual analytics. By systematically addressing the separation between data and its visualization, this thesis not only provides a roadmap for current challenges but also lays the groundwork for future innovations in the domain. Additionally, the transformative potential of visual analytics in genomics is highlighted, expanding the possibilities for further interdisciplinary research.
Review
Metadata
Contributors
Supervisor:
Dates
Created: 2023Issued: 2025-03-13Updated: 2025-03-13
Faculty
Fachbereich Mathematik und Informatik
Publisher
Philipps-Universität Marburg
Language
eng
Data types
DoctoralThesis
Keywords
BioinformatikTaxonomieMachine LearningTaxonomyMultidimensional ProjectionDNA-DatenspeicherungVisualisierungGenomicsVisuelle AnalytikVisual AnalyticsBioinformaticsDNA Data StorageVisualizationMaschinelles LernenGenomikMultidimensionale Projektion
DDC-Numbers
500
show more
Ezekannagha, Chisom (Dr.) (0000-0003-3433-3340): Visual Analytics Approaches to Genomic Data Representation: From Molecular Storage to Multidimensional Projection. : Philipps-Universität Marburg 2025-03-13. DOI: https://doi.org/10.17192/z2024.0224.