Topological Machine Learning for High-Dimensional Data Analysis

193460-Thumbnail Image.png
Description
This dissertation focuses on a comprehensive exploration of machine learning (ML) and topological data analysis (TDA) with applications for engineering and clinical diagnostics and prognostics. The interface of TDA and ML is called topological machine learning (TML). The key focus

This dissertation focuses on a comprehensive exploration of machine learning (ML) and topological data analysis (TDA) with applications for engineering and clinical diagnostics and prognostics. The interface of TDA and ML is called topological machine learning (TML). The key focus and benefit of the proposed TML are on the automated, consistent, and robust handling of high-dimensional data, specifically for the complexities inherent in spatial-temporal datasets. TML's unique ability to capture and quantify high-dimensional geometric and topological features (such as homology) facilitates a deep understanding of the underlying structures of data. The associated dimension reduction capabilities significantly enhance diagnostics and prognostics accuracy and interpretability. TML is first demonstrated using an unsupervised learning setting, where the label information is not required for machine learning. Spatial-temporal data from resting-state functional magnetic resonance imaging (rs-fMRI) are collected and analyzed for Parkinson's disease. Fractal analysis is used to extract topological characteristics of the signal, and extracted features are used in a manifold embedding and projection model for low-dimensional space visualization. The low-dimensional data is integrated with a neural network-based classifier for disease diagnosis. A similar methodology is extended to structural health monitoring problems in engineering. Following this, the TML is developed for a supervised learning setting, where the major application is regression and prediction. Euler characteristics using filtration are used as the topological feature extraction method and extracted features are used in Gaussian Process (GP) modeling for regression analysis. The methodology is first demonstrated with a toy random field problem where a time-dependent field is characterized by varying topological features. The developed method is then demonstrated with crack growth problems with numerical and experimental data. Finally, the topological data analysis is Reflecting on the significant strides made in pushing the envelope of theoretical knowledge while showcasing tangible applications, this work not only charts a course for future progress in the field but also enriches our understanding of machine learning, structural health monitoring, predictive modeling, and beyond. The exploration initiated in this dissertation is just the beginning, with each chapter paving the way for new realms of exploration, innovation, and discovery.
Date Created
2024
Agent