Optimal Sampling Designs for Functional Data Analysis

158208-Thumbnail Image.png
Description
Functional regression models are widely considered in practice. To precisely understand an underlying functional mechanism, a good sampling schedule for collecting informative functional data is necessary, especially when data collection is limited. However, scarce research has been conducted on the

Functional regression models are widely considered in practice. To precisely understand an underlying functional mechanism, a good sampling schedule for collecting informative functional data is necessary, especially when data collection is limited. However, scarce research has been conducted on the optimal sampling schedule design for the functional regression model so far. To address this design issue, efficient approaches are proposed for generating the best sampling plan in the functional regression setting. First, three optimal experimental designs are considered under a function-on-function linear model: the schedule that maximizes the relative efficiency for recovering the predictor function, the schedule that maximizes the relative efficiency for predicting the response function, and the schedule that maximizes the mixture of the relative efficiencies of both the predictor and response functions. The obtained sampling plan allows a precise recovery of the predictor function and a precise prediction of the response function. The proposed approach can also be reduced to identify the optimal sampling plan for the problem with a scalar-on-function linear regression model. In addition, the optimality criterion on predicting a scalar response using a functional predictor is derived when the quadratic relationship between these two variables is present, and proofs of important properties of the derived optimality criterion are also provided. To find such designs, an algorithm that is comparably fast, and can generate nearly optimal designs is proposed. As the optimality criterion includes quantities that must be estimated from prior knowledge (e.g., a pilot study), the effectiveness of the suggested optimal design highly depends on the quality of the estimates. However, in many situations, the estimates are unreliable; thus, a bootstrap aggregating (bagging) approach is employed for enhancing the quality of estimates and for finding sampling schedules stable to the misspecification of estimates. Through case studies, it is demonstrated that the proposed designs outperform other designs in terms of accurately predicting the response and recovering the predictor. It is also proposed that bagging-enhanced design generates a more robust sampling design under the misspecification of estimated quantities.
Date Created
2020
Agent

Novel Deep Learning Models for Medical Imaging Analysis

157808-Thumbnail Image.png
Description
Deep learning is a sub-field of machine learning in which models are developed to imitate the workings of the human brain in processing data and creating patterns for decision making. This dissertation is focused on developing deep learning models for

Deep learning is a sub-field of machine learning in which models are developed to imitate the workings of the human brain in processing data and creating patterns for decision making. This dissertation is focused on developing deep learning models for medical imaging analysis of different modalities for different tasks including detection, segmentation and classification. Imaging modalities including digital mammography (DM), magnetic resonance imaging (MRI), positron emission tomography (PET) and computed tomography (CT) are studied in the dissertation for various medical applications. The first phase of the research is to develop a novel shallow-deep convolutional neural network (SD-CNN) model for improved breast cancer diagnosis. This model takes one type of medical image as input and synthesizes different modalities for additional feature sources; both original image and synthetic image are used for feature generation. This proposed architecture is validated in the application of breast cancer diagnosis and proved to be outperforming the competing models. Motivated by the success from the first phase, the second phase focuses on improving medical imaging synthesis performance with advanced deep learning architecture. A new architecture named deep residual inception encoder-decoder network (RIED-Net) is proposed. RIED-Net has the advantages of preserving pixel-level information and cross-modality feature transferring. The applicability of RIED-Net is validated in breast cancer diagnosis and Alzheimer’s disease (AD) staging. Recognizing medical imaging research often has multiples inter-related tasks, namely, detection, segmentation and classification, my third phase of the research is to develop a multi-task deep learning model. Specifically, a feature transfer enabled multi-task deep learning model (FT-MTL-Net) is proposed to transfer high-resolution features from segmentation task to low-resolution feature-based classification task. The application of FT-MTL-Net on breast cancer detection, segmentation and classification using DM images is studied. As a continuing effort on exploring the transfer learning in deep models for medical application, the last phase is to develop a deep learning model for both feature transfer and knowledge from pre-training age prediction task to new domain of Mild cognitive impairment (MCI) to AD conversion prediction task. It is validated in the application of predicting MCI patients’ conversion to AD with 3D MRI images.
Date Created
2019
Agent

Stochastic models of patient access management in healthcare

157599-Thumbnail Image.png
Description
This dissertation addresses access management problems that occur in both emergency and outpatient clinics with the objective of allocating the available resources to improve performance measures by considering the trade-offs. Two main settings are considered for estimating patient willingness-to-wait (WtW)

This dissertation addresses access management problems that occur in both emergency and outpatient clinics with the objective of allocating the available resources to improve performance measures by considering the trade-offs. Two main settings are considered for estimating patient willingness-to-wait (WtW) behavior for outpatient appointments with statistical analyses of data: allocation of the limited booking horizon to patients of different priorities by using time windows in an outpatient setting considering patient behavior, and allocation of hospital beds to admitted Emergency Department (ED) patients. For each chapter, a different approach based on the problem context is developed and the performance is analyzed by implementing analytical and simulation models. Real hospital data is used in the analyses to provide evidence that the methodologies introduced are beneficial in addressing real life problems, and real improvements can be achievable by using the policies that are suggested.

This dissertation starts with studying an outpatient clinic context to develop an effective resource allocation mechanism that can improve patient access to clinic appointments. I first start with identifying patient behavior in terms of willingness-to-wait to an outpatient appointment. Two statistical models are developed to estimate patient WtW distribution by using data on booked appointments and appointment requests. Several analyses are conducted on simulated data to observe effectiveness and accuracy of the estimations.

Then, this dissertation introduces a time windows based policy that utilizes patient behavior to improve access by using appointment delay as a lever. The policy improves patient access by allocating the available capacity to the patients from different priorities by dividing the booking horizon into time intervals that can be used by each priority group which strategically delay lower priority patients.

Finally, the patient routing between ED and inpatient units to improve the patient access to hospital beds is studied. The strategy that captures the trade-off between patient safety and quality of care is characterized as a threshold type. Through the simulation experiments developed by real data collected from a hospital, the achievable improvement of implementing such a strategy that considers the safety-quality of care trade-off is illustrated.
Date Created
2019
Agent

Novel Semi-Supervised Learning Models to Balance Data Inclusivity and Usability in Healthcare Applications

157564-Thumbnail Image.png
Description
Semi-supervised learning (SSL) is sub-field of statistical machine learning that is useful for problems that involve having only a few labeled instances with predictor (X) and target (Y) information, and abundance of unlabeled instances that only have predictor (X) information.

Semi-supervised learning (SSL) is sub-field of statistical machine learning that is useful for problems that involve having only a few labeled instances with predictor (X) and target (Y) information, and abundance of unlabeled instances that only have predictor (X) information. SSL harnesses the target information available in the limited labeled data, as well as the information in the abundant unlabeled data to build strong predictive models. However, not all the included information is useful. For example, some features may correspond to noise and including them will hurt the predictive model performance. Additionally, some instances may not be as relevant to model building and their inclusion will increase training time and potentially hurt the model performance. The objective of this research is to develop novel SSL models to balance data inclusivity and usability. My dissertation research focuses on applications of SSL in healthcare, driven by problems in brain cancer radiomics, migraine imaging, and Parkinson’s Disease telemonitoring.

The first topic introduces an integration of machine learning (ML) and a mechanistic model (PI) to develop an SSL model applied to predicting cell density of glioblastoma brain cancer using multi-parametric medical images. The proposed ML-PI hybrid model integrates imaging information from unbiopsied regions of the brain as well as underlying biological knowledge from the mechanistic model to predict spatial tumor density in the brain.

The second topic develops a multi-modality imaging-based diagnostic decision support system (MMI-DDS). MMI-DDS consists of modality-wise principal components analysis to incorporate imaging features at different aggregation levels (e.g., voxel-wise, connectivity-based, etc.), a constrained particle swarm optimization (cPSO) feature selection algorithm, and a clinical utility engine that utilizes inverse operators on chosen principal components for white-box classification models.

The final topic develops a new SSL regression model with integrated feature and instance selection called s2SSL (with “s2” referring to selection in two different ways: feature and instance). s2SSL integrates cPSO feature selection and graph-based instance selection to simultaneously choose the optimal features and instances and build accurate models for continuous prediction. s2SSL was applied to smartphone-based telemonitoring of Parkinson’s Disease patients.
Date Created
2019
Agent

Image-based process monitoring via generative adversarial autoencoder with applications to rolling defect detection

157308-Thumbnail Image.png
Description
Image-based process monitoring has recently attracted increasing attention due to the advancement of the sensing technologies. However, existing process monitoring methods fail to fully utilize the spatial information of images due to their complex characteristics including the high dimensionality and

Image-based process monitoring has recently attracted increasing attention due to the advancement of the sensing technologies. However, existing process monitoring methods fail to fully utilize the spatial information of images due to their complex characteristics including the high dimensionality and complex spatial structures. Recent advancement of the unsupervised deep models such as a generative adversarial network (GAN) and generative adversarial autoencoder (AAE) has enabled to learn the complex spatial structures automatically. Inspired by this advancement, we propose an anomaly detection framework based on the AAE for unsupervised anomaly detection for images. AAE combines the power of GAN with the variational autoencoder, which serves as a nonlinear dimension reduction technique with regularization from the discriminator. Based on this, we propose a monitoring statistic efficiently capturing the change of the image data. The performance of the proposed AAE-based anomaly detection algorithm is validated through a simulation study and real case study for rolling defect detection.
Date Created
2019
Agent

Computational design and study of structural and dynamic nucleic acid systems

157221-Thumbnail Image.png
Description
DNA and RNA are generally regarded as one of the central molecules in molecular biology. Recent advancements in the field of DNA/RNA nanotechnology witnessed the success of usage of DNA/RNA as programmable molecules to construct nano-objects with predefined shapes and

DNA and RNA are generally regarded as one of the central molecules in molecular biology. Recent advancements in the field of DNA/RNA nanotechnology witnessed the success of usage of DNA/RNA as programmable molecules to construct nano-objects with predefined shapes and dynamic molecular machines for various functions. From the perspective of structural design with nucleic acid, there are basically two types of assembly method, DNA tile based assembly and DNA origami based assembly, used to construct infinite-sized crystal structures and finite-sized molecular structures. The assembled structure can be used for arrangement of other molecules or nanoparticles with the resolution of nanometers to create new type of materials. The dynamic nucleic acid machine is based on the DNA strand displacement, which allows two nucleic acid strands to hybridize with each other to displace one or more prehybridized strands in the process. Strand displacement reaction has been implemented to construct a variety of dynamic molecular systems, such as molecular computer, oscillators, in vivo devices for gene expression control.

This thesis will focus on the computational design of structural and dynamic nucleic acid systems, particularly for new type of DNA structure design and high precision control of gene expression in vivo. Firstly, a new type of fundamental DNA structural motif, the layered-crossover motif, will be introduced. The layered-crossover allow non-parallel alignment of DNA helices with precisely controlled angle. By using the layered-crossover motif, the scaffold can go through the 3D framework DNA origami structures. The properties of precise angle control of the layered-crossover tiles can also be used to assemble 2D and 3D crystals. One the dynamic control part, a de-novo-designed riboregulator is developed that can recognize single nucleotide variation. The riboregulators can also be used to develop paper-based diagnostic devices.
Date Created
2019
Agent

Sensing and regulation from nucleic acid devices

157213-Thumbnail Image.png
Description
The highly predictable structural and thermodynamic behavior of deoxynucleic acid (DNA) and ribonucleic acid (RNA) have made them versatile tools for creating artificial nanostructures over broad range. Moreover, DNA and RNA are able to interact with biological ligand as either

The highly predictable structural and thermodynamic behavior of deoxynucleic acid (DNA) and ribonucleic acid (RNA) have made them versatile tools for creating artificial nanostructures over broad range. Moreover, DNA and RNA are able to interact with biological ligand as either synthetic aptamers or natural components, conferring direct biological functions to the nucleic acid devices. The applications of nucleic acids greatly relies on the bio-reactivity and specificity when applied to highly complexed biological systems.

This dissertation aims to 1) develop new strategy to identify high affinity nucleic acid aptamers against biological ligand; and 2) explore highly orthogonal RNA riboregulators in vivo for constructing multi-input gene circuits with NOT logic. With the aid of a DNA nanoscaffold, pairs of hetero-bivalent aptamers for human alpha thrombin were identified with ultra-high binding affinity in femtomolar range with displaying potent biological modulations for the enzyme activity. The newly identified bivalent aptamers enriched the aptamer tool box for future therapeutic applications in hemostasis, and also the strategy can be potentially developed for other target molecules. Secondly, by employing a three-way junction structure in the riboregulator structure through de-novo design, we identified a family of high-performance RNA-sensing translational repressors that down-regulates gene translation in response to cognate RNAs with remarkable dynamic range and orthogonality. Harnessing the 3WJ repressors as modular parts, we integrate them into biological circuits that execute universal NAND and NOR logic with up to four independent RNA inputs in Escherichia coli.
Date Created
2019
Agent

Fast forward and inverse wave propagation for tomographic imaging of defects in solids

157030-Thumbnail Image.png
Description
Aging-related damage and failure in structures, such as fatigue cracking, corrosion, and delamination, are critical for structural integrity. Most engineering structures have embedded defects such as voids, cracks, inclusions from manufacturing. The properties and locations of embedded defects are

Aging-related damage and failure in structures, such as fatigue cracking, corrosion, and delamination, are critical for structural integrity. Most engineering structures have embedded defects such as voids, cracks, inclusions from manufacturing. The properties and locations of embedded defects are generally unknown and hard to detect in complex engineering structures. Therefore, early detection of damage is beneficial for prognosis and risk management of aging infrastructure system.

Non-destructive testing (NDT) and structural health monitoring (SHM) are widely used for this purpose. Different types of NDT techniques have been proposed for the damage detection, such as optical image, ultrasound wave, thermography, eddy current, and microwave. The focus in this study is on the wave-based detection method, which is grouped into two major categories: feature-based damage detection and model-assisted damage detection. Both damage detection approaches have their own pros and cons. Feature-based damage detection is usually very fast and doesn’t involve in the solution of the physical model. The key idea is the dimension reduction of signals to achieve efficient damage detection. The disadvantage is that the loss of information due to the feature extraction can induce significant uncertainties and reduces the resolution. The resolution of the feature-based approach highly depends on the sensing path density. Model-assisted damage detection is on the opposite side. Model-assisted damage detection has the ability for high resolution imaging with limited number of sensing paths since the entire signal histories are used for damage identification. Model-based methods are time-consuming due to the requirement for the inverse wave propagation solution, which is especially true for the large 3D structures.

The motivation of the proposed method is to develop efficient and accurate model-based damage imaging technique with limited data. The special focus is on the efficiency of the damage imaging algorithm as it is the major bottleneck of the model-assisted approach. The computational efficiency is achieved by two complimentary components. First, a fast forward wave propagation solver is developed, which is verified with the classical Finite Element(FEM) solution and the speed is 10-20 times faster. Next, efficient inverse wave propagation algorithms is proposed. Classical gradient-based optimization algorithms usually require finite difference method for gradient calculation, which is prohibitively expensive for large degree of freedoms. An adjoint method-based optimization algorithms is proposed, which avoids the repetitive finite difference calculations for every imaging variables. Thus, superior computational efficiency can be achieved by combining these two methods together for the damage imaging. A coupled Piezoelectric (PZT) damage imaging model is proposed to include the interaction between PZT and host structure. Following the formulation of the framework, experimental validation is performed on isotropic and anisotropic material with defects such as cracks, delamination, and voids. The results show that the proposed method can detect and reconstruct multiple damage simultaneously and efficiently, which is promising to be applied to complex large-scale engineering structures.
Date Created
2019
Agent

Strategies to enhance RNA-origami-based immunotherapeutics for an induction of long-term tumor-regression

132765-Thumbnail Image.png
Description
Recently, we have demonstrated that a novel RNA origami (RNA-OG) nanostructure functions as a TLR3 agonist both in vitro and in vivo. This RNA nanostructure could induce effective antitumor immunity in a CT26-OVA-iRFP tumor model that expresses both ovalbumin (OVA)

Recently, we have demonstrated that a novel RNA origami (RNA-OG) nanostructure functions as a TLR3 agonist both in vitro and in vivo. This RNA nanostructure could induce effective antitumor immunity in a CT26-OVA-iRFP tumor model that expresses both ovalbumin (OVA) and near infrared protein (iRFP), rendering a significant delay in tumor growth or complete tumor-regression. However, in a similar tumor line that expresses iRFP but not OVA, i.e. a CT26-Neo-iRFP model, RNA-OG induced responses that were consistently inferior to those observed in CT26-OVA-iRFP. Interestingly, the antitumor immunity initially generated against CT26-OVA-iRFP was found to render the mice immune to a challenge with the more malignant CT26-Neo-iRFP line. In addition to OVA expression, the two cell lines also showed different levels of MHC-I. Ongoing research has been focused on deciphering the molecular nature of the different responses. Then, we can search for strategies that increase the tumor immunogenicity, and therefore improve the therapeutic efficacy of RNA-OG for inducing long-term tumor-regression.
Date Created
2019-05
Agent

New statistical transfer learning models for health care applications

156932-Thumbnail Image.png
Description
Transfer learning is a sub-field of statistical modeling and machine learning. It refers to methods that integrate the knowledge of other domains (called source domains) and the data of the target domain in a mathematically rigorous and intelligent way, to

Transfer learning is a sub-field of statistical modeling and machine learning. It refers to methods that integrate the knowledge of other domains (called source domains) and the data of the target domain in a mathematically rigorous and intelligent way, to develop a better model for the target domain than a model using the data of the target domain alone. While transfer learning is a promising approach in various application domains, my dissertation research focuses on the particular application in health care, including telemonitoring of Parkinson’s Disease (PD) and radiomics for glioblastoma.

The first topic is a Mixed Effects Transfer Learning (METL) model that can flexibly incorporate mixed effects and a general-form covariance matrix to better account for similarity and heterogeneity across subjects. I further develop computationally efficient procedures to handle unknown parameters and large covariance structures. Domain relations, such as domain similarity and domain covariance structure, are automatically quantified in the estimation steps. I demonstrate METL in an application of smartphone-based telemonitoring of PD.

The second topic focuses on an MRI-based transfer learning algorithm for non-invasive surgical guidance of glioblastoma patients. Limited biopsy samples per patient create a challenge to build a patient-specific model for glioblastoma. A transfer learning framework helps to leverage other patient’s knowledge for building a better predictive model. When modeling a target patient, not every patient’s information is helpful. Deciding the subset of other patients from which to transfer information to the modeling of the target patient is an important task to build an accurate predictive model. I define the subset of “transferrable” patients as those who have a positive rCBV-cell density correlation, because a positive correlation is confirmed by imaging theory and the its respective literature.

The last topic is a Privacy-Preserving Positive Transfer Learning (P3TL) model. Although negative transfer has been recognized as an important issue by the transfer learning research community, there is a lack of theoretical studies in evaluating the risk of negative transfer for a transfer learning method and identifying what causes the negative transfer. My work addresses this issue. Driven by the theoretical insights, I extend Bayesian Parameter Transfer (BPT) to a new method, i.e., P3TL. The unique features of P3TL include intelligent selection of patients to transfer in order to avoid negative transfer and maintain patient privacy. These features make P3TL an excellent model for telemonitoring of PD using an At-Home Testing Device.
Date Created
2018
Agent