Programming Nucleic Acid Systems through Computation Design: from Dynamic Reaction to Complex Self Assembly

187308-Thumbnail Image.png
Description
As a rapidly evolving field, nucleic acid nanotechnology focuses on creating functional nanostructures or dynamic devices through harnessing the programmbility of nucleic acids including deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), enabled by the predictable Watson-Crick base pairing. The precise

As a rapidly evolving field, nucleic acid nanotechnology focuses on creating functional nanostructures or dynamic devices through harnessing the programmbility of nucleic acids including deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), enabled by the predictable Watson-Crick base pairing. The precise control over the sequence and structure, along with the development of simulation softwares for the prediction of the experimental implementation provides the base of designing structures or devices with arbitrary topology and operational logic at nanoscale. Over the past 40 years, the thriving field has pushed the boundaries of nucleic acids, from originally biological macromolecules to functional building blocks with applications in biomedicine, molecular diagnostics and imaging, material science, electronics, crystallography, and more have emerged through programming the sequences and generating the various structures or devices. The underlying logic of nucleic acid programming is the base pairing rule, straightforward and robust. While for the complicated design of sequences and quantitative understanding of the programmed results, computational tools will markedly reduced the level of difficulty and even meet the challenge not available with manual effort. With this thesis three individual projects are presented, with all of them interweaving theory/computation and experiments. In a higher level abstraction, this dissertation covers the topic of biophysical understanding of the dynamic reactions, designing and realizing complex self-assembly systems and finally super-resolutional imaging. More specifically, Chapter 2 describes the study of RNA strand displacement kinetics with dedicated model extracting the reaction rates, providing guidelines for the rational design and regulation of the strand displacement reactions and eventually biochemical processes. In chapter 3 the platform for the design of complex symmetry of the self-assembly target and first experimental implementation of the assembly of pyrochlore lattices with DNA origamis are presented, which potentially can be applied to manipulate lights as optical materials. Chapter 4 focuses on the in solution characterization of the periodicity of DNA origami lattices with super-resolutional microscopy, with algorithms in development for three dimensional structural reconstruction.
Date Created
2023
Agent

Allosteric Control of RNA Molecular Clamp through Mechanical Tension

Description

Molecular engineering is an emerging field that aims to create functional devices for modular purposes, particularly bottom-up design of nano-assemblies using mechanical and chemical methods to perform complex tasks. In this study, we present a novel method for constructing an

Molecular engineering is an emerging field that aims to create functional devices for modular purposes, particularly bottom-up design of nano-assemblies using mechanical and chemical methods to perform complex tasks. In this study, we present a novel method for constructing an RNA clamp using circularized RNA and a broccoli aptamer for fluorescence sensing. By designing a circular RNA with the broccoli aptamer and a complementary DNA strand, we created a molecular clamp that can stabilize the aptamer. The broccoli aptamer displays enhanced fluorescence when bound to its ligand, DFHBI-1T. Upon induction with this small molecule, the clamp can exhibit or destroy fluorescence. We demonstrated that we could control the fluorescence of the RNA clamp by introducing different complementary DNA strands, which regulate the level of fluorescence. Additionally, we designed allosteric control by introducing new DNA strands, making the system reversible. We explored the use of mechanical tension to regulate RNA function by attaching a spring-like activity through the RNA clamp to two points on the RNA surface. By adjusting the stiffness of the spring, we could control the tension between the two points and induce reversible conformational changes, effectively turning RNA function on and off. Our approach offers a simple and versatile method for creating RNA clamps with various applications, including RNA detection, regulation, and future nanodevice design. Our findings highlight the crucial role of mechanical forces in regulating RNA function, paving the way for developing new strategies for RNA manipulation, and potentially advancing molecular engineering. Although the current work is ongoing, we provide current progress of both theoretical and experimental calculations based on our findings.

Date Created
2023-05
Agent

Novel Computational Algorithms for Imaging Biomarker Identification

171944-Thumbnail Image.png
Description
Over the past few decades, medical imaging is becoming important in medicine for disease diagnosis, prognosis, treatment assessment and health monitoring. As medical imaging has progressed, imaging biomarkers are being rapidly developed for early diagnosis and staging of disease. Detecting

Over the past few decades, medical imaging is becoming important in medicine for disease diagnosis, prognosis, treatment assessment and health monitoring. As medical imaging has progressed, imaging biomarkers are being rapidly developed for early diagnosis and staging of disease. Detecting and segmenting objects from images are often the first steps in quantitative measurement of these biomarkers. While large objects can often be automatically or semi-automatically delineated, segmenting small objects (blobs) is challenging. The small object of particular interest in this dissertation are glomeruli from kidney magnetic resonance (MR) images. This problem has its unique challenges. First of all, the size of glomeruli is extremely small and very similar with noises from images. Second, there are massive of glomeruli in kidney, e.g. over 1 million glomeruli in human kidney, and the intensity distribution is heterogenous. A third recognized issue is that a large portion of glomeruli are overlapping and touched in images. The goal of this dissertation is to develop computational algorithms to identify and discover glomeruli related imaging biomarkers. The first phase is to develop a U-net joint with Hessian based Difference of Gaussians (UH-DoG) blob detector. Joining effort from deep learning alleviates the over-detection issue from Hessian analysis. Next, as extension of UH-DoG, a small blob detector using Bi-Threshold Constrained Adaptive Scales (BTCAS) is proposed. Deep learning is treated as prior of Difference of Gaussian (DoG) to improve its efficiency. By adopting BTCAS, under-segmentation issue of deep learning is addressed. The second phase is to develop a denoising convexity-consistent Blob Generative Adversarial Network (BlobGAN). BlobGAN could achieve high denoising performance and selectively denoise the image without affecting the blobs. These detectors are validated on datasets of 2D fluorescent images, 3D synthetic images, 3D MR (18 mice, 3 humans) images and proved to be outperforming the competing detectors. In the last phase, a Fréchet Descriptors Distance based Coreset approach (FDD-Coreset) is proposed for accelerating BlobGAN’s training. Experiments have shown that BlobGAN trained on FDD-Coreset not only significantly reduces the training time, but also achieves higher denoising performance and maintains approximate performance of blob identification compared with training on entire dataset.
Date Created
2022
Agent

Software Tools for Design, Simulation, and Characterization of DNA and RNA Nanostructures

Description
Nucleic acid nanotechnology is a field of nanoscale engineering where the sequences of deoxyribonucleicacid (DNA) and ribonucleic acid (RNA) molecules are carefully designed to create self–assembled nanostructures with higher spatial resolution than is available to top–down fabrication methods. In the

Nucleic acid nanotechnology is a field of nanoscale engineering where the sequences of deoxyribonucleicacid (DNA) and ribonucleic acid (RNA) molecules are carefully designed to create self–assembled nanostructures with higher spatial resolution than is available to top–down fabrication methods. In the 40 year history of the field, the structures created have scaled from small tile–like structures constructed from a few hundred individual nucleotides to micron–scale structures assembled from millions of nucleotides using the technique of “DNA origami”. One of the key drivers of advancement in any modern engineering field is the parallel development of software which facilitates the design of components and performs in silico simulation of the target structure to determine its structural properties, dynamic behavior, and identify defects. For nucleic acid nanotechnology, the design software CaDNAno and simulation software oxDNA are the most popular choices for design and simulation, respectively. In this dissertation I will present my work on the oxDNA software ecosystem, including an analysis toolkit, a web–based graphical interface, and a new molecular visualization tool which doubles as a free–form design editor that covers some of the weaknesses of CaDNAno’s lattice–based design paradigm. Finally, as a demonstration of the utility of these new tools I show oxDNA simulation and subsequent analysis of a nanoscale leaf–spring engine capable of converting chemical energy into dynamic motion. OxDNA simulations were used to investigate the effects of design choices on the behavior of the system and rationalize experimental results.
Date Created
2022
Agent

Hierarchical Sequential Event Prediction and Translation from Aviation Accident Report Data

171838-Thumbnail Image.png
Description
Sequential event prediction or sequential pattern mining is a well-studied topic in the literature. There are a lot of real-world scenarios where the data is released sequentially. People believe that there exist repetitive patterns of event sequences so that the

Sequential event prediction or sequential pattern mining is a well-studied topic in the literature. There are a lot of real-world scenarios where the data is released sequentially. People believe that there exist repetitive patterns of event sequences so that the future events can be predicted. For example, many companies build their recommender system to predict the next possible product for the users according to their purchase history. The healthcare system discovers the relationships among patients’ sequential symptoms to mitigate the adverse effect of a treatment (drugs or surgery). Modern engineering systems like aviation/distributed computing/energy systems diagnosed failure event logs and took prompt actions to avoid disaster when a similar failure pattern occurs. In this dissertation, I specifically focus on building a scalable algorithm for event prediction and extraction in the aviation domain. Understanding the accident event is always the major concern of the safety issue in the aviation system. A flight accident is often caused by a sequence of failure events. Accurate modeling of the failure event sequence and how it leads to the final accident is important for aviation safety. This work aims to study the relationship of the failure event sequence and evaluate the risk of the final accident according to these failure events. There are three major challenges I am trying to deal with. (1) Modeling Sequential Events with Hierarchical Structure: I aim to improve the prediction accuracy by taking advantage of the multi-level or hierarchical representation of these rare events. Specifically, I proposed to build a sequential Encoder-Decoder framework with a hierarchical embedding representation of the events. (2) Lack of high-quality and consistent event log data: In order to acquire more accurate event data from aviation accident reports, I convert the problem into a multi-label classification. An attention-based Bidirectional Encoder Representations from Transformers model is developed to achieve good performance and interpretability. (3) Ontology-based event extraction: In order to extract detailed events, I proposed to solve the problem as a hierarchical classification task. I improve the model performance by incorporating event ontology. By solving these three challenges, I provide a framework to extract events from narrative reports and estimate the risk level of aviation accidents through event sequence modeling.
Date Created
2022
Agent

Development of Tools for Planning and Coordinating the Production of Small Farmers as a Response to Market Opportunities

171820-Thumbnail Image.png
Description
For multiple reasons, the consumption of fresh fruits and vegetables in the United States has progressively increased. This has resulted in increased domestic production and importation of these products. The associated logistics is complex due to the perishability of these

For multiple reasons, the consumption of fresh fruits and vegetables in the United States has progressively increased. This has resulted in increased domestic production and importation of these products. The associated logistics is complex due to the perishability of these products, and most current logistics systems rely on marketing and supply chains practices that result in high levels of food waste and limited offer diversity. For instance, given the lack of critical mass, small growers are conspicuously absent from mainstream distribution channels. One way to obtain these critical masses is using associative schemes such as co-ops. However, the success level of traditional associate schemes has been mixed at best. This dissertation develops decision support tools to facilitate the formation of coalitions of small growers in complementary production regions to act as a single-like supplier. Thus, this dissertation demonstrates the benefits and efficiency that could be achieved by these coalitions, presents a methodology to efficiently distribute the value of a new identified market opportunity among the growers participating in the coalition, and develops a negotiation framework between a buyer(s) and the agent representing the coalition that results in a prototype contract.There are four main areas of research contributions in this dissertation. The first is the development of optimization tools to allocate a market opportunity to potential production regions while considering consumer preferences for special denomination labels such as “local”, “organic”, etc. The second contribution is in the development of a stochastic optimization and revenue-distribution framework for the formation of coalitions of growers to maximize the captured value of a market opportunity. The framework considers the growers’ individual preferences and production characteristics (yields, resources, etc.) to develop supply contracts that entice their participation in the coalition. The third area is the development of a negotiation mechanism to design contracts between buyers and groups of growers considering the profit expectations and the variability of the future demand. The final contribution is the integration of these models and tools into a framework capable of transforming new market opportunities into implementable production plans and contractual agreement between the different supply chain participants.
Date Created
2022
Agent

Uncertainty Quantification and Prognostics using Bayesian Statistics and Machine Learning

168584-Thumbnail Image.png
Description
Uncertainty quantification is critical for engineering design and analysis. Determining appropriate ways of dealing with uncertainties has been a constant challenge in engineering. Statistical methods provide a powerful aid to describe and understand uncertainties. This work focuses on applying Bayesian

Uncertainty quantification is critical for engineering design and analysis. Determining appropriate ways of dealing with uncertainties has been a constant challenge in engineering. Statistical methods provide a powerful aid to describe and understand uncertainties. This work focuses on applying Bayesian methods and machine learning in uncertainty quantification and prognostics among all the statistical methods. This study focuses on the mechanical properties of materials, both static and fatigue, the main engineering field on which this study focuses. This work can be summarized in the following items: First, maintaining the safety of vintage pipelines requires accurately estimating the strength. The objective is to predict the reliability-based strength using nondestructive multimodality surface information. Bayesian model averaging (BMA) is implemented for fusing multimodality non-destructive testing results for gas pipeline strength estimation. Several incremental improvements are proposed in the algorithm implementation. Second, the objective is to develop a statistical uncertainty quantification method for fatigue stress-life (S-N) curves with sparse data.Hierarchical Bayesian data augmentation (HBDA) is proposed to integrate hierarchical Bayesian modeling (HBM) and Bayesian data augmentation (BDA) to deal with sparse data problems for fatigue S-N curves. The third objective is to develop a physics-guided machine learning model to overcome limitations in parametric regression models and classical machine learning models for fatigue data analysis. A Probabilistic Physics-guided Neural Network (PPgNN) is proposed for probabilistic fatigue S-N curve estimation. This model is further developed for missing data and arbitrary output distribution problems. Fourth, multi-fidelity modeling combines the advantages of low- and high-fidelity models to achieve a required accuracy at a reasonable computation cost. The fourth objective is to develop a neural network approach for multi-fidelity modeling by learning the correlation between low- and high-fidelity models. Finally, conclusions are drawn, and future work is outlined based on the current study.
Date Created
2022
Agent

Outlier-Aware Applications in High-Dimensional Industrial Systems

161801-Thumbnail Image.png
Description
High-dimensional data is omnipresent in modern industrial systems. An imaging sensor in a manufacturing plant a can take images of millions of pixels or a sensor may collect months of data at very granular time steps. Dimensionality reduction techniques are

High-dimensional data is omnipresent in modern industrial systems. An imaging sensor in a manufacturing plant a can take images of millions of pixels or a sensor may collect months of data at very granular time steps. Dimensionality reduction techniques are commonly used for dealing with such data. In addition, outliers typically exist in such data, which may be of direct or indirect interest given the nature of the problem that is being solved. Current research does not address the interdependent nature of dimensionality reduction and outliers. Some works ignore the existence of outliers altogether—which discredits the robustness of these methods in real life—while others provide suboptimal, often band-aid solutions. In this dissertation, I propose novel methods to achieve outlier-awareness in various dimensionality reduction methods. The problem is considered from many different angles depend- ing on the dimensionality reduction technique used (e.g., deep autoencoder, tensors), the nature of the application (e.g., manufacturing, transportation) and the outlier structure (e.g., sparse point anomalies, novelties).
Date Created
2021
Agent

A Disease Progression Modeling Framework for Nonalcoholic Steatohepatitis Using Multiparametric Serial Magnetic Resonance Imaging and Elastography

161762-Thumbnail Image.png
Description
Nonalcoholic Steatohepatitis (NASH) is a severe form of Nonalcoholic fatty liverdisease, that is caused due to excessive calorie intake, sedentary lifestyle and in the absence of severe alcohol consumption. It is widely prevalent in the United States and in many other developed

Nonalcoholic Steatohepatitis (NASH) is a severe form of Nonalcoholic fatty liverdisease, that is caused due to excessive calorie intake, sedentary lifestyle and in the absence of severe alcohol consumption. It is widely prevalent in the United States and in many other developed countries, affecting up to 25 percent of the population. Due to being asymptotic, it usually goes unnoticed and may lead to liver failure if not treated at the right time. Currently, liver biopsy is the gold standard to diagnose NASH, but being an invasive procedure, it comes with it's own complications along with the inconvenience of sampling repeated measurements over a period of time. Hence, noninvasive procedures to assess NASH are urgently required. Magnetic Resonance Elastography (MRE) based Shear Stiffness and Loss Modulus along with Magnetic Resonance Imaging based proton density fat fraction have been successfully combined to predict NASH stages However, their role in the prediction of disease progression still remains to be investigated. This thesis thus looks into combining features from serial MRE observations to develop statistical models to predict NASH progression. It utilizes data from an experiment conducted on male mice to develop progressive and regressive NASH and trains ordinal models, ordered probit regression and ordinal forest on labels generated from a logistic regression model. The models are assessed on histological data collected at the end point of the experiment. The models developed provide a framework to utilize a non-invasive tool to predict NASH disease progression.
Date Created
2021
Agent

Engineered Excitonic Complex Directed by Programmable DNA Architectures

161661-Thumbnail Image.png
Description
Efficient light collection and utilization are highly needed for developing effective photonic devices and materials. Nature is the master of organizing photosynthetic pigments into a densely packed state without self-quenching and conducting efficient energy transfer in a directed manner via

Efficient light collection and utilization are highly needed for developing effective photonic devices and materials. Nature is the master of organizing photosynthetic pigments into a densely packed state without self-quenching and conducting efficient energy transfer in a directed manner via implementing sophisticated proteins as scaffolds. The natural light-harvesting complex inspires the design of artificial photonic systems by utilizing synthetic templates to control the spatial arrangement and energy landscape of photoactive components. The self-assembled DNA nanostructures are highly programmable and intrinsically addressable, which makes them excellent templates for the precise organization of chromophores with desired complexity as artificial light-harvesting systems and photonic nanodevices for efficient photon capture and excitation energy transport. This dissertation focuses on the fundamental understanding and rational engineering of a series of artificial excitonic systems using programmable DNA architectures as templates to direct the self-assembly of cyanine dye aggregates. First, the DNA-templated pseudoisocyanine (PIC) dye aggregates were systematically studied to explore the effect of sequence and length of DNA templates on their excitonic properties. The results revealed that the PIC dye aggregates enable energy transfer along a defined track. Next, the benzothiazole cyanine dye K21 was introduced to form dye aggregates on double-stranded DNA templates. The strong inter-molecular coupling and weak sequence dependency of the K21 aggregates make it possible to mediate the efficient directional energy transfer over a distance up to 30 nm. Finally, the DNA helix-bundle structures with extended size and complicated geometries were employed to organize K21 dye as the scalable, addressable, and programmable excitonic complexes conducting sub-micron-scale directional exciton transport and serving as robust and modular building blocks to construct higher-order excitonic architectures.
Date Created
2021
Agent