Bayesian Approach in Addressing Simultaneous Gene Network Model Selection and Parameter Estimation with Snapshot Data

193430-Thumbnail Image.png
Description
Gene expression models are key to understanding and predicting transcriptional dynamics. This thesis devises a computational method which can efficiently explore a large, highly correlated parameter space, ultimately allowing the author to accurately deduce the underlying gene network model using

Gene expression models are key to understanding and predicting transcriptional dynamics. This thesis devises a computational method which can efficiently explore a large, highly correlated parameter space, ultimately allowing the author to accurately deduce the underlying gene network model using discrete, stochastic mRNA counts derived through the non-invasive imaging method of single molecule fluorescence in situ hybridization (smFISH). An underlying gene network model consists of the number of gene states (distinguished by distinct production rates) and all associated kinetic rate parameters. In this thesis, the author constructs an algorithm based on Bayesian parametric and nonparametric theory, expanding the traditional single gene network inference tools. This expansion starts by increasing the efficiency of classic Markov-Chain Monte Carlo (MCMC) sampling by combining three schemes known in the Bayesian statistical computing community: 1) Adaptive Metropolis-Hastings (AMH), 2) Hamiltonian Monte Carlo (HMC), and 3) Parallel Tempering (PT). The aggregation of these three methods decreases the autocorrelation between sequential MCMC samples, reducing the number of samples required to gain an accurate representation of the posterior probability distribution. Second, by employing Bayesian nonparametric methods, the author is able to simultaneously evaluate discrete and continuous parameters, enabling the method to devise the structure of the gene network and all kinetic parameters, respectively. Due to the nature of Bayesian theory, uncertainty is evaluated for the gene network model in combination with the kinetic parameters. Tools brought from Bayesian nonparametric theory equip the method with an ability to sample from the posterior distribution of all possible gene network models without pre-defining the gene network structure, i.e. the number of gene states. The author verifies the method’s robustness through the use of synthetic snapshot data, designed to closely represent experimental smFISH data sets, across a range of gene network model structures, parameters and experimental settings (number of probed cells and timepoints).
Date Created
2024
Agent

Misinformation on the Russian-Ukrainian War: A Case Study

Description

As online media, including social media platforms, become the primary and go-to resource for traditional communication, news and the spread of information is more present and accessible to consumers than ever before. This research focuses on analyzing Twitter data on

As online media, including social media platforms, become the primary and go-to resource for traditional communication, news and the spread of information is more present and accessible to consumers than ever before. This research focuses on analyzing Twitter data on the ongoing Russian-Ukrainian War to understand the significance of social media during this period in comparison to previous conflicts. The significance of social media and political conflict will be examined through Twitter user analysis and sentiment analysis. This case study will conduct sentiment analysis on a random sample of tweets from a given dataset, followed by user analysis and classification methods. The data will explore the implications for understanding public opinion on the conflict, the strengths and limitations of Twitter as a data source, and the next steps for future research. Highlighting the implications of the research findings will allow consumers and political stakeholders to make more informed decisions in the future.

Date Created
2023-05
Agent

Three essays on shrinkage estimation and model selection of linear and nonlinear time series models

156576-Thumbnail Image.png
Description
The primary objective in time series analysis is forecasting. Raw data often exhibits nonstationary behavior: trends, seasonal cycles, and heteroskedasticity. After data is transformed to a weakly stationary process, autoregressive moving average (ARMA) models may capture the remaining temporal

The primary objective in time series analysis is forecasting. Raw data often exhibits nonstationary behavior: trends, seasonal cycles, and heteroskedasticity. After data is transformed to a weakly stationary process, autoregressive moving average (ARMA) models may capture the remaining temporal dynamics to improve forecasting. Estimation of ARMA can be performed through regressing current values on previous realizations and proxy innovations. The classic paradigm fails when dynamics are nonlinear; in this case, parametric, regime-switching specifications model changes in level, ARMA dynamics, and volatility, using a finite number of latent states. If the states can be identified using past endogenous or exogenous information, a threshold autoregressive (TAR) or logistic smooth transition autoregressive (LSTAR) model may simplify complex nonlinear associations to conditional weakly stationary processes. For ARMA, TAR, and STAR, order parameters quantify the extent past information is associated with the future. Unfortunately, even if model orders are known a priori, the possibility of over-fitting can lead to sub-optimal forecasting performance. By intentionally overestimating these orders, a linear representation of the full model is exploited and Bayesian regularization can be used to achieve sparsity. Global-local shrinkage priors for AR, MA, and exogenous coefficients are adopted to pull posterior means toward 0 without over-shrinking relevant effects. This dissertation introduces, evaluates, and compares Bayesian techniques that automatically perform model selection and coefficient estimation of ARMA, TAR, and STAR models. Multiple Monte Carlo experiments illustrate the accuracy of these methods in finding the "true" data generating process. Practical applications demonstrate their efficacy in forecasting.
Date Created
2018
Agent

Fundamentals of Blockchain in the Supply Chain

133364-Thumbnail Image.png
Description
The objective of this paper is to provide an educational diagnostic into the technology of blockchain and its application for the supply chain. Education on the topic is important to prevent misinformation on the capabilities of blockchain. Blockchain as a

The objective of this paper is to provide an educational diagnostic into the technology of blockchain and its application for the supply chain. Education on the topic is important to prevent misinformation on the capabilities of blockchain. Blockchain as a new technology can be confusing to grasp given the wide possibilities it can provide. This can convolute the topic by being too broad when defined. Instead, the focus will be maintained on explaining the technical details about how and why this technology works in improving the supply chain. The scope of explanation will not be limited to the solutions, but will also detail current problems. Both public and private blockchain networks will be explained and solutions they provide in supply chains. In addition, other non-blockchain systems will be described that provide important pieces in supply chain operations that blockchain cannot provide. Blockchain when applied to the supply chain provides improved consumer transparency, management of resources, logistics, trade finance, and liquidity.
Date Created
2018-05
Agent