Intracellular Amplification for Applications in Single-cell DNA Sequencing

193688-Thumbnail Image.png
Description
Single-cell DNA sequencing (scDNA-seq) can identify genetic differencesbetween individual cells and has broad applications in studying biology. For example, because scDNA-seq preserves haplotypes, it enables the addition of information about the fitness of different combinations of mutations into studies that

Single-cell DNA sequencing (scDNA-seq) can identify genetic differencesbetween individual cells and has broad applications in studying biology. For example, because scDNA-seq preserves haplotypes, it enables the addition of information about the fitness of different combinations of mutations into studies that quantify the fitness of individual mutations. However, it requires separating cells manually or using machinery, which is time-consuming and costly as every cell requires a separate reaction. Thus, most studies are limited to a few hundred cells, and scaling up is expensive and challenging. This problem also makes it difficult to multiplex samples or to study multiple sample types in the same experiment. To solve these problems, I introduce a novel method for sequencing DNA in heterogeneous cell populations by using the cell itself as a container for sequencing reactions, eliminating the need to isolate individual cells. The method involves diffusing DNA polymerase and barcoded primers into intact cells and amplifying its DNA Intracellularly. To ensure that DNA from each cell can be uniquely identified, I use combinatorial barcoding, which assigns a specific barcode to each cell using a unique combination of non-unique nucleotide block sequences. This allows for the pooling of cells, making the method multiplexable and enabling the analysis of dozens of samples containing thousands of cells. The method is flexible and allows for targeted sequencing of a region of interest and whole genome sequencing. I optimize the method for various organisms and applications so it can be made accessible to a wide range of research groups.
Date Created
2024
Agent

The Impact of Laboratory Conditions on the Estimation of Nucleotide Mutation Rates.

190922-Thumbnail Image.png
Description
Mutation is the source of heritable variation of genotype and phenotype, on which selection may act. Mutation rates describe a fundamental parameter of living things, which influence the rate at which evolution may occur, from viral pathogens to human crops

Mutation is the source of heritable variation of genotype and phenotype, on which selection may act. Mutation rates describe a fundamental parameter of living things, which influence the rate at which evolution may occur, from viral pathogens to human crops and even to aging cells and the emergence of cancer. An understanding of the variables which impact mutation rates and their estimation is necessary to place mutation rate estimates in their proper contexts. To better understand mutation rate estimates, this research investigates the impact of temperature upon transcription rate error estimates; the impact of growing cells in liquid culture vs. on agar plates; the impact of many in vitro variables upon the estimation of deoxyribonucleic acid (DNA) mutation rates from a single sample; and the mutational hazard induced by expressing clustered regularly interspaced short palindromic repeat (CRISPR) proteins in yeast. This research finds that many of the variables tested did not significantly alter the estimation of mutation rates, strengthening the claims of previous mutation rate estimates across the tree of life by diverse experimental approaches. However, it is clear that sonication is a mutagen of DNA, part of an effort which has reduced the sequencing error rate of circle-seq by over 1,000-fold. This research also demonstrates that growth in liquid culture modestly skews the mutation spectrum of MMR- Escherichia coli, though it does not significantly impact the overall mutation rate. Finally, this research demonstrates a modest mutational hazard of expressing Cas9 and similar CRISPR proteins in yeast cells at an un-targeted genomic locus, though it is possible the indel rate has been increased by an order of magnitude.
Date Created
2023
Agent

Computational Genomics of DNA Viruses: Novel Insights into Bacteriophage and Human Cytomegalovirus Evolution

190832-Thumbnail Image.png
Description
Viruses are the most abundant biological entities on Earth, infecting all types of cellular organisms. Yet less than 1% of the virosphere on our planet has been characterized to date. Viruses are both an important driver of bacterial evolution and

Viruses are the most abundant biological entities on Earth, infecting all types of cellular organisms. Yet less than 1% of the virosphere on our planet has been characterized to date. Viruses are both an important driver of bacterial evolution and have significant implications for human health, therefore understanding the relative contributions of various evolutionary forces in shaping their genomic landscapes is of critical importance both mechanistically as well as clinically. In my thesis I use computational genomic approaches to gain novel insights into bacteriophage and human cytomegalovirus evolution. In my first two chapters and associated appendices I characterized the complete genomes of the Cluster P bacteriophage Phegasus and Cluster DR bacteriophage BiggityBass, whose isolation hosts were Mycobacterium smegmatis mc²155 and Gordonia terrae CAG3, respectively. I also determined the bacteriophages' phylogenetic placement and computationally inferred their putative host ranges. For my fourth chapter I assessed the performance of several of these computational host range prediction tools using a dataset of bacteriophages whose host ranges have been experimentally validated. Finally, in my fifth chapter I reviewed the key parameters for developing an evolutionary baseline model of another virus, human cytomegalovirus.
Date Created
2023
Agent

Growth-related Mutational Effects and Phenotypic Evolution in Escherichia coli

187735-Thumbnail Image.png
Description
Phenotypic evolution is of great significance within biology, as it is the culmination of the influence of key evolutionary factors on the expression of genotypes. Deeper studies of the fundamental components, such as fitness effects of mutations and genetic variance

Phenotypic evolution is of great significance within biology, as it is the culmination of the influence of key evolutionary factors on the expression of genotypes. Deeper studies of the fundamental components, such as fitness effects of mutations and genetic variance within a population, allow one to predict the evolutionary trajectory of phenotypic evolution. In this regard, how much the change in mutational variance and the ongoing natural selection influence the rate of phenotypic evolution has yet to be fully understood. Therefore, this study measured mutational variances and the increasing rate of genetic variance during the experimental evolution of Escherichia coli populations, focusing on two growth-related traits, the populational maximum growth rate and carrying capacity. Mutational variances were measured by mutation-accumulation experiments, which allowed for the analysis of the effects of spontaneous mutations on growth-related traits in the absence of selection. This analysis revealed that some evolved populations developed a higher mutational variance for growth-related traits. Further investigation showed that most evolved populations have also developed a greater mutational effect, which could explain the increase in mutational variance. Finally, the genetic variances for most evolved populations are lower than expected in the absence of selection, and the involvement of either stabilizing or directional selection is evident. Future experiments with a larger sample size of experimentally evolved populations, as well as more intermediate timepoints during experimental evolution, may provide further insight regarding the complexities of the evolutionary outcomes of these traits.
Date Created
2023
Agent

Characterizing and Releasing Biological Constraints for Lignocellulosic Bioconversion

187561-Thumbnail Image.png
Description
Lignocellulose, the major structural component of plant biomass, represents arenewable substrate of enormous biotechnological value. Microbial production of chemicals from lignocellulosic biomass is an attractive alternative to chemical synthesis. However, to create industrially competitive strains to efficiently convert lignocellulose to high-value chemicals, current

Lignocellulose, the major structural component of plant biomass, represents arenewable substrate of enormous biotechnological value. Microbial production of chemicals from lignocellulosic biomass is an attractive alternative to chemical synthesis. However, to create industrially competitive strains to efficiently convert lignocellulose to high-value chemicals, current challenges must be addressed. Redox constraints, allosteric regulation, and transport-related limitations are important bottlenecks limiting the commercial production of renewable chemicals from lignocellulose. Advances in metabolic engineering techniques have enabled researchers to engineer microbial strains that overcome some of these challenges but new approaches that facilitate the commercial viability of lignocellulose valorization are needed. Biological systems are complex with a plethora of regulatory systems that must be carefully modulated to efficiently produce and excrete the desired metabolites. In this work, I explore metabolic engineering strategies to address some of the biological constraints limiting bioproduction such as redox, allosteric, and transport constraints to facilitate cost-effective lignocellulose bioconversion.
Date Created
2023
Agent

Understanding the Heterogeneity in Gene Regulatory Responses to Misfolded Protein Toxicity

187419-Thumbnail Image.png
Description
Protein misfolding is a problem faced by all organisms, but the reasons behind misfolded protein toxicity are largely unknown. It is difficult to pinpoint one exact mechanism as the effects of misfolded proteins can be widespread and variable between cells.

Protein misfolding is a problem faced by all organisms, but the reasons behind misfolded protein toxicity are largely unknown. It is difficult to pinpoint one exact mechanism as the effects of misfolded proteins can be widespread and variable between cells. To better understand their impacts, here I explore the consequences of misfolded proteins and if they affect all cells equally or affect some cells more than others. To investigate cell subpopulations, I built and optimized a cutting-edge single-cell RNA sequencing platform (scRNAseq) for yeast. By using scRNAseq, I can study the expression variability of many genes (i.e. how the transcriptomes of single cells differ from one another). To induce misfolding and study how single cells deal with this stress, I use engineered strains with varying degrees of an orthogonal misfolded protein. When I computationally cluster the cells expressing misfolded proteins by their sequenced transcriptomes, I see more cells with the severely misfolded protein in subpopulations undergoing canonical stress responses. For example, I see these cells tend to overexpress chaperones, and upregulate mitochondrial biogenesis and transmembrane transport. Both of these are hallmarks of the “Generalized” or “Environmental Stress Response” (ESR) in yeast. Interestingly, I do not see all components of the ESR upregulated in all cells, which may suggest that the massive transcriptional changes characteristic of the ESR are an artifact of having defined the ESR in bulk studies. Instead, I see some cells activate chaperones, while others activate respiration in response to stress. Another intriguing finding is that growth supporting proteins, such as ribosomes, have particularly heterogeneous expression levels in cells expressing misfolded proteins. This suggests that these cells potentially reallocate their metabolic functions at the expense of growth but not all cells respond the same. In sum, by using my novel single-cell approach, I have gleaned new insights about how cells respond to stress. which can help me better understand diseased cells. These results also teach how cells contend with mutation, which commonly causes protein misfolding and is the raw material of evolution. My results are the first to explore single-cell transcriptional responses to protein misfolding and suggest that the toxicity from misfolded proteins may affect some cells’ transcriptomes differently than others.
Date Created
2023
Agent

Investigating the Mutational Hazard of CRISPR-Cas9 in Saccharomyces cerevisiae

Description

A mutation rate refers to the frequency at which DNA mutations occur in an organism over time. In organisms, mutations are the ultimate source of genetic variation on which selection may act. However, a large number of mutations over time

A mutation rate refers to the frequency at which DNA mutations occur in an organism over time. In organisms, mutations are the ultimate source of genetic variation on which selection may act. However, a large number of mutations over time can be detrimental to the cell. Mutation rates are the frequency at which these new mutations arise over time. This can give great insight into DNA repair mechanisms abilities as well as the mutagenic abilities of selected factors. CRISPR-Cas9 is a powerful tool for genome editing, but its off-target effects are not yet fully understood and studied. With its increasing implementation in science and medicine, it is crucial to understand the mutagenic potential of the tool. S. cerevisiae is a model organism for studying genetics due to its fast growth rate and eukaryotic nature. By integrating CRISPR-Cas9 systems into S. cerevisiae, the mutational burden of the technology can be measured and quantified using fluctuation assays. In this experiment, a fluctuation assay using canavanine selective plates was conducted to determine the mutational burden of CRISPR-Cas9 in S. cerevisiae. Multiple trials revealed that various strains of CRISPR-Cas9 had a mutation rate up to 3-fold higher than that of wild-type S. cerevisiae. This information is essential in improving the precision and safety of CRISPR-Cas9 editing in various applications, including gene therapy and biotechnology.

Date Created
2023-05
Agent

Precisely calculating relative fitness advantage (s) for diverse drug resistant mutants to better inform treatment models

165220-Thumbnail Image.png
Description

Pathogenic drug resistance is a major global health concern. Thus, there is great interest in modeling the behavior of resistant mutations–how quickly they will rise in frequency within a population, and whether they come with fitness tradeoffs that can form

Pathogenic drug resistance is a major global health concern. Thus, there is great interest in modeling the behavior of resistant mutations–how quickly they will rise in frequency within a population, and whether they come with fitness tradeoffs that can form the basis of treatment strategies. These models often depend on precise measurements of the relative fitness advantage (s) for each mutation and the strength of the fitness tradeoff that each mutation suffers in other contexts. Precisely quantifying s helps us create better, more accurate models of how mutants act in different treatment strategies. For example, P. falciparum acquires antimalarial drug resistance through a series of mutations to a single gene. Prior work in yeast expressing this P. falciparum gene demonstrated that mutations come with tradeoffs. Computational work has demonstrated the possibility of a treatment strategy which enriches for a particular resistant mutation that then makes the population grow poorly once the drug is removed. This treatment strategy requires knowledge of s and how it changes when multiple mutants are competing across various drug concentrations. Here, we precisely quantified s in varying drug concentrations for five resistant mutants, each of which provide varying degrees of drug resistance to antimalarial drugs. DNA barcodes were used to label each strain, allowing the mutants to be pooled together for direct competition in different concentrations of drug. This will provide data that can make the models more accurate, potentially facilitating more effective drug treatments in the future.

Date Created
2022-05
Agent

Understanding the Heterogeneity in Gene Regulatory Responses to Misfolded Protein Toxicity

165164-Thumbnail Image.png
Description

Protein misfolding is a problem across all organisms, but the reasons behind misfolded protein (MP) toxicity to cells are largely unknown. To better understand toxicity, I investigate if toxicity from MPs affects all cells equally or affects some cell subpopulations

Protein misfolding is a problem across all organisms, but the reasons behind misfolded protein (MP) toxicity to cells are largely unknown. To better understand toxicity, I investigate if toxicity from MPs affects all cells equally or affects some cell subpopulations more than others, such as older cells. To define cell subpopulations, I optimized a cutting-edge single-cell RNA sequencing platform (scRNAseq) for yeast. By using scRNAseq in yeast, I studied the expression variability of many genes across populations of thousands of cells. I studied how the transcriptomes of single cells differ from one another in various conditions: at different stages in the growth phase and with different engineered MPs. Differences in gene expression between strains expressing misfolded vs. properly folded proteins were found, confirming previous proteomic data. Further, I found a greater number of cell subpopulations in a MP expressing strain compared to a properly folded protein expressing strain, implying more differentiated subpopulations, potentially in response to toxicity from MPs. This observation is consistent with previous observations that heterogeneity within microbial populations can be beneficial to their fitness by allowing that population to thrive in stressful environments. Thus, my data provide insights about evolutionary biology and how strains respond to stress. Further, after identifying subpopulations with a more severe transcriptional response to MPs, I studied the cells’ physiology to gain insights about why that subpopulation is sensitive to MPs and found an upregulation of markers of aging, stress response, and shortening of lifespan. Observing characteristics of cell subpopulations, I also found differences dependent on stages of the cell cycle. Overall, this study provides insights on the gene regulatory responses associated with MP toxicity by revealing which type of cells are most sensitive to this intracellular threat.

Date Created
2022-05
Agent

Quantifying the Evolution of Fluconazole Resistance in S. Cerevisiae Using Molecular Barcodes

148139-Thumbnail Image.png
Description

One of the largest problems facing modern medicine is drug resistance. Many classes of drugs can be rendered ineffective if their target is able to acquire beneficial mutations. While this is an excellent showcase of the power of evolution, it

One of the largest problems facing modern medicine is drug resistance. Many classes of drugs can be rendered ineffective if their target is able to acquire beneficial mutations. While this is an excellent showcase of the power of evolution, it necessitates the development of increasingly stronger drugs to combat resistant pathogens. Not only is this strategy costly and time consuming, it is also unsustainable. To contend with this problem, many multi-drug treatment strategies are being explored. Previous studies have shown that resistance to some drug combinations is not possible, for example, resistance to a common antifungal drug, fluconazole, seems impossible in the presence of radicicol. We believe that in order to understand the viability of multi-drug strategies in combating drug resistance, we must understand the full spectrum of resistance mutations that an organism can develop, not just the most common ones. It is possible that rare mutations exist that are resistant to both drugs. Knowing the frequency of such mutations is important for making predictions about how problematic they will be when multi-drug strategies are used to treat human disease. This experiment aims to expand on previous research on the evolution of drug resistance in S. cerevisiae by using molecular barcodes to track ~100,000 evolving lineages simultaneously. The barcoded cells were evolved with serial transfers for seven weeks (200 generations) in three concentrations of the antifungal Fluconazole, three concentrations of the Hsp90 inhibitor Radicicol, and in four combinations of Fluconazole and Radicicol. Sequencing data was used to track barcode frequencies over the course of the evolution, allowing us to observe resistant lineages as they rise and quantify differences in resistance evolution across the different conditions. We were able to successfully observe over 100,000 replicates simultaneously, revealing many adaptive lineages in all conditions. Our results also show clear differences across drug concentrations and combinations, with the highest drug concentrations exhibiting distinct behaviors.

Date Created
2021-05
Agent