Bayesian Hierarchical Model for Testing Allele Specific Expression Towards the Alternative Allele

137848-Thumbnail Image.png
Description
Identifying associations between genotypes and gene expression levels using next-generation technology has enabled systematic interrogation of regulatory variation underlying complex phenotypes. Understanding the source of expression variation has important implications for disease susceptibility, phenotypic diversity, and adaptation (Main, 2009). Interest

Identifying associations between genotypes and gene expression levels using next-generation technology has enabled systematic interrogation of regulatory variation underlying complex phenotypes. Understanding the source of expression variation has important implications for disease susceptibility, phenotypic diversity, and adaptation (Main, 2009). Interest in the existence of allele-specific expression in autosomal genes evolved with the increased awareness of the important role that variation in non-coding DNA sequences can play in determining phenotypic diversity, and the essential role parent-of-origin expression has in early development (Knight, 2004). As new implications of high-throughput sequencing are conceived, it is becoming increasingly important to develop statistical methods tailored to large and formidably complex data sets in order to maximize the biological insights derived from next-generation sequencing experiments. Here, a Bayesian hierarchical probability model based on the beta-binomial distribution is proposed as a possible approach for quantifying allele-specific expression from whole genome (WGS) and whole transcriptome (RNA-seq) data. Pipeline for the analysis of WGS and RNA-seq data sets from ten samples was developed and implemented, while allele-specific expression (ASE) was quantified from both haplotypes using individuals heterozygous at the tested variants utilizing the described methodology. Both computational and statistical framework applied accurately quantified ASE, achieving high reproducibility of already described allele-specific genes in the literature. In conclusion, described methodology provides a solid starting point for quantifying allele specific expression across whole genomes.
Date Created
2012-12
Agent

The Biological Misunderstandings of HIV/AIDS Make It a Public Health Threat

137822-Thumbnail Image.png
Description
Affecting over 34 million people worldwide, (0.5% of the world population) HIV/AIDS is a pandemic that is not receding its control anytime soon (Sidibe 2011). Thirty years since the chaotic emergence of fear and misunderstanding, our knowledge of the virus

Affecting over 34 million people worldwide, (0.5% of the world population) HIV/AIDS is a pandemic that is not receding its control anytime soon (Sidibe 2011). Thirty years since the chaotic emergence of fear and misunderstanding, our knowledge of the virus and its subsequent syndrome has grown exponentially, but how much of this information is really getting to the people that need it? In the corners of the Earth, where scientific knowledge rarely reaches, what can we do to stop the deadly spread of this virus? And what of the countries with a large amount of knowledge, but still a ravaging problem present in their countries? When this information is disseminated- sometimes a matter of ‘if’ in certain countries, it is primarily through unreliable sources, most of the countries examined through the media, which has a tendency to skew and misinterpret information-specially scientific. This is the information that enters the lives of people in several different countries, for example, the United States, France, China, Brazil, Uganda, and South Africa: Misunderstandings of how to protect themselves from the virus, and its effects on the body. These misunderstandings have led to millions of lives lost as myths such as showering to cure AIDS and that AIDS only infect the ‘sinners’ continue to surface throughout the globe. The Public Health threat is due to knowledge deficits, and incorrect perceived ‘knowledge’ and ‘awareness of the problem’. Here, in a six-country analysis of common misconceptions and the subsequent policies and prevalence rate, it has begun to be clear that the hardest hit areas are those with the most stigma, the most misguided policies, the most uninformed leadership, and because of this, the most mislead citizens. The biological misunderstandings of HIV/AIDS are at the root of the public health threat continuing to keep its hold in the modern world, 30 years after its documented outbreak.
Date Created
2012-12
Agent

Genetically-Related Health Disparities in Sports: Frequency Analysis of Two Single Nucleotide Polymorphisms in NCAA Student Athletes with Sickle Cell Trait

137222-Thumbnail Image.png
Description
The NCAA recently declared sickle cell trait (SCT) to be a risk factor for sudden illness and death among student athletes. Fetal hemoglobin (HbF) concentration in adults is negatively correlated with disease severity in sickle cell anemia, although its effect

The NCAA recently declared sickle cell trait (SCT) to be a risk factor for sudden illness and death among student athletes. Fetal hemoglobin (HbF) concentration in adults is negatively correlated with disease severity in sickle cell anemia, although its effect on SCT is not fully understood and the concentration is found to have high variability across populations. Two single nucleotide polymorphisms (SNPs) at the human beta globin gene cluster, rs7482144 and rs10128556, contribute to the heritable variation in HbF levels and are associated with increased HbF concentrations in adults. A sample population of NCAA football student athletes was genotyped for these two polymorphisms, and their allele frequencies were compared to those of other populations. The minor allele of both polymorphisms had allele frequencies of 0.091 in the sample population, which compared closely with other populations of recent African heritage but was significantly different from European populations. The results of this study will be included in a larger study to predict whether these among other polymorphisms can be used as markers to predict susceptibility to heat-related emergencies in NCAA student athletes with SCT, although the small sample size will delay this process until participation in the study increases. Since both rs7482144 and rs10128556 exhibit high levels of linkage disequilibrium, and as their contributions to the heritable variability of HbF concentrations tend to differ greatly between populations of different ancestry, further investigations should be aimed at distinguishing between the effects of each SNP in African American, European, and other populations represented in NCAA football before conclusions can be drawn as to their practical use as genetic markers of heat susceptibility in student athletes with SCT.
Date Created
2014-05
Agent

Modeling the Relationship Between Migration, Selection, and Drift in Populations of Blind Cave Fish

136967-Thumbnail Image.png
Description
The evolution of blindness in cave animals has been heavily studied; however, little research has been done on the interaction of migration and drift on the development of blindness in these populations. In this study, a model is used to

The evolution of blindness in cave animals has been heavily studied; however, little research has been done on the interaction of migration and drift on the development of blindness in these populations. In this study, a model is used to compare the effect that genetic drift has on the fixation of a blindness allele for varying amounts of migration and selection. For populations where the initial frequency is quite low, genetic drift plays a much larger role in the fixation of blindness than populations where the initial frequency is high. In populations where the initial frequency is high, genetic drift plays almost no role in fixation. Our results suggest that migration plays a greater role in the fate of the blindness allele than selection.
Date Created
2014-05
Agent

Selection of the AMA-1 Gene in Plasmodium falciparum and Plasmodium vivax

Description
Plasmodium falciparum and Plasmodium vivax are two of the main propagators of human malaria. Both species contain the protein, Apical Membrane Antigen 1 (AMA-1), which is involved in the process of host cell invasion. However, the high degree of polymorphisms

Plasmodium falciparum and Plasmodium vivax are two of the main propagators of human malaria. Both species contain the protein, Apical Membrane Antigen 1 (AMA-1), which is involved in the process of host cell invasion. However, the high degree of polymorphisms and antigenic diversity in this protein has prevented consistent single-vaccine success. Furthermore, the three main domains within AMA-1 (Domains I, II, and III), possess variable polymorphic features and levels of diversity. Overcoming this issue may require an understanding of the type of selection acting on AMA-1 in P. falciparum and P. vivax. Therefore, this investigation aimed to determine the type of selection acting on the whole AMA-1 coding sequence and in each domain for P. falciparum and P. vivax. Population structure was investigated on a global scale and among individual countries. AMA-1 sequences were obtained from the National Center for Biotechnology. For P. falciparum, 649 complete and 382 partial sequences were obtained. For P. vivax, 395 sequences were obtained (370 partial). The AMA-1 gene in P. falciparum was found to possess high nonsynonymous polymorphisms and disproportionately low synonymous polymorphisms. Domain I was found to have the most diverse region with consistently high nonsynonymous substitutions across all countries. Large, positive, and significant Z-test scores indicated the presence of positive selection while FST and NST values showed low genetic differentiation across populations. Data trends for all analyses were relatively consistent for the global and country-based analyses. The only country to deviate was Venezuela, which was the only South American country analyzed. Network analyses did not show distinguishable groupings. For P. falciparum, it was concluded that positive diversifying selection was acting on the AMA-1 gene, particularly in Domain I. In AMA-1 of P. vivax, nonsynonymous and synonymous polymorphisms were relatively equal across all analyses. FST and NST values were high, indicating that countries were genetically distinct populations. Network analyses did not show distinguishable grouping; however, the data was limited to small sample sizes. From the data, it was concluded that AMA-1 in P. vivax was evolving neutrally, where selective pressures did not strongly encourage positive or purifying selection specifically. In addition, different AMA-1 P. vivax strains were genetically distinct and this genetic identity correlated with geographic region. Therefore, AMA-1 strains in P. falciparum and P. vivax not only evolve differently and undergo different form of selection, but they also require different vaccine development strategies. A combination of strain-specific vaccines along with preventative measures on an environmental level will likely be more effective than trying to achieve a single, comprehensive vaccine.
Date Created
2015-05
Agent

Population Structure in the Roundtail Chub (Gila Robusta Complex) of the Gila River Basin as Determined by Microsatellites: Evolutionary and Conservation Implications

128795-Thumbnail Image.png
Description

Ten microsatellite loci were characterized for 34 locations from roundtail chub (Gila robusta complex) to better resolve patterns of genetic variation among local populations in the lower Colorado River basin. This group has had a complex taxonomic history and previous

Ten microsatellite loci were characterized for 34 locations from roundtail chub (Gila robusta complex) to better resolve patterns of genetic variation among local populations in the lower Colorado River basin. This group has had a complex taxonomic history and previous molecular analyses failed to identify species diagnostic molecular markers. Our results supported previous molecular studies based on allozymes and DNA sequences, which found that most genetic variance was explained by differences among local populations. Samples from most localities were so divergent species-level diagnostic markers were not found. Some geographic samples were discordant with current taxonomy due to admixture or misidentification; therefore, additional morphological studies are necessary. Differences in spatial genetic structure were consistent with differences in connectivity of stream habitats, with the typically mainstem species, G. robusta, exhibiting greater genetic connectedness within the Gila River drainage. No species exhibited strong isolation by distance over the entire stream network, but the two species typically found in headwaters, G. nigra and G. intermedia, exhibited greater than expected genetic similarity between geographically proximate populations, and usually clustered with individuals from the same geographic location and/or sub-basin. These results highlight the significance of microevolutionary processes and importance of maintaining local populations to maximize evolutionary potential for this complex. Augmentation stocking as a conservation management strategy should only occur under extreme circumstances, and potential source populations should be geographically proximate stocks of the same species, especially for the headwater forms.

Date Created
2015-10-16
Agent

A Generalized Formula for Converting Chi-Square Tests to Effect Sizes for Meta-Analysis

128869-Thumbnail Image.png
Description

The common formula used for converting a chi-square test into a correlation coefficient for use as an effect size in meta-analysis has a hidden assumption which may be violated in specific instances, leading to an overestimation of the effect size. A corrected formula is provided.

Date Created
2010-04-07
Agent

Evolution of multigene families and single copy genes in Plasmodium spp

154808-Thumbnail Image.png
Description
The complex life cycle and widespread range of infection of Plasmodium parasites, the causal agent of malaria in humans, makes them the perfect organism for the study of various evolutionary mechanisms. In particular, multigene families are considered one of the

The complex life cycle and widespread range of infection of Plasmodium parasites, the causal agent of malaria in humans, makes them the perfect organism for the study of various evolutionary mechanisms. In particular, multigene families are considered one of the main sources for genome adaptability and innovation. Within Plasmodium, numerous species- and clade-specific multigene families have major functions in the development and maintenance of infection. Nonetheless, while the evolutionary mechanisms predominant on many species- and clade-specific multigene families have been previously studied, there are far less studies dedicated to analyzing genus common multigene families (GCMFs). I studied the patterns of natural selection and recombination in 90 GCMFs with diverse numbers of gene gain/loss events. I found that the majority of GCMFs are formed by duplications events that predate speciation of mammal Plasmodium species, with many paralogs being neutrally maintained thereafter. In general, multigene families involved in immune evasion and host cell invasion commonly showed signs of positive selection and species-specific gain/loss events; particularly, on Plasmodium species is the simian and rodent clades. A particular multigene family: the merozoite surface protein-7 (msp7) family, is found in all Plasmodium species and has functions related to the erythrocyte invasion. Within Plasmodium vivax, differences in the number of paralogs in this multigene family has been previously explained, at least in part, as potential adaptations to the human host. To investigate this I studied msp7 orthologs in closely related non-human primate parasites where homology was evident. I also estimated paralogs’ evolutionary history and genetic polymorphism. The emerging patterns where compared with those of Plasmodium falciparum. I found that the evolution of the msp7 multigene family is consistent with a Birth-and-Death model where duplications, pseudogenization and gene lost events are common. In order to study additional aspects in the evolution of Plasmodium, I evaluated the trends of long term and short term evolution and the putative effects of vertebrate- host’s immune pressure of gametocytes across various Plasmodium species. Gametocytes, represent the only sexual stage within the Plasmodium life cycle, and are also the transition stages from the vertebrate to the mosquito vector. I found that, while male and female gametocytes showed different levels of immunogenicity, signs of positive selection were not entirely related to the location and presence of immune epitope regions. Overall, these studies further highlight the complex evolutionary patterns observed in Plasmodium.
Date Created
2016
Agent

Contextual Cross-Referencing of Species Names for Fiddler Crabs (Genus Uca): An Experiment in Cyber-Taxonomy

129592-Thumbnail Image.png
Description

Cyber-taxonomy of name usage has focused primarily on producing authoritative lists of names or cross-linking names and data across disparate databases. A feature missing from much of this work is the recording and analysis of the context in which a

Cyber-taxonomy of name usage has focused primarily on producing authoritative lists of names or cross-linking names and data across disparate databases. A feature missing from much of this work is the recording and analysis of the context in which a name was used—context which can be critical for understanding not only what name an author used, but to which currently recognized species they actually refer. An experiment on recording contextual information associated with name usage was conducted for the fiddler crabs (genus Uca). Data from approximately one quarter of all publications that mention fiddler crabs, including 95% of those published prior to 1924 and 67% of those published prior to 1976, have currently been recorded in a database. Approaches and difficulties in recording and analyzing the context of name use are discussed. These results are not meant to be a full solution, rather to highlight problems which have not been previously investigated and may act as a springboard for broader approaches and discussion. Some data on the accessibility of the literature, including in particular electronic forms of publication, are also presented. The resulting data has been integrated for general browsing into the website http://www.fiddlercrab.info; the raw data and code used to construct the website is available at https://github.com/msrosenberg/fiddlercrab.info.

Date Created
2014-07-08
Agent

Plasmodium population structure in the context of malaria control and elimination

152820-Thumbnail Image.png
Description
Malaria is a vector-borne parasitic disease affecting tropical and subtropical regions. Regardless control efforts, malaria incidence is still incredible high with 219 million clinical cases and an estimated 660,000 related deaths (WHO, 2012). In this project, different population genetic approaches

Malaria is a vector-borne parasitic disease affecting tropical and subtropical regions. Regardless control efforts, malaria incidence is still incredible high with 219 million clinical cases and an estimated 660,000 related deaths (WHO, 2012). In this project, different population genetic approaches were explored to characterize parasite populations. The goal was to create a framework that considered temporal and spatial changes of Plasmodium populations in malaria surveillance. This is critical in a vector borne disease in areas of low transmission where there is not accurate information of when and where a patient was infected. In this study, fragment analysis data and single nucleotide polymorphism (SNPs) from South American samples were used to characterize Plasmodium population structure, patterns of migration and gene flow, and discuss approaches to differentiate reinfection vs. recrudescence cases in clinical trials. A Bayesian approach was also applied to analyze the Plasmodium population history by inferring genealogies using microsatellites data. Specifically, fluctuations in the parasite population and the age of different parasite lineages were evaluated through time in order to relate them with the malaria control plan in force. These studies are important to understand the turnover or persistence of "clones" circulating in a specific area through time and consider them in drug efficacy studies. Moreover, this methodology is useful for assessing changes in malaria transmission and for more efficiently manage resources to deploy control measures in locations that act as parasite "sources" for other regions. Overall, these results stress the importance of monitoring malaria demographic changes when assessing the success of elimination programs in areas of low transmission.
Date Created
2014
Agent