Modeling the Relationship Between Migration, Selection, and Drift in Populations of Blind Cave Fish

136967-Thumbnail Image.png
Description
The evolution of blindness in cave animals has been heavily studied; however, little research has been done on the interaction of migration and drift on the development of blindness in these populations. In this study, a model is used to

The evolution of blindness in cave animals has been heavily studied; however, little research has been done on the interaction of migration and drift on the development of blindness in these populations. In this study, a model is used to compare the effect that genetic drift has on the fixation of a blindness allele for varying amounts of migration and selection. For populations where the initial frequency is quite low, genetic drift plays a much larger role in the fixation of blindness than populations where the initial frequency is high. In populations where the initial frequency is high, genetic drift plays almost no role in fixation. Our results suggest that migration plays a greater role in the fate of the blindness allele than selection.
Date Created
2014-05
Agent

Genie: A Population Genetics Simulation Built with JavaScript

136360-Thumbnail Image.png
Description
The modern web presents an opportunity for educators and researchers to create tools that are highly accessible. Because of the near-ubiquity of modern web browsers, developers who hope to create educational and analytical tools can reach a large au- dience

The modern web presents an opportunity for educators and researchers to create tools that are highly accessible. Because of the near-ubiquity of modern web browsers, developers who hope to create educational and analytical tools can reach a large au- dience by creating web applications. Using JavaScript, HTML, and other modern web development technologies, Genie was developed as a simulator to help educators in biology, genetics, and evolution classrooms teach their students about population genetics. Because Genie was designed for the modern web, it is highly accessible to both educators and students, who can access the web application using any modern web browser on virtually any device. Genie demonstrates the efficacy of web devel- opment technologies for demonstrating and simulating complex processes, and it will be a unique educational tool for educators who teach population genetics.
Date Created
2015-05
Agent

Low Base-Substitution Mutation Rate in the Germline Genome of the Ciliate 'Tetrahymena Thermophila'

127983-Thumbnail Image.png
Description

Mutation is the ultimate source of all genetic variation and is, therefore, central to evolutionary change. Previous work on Paramecium tetraurelia found an unusually low germline base-substitution mutation rate in this ciliate. Here, we tested the generality of this result

Mutation is the ultimate source of all genetic variation and is, therefore, central to evolutionary change. Previous work on Paramecium tetraurelia found an unusually low germline base-substitution mutation rate in this ciliate. Here, we tested the generality of this result among ciliates using Tetrahymena thermophila. We sequenced the genomes of 10 lines of T. thermophila that had each undergone approximately 1,000 generations of mutation accumulation (MA). We applied an existing mutation-calling pipeline and developed a new probabilistic mutation detection approach that directly models the design of an MA experiment and accommodates the noise introduced by mismapped reads. Our probabilistic mutation-calling method provides a straightforward way of estimating the number of sites at which a mutation could have been called if one was present, providing the denominator for our mutation rate calculations. From these methods, we find that T. thermophila has a germline base-substitution mutation rate of 7.61 × 10 -12 per-site, per cell division, which is consistent with the low base-substitution mutation rate in P. tetraurelia. Over the course of the evolution experiment, genomic exclusion lines derived from the MA lines experienced a fitness decline that cannot be accounted for by germline base-substitution mutations alone, suggesting that other genetic or epigenetic factors must be involved. Because selection can only operate to reduce mutation rates based upon the "visible" mutational load, asexual reproduction with a transcriptionally silent germline may allow ciliates to evolve extremely low germline mutation rates.

Date Created
2016-09-15
Agent

Inferring Rates and Length-Distributions of Indels Using Approximate Bayesian Computation

128348-Thumbnail Image.png
Description

The most common evolutionary events at the molecular level are single-base substitutions, as well as insertions and deletions (indels) of short DNA segments. A large body of research has been devoted to develop probabilistic substitution models and to infer their

The most common evolutionary events at the molecular level are single-base substitutions, as well as insertions and deletions (indels) of short DNA segments. A large body of research has been devoted to develop probabilistic substitution models and to infer their parameters using likelihood and Bayesian approaches. In contrast, relatively little has been done to model indel dynamics, probably due to the difficulty in writing explicit likelihood functions. Here, we contribute to the effort of modeling indel dynamics by presenting SpartaABC, an approximate Bayesian computation (ABC) approach to infer indel parameters from sequence data (either aligned or unaligned). SpartaABC circumvents the need to use an explicit likelihood function by extracting summary statistics from simulated sequences. First, summary statistics are extracted from the input sequence data. Second, SpartaABC samples indel parameters from a prior distribution and uses them to simulate sequences. Third, it computes summary statistics from the simulated sets of sequences. By computing a distance between the summary statistics extracted from the input and each simulation, SpartaABC can provide an approximation to the posterior distribution of indel parameters as well as point estimates. We study the performance of our methodology and show that it provides accurate estimates of indel parameters in simulations. We next demonstrate the utility of SpartaABC by studying the impact of alignment errors on the inference of positive selection. A C ++ program implementing SpartaABC is freely available in http://spartaabc.tau.ac.il.

Date Created
2017-05-01
Agent

The Effect of the Dispersal Kernel on Isolation-By-Distance in a Continuous Population

128403-Thumbnail Image.png
Description

Under models of isolation-by-distance, population structure is determined by the probability of identity-by-descent between pairs of genes according to the geographic distance between them. Well established analytical results indicate that the relationship between geographical and genetic distance depends mostly on

Under models of isolation-by-distance, population structure is determined by the probability of identity-by-descent between pairs of genes according to the geographic distance between them. Well established analytical results indicate that the relationship between geographical and genetic distance depends mostly on the neighborhood size of the population which represents a standardized measure of gene flow. To test this prediction, we model local dispersal of haploid individuals on a two-dimensional landscape using seven dispersal kernels: Rayleigh, exponential, half-normal, triangular, gamma, Lomax and Pareto. When neighborhood size is held constant, the distributions produce similar patterns of isolation-by-distance, confirming predictions. Considering this, we propose that the triangular distribution is the appropriate null distribution for isolation-by-distance studies. Under the triangular distribution, dispersal is uniform over the neighborhood area which suggests that the common description of neighborhood size as a measure of an effective, local panmictic population is valid for popular families of dispersal distributions. We further show how to draw random variables from the triangular distribution efficiently and argue that it should be utilized in other studies in which computational efficiency is important.

Date Created
2016-03-29
Agent

Equations of the End: Teaching Mathematical Modeling Using the Zombie Apocalypse

128447-Thumbnail Image.png
Description

Mathematical models of infectious diseases are a valuable tool in understanding the mechanisms and patterns of disease transmission. It is, however, a difficult subject to teach, requiring both mathematical expertise and extensive subject-matter knowledge of a variety of disease systems.

Mathematical models of infectious diseases are a valuable tool in understanding the mechanisms and patterns of disease transmission. It is, however, a difficult subject to teach, requiring both mathematical expertise and extensive subject-matter knowledge of a variety of disease systems. In this article, we explore several uses of zombie epidemics to make mathematical modeling and infectious disease epidemiology more accessible to public health professionals, students, and the general public. We further introduce a web-based simulation, White Zed (http://cartwrig.ht/apps/whitezed/), that can be deployed in classrooms to allow students to explore models before implementing them. In our experience, zombie epidemics are familiar, approachable, flexible, and an ideal way to introduce basic concepts of infectious disease epidemiology.

Date Created
2016-03
Agent

Inferring Indel Parameters Using a Simulation-Based Approach

128470-Thumbnail Image.png
Description

In this study, we present a novel methodology to infer indel parameters from multiple sequence alignments (MSAs) based on simulations. Our algorithm searches for the set of evolutionary parameters describing indel dynamics which best fits a given input MSA. In

In this study, we present a novel methodology to infer indel parameters from multiple sequence alignments (MSAs) based on simulations. Our algorithm searches for the set of evolutionary parameters describing indel dynamics which best fits a given input MSA. In each step of the search, we use parametric bootstraps and the Mahalanobis distance to estimate how well a proposed set of parameters fits input data. Using simulations, we demonstrate that our methodology can accurately infer the indel parameters for a large variety of plausible settings. Moreover, using our methodology, we show that indel parameters substantially vary between three genomic data sets: Mammals, bacteria, and retroviruses. Finally, we demonstrate how our methodology can be used to simulate MSAs based on indel parameters inferred from real data sets.

Date Created
2015-11-03
Agent

Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium Vivax From Colombia

128617-Thumbnail Image.png
Description

Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by

Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and through the development of new genetic markers that can be used to monitor efforts to reduce malaria transmission. Here we analyze whole-genome data from eight field samples from a region in Cordóba, Colombia where malaria is endemic. We find considerable genetic diversity within this population, a result that contrasts with earlier studies suggesting that P. vivax had limited diversity in the Americas. We also identify a selective sweep around a substitution known to confer resistance to sulphadoxine-pyrimethamine (SP). This is the first observation of a selective sweep for SP resistance in this species. These results indicate that P. vivax has been exposed to SP pressure even when the drug is not in use as a first line treatment for patients afflicted by this parasite. We identify multiple non-synonymous substitutions in three other genes known to be involved with drug resistance in Plasmodium species. Finally, we found extensive microsatellite polymorphisms. Using this information we developed 18 polymorphic and easy to score microsatellite loci that can be used in epidemiological investigations in South America.

Date Created
2015-12-28
Agent

The Importance of Selection in the Evolution of Blindness in Cavefish

129022-Thumbnail Image.png
Description

Background: Blindness has evolved repeatedly in cave-dwelling organisms, and many hypotheses have been proposed to explain this observation, including both accumulation of neutral loss-of-function mutations and adaptation to darkness. Investigating the loss of sight in cave dwellers presents an opportunity to

Background: Blindness has evolved repeatedly in cave-dwelling organisms, and many hypotheses have been proposed to explain this observation, including both accumulation of neutral loss-of-function mutations and adaptation to darkness. Investigating the loss of sight in cave dwellers presents an opportunity to understand the operation of fundamental evolutionary processes, including drift, selection, mutation, and migration.

Results: Here we model the evolution of blindness in caves. This model captures the interaction of three forces: (1) selection favoring alleles causing blindness, (2) immigration of sightedness alleles from a surface population, and (3) mutations creating blindness alleles. We investigated the dynamics of this model and determined selection-strength thresholds that result in blindness evolving in caves despite immigration of sightedness alleles from the surface. We estimate that the selection coefficient for blindness would need to be at least 0.005 (and maybe as high as 0.5) for blindness to evolve in the model cave-organism, Astyanax mexicanus.

Conclusions: Our results indicate that strong selection is required for the evolution of blindness in cave-dwelling organisms, which is consistent with recent work suggesting a high metabolic cost of eye development.

Date Created
2017-02-07
Agent

A composite genome approach to identify phylogenetically informative data from next-generation sequencing

130367-Thumbnail Image.png
Description
Background
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without

Background
Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS, avoids the time consuming steps of de novo whole genome assembly, multiple genome alignment, and annotation.
Results
For simulations SISRS is able to identify large numbers of loci containing variable sites with phylogenetic signal. For genomic data from apes, SISRS identified thousands of variable sites, from which we produced an accurate phylogeny. Finally, we used SISRS to identify phylogenetic markers that we used to estimate the phylogeny of placental mammals. We recovered eight phylogenies that resolved the basal relationships among mammals using datasets with different levels of missing data. The three alternate resolutions of the basal relationships are consistent with the major hypotheses for the relationships among mammals, all of which have been supported previously by different molecular datasets.
Conclusions
SISRS has the potential to transform phylogenetic research. This method eliminates the need for expensive marker development in many studies by using whole genome shotgun sequence data directly. SISRS is open source and freely available at https://github.com/rachelss/SISRS/releases.
Date Created
2015-06-11
Agent