Revealing microbial responses to environmental dynamics: developing methods for analysis and visualization of complex sequence datasets

Description
The greatest barrier to understanding how life interacts with its environment is the complexity in which biology operates. In this work, I present experimental designs, analysis methods, and visualization techniques to overcome the challenges of deciphering complex biological datasets. First,

The greatest barrier to understanding how life interacts with its environment is the complexity in which biology operates. In this work, I present experimental designs, analysis methods, and visualization techniques to overcome the challenges of deciphering complex biological datasets. First, I examine an iron limitation transcriptome of Synechocystis sp. PCC 6803 using a new methodology. Until now, iron limitation in experiments of Synechocystis sp. PCC 6803 gene expression has been achieved through media chelation. Notably, chelation also reduces the bioavailability of other metals, whereas naturally occurring low iron settings likely result from a lack of iron influx and not as a result of chelation. The overall metabolic trends of previous studies are well-characterized but within those trends is significant variability in single gene expression responses. I compare previous transcriptomics analyses with our protocol that limits the addition of bioavailable iron to growth media to identify consistent gene expression signals resulting from iron limitation. Second, I describe a novel method of improving the reliability of centroid-linkage clustering results. The size and complexity of modern sequencing datasets often prohibit constructing distance matrices, which prevents the use of many common clustering algorithms. Centroid-linkage circumvents the need for a distance matrix, but has the adverse effect of producing input-order dependent results. In this chapter, I describe a method of cluster edge counting across iterated centroid-linkage results and reconstructing aggregate clusters from a ranked edge list without a distance matrix and input-order dependence. Finally, I introduce dendritic heat maps, a new figure type that visualizes heat map responses through expanding and contracting sequence clustering specificities. Heat maps are useful for comparing data across a range of possible states. However, data binning is sensitive to clustering cutoffs which are often arbitrarily introduced by researchers and can substantially change the heat map response of any single data point. With an understanding of how the architectural elements of dendrograms and heat maps affect data visualization, I have integrated their salient features to create a figure type aimed at viewing multiple levels of clustering cutoffs, allowing researchers to better understand the effects of environment on metabolism or phylogenetic lineages.
Date Created
2017
Agent

Deep Freeze: Could Life Exist on Europa?

134928-Thumbnail Image.png
Description
Finding life beyond Earth could change our understanding of life and habitability. The best place to look for life beyond Earth is Jupiter's moon, Europa. It has been estimated Europa may have a liquid, salt-water subsurface with 2 to 3

Finding life beyond Earth could change our understanding of life and habitability. The best place to look for life beyond Earth is Jupiter's moon, Europa. It has been estimated Europa may have a liquid, salt-water subsurface with 2 to 3 times the volume of all Earth's oceans. Knowing that all life requires water, it is in our best interest to explore Europa. This thesis explored the plausibility of life on Europa in four of its environments: on the surface, under the ice shell, in the liquid subsurface, and at the bottom of the liquid subsurface. Each of these environments were defined from science literature and compared to known Earth analogs. Europa's surface is not likely to support life, as there is not liquid water present. There is also extremely high radiation bombardment and extremely low surface temperatures that are estimated to be well out of the range for supporting life. It is more plausible that life could be under Europa's ice shell than on the surface. Under the surface, radiation exposure dramatically reduces. Researchers have found organisms on Earth that can live in similar environments as Europa's ice as well. These organisms require some interaction with liquid water though. Uncertainties about Europa's ice shell thickness and radiation load per depth it experiences, as well as there being limited research on organisms in ice environments, hinder us from definitively assessing the plausibility of life under the surface. The best environment on Europa to look for life on Europa is the subsurface. There remain a lot of uncertainties about the subsurface, however, that make it difficult to assess the plausibility of finding life. These uncertainties include its depth, water activity, salinity, temperature, pressure, and structure. This subsurface may be suitable for life, but until we can further understand the environment of the subsurface, we cannot make definite conclusions. As for assessing the plausibility of life at the bottom of Europa's subsurface, there is not much we know about this environment either. It has been suggested there may be hydrothermal vents, but no evidence has either supported or rejected this idea. Without a clear understanding of the environment at the bottom of the subsurface, the plausibility of life here cannot be definitively answered. It is apparent we need to further study Europa. In particular, we need to focus on understanding the subsurface. When the subsurface is better defined, we can better assess the plausibility of life being present. Fortunately, both NASA and the ESA are currently planning missions to Europa that are scheduled to launch in the 2020s.
Date Created
2017-05
Agent

Analysis of Variables in Relation to Dissolved Organic Carbon in Yellowstone National Park

Description
Yellowstone National Park has a vibrant variety of flora, fauna, and hydrothermal systems all collected together in one large and complex system. Studies have been conducted for at least several decades in order to make sense of this system in

Yellowstone National Park has a vibrant variety of flora, fauna, and hydrothermal systems all collected together in one large and complex system. Studies have been conducted for at least several decades in order to make sense of this system in ways that may be relevant to other similar geologies around the world. The latest update in this ever-ongoing study involves the collection and analysis of water samples from 2016. These samples have been analyzed for conductivity, pH, temperature, dissolved organic carbon, dissolved inorganic carbon, carbon isotopes, dissolved oxygen, ferrous iron, sulfide, silica, and more. While not many trends were found in this data in regards to dissolved organic carbon values, this is a substantial addition to a growing body of information that could yield more impressive information in times to come. In addition, factors that have yet to analyzed for this 2016 data, such as concentrations of metals and metalloids, may provide some insights when put through a chloride vs sulfate framework to separate out different reaction regions.
Date Created
2016-12
Agent

Stable Isotope Labeling Confirms Mixotrophic Nature of Streamer Biofilm Communities at Alkaline Hot Springs

128183-Thumbnail Image.png
Description

Streamer biofilm communities (SBC) are often observed within chemosynthetic zones of Yellowstone hot spring outflow channels, where temperatures exceed those conducive to photosynthesis. Nearest the hydrothermal source (75–88°C) SBC comprise thermophilic Archaea and Bacteria, often mixed communities including Desulfurococcales and

Streamer biofilm communities (SBC) are often observed within chemosynthetic zones of Yellowstone hot spring outflow channels, where temperatures exceed those conducive to photosynthesis. Nearest the hydrothermal source (75–88°C) SBC comprise thermophilic Archaea and Bacteria, often mixed communities including Desulfurococcales and uncultured Crenarchaeota, as well as Aquificae and Thermus, each carrying diagnostic membrane lipid biomarkers. We tested the hypothesis that SBC can alternate their metabolism between autotrophy and heterotrophy depending on substrate availability. Feeding experiments were performed at two alkaline hot springs in Yellowstone National Park: Octopus Spring and “Bison Pool,” using various 13C-labeled substrates (bicarbonate, formate, acetate, and glucose) to determine the relative uptake of these different carbon sources. Highest 13C uptake, at both sites, was from acetate into almost all bacterial fatty acids, particularly into methyl-branched C15, C17 and C19 fatty acids that are diagnostic for Thermus/Meiothermus, and some Firmicutes as well as into universally common C16:0 and C18:0 fatty acids. 13C-glucose showed a similar, but a 10–30 times lower uptake across most fatty acids. 13C-bicarbonate uptake, signifying the presence of autotrophic communities was only significant at “Bison Pool” and was observed predominantly in non-specific saturated C16, C18, C20, and C22 fatty acids. Incorporation of 13C-formate occurred only at very low rates at “Bison Pool” and was almost undetectable at Octopus Spring, suggesting that formate is not an important carbon source for SBC. 13C-uptake into archaeal lipids occurred predominantly with 13C-acetate, suggesting also that archaeal communities at both springs have primarily heterotrophic carbon assimilation pathways. We hypothesize that these communities are energy-limited and predominantly nurtured by input of exogenous organic material, with only a small fraction being sustained by autotrophic growth.

Date Created
2015-02-05
Agent

High pH Microbial Ecosystems in a Newly Discovered, Ephemeral, Serpentinizing Fluid Seep at Yanartaş (Chimera), Turkey

128591-Thumbnail Image.png
Description

Gas seeps emanating from Yanartaş (Chimera), Turkey, have been documented for thousands of years. Active serpentinization produces hydrogen and a range of carbon gases that may provide fuel for life. Here we report a newly discovered, ephemeral fluid seep emanating

Gas seeps emanating from Yanartaş (Chimera), Turkey, have been documented for thousands of years. Active serpentinization produces hydrogen and a range of carbon gases that may provide fuel for life. Here we report a newly discovered, ephemeral fluid seep emanating from a small gas vent at Yanartaş. Fluids and biofilms were sampled at the source and points downstream. We describe site conditions, and provide microbiological data in the form of enrichment cultures, Scanning electron microscopy (SEM), carbon and nitrogen isotopic composition of solids, and PCR screens of nitrogen cycle genes. Source fluids are pH 11.95, with a Ca:Mg of ~200, and sediments under the ignited gas seep measure 60°C. Collectively, these data suggest the fluid is the product of active serpentinization at depth. Source sediments are primarily calcite and alteration products (chlorite and montmorillonite). Downstream, biofilms are mixed with montmorillonite. SEM shows biofilms distributed homogeneously with carbonates. Organic carbon accounts for 60% of the total carbon at the source, decreasing downstream to <15% as inorganic carbon precipitates. δ13C ratios of the organic carbon fraction of solids are depleted (−25 to −28‰) relative to the carbonates (−11 to −20‰). We conclude that heterotrophic processes are dominant throughout the surface ecosystem, and carbon fixation may be key down channel. δ15N ratios ~3‰, and absence of nifH in extracted DNA suggest that nitrogen fixation is not occurring in sediments. However, the presence of narG and nirS at most locations and in enrichments indicates genomic potential for nitrate and nitrite reduction. This small seep with shallow run-off is likely ephemeral, but abundant preserved microterracettes in the outflow and the surrounding area suggest it has been present for some time. This site and others like it present an opportunity for investigations of preserved deep biosphere signatures, and subsurface-surface interactions.

Date Created
2015-01-19
Agent

A Metastable Equilibrium Model for the Relative Abundances of Microbial Phyla in a Hot Spring

128824-Thumbnail Image.png
Description

Many studies link the compositions of microbial communities to their environments, but the energetics of organism-specific biomass synthesis as a function of geochemical variables have rarely been assessed. We describe a thermodynamic model that integrates geochemical and metagenomic data for

Many studies link the compositions of microbial communities to their environments, but the energetics of organism-specific biomass synthesis as a function of geochemical variables have rarely been assessed. We describe a thermodynamic model that integrates geochemical and metagenomic data for biofilms sampled at five sites along a thermal and chemical gradient in the outflow channel of the hot spring known as “Bison Pool” in Yellowstone National Park. The relative abundances of major phyla in individual communities sampled along the outflow channel are modeled by computing metastable equilibrium among model proteins with amino acid compositions derived from metagenomic sequences. Geochemical conditions are represented by temperature and activities of basis species, including pH and oxidation-reduction potential quantified as the activity of dissolved hydrogen. By adjusting the activity of hydrogen, the model can be tuned to closely approximate the relative abundances of the phyla observed in the community profiles generated from BLAST assignments. The findings reveal an inverse relationship between the energy demand to form the proteins at equal thermodynamic activities and the abundance of phyla in the community. The distance from metastable equilibrium of the communities, assessed using an equation derived from energetic considerations that is also consistent with the information-theoretic entropy change, decreases along the outflow channel. Specific divergences from metastable equilibrium, such as an underprediction of the relative abundances of phototrophic organisms at lower temperatures, can be explained by considering additional sources of energy and/or differences in growth efficiency. Although the metabolisms used by many members of these communities are driven by chemical disequilibria, the results support the possibility that higher-level patterns of chemotrophic microbial ecosystems are shaped by metastable equilibrium states that depend on both the composition of biomass and the environmental conditions.

Date Created
2013-09-02
Agent

Korarchaeota Diversity, Biogeography, and Abundance in Yellowstone and Great Basin Hot Springs and Ecological Niche Modeling Based on Machine Learning

128833-Thumbnail Image.png
Description

Over 100 hot spring sediment samples were collected from 28 sites in 12 areas/regions, while recording as many coincident geochemical properties as feasible (>60 analytes). PCR was used to screen samples for Korarchaeota 16S rRNA genes. Over 500 Korarchaeota 16S

Over 100 hot spring sediment samples were collected from 28 sites in 12 areas/regions, while recording as many coincident geochemical properties as feasible (>60 analytes). PCR was used to screen samples for Korarchaeota 16S rRNA genes. Over 500 Korarchaeota 16S rRNA genes were screened by RFLP analysis and 90 were sequenced, resulting in identification of novel Korarchaeota phylotypes and exclusive geographical variants. Korarchaeota diversity was low, as in other terrestrial geothermal systems, suggesting a marine origin for Korarchaeota with subsequent niche-invasion into terrestrial systems. Korarchaeota endemism is consistent with endemism of other terrestrial thermophiles and supports the existence of dispersal barriers. Korarchaeota were found predominantly in >55°C springs at pH 4.7–8.5 at concentrations up to 6.6×106 16S rRNA gene copies g-1 wet sediment. In Yellowstone National Park (YNP), Korarchaeota were most abundant in springs with a pH range of 5.7 to 7.0. High sulfate concentrations suggest these fluids are influenced by contributions from hydrothermal vapors that may be neutralized to some extent by mixing with water from deep geothermal sources or meteoric water. In the Great Basin (GB), Korarchaeota were most abundant at spring sources of pH<7.2 with high particulate C content and high alkalinity, which are likely to be buffered by the carbonic acid system. It is therefore likely that at least two different geological mechanisms in YNP and GB springs create the neutral to mildly acidic pH that is optimal for Korarchaeota. A classification support vector machine (C-SVM) trained on single analytes, two analyte combinations, or vectors from non-metric multidimensional scaling models was able to predict springs as Korarchaeota-optimal or sub-optimal habitats with accuracies up to 95%. To our knowledge, this is the most extensive analysis of the geochemical habitat of any high-level microbial taxon and the first application of a C-SVM to microbial ecology.

Date Created
2012-05-04
Agent

Coordinating Environmental Genomics and Geochemistry Reveals Metabolic Transitions in a Hot Spring Ecosystem

128916-Thumbnail Image.png
Description

We have constructed a conceptual model of biogeochemical cycles and metabolic and microbial community shifts within a hot spring ecosystem via coordinated analysis of the “Bison Pool” (BP) Environmental Genome and a complementary contextual geochemical dataset of ∼75 geochemical parameters.

We have constructed a conceptual model of biogeochemical cycles and metabolic and microbial community shifts within a hot spring ecosystem via coordinated analysis of the “Bison Pool” (BP) Environmental Genome and a complementary contextual geochemical dataset of ∼75 geochemical parameters. 2,321 16S rRNA clones and 470 megabases of environmental sequence data were produced from biofilms at five sites along the outflow of BP, an alkaline hot spring in Sentinel Meadow (Lower Geyser Basin) of Yellowstone National Park. This channel acts as a >22 m gradient of decreasing temperature, increasing dissolved oxygen, and changing availability of biologically important chemical species, such as those containing nitrogen and sulfur. Microbial life at BP transitions from a 92°C chemotrophic streamer biofilm community in the BP source pool to a 56°C phototrophic mat community. We improved automated annotation of the BP environmental genomes using BLAST-based Markov clustering. We have also assigned environmental genome sequences to individual microbial community members by complementing traditional homology-based assignment with nucleotide word-usage algorithms, allowing more than 70% of all reads to be assigned to source organisms. This assignment yields high genome coverage in dominant community members, facilitating reconstruction of nearly complete metabolic profiles and in-depth analysis of the relation between geochemical and metabolic changes along the outflow. We show that changes in environmental conditions and energy availability are associated with dramatic shifts in microbial communities and metabolic function. We have also identified an organism constituting a novel phylum in a metabolic “transition” community, located physically between the chemotroph- and phototroph-dominated sites. The complementary analysis of biogeochemical and environmental genomic data from BP has allowed us to build ecosystem-based conceptual models for this hot spring, reconstructing whole metabolic networks in order to illuminate community roles in shaping and responding to geochemical variability.

Date Created
2012-06-04
Agent

Calculation of the Relative Chemical Stabilities of Proteins as a Function of Temperature and Redox Chemistry in a Hot Spring

128925-Thumbnail Image.png
Description

Uncovering the chemical and physical links between natural environments and microbial communities is becoming increasingly amenable owing to geochemical observations and metagenomic sequencing. At the hot spring known as Bison Pool in Yellowstone National Park, the cooling of the water

Uncovering the chemical and physical links between natural environments and microbial communities is becoming increasingly amenable owing to geochemical observations and metagenomic sequencing. At the hot spring known as Bison Pool in Yellowstone National Park, the cooling of the water in the outflow channel is associated with an increase in oxidation potential estimated from multiple field-based measurements. Representative groups of proteins whose sequences were derived from metagenomic data also exhibit an increase in average oxidation state of carbon in the protein molecules with distance from the hot-spring source. The energetic requirements of reactions to form selected proteins used in the model were computed using amino-acid group additivity for the standard molal thermodynamic properties of the proteins, and the relative chemical stabilities of the proteins were investigated by varying temperature, pH and oxidation state, expressed as activity of dissolved hydrogen. The relative stabilities of the proteins were found to track the locations of the sampling sites when the calculations included a function for hydrogen activity that increases with temperature and is higher, or more reducing, than values consistent with measurements of dissolved oxygen, sulfide and oxidation-reduction potential in the field. These findings imply that spatial patterns in the amino acid compositions of proteins can be linked, through energetics of overall chemical reactions representing the formation of the proteins, to the environmental conditions at this hot spring, even if microbial cells maintain considerably different internal conditions. Further applications of the thermodynamic calculations are possible for other natural microbial ecosystems.

Date Created
2011-08-11
Agent

Energy transfer between the geosphere and biosphere

155152-Thumbnail Image.png
Description
One goal of geobiochemistry is to follow geochemical energy supplies from the external environment to the inside of microbial cells. This can be accomplished by combining thermodynamic calculations of energy supplies from geochemical processes and energy demands for biochemical processes.

One goal of geobiochemistry is to follow geochemical energy supplies from the external environment to the inside of microbial cells. This can be accomplished by combining thermodynamic calculations of energy supplies from geochemical processes and energy demands for biochemical processes. Progress towards this goal is summarized here. A critique of all thermodynamic data for biochemical compounds involved in the citric acid cycle (CAC) and the formulation of metabolite properties allows predictions of the energy involved in each step of the cycle as well as the full forward and reverse cycles over wide ranges of temperature and pressure. These results allow evaluation of energy demands at the center of many microbial metabolic systems. Field work, sampling, and lab analyses from two low-temperature systems, a serpentinizing system, and a subglacial setting, provide the data used in these thermodynamic analyses of energy supplies. An extensive literature summary of microbial and molecular data from serpentinizing systems found is used to guide the evaluation and ranking of energy supplies used by chemolithoautotrophic microbes. These results constrain models of the distribution of microbial metabolisms throughout the low-temperature serpentinization systems in the Samail ophiolite in Oman (including locales of primary and subsequent alteration processes). Data collected from Robertson Glacier in Alberta, Canada, together with literature data from Lake Vida in Antarctica and bottom seawater, allowed thermodynamic analyses of low-temperature energy supplies in a glacial system. Results for 1460 inorganic redox reactions are used to fully inventory the geochemical energy sources that support the globally extensive cold biosphere.
Date Created
2016
Agent