Biology-based matched signal processing and physics-based modeling for improved detection

153209-Thumbnail Image.png
Peptide microarrays have been used in molecular biology to profile immune responses and develop diagnostic tools. When the microarrays are printed with random peptide sequences, they can be used to identify antigen antibody binding patterns or immunosignatures. In this

Peptide microarrays have been used in molecular biology to profile immune responses and develop diagnostic tools. When the microarrays are printed with random peptide sequences, they can be used to identify antigen antibody binding patterns or immunosignatures. In this thesis, an advanced signal processing method is proposed to estimate epitope antigen subsequences as well as identify mimotope antigen subsequences that mimic the structure of epitopes from random-sequence peptide microarrays. The method first maps peptide sequences to linear expansions of highly-localized one-dimensional (1-D) time-varying signals and uses a time-frequency processing technique to detect recurring patterns in subsequences. This technique is matched to the aforementioned mapping scheme, and it allows for an inherent analysis on how substitutions in the subsequences can affect antibody binding strength. The performance of the proposed method is demonstrated by estimating epitopes and identifying potential mimotopes for eight monoclonal antibody samples.

The proposed mapping is generalized to express information on a protein's sequence location, structure and function onto a highly localized three-dimensional (3-D) Gaussian waveform. In particular, as analysis of protein homology has shown that incorporating different kinds of information into an alignment process can yield more robust alignment results, a pairwise protein structure alignment method is proposed based on a joint similarity measure of multiple mapped protein attributes. The 3-D mapping allocates protein properties into distinct regions in the time-frequency plane in order to simplify the alignment process by including all relevant information into a single, highly customizable waveform. Simulations demonstrate the improved performance of the joint alignment approach to infer relationships between proteins, and they provide information on mutations that cause changes to both the sequence and structure of a protein.

In addition to the biology-based signal processing methods, a statistical method is considered that uses a physics-based model to improve processing performance. In particular, an externally developed physics-based model for sea clutter is examined when detecting a low radar cross-section target in heavy sea clutter. This novel model includes a process that generates random dynamic sea clutter based on the governing physics of water gravity and capillary waves and a finite-difference time-domain electromagnetics simulation process based on Maxwell's equations propagating the radar signal. A subspace clutter suppression detector is applied to remove dominant clutter eigenmodes, and its improved performance over matched filtering is demonstrated using simulations.
Date Created

Dense non-natural sequence peptide microarrays for epitope mapping and diagnostics

153110-Thumbnail Image.png
The healthcare system in this country is currently unacceptable. New technologies may contribute to reducing cost and improving outcomes. Early diagnosis and treatment represents the least risky option for addressing this issue. Such a technology needs to be inexpensive, highly

The healthcare system in this country is currently unacceptable. New technologies may contribute to reducing cost and improving outcomes. Early diagnosis and treatment represents the least risky option for addressing this issue. Such a technology needs to be inexpensive, highly sensitive, highly specific, and amenable to adoption in a clinic. This thesis explores an immunodiagnostic technology based on highly scalable, non-natural sequence peptide microarrays designed to profile the humoral immune response and address the healthcare problem. The primary aim of this thesis is to explore the ability of these arrays to map continuous (linear) epitopes. I discovered that using a technique termed subsequence analysis where epitopes could be decisively mapped to an eliciting protein with high success rate. This led to the discovery of novel linear epitopes from Plasmodium falciparum (Malaria) and Treponema palladium (Syphilis), as well as validation of previously discovered epitopes in Dengue and monoclonal antibodies. Next, I developed and tested a classification scheme based on Support Vector Machines for development of a Dengue Fever diagnostic, achieving higher sensitivity and specificity than current FDA approved techniques. The software underlying this method is available for download under the BSD license. Following this, I developed a kinetic model for immunosignatures and tested it against existing data driven by previously unexplained phenomena. This model provides a framework and informs ways to optimize the platform for maximum stability and efficiency. I also explored the role of sequence composition in explaining an immunosignature binding profile, determining a strong role for charged residues that seems to have some predictive ability for disease. Finally, I developed a database, software and indexing strategy based on Apache Lucene for searching motif patterns (regular expressions) in large biological databases. These projects as a whole have advanced knowledge of how to approach high throughput immunodiagnostics and provide an example of how technology can be fused with biology in order to affect scientific and health outcomes.
Date Created

Antibody based strategies for multiplexed diagnostics

152851-Thumbnail Image.png
Peptide microarrays are to proteomics as sequencing is to genomics. As microarrays become more content-rich, higher resolution proteomic studies will parallel deep sequencing of nucleic acids. Antigen-antibody interactions can be studied at a much higher resolution using microarrays than was

Peptide microarrays are to proteomics as sequencing is to genomics. As microarrays become more content-rich, higher resolution proteomic studies will parallel deep sequencing of nucleic acids. Antigen-antibody interactions can be studied at a much higher resolution using microarrays than was possible only a decade ago. My dissertation focuses on testing the feasibility of using either the Immunosignature platform, based on non-natural peptide sequences, or a pathogen peptide microarray, which uses bioinformatically-selected peptides from pathogens for creating sensitive diagnostics. Both diagnostic applications use relatively little serum from infected individuals, but each approaches diagnosis of disease differently. The first project compares pathogen epitope peptide (life-space) and non-natural (random-space) peptide microarrays while using them for the early detection of Coccidioidomycosis (Valley Fever). The second project uses NIAID category A, B and C priority pathogen epitope peptides in a multiplexed microarray platform to assess the feasibility of using epitope peptides to simultaneously diagnose multiple exposures using a single assay. Cross-reactivity is a consistent feature of several antigen-antibody based immunodiagnostics. This work utilizes microarray optimization and bioinformatic approaches to distill the underlying disease specific antibody signature pattern. Circumventing inherent cross-reactivity observed in antibody binding to peptides was crucial to achieve the goal of this work to accurately distinguishing multiple exposures simultaneously.
Date Created

Using antibodies to characterize healthy, disease, and age states

152641-Thumbnail Image.png
The advent of new high throughput technology allows for increasingly detailed characterization of the immune system in healthy, disease, and age states. The immune system is composed of two main branches: the innate and adaptive immune system, though the border

The advent of new high throughput technology allows for increasingly detailed characterization of the immune system in healthy, disease, and age states. The immune system is composed of two main branches: the innate and adaptive immune system, though the border between these two states is appearing less distinct. The adaptive immune system is further split into two main categories: humoral and cellular immunity. The humoral immune response produces antibodies against specific targets, and these antibodies can be used to learn about disease and normal states. In this document, I use antibodies to characterize the immune system in two ways: 1. I determine the Antibody Status (AbStat) from the data collected from applying sera to an array of non-natural sequence peptides, and demonstrate that this AbStat measure can distinguish between disease, normal, and aged samples as well as produce a single AbStat number for each sample; 2. I search for antigens for use in a cancer vaccine, and this search results in several candidates as well as a new hypothesis. Antibodies provide us with a powerful tool for characterizing the immune system, and this natural tool combined with emerging technologies allows us to learn more about healthy and disease states.
Date Created

Identification of neo-antigens for a cancer vaccine by transcriptome analysis

150491-Thumbnail Image.png
We propose a novel solution to prevent cancer by developing a prophylactic cancer. Several sources of antigens for cancer vaccines have been published. Among these, antigens that contain a frame-shift (FS) peptide or viral peptide are quite attractive for a

We propose a novel solution to prevent cancer by developing a prophylactic cancer. Several sources of antigens for cancer vaccines have been published. Among these, antigens that contain a frame-shift (FS) peptide or viral peptide are quite attractive for a variety of reasons. FS sequences, from either mistake in RNA processing or in genomic DNA, may lead to generation of neo-peptides that are foreign to the immune system. Viral peptides presumably would originate from exogenous but integrated viral nucleic acid sequences. Both are non-self, therefore lessen concerns about development of autoimmunity. I have developed a bioinformatical approach to identify these aberrant transcripts in the cancer transcriptome. Their suitability for use in a vaccine is evaluated by establishing their frequencies and predicting possible epitopes along with their population coverage according to the prevalence of major histocompatibility complex (MHC) types. Viral transcripts and transcripts with FS mutations from gene fusion, insertion/deletion at coding microsatellite DNA, and alternative splicing were identified in NCBI Expressed Sequence Tag (EST) database. 48 FS chimeric transcripts were validated in 50 breast cell lines and 68 primary breast tumor samples with their frequencies from 4% to 98% by RT-PCR and sequencing confirmation. These 48 FS peptides, if translated and presented, could be used to protect more than 90% of the population in Northern America based on the prediction of epitopes derived from them. Furthermore, we synthesized 150 peptides that correspond to FS and viral peptides that we predicted would exist in tumor patients and we tested over 200 different cancer patient sera. We found a number of serological reactive peptide sequences in cancer patients that had little to no reactivity in healthy controls; strong support for the strength of our bioinformatic approach. This study describes a process used to identify aberrant transcripts that lead to a new source of antigens that can be tested and used in a prophylactic cancer vaccine. The vast amount of transcriptome data of various cancers from the Cancer Genome Atlas (TCGA) project will enhance our ability to further select better cancer antigen candidates.
Date Created

Immunosignature of Alzheimer's disease

150452-Thumbnail Image.png
The goal of this thesis is to test whether Alzheimer's disease (AD) is associated with distinctive humoral immune changes that can be detected in plasma and tracked across time. This is relevant because AD is the principal cause of dementia,

The goal of this thesis is to test whether Alzheimer's disease (AD) is associated with distinctive humoral immune changes that can be detected in plasma and tracked across time. This is relevant because AD is the principal cause of dementia, and yet, no specific diagnostic tests are universally employed in clinical practice to predict, diagnose or monitor disease progression. In particular, I describe herein a proteomic platform developed at the Center for Innovations in Medicine (CIM) consisting of a slide with 10.000 random-sequence peptides printed on its surface, which is used as the solid phase of an immunoassay where antibodies of interest are allowed to react and subsequently detected with a labeled secondary antibody. The pattern of antibody binding to the microarray is unique for each individual animal or person. This thesis will evaluate the versatility of the microarray platform and how it can be used to detect and characterize the binding patterns of antibodies relevant to the pathophysiology of AD as well as the plasma samples of animal models of AD and elderly humans with or without dementia. My specific aims were to evaluate the emergence and stability of immunosignature in mice with cerebral amyloidosis, and characterize the immunosignature of humans with AD. Plasma samples from APPswe/PSEN1-dE9 transgenic mice were evaluated longitudinally from 2 to 15 months of age to compare the evolving immunosignature with non-transgenic control mice. Immunological variation across different time-points was assessed, with particular emphasis on time of emergence of a characteristic pattern. In addition, plasma samples from AD patients and age-matched individuals without dementia were assayed on the peptide microarray and binding patterns were compared. It is hoped that these experiments will be the basis for a larger study of the diagnostic merits of the microarray-based immunoassay in dementia clinics.
Date Created

Characterization and analysis of a novel platform for profiling the antibody response

150250-Thumbnail Image.png
Immunosignaturing is a new immunodiagnostic technology that uses random-sequence peptide microarrays to profile the humoral immune response. Though the peptides have little sequence homology to any known protein, binding of serum antibodies may be detected, and the pattern correlated to

Immunosignaturing is a new immunodiagnostic technology that uses random-sequence peptide microarrays to profile the humoral immune response. Though the peptides have little sequence homology to any known protein, binding of serum antibodies may be detected, and the pattern correlated to disease states. The aim of my dissertation is to analyze the factors affecting the binding patterns using monoclonal antibodies and determine how much information may be extracted from the sequences. Specifically, I examined the effects of antibody concentration, competition, peptide density, and antibody valence. Peptide binding could be detected at the low concentrations relevant to immunosignaturing, and a monoclonal's signature could even be detected in the presences of 100 fold excess naive IgG. I also found that peptide density was important, but this effect was not due to bivalent binding. Next, I examined in more detail how a polyreactive antibody binds to the random sequence peptides compared to protein sequence derived peptides, and found that it bound to many peptides from both sets, but with low apparent affinity. An in depth look at how the peptide physicochemical properties and sequence complexity revealed that there were some correlations with properties, but they were generally small and varied greatly between antibodies. However, on a limited diversity but larger peptide library, I found that sequence complexity was important for antibody binding. The redundancy on that library did enable the identification of specific sub-sequences recognized by an antibody. The current immunosignaturing platform has little repetition of sub-sequences, so I evaluated several methods to infer antibody epitopes. I found two methods that had modest prediction accuracy, and I developed a software application called GuiTope to facilitate the epitope prediction analysis. None of the methods had sufficient accuracy to identify an unknown antigen from a database. In conclusion, the characteristics of the immunosignaturing platform observed through monoclonal antibody experiments demonstrate its promise as a new diagnostic technology. However, a major limitation is the difficulty in connecting the signature back to the original antigen, though larger peptide libraries could facilitate these predictions.
Date Created