Exploring the Sequence vs. Binding Relationships for Monoclonal Antibodies and Other Proteins

171835-Thumbnail Image.png
Description
Molecular recognition forms the basis of all protein interactions, and therefore is crucial for maintaining biological functions and pathways. It can be governed by many factors, but in case of proteins and peptides, the amino acids sequences of the interacting

Molecular recognition forms the basis of all protein interactions, and therefore is crucial for maintaining biological functions and pathways. It can be governed by many factors, but in case of proteins and peptides, the amino acids sequences of the interacting entities play a huge role. It is molecular recognition that helps a protein identify the correct sequences residues necessary for an interaction, among the vast number of possibilities from the combinatorial sequence space. Therefore, it is fundamental to study how the interacting amino acid sequences define the molecular interactions of proteins. In this work, sparsely sampled peptide sequences from the combinatorial sequence space were used to study the molecular recognition observed in proteins, especially monoclonal antibodies. A machine learning based approach was used to study the molecular recognition characteristics of 11 monoclonal antibodies, where a neural network (NN) was trained on data from protein binding experiments performed on high-throughput random-sequence peptide microarrays. The use of random-sequence microarrays allowed for the peptides to be sparsely sampled from sequence space. Post-training, a sequence vs. binding relationship was deduced by the NN, for each antibody. This in silico relationship was then extended to larger libraries of random peptides, as well as to the biologically relevant sequences (target antigens, and proteomes). The NN models performed well in predicting the pertinent interactions for 6 out of the 11 monoclonal antibodies, in all aspects. The interactions of the other five monoclonal antibodies could not be predicted well by the models, due to their poor recognition of the residues that were omitted from the array. Furthermore, NN predicted sequence vs. binding relationships for 3 other proteins were experimentally probed using surface plasmon resonance (SPR). This was done to explore the relationship between the observed and predicted binding to the arrays and the observed binding on different assay platforms. It was noted that there was a general motif dependent correlation between predicted and SPR-measured binding. This study also indicated that a combined reiterative approach using in silico and in vitro techniques is a powerful tool for optimizing the selectivity of the protein-binding peptides.
Date Created
2022
Agent

Understanding and Utilizing Protein Interactions in Diverse Environments

168737-Thumbnail Image.png
Description
Transient protein-protein and protein-molecule interactions fluctuate between associated and dissociated states. They are widespread in nature and mediate most biological processes. These interactions are complex and are strongly influenced by factors such as concentration, structure, and environment. Understanding and utilizing

Transient protein-protein and protein-molecule interactions fluctuate between associated and dissociated states. They are widespread in nature and mediate most biological processes. These interactions are complex and are strongly influenced by factors such as concentration, structure, and environment. Understanding and utilizing these types of interactions is useful from both a fundamental and design perspective. In this dissertation, transient protein interactions are used as the sensing element of a biosensor for small molecule detection. This is done by using a transcription factor-small molecule pair that mediates the activation of a CRISPR/Cas12a complex. Activation of the Cas12a enzyme results in an amplified readout mechanism that is either fluorescence or paper based. This biosensor can successfully detect 9 different small molecules including antibiotics with a tuneable detection limit ranging from low µM to low nM. By combining protein and nucleic acid-based systems, this biosensor has the potential to report on almost any protein-molecule interaction, linking this to the intrinsic amplification that is possible when working with nucleic acid-based technologies. The second part of this dissertation focuses on understanding protein-molecule interactions at a more fundamental level, and, in so doing, exploring design rules required to generalize sensors like the ones described above. This is done by training a neural network algorithm with binding data from high density peptide micro arrays incubated with specific protein targets. Because the peptide sequences were chosen simply to evenly, though sparsely, represent all sequence space, the resulting network provides a comprehensive sequence/binding relationship for a given target protein. While past work had shown that this works well on the arrays, here I have explored how well the neural networks thus trained, predict sequence-dependent binding in the context of protein-protein and peptide-protein interactions. Amino acid sequences, either free in solution or embedded in protein structure, will display somewhat different binding properties than sequences affixed to the surface of a high-density array. However, the neural network trained on array sequences was able to both identify binding regions in between proteins and predict surface plasmon resonance-based binding propensities for peptides with statistically significant levels of accuracy.
Date Created
2022
Agent

Engineered Excitonic Complex Directed by Programmable DNA Architectures

161661-Thumbnail Image.png
Description
Efficient light collection and utilization are highly needed for developing effective photonic devices and materials. Nature is the master of organizing photosynthetic pigments into a densely packed state without self-quenching and conducting efficient energy transfer in a directed manner via

Efficient light collection and utilization are highly needed for developing effective photonic devices and materials. Nature is the master of organizing photosynthetic pigments into a densely packed state without self-quenching and conducting efficient energy transfer in a directed manner via implementing sophisticated proteins as scaffolds. The natural light-harvesting complex inspires the design of artificial photonic systems by utilizing synthetic templates to control the spatial arrangement and energy landscape of photoactive components. The self-assembled DNA nanostructures are highly programmable and intrinsically addressable, which makes them excellent templates for the precise organization of chromophores with desired complexity as artificial light-harvesting systems and photonic nanodevices for efficient photon capture and excitation energy transport. This dissertation focuses on the fundamental understanding and rational engineering of a series of artificial excitonic systems using programmable DNA architectures as templates to direct the self-assembly of cyanine dye aggregates. First, the DNA-templated pseudoisocyanine (PIC) dye aggregates were systematically studied to explore the effect of sequence and length of DNA templates on their excitonic properties. The results revealed that the PIC dye aggregates enable energy transfer along a defined track. Next, the benzothiazole cyanine dye K21 was introduced to form dye aggregates on double-stranded DNA templates. The strong inter-molecular coupling and weak sequence dependency of the K21 aggregates make it possible to mediate the efficient directional energy transfer over a distance up to 30 nm. Finally, the DNA helix-bundle structures with extended size and complicated geometries were employed to organize K21 dye as the scalable, addressable, and programmable excitonic complexes conducting sub-micron-scale directional exciton transport and serving as robust and modular building blocks to construct higher-order excitonic architectures.
Date Created
2021
Agent

Exploring the nature of protein-peptide interactions on surfaces

152875-Thumbnail Image.png
Description
Protein-surface interactions, no matter structured or unstructured, are important in both biological and man-made systems. Unstructured interactions are more difficult to study with conventional techniques due to the lack of a specific binding structure. In this dissertation, a novel approach

Protein-surface interactions, no matter structured or unstructured, are important in both biological and man-made systems. Unstructured interactions are more difficult to study with conventional techniques due to the lack of a specific binding structure. In this dissertation, a novel approach is employed to study the unstructured interactions between proteins and heterogonous surfaces, by looking at a large number of different binding partners at surfaces and using the binding information to understand the chemistry of binding. In this regard, surface-bound peptide arrays are used as a model for the study. Specifically, in Chapter 2, the effects of charge, hydrophobicity and length of surface-bound peptides on binding affinity for specific globular proteins (&beta-galactosidase and &alpha1-antitrypsin) and relative binding of different proteins were examined with LC Sciences peptide array platform. While the general charge and hydrophobicity of the peptides are certainly important, more surprising is that &beta-galactosidase affinity for the surface does not simply increase with the length of the peptide. Another interesting observation that leads to the next part of the study is that even very short surface-bound peptides can have both strong and selective interactions with proteins. Hence, in Chapter 3, selected tetrapeptide sequences with known binding characteristics to &beta-galactosidase are used as building blocks to create longer sequences to see if the binding function can be added together. The conclusion is that while adding two component sequences together can either greatly increase or decrease overall binding and specificity, the contribution to the binding affinity and specificity of the individual binding components is strongly dependent on their position in the peptide. Finally, in Chapter 4, another array platform is utilized to overcome the limitations associated with LC Sciences. It is found that effects of peptide sequence properties on IgG binding with HealthTell array are quiet similar to what was observed with &beta-galactosidase on LC Science array surface. In summary, the approach presented in this dissertation can provide binding information for both structured and unstructured interactions taking place at complex surfaces and has the potential to help develop surfaces covered with specific short peptide sequences with relatively specific protein interaction profiles.
Date Created
2014
Agent

Protein folding & dynamics using multi-scale computational methods

152662-Thumbnail Image.png
Description
This thesis explores a wide array of topics related to the protein folding problem, ranging from the folding mechanism, ab initio structure prediction and protein design, to the mechanism of protein functional evolution, using multi-scale approaches. To investigate the role

This thesis explores a wide array of topics related to the protein folding problem, ranging from the folding mechanism, ab initio structure prediction and protein design, to the mechanism of protein functional evolution, using multi-scale approaches. To investigate the role of native topology on folding mechanism, the native topology is dissected into non-local and local contacts. The number of non-local contacts and non-local contact orders are both negatively correlated with folding rates, suggesting that the non-local contacts dominate the barrier-crossing process. However, local contact orders show positive correlation with folding rates, indicating the role of a diffusive search in the denatured basin. Additionally, the folding rate distribution of E. coli and Yeast proteomes are predicted from native topology. The distribution is fitted well by a diffusion-drift population model and also directly compared with experimentally measured half life. The results indicate that proteome folding kinetics is limited by protein half life. The crucial role of local contacts in protein folding is further explored by the simulations of WW domains using Zipping and Assembly Method. The correct formation of N-terminal β-turn turns out important for the folding of WW domains. A classification model based on contact probabilities of five critical local contacts is constructed to predict the foldability of WW domains with 81% accuracy. By introducing mutations to stabilize those critical local contacts, a new protein design approach is developed to re-design the unfoldable WW domains and make them foldable. After folding, proteins exhibit inherent conformational dynamics to be functional. Using molecular dynamics simulations in conjunction with Perturbation Response Scanning, it is demonstrated that the divergence of functions can occur through the modification of conformational dynamics within existing fold for β-lactmases and GFP-like proteins: i) the modern TEM-1 lactamase shows a comparatively rigid active-site region, likely reflecting adaptation for efficient degradation of a specific substrate, while the resurrected ancient lactamases indicate enhanced active-site flexibility, which likely allows for the binding and subsequent degradation of different antibiotic molecules; ii) the chromophore and attached peptides of photocoversion-competent GFP-like protein exhibits higher flexibility than the photocoversion-incompetent one, consistent with the evolution of photocoversion capacity.
Date Created
2014
Agent

The role of protein dielectric relaxation on modulating the electron transfer process in photosynthetic reaction centers

150988-Thumbnail Image.png
Description
The photosynthetic reaction center is a type of pigment-protein complex found widely in photosynthetic bacteria, algae and higher plants. Its function is to convert the energy of sunlight into a chemical form that can be used to support other life

The photosynthetic reaction center is a type of pigment-protein complex found widely in photosynthetic bacteria, algae and higher plants. Its function is to convert the energy of sunlight into a chemical form that can be used to support other life processes. The high efficiency and structural simplicity make the bacterial reaction center a paradigm for studying electron transfer in biomolecules. This thesis starts with a comparison of the primary electron transfer process in the reaction centers from the Rhodobacter shperoides bacterium and those from its thermophilic homolog, Chloroflexus aurantiacus. Different temperature dependences in the primary electron transfer were found in these two type of reaction centers. Analyses of the structural differences between these two proteins suggested that the excess surface charged amino acids as well as a larger solvent exposure area in the Chloroflexus aurantiacus reaction center could explain the different temperature depenence. The conclusion from this work is that the electrostatic interaction potentially has a major effect on the electron transfer. Inspired by these results, a single point mutant was designed for Rhodobacter shperoides reaction centers by placing an ionizable amino acid in the protein interior to perturb the dielectrics. The ionizable group in the mutation site largely deprotonated in the ground state judging from the cofactor absorption spectra as a function of pH. By contrast, a fast charge recombination assoicated with protein dielectric relaxation was observed in this mutant, suggesting the possibility that dynamic protonation/deprotonation may be taking place during the electron transfer. The fast protein dielectric relaxation occuring in this mutant complicates the electron transfer pathway and reduces the yield of electron transfer to QA. Considering the importance of the protein dielectric environment, efforts have been made in quantifying variations of the internal field during charge separation. An analysis protocol based on the Stark effect of reaction center cofactor spectra during charge separation has been developed to characterize the charge-separated radical field acting on probe chromophores. The field change, monitored by the dynamic Stark shift, correlates with, but is not identical to, the electron transfer kinetics. The dynamic Stark shift results have lead to a dynamic model for the time-dependent dielectric that is complementary to the static dielectric asymmetry observed in past steady state experiments. Taken together, the work in this thesis emphasizes the importance of protein electrostatics and its dielectric response to electron transfer.
Date Created
2012
Agent

Computational modeling of peptide-protein binding

149386-Thumbnail Image.png
Description
Peptides offer great promise as targeted affinity ligands, but the space of possible peptide sequences is vast, making experimental identification of lead candidates expensive, difficult, and uncertain. Computational modeling can narrow the search by estimating the affinity and specificity

Peptides offer great promise as targeted affinity ligands, but the space of possible peptide sequences is vast, making experimental identification of lead candidates expensive, difficult, and uncertain. Computational modeling can narrow the search by estimating the affinity and specificity of a given peptide in relation to a predetermined protein target. The predictive performance of computational models of interactions of intermediate-length peptides with proteins can be improved by taking into account the stochastic nature of the encounter and binding dynamics. A theoretical case is made for the hypothesis that, because of the flexibility of the peptide and the structural complexity of the target protein, interactions are best characterized by an ensemble of possible bound configurations rather than a single “lock and key” fit. A model incorporating these factors is proposed and evaluated. A comprehensive dataset of 3,924 peptide-protein interface structures was extracted from the Protein Data Bank (PDB) and descriptors were computed characterizing the geometry and energetics of each interface. The characteristics of these interfaces are shown to be generally consistent with the proposed model, and heuristics for design and selection of peptide ligands are derived. The curated and energy-minimized interface structure dataset and a relational database containing the detailed results of analysis and energy modeling are made publicly available via a web repository. A novel analytical technique based on the proposed theoretical model, Virtual Scanning Probe Mapping (VSPM), is implemented in software to analyze the interaction between a target protein of known structure and a peptide of specified sequence, producing a spatial map indicating the most likely peptide binding regions on the protein target. The resulting predictions are shown to be superior to those of two other published methods, and support the validity of the stochastic binding model.
Date Created
2010
Agent