132667-Thumbnail Image.png
Description
In recent years, experimental and theoretical evidence has pointed to the existence of biologically active proteins that either include unstructured regions or are entirely unstructured. Referred to as intrinsically disordered proteins (IDPs), they are now known to be involved

In recent years, experimental and theoretical evidence has pointed to the existence of biologically active proteins that either include unstructured regions or are entirely unstructured. Referred to as intrinsically disordered proteins (IDPs), they are now known to be involved in diverse functions, much as any folded protein. Mutations in IDPs have been implicated in multiple neurodegenerative diseases. Considering the disordered nature of IDPs, there are limited structure features that can be used to quantify the disordered state. One such pair of variables are the radius of gyration (Rg) and the corresponding Flory’s scaling exponent, both of which characterize the dimension and size of the protein. It is generally understood that the sequence of an IDP affects its Rg and scaling exponent. Properties such as amino acid hydrophobicity and charge can play important roles in determining the Rg of an IDP, much as they affect the structure of a folded protein. However, it is nontrivial to directly predict Rg and scaling exponent from an IDP sequence. In this thesis, a coarse-grained model is used to simulate the Rg and scaling exponents of 10,000 randomly generated sequences mimicking the amino acid propensities of a typical IDP sequence. Such a database is then fed into an artificial neural network model to directly predict the scaling exponent from the sequence. The framework has not only made accurate and precise predictions (<1% error) in comparing to the simulation-obtained scaling exponent, but also suggest important sequence descriptors for such prediction. In addition, through varying the number of sequences for training the model, we suggest a minimum dataset of 100 sequences might be sufficient to achieve a 5% error of prediction, shedding light upon possible predictive models with only experimental inputs.


Download restricted.
Restrictions Statement

Barrett Honors College theses and creative projects are restricted to ASU community members.

Details

Title
  • Predicting Dimensions of Intrinsically Disordered Proteins
Contributors
Date Created
2019-05
Resource Type
  • Text
  • Machine-readable links