PgmNr P2150: Folding and misfolding of evolutionarily young proteins.

Authors:
Joanna Masel 1 ; Scott Foy 1 ; Ben Wilson 1 ; Rafik Neme 2 ; Matt Cordes 1


Institutes
1) University of Arizona, Tucson, AZ, USA; 2) Max Planck Institute for Evolutionary Biology, Ploen, Germany.


Abstract:

Random polypeptides are expected to form amyloids and other toxic aggregates, making de novo gene birth from non-coding sequences a difficult transition. We find that evolutionarily young mouse proteins contain more hydrophilic amino acids than old proteins do, which in turn are more hydrophilic than intergenic sequences would be if they were translated. Low hydrophobicity is presumably a precondition for avoiding harmful aggregation; computationally predicted aggregation propensity tracks amino acid composition. But surprisingly, when amino acid composition is held constant, young proteins (but not old proteins) actually have a higher predicted aggregation propensity than scrambled controls. Preliminary results suggest that this might be explained by the degree of dispersion of hydrophobic amino acids along the primary sequence. Previous work concluded that amino acids with different properties, e.g. polar vs. non-polar or hydrophobic vs. hydrophilic, are overdispersed relative to a random ordering. We find that this conclusion holds only for the very oldest proteins, which are overrepresented in protein structural databases. Young proteins instead show significant underdispersion / clustering. We hypothesize that this clustering to form locally structured regions may be a precondition for a de novo evolved protein to fold, with a greater tendency to misfold/aggregate arising as an inevitable byproduct. Over long evolutionary timescales, more subtle folding strategies allow proteins to become more hydrophobic while still avoiding misfolding, and to have less clustering of their hydrophobicity while still being able to fold.