Emergence of stable and fast folding protein structures
read the original abstract
The number of protein structures is far less than the number of sequences. By imposing simple generic features of proteins (low energy and compaction) on all possible sequences we show that the structure space is sparse compared to the sequence space. Even though the sequence space grows exponentially with N (the number of amino acids) we conjecture that the number of low energy compact structures only scales as ln N. This implies that many sequences must map onto countable number of basins in the structure space. The number of sequences for which a given fold emerges as a native structure is further reduced by the dual requirements of stability and kinetic accessibility. The factor that determines the dual requirement is related to the sequence dependent temperatures, T_\theta (collapse transition temperature) and T_F (folding transition temperature). Sequences, for which \sigma =(T_\theta-T_F)/T_\theta is small, typically fold fast by generically collapsing to the native-like structures and then rapidly assembling to the native state. Such sequences satisfy the dual requirements over a wide temperature range. We also suggest that the functional requirement may further reduce the number of sequences that are biologically competent. The scheme developed here for thinning of the sequence space that leads to foldable structures arises naturally using simple physical characteristics of proteins. The reduction in sequence space leading to the emergence of foldable structures is demonstrated using lattice models of proteins.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.