pith. sign in

arxiv: 0806.4838 · v2 · pith:TWJF4UH6new · submitted 2008-06-30 · 🧬 q-bio.BM

On the performance of combined dichotomic predictors of natively unfolded proteins

classification 🧬 q-bio.BM
keywords proteinspredictorsperformanceunfoldednativelysingledatasetunclassified
0
0 comments X
read the original abstract

The performance of single folding predictors and combination scores is critically evaluated. We test mean packing, mean pairwise energy and the new index gVSL2 on a dataset of 743 folded proteins and 81 natively unfolded proteins. These predictors have an individual performance comparable or even better than other proposed methods. We introduce here a strictly unanimous score S_{SU} that combines them but leaves undecided those sequences differently classified by two single predictors. The performance of the single predictors on a dataset purged from the proteins left unclassified by S_{SU}, significantly increases, indicating that unclassified proteins are mainly false predictions. Amino acid composition is the main determinant considered by these predictors, therefore unclassified proteins have a composition compatible with both folded and unfolded status. This is why purging a dataset from these ambiguous proteins increases the performance of single predictors. The percentage of proteins predicted as natively unfolded by S_{SU} in the three kingdoms are: 4.1% for Bacteria, 1.0% for Archaea and 20.0% for Eukarya; compatible with previous determinations. Evidence is given of a scaling law relating the number of natively unfolded proteins with the total number of proteins in a genome; a first estimate of the critical exponent is 1.95 +- 0.21

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.