Simplified amino acid alphabets based on deviation of conditional probability from random background
read the original abstract
The primitive data for deducing the Miyazawa-Jernigan contact energy or BLOSUM score matrix consists of pair frequency counts. Each amino acid corresponds to a conditional probability distribution. Based on the deviation of such conditional probability from random background, a scheme for reduction of amino acid alphabet is proposed. It is observed that evident discrepancy exists between reduced alphabets obtained from raw data of the Miyazawa-Jernigan's and BLOSUM's residue pair counts. Taking homologous sequence database SCOP40 as a test set, we detect homology with the obtained coarse-grained substitution matrices. It is verified that the reduced alphabets obtained well preserve information contained in the original 20-letter alphabet.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.