pith. machine review for the scientific record. sign in

arxiv: physics/0103028 · v1 · submitted 2001-03-10 · ⚛️ physics.bio-ph · q-bio

Recognition: unknown

Compositional representation of protein sequences and the number of Eulerian loops

Authors on Pith no claims yet
classification ⚛️ physics.bio-ph q-bio
keywords proteinsequencesacidaminoeulereulerianformulaloops
0
0 comments X
read the original abstract

An amino acid sequence of a protein may be decomposed into consecutive overlapping strings of length K. How unique is the converse, i.e., reconstruction of amino acid sequences using the set of K-strings obtained in the decomposition? This problem may be transformed into the problem of counting the number of Eulerian loops in an Euler graph, though the well-known formula must be modified. By exhaustive enumeration and by using the modified formula we show that the reconstruction is unique at K equal or greater than 5 for an overwhelming majority of the proteins in the PDB.seq database. The corresponding Euler graphs provide a means to study the structure of repeated segments in protein sequences.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.