Measure representation and multifractal analysis of complete genomes
read the original abstract
This paper introduces the notion of measure representation of DNA sequences. Spectral analysis and multifractal analysis are then performed on the measure representations of a large number of complete genomes. The main aim of this paper is to discuss the multifractal property of the measure representation and the classification of bacteria. From the measure representations and the values of the $D_{q}$ spectra and related $C_{q}$ curves, it is concluded that these complete genomes are not random sequences. In fact, spectral analyses performed indicate that these measure representations considered as time series, exhibit strong long-range correlation. For substrings with length K=8, the $D_{q}$ spectra of all organisms studied are multifractal-like and sufficiently smooth for the $C_{q}$ curves to be meaningful. The $C_{q}$ curves of all bacteria resemble a classical phase transition at a critical point. But the 'analogous' phase transitions of chromosomes of non-bacteria organisms are different. Apart from Chromosome 1 of {\it C. elegans}, they exhibit the shape of double-peaked specific heat function.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.