pith. sign in

arxiv: chao-dyn/9403002 · v1 · submitted 1994-03-22 · chao-dyn · nlin.CD· q-bio.GN

Understanding Long-range Correlations in DNA Sequences

classification chao-dyn nlin.CDq-bio.GN
keywords sequencescorrelationlong-rangecomplexityduplicationmodelobservedcorrelations
0
0 comments X
read the original abstract

In this paper, we review the literature on statistical long-range correlation in DNA sequences. We examine the current evidence for these correlations, and conclude that a mixture of many length scales (including some relatively long ones) in DNA sequences is responsible for the observed 1/f-like spectral component. We note the complexity of the correlation structure in DNA sequences. The observed complexity often makes it hard, or impossible, to decompose the sequence into a few statistically stationary regions. We suggest that, based on the complexity of DNA sequences, a fruitful approach to understand long-range correlation is to model duplication, and other rearrangement processes, in DNA sequences. One model, called ``expansion-modification system", contains only point duplication and point mutation. Though simplistic, this model is able to generate sequences with 1/f spectra. We emphasize the importance of DNA duplication in its contribution to the observed long-range correlation in DNA sequences.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.