pith. sign in

arxiv: 2606.02408 · v1 · pith:KAAZPWIKnew · submitted 2026-06-01 · 💻 cs.CC · q-bio.QM

Structure-Informed Multiple Sequence Alignment: A Formal Model and Hardness Results

classification 💻 cs.CC q-bio.QM
keywords fixedalignmentdesignatedlambdamodelmultipleproblemscore
0
0 comments X
read the original abstract

We formulate a structure-informed multiple sequence alignment problem, denoted MSA-S. The model abstracts biological sequences as strings and structural information as designated position-pairs. It augments a fixed pairwise string score, defined by a fixed non-gap symbol-pair scoring rule and fixed affine gap penalties, with a binary overlap score on designated position-pairs, which can be interpreted as a contact-map overlap score in structural applications. This yields a fixed-score, integer-valued optimization model suitable for complexity-theoretic analysis. Under this formulation, we show that the decision problem MSA-S-DEC is NP-complete for a broad class of fixed pairwise string scoring schemes. We also show that NP-hardness persists even under the restriction that every designated position-pair set is nonempty and the pair-overlap threshold is strictly positive. For the associated scalarized optimization problem MSA-S-OPT(lambda) with any fixed rational constant lambda >= 1, we further show that, under the canonical unit scheme for the non-gap symbol-pair scoring rule, MSA-S-OPT(lambda) admits no polynomial-time approximation scheme (PTAS) even for two input strings (k = 2), unless P = NP. These results establish a formal complexity-theoretic baseline for structure-informed multiple sequence alignment.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.