pith. sign in

arxiv: 1907.08865 · v1 · pith:LYMTYMTLnew · submitted 2019-07-20 · 💻 cs.CC · cs.DS· math.CO

Complexity of Modification Problems for Reciprocal Best Match Graphs

Pith reviewed 2026-05-24 18:29 UTC · model grok-4.3

classification 💻 cs.CC cs.DSmath.CO
keywords reciprocal best match graphsNP-hardnessgraph editingedge deletionvertex-colored graphscographsfixed-parameter tractabilityorthology detection
0
0 comments X

The pith

The decision versions of deleting or editing at most k edges to obtain reciprocal best match graphs from vertex-colored graphs are NP-hard.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Reciprocal best match graphs model pairs of genes that are closest relatives under an unknown species tree, with vertices colored by species. When sequence data produces noisy versions of these graphs, the task is to correct them by removing or changing a limited number of edges. The paper establishes that both the deletion problem and the editing problem are NP-hard in general. It also shows fixed-parameter tractability for the editing problem when the graph has only two colors, via a reduction to bicluster editing. The same NP-completeness carries over to the special case of modifying graphs into hierarchically colored cographs.

Core claim

The decision version of the deletion and editing problem to obtain RBMGs from vertex colored graphs is NP-hard. Using known results for the so-called bicluster editing, the RBMG editing problem for 2-colored graphs is fixed-parameter tractable. The decision problem of modifying a vertex-colored graph (either by edge-deletion or editing) into an RBMG with cograph structure or, equivalently, to an hierarchically colored cograph is NP-complete.

What carries the argument

Polynomial-time reductions from bicluster editing and other NP-complete problems that map the parameter k while preserving the property of being an RBMG or hierarchically colored cograph.

If this is right

  • Exact correction of noisy gene-relationship graphs cannot be done in polynomial time for general inputs unless P equals NP.
  • For two-color instances the editing problem admits algorithms whose running time is exponential only in the number of edits k.
  • Error correction for orthology detection via hierarchically colored cographs faces the same computational barrier.
  • Practical methods must use heuristics or approximation schemes rather than exact solutions on large instances.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The hardness result implies that scaling exact RBMG correction to genome-wide data sets will require either strong restrictions on the input or a shift to different target structures.
  • The two-color FPT result may generalize to a fixed number of colors if analogous parameter-preserving reductions exist.
  • The equivalence between RBMG-cograph modification and hierarchical cograph modification suggests that orthology pipelines could import techniques from cograph editing literature.

Load-bearing premise

The reductions from bicluster editing preserve the exact parameter k and the biological target of an RBMG is the structure that noise-correction algorithms must reach.

What would settle it

A polynomial-time algorithm that solves the deletion problem for arbitrary vertex-colored graphs would show the claimed NP-hardness does not hold.

Figures

Figures reproduced from arXiv: 1907.08865 by Manuela Gei{\ss}, Marc Hellmuth, Peter F. Stadler.

Figure 1
Figure 1. Figure 1: The disconnected 3-RBMG (G, σ) is also an orthology graph and thus, by Thm. 1, an hc-cograph. For the tree (T, t, σ) we have G(T, σ) = (G, σ) as well as t(lcaT (x, y)) = 1 iff xy ∈ E(G). In other words, y is a best match of x if y is the closest relative of x in comparison with all other leaves from species σ(y), see [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
read the original abstract

Reciprocal best match graphs (RBMGs) are vertex colored graphs whose vertices represent genes and the colors the species where the genes reside. Edges identify pairs of genes that are most closely related with respect to an underlying evolutionary tree. In practical applications this tree is unknown and the edges of the RBMGs are inferred by quantifying sequence similarity. Due to noise in the data, these empirically determined graphs in general violate the condition of being a ``biologically feasible'' RBMG. Therefore, it is of practical interest in computational biology to correct the initial estimate. Here we consider deletion (remove at most $k$ edges) and editing (add or delete at most $k$ edges) problems. We show that the decision version of the deletion and editing problem to obtain RBMGs from vertex colored graphs is NP-hard. Using known results for the so-called bicluster editing, we show that the RBMG editing problem for $2$-colored graphs is fixed-parameter tractable. A restricted class of RBMGs appears in the context of orthology detection. These are cographs with a specific type of vertex coloring known as hierarchical coloring. We show that the decision problem of modifying a vertex-colored graph (either by edge-deletion or editing) into an RBMG with cograph structure or, equivalently, to an hierarchically colored cograph is NP-complete.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 4 minor

Summary. The manuscript proves that the decision versions of the minimum edge deletion and minimum edge editing problems for transforming a vertex-colored graph into a reciprocal best match graph (RBMG) are NP-hard. It further shows that the editing problem restricted to 2-colored graphs is fixed-parameter tractable by reduction to the known bicluster editing problem, and establishes NP-completeness of both deletion and editing when the target is restricted to RBMGs that are cographs (equivalently, hierarchically colored cographs).

Significance. If the reductions hold, the results delineate the computational boundary between tractable and intractable noise-correction tasks for RBMGs arising in orthology detection. The FPT result for 2-colored instances supplies a concrete algorithmic pathway for small k by direct appeal to bicluster editing, while the NP-completeness proof for the hierarchically colored cograph case isolates hardness within a subclass already studied for practical orthology applications.

minor comments (4)
  1. [§2] §2 (Preliminaries): the formal definition of an RBMG via the underlying species tree and the 'reciprocal best match' condition should be stated before the first hardness claim, as the subsequent reductions rely on it.
  2. [§4] The 2-colored FPT claim in §4 would be strengthened by an explicit statement that the reduction from bicluster editing preserves both the 2-coloring and the parameter k exactly.
  3. Figure 1 (or the first illustrative figure) would benefit from an explicit caption noting which edges are added or deleted in the editing example.
  4. [§6] A short paragraph comparing the new NP-completeness result for hierarchically colored cographs with the earlier cograph editing literature would help readers situate the contribution.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful reading and positive assessment of the significance of our results on the complexity of modification problems for RBMGs. The recommendation for minor revision is noted; since no specific major comments were raised, we will incorporate any minor editorial suggestions in the revised version.

Circularity Check

0 steps flagged

No significant circularity; standard reductions from independent problems

full rationale

The paper's central results are NP-hardness proofs for deletion/editing to RBMGs and NP-completeness for hierarchically colored cographs, obtained via (presumably polynomial) reductions from known problems such as bicluster editing. The FPT claim for 2-colored RBMG editing is obtained by direct appeal to existing bicluster-editing results. No equations or claims reduce by construction to the paper's own inputs, no parameters are fitted and renamed as predictions, and no load-bearing uniqueness theorems or ansatzes are imported via self-citation. The derivation chain relies on external, independently established hardness results rather than self-referential definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on the standard definition of RBMGs from prior computational-biology literature and on the known NP-completeness/FPT status of bicluster editing; no new free parameters or invented entities are introduced.

axioms (2)
  • standard math Bicluster editing is fixed-parameter tractable (used for the 2-colored case).
    Invoked to obtain the FPT result for 2-colored RBMG editing.
  • domain assumption RBMGs are defined by closest-gene pairs under an unknown species tree.
    Central modeling assumption that makes the modification problems biologically motivated.

pith-pipeline@v0.9.0 · 5786 in / 1308 out tokens · 23726 ms · 2026-05-24T18:29:21.495716+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 1 internal anchor

  1. [1]

    A. M. Altenhoff and C. Dessimoz. Phylogenetic and functional assessment of orthologs infer- ence projects and methods. PLoS Comput Biol , 5:e1000262, 2009

  2. [2]

    Standardized benchmarking in the quest for orthologs

    Adrian M Altenhoff, Brigitte Boeckmann, Salvador Capella-Gutierrez, Daniel A Dalquen, Todd DeLuca, Kristoffer Forslund, Jaime Huerta-Cepas, Benjamin Linard, C´ ecile Pereira, Leszek P Pryszcz, Fabian Schreiber, Alan Sousa da Silva, Damian Szklarczyk, Cl´ ement-Marie Train, Peer Bork, Odile Lecompte, Christian von Mering, Ioannis Xenarios, Kimmen Sj¨ olander...

  3. [3]

    N. Amit. The bicluster graph editing problem. Master’s thesis, Tel Aviv University, School of Mathematical Sciences, 2004

  4. [4]

    Sebastian B¨ ocker, Sebastian Briesemeister, and Gunnar W. Klau. Exact algorithms for cluster editing: Evaluation and experiments. Algorithmica, 60:316–334, 2008

  5. [5]

    Pardalos

    Stanislav Busygin, Oleg Prokopyev, and Panos M. Pardalos. Biclustering in data mining. Computers & Operations Research, 35(9):2964 – 2987, 2008. Part Special Issue: Bio-inspired Methods in Combinatorial Optimization

  6. [6]

    Fixed-parameter tractability of graph modification problems for hereditary prop- erties

    Leizhen Cai. Fixed-parameter tractability of graph modification problems for hereditary prop- erties. Information Processing Letters, 58(4):171 – 176, 1996

  7. [7]

    Sullivan, and Michael R

    Guanhua Chen, Patrick F. Sullivan, and Michael R. Kosorok. Biclustering with heterogeneous variance. Proceedings of the National Academy of Sciences , 110(30):12253–12258, 2013

  8. [8]

    Biclustering of expression data

    Y Cheng and GM Church. Biclustering of expression data. In Proceedings. International Conference on Intelligent Systems for Molecular Biology , volume 8, pages 93–103, 2000

  9. [9]

    C. A. Christen and S. M. Selkow. Some perfect coloring properties of graphs. J. Comb. Th., Ser. B, 27:49–59, 1979

  10. [10]

    D. G. Corneil, H. Lerchs, and L. Steward Burlingham. Complement reducible graphs. Discr. Appl. Math., 3:163–174, 1981

  11. [11]

    On bipartite and multipartite clique problems

    Milind Dawande, Pinar Keskinocak, Jayashankar M Swaminathan, and Sridhar Tayur. On bipartite and multipartite clique problems. J. Algorithms, 41:388–403, 2001

  12. [12]

    de Sousa Filho, Teobaldo L

    Gilberto F. de Sousa Filho, Teobaldo L. Bulh˜ oes J´ unior, Lucidio A. F. Cabral, Luiz Satoru Ochi, and F´ abio Protti. New heuristics for the bicluster editing problem.Annals of Operations Research, 258:781–814, 2017

  13. [13]

    Hybrid metaheuristic for bicluster editing problem

    Gilberto F de Sousa Filho, F Cabral Lucidio dos Anjos, Luiz Satoru Ochi, and F´ abio Protti. Hybrid metaheuristic for bicluster editing problem. Electronic Notes in Discrete Mathematics, 39:35–42, 2012

  14. [14]

    P. G. Drange, F Reidl, F. S. Villaamil, and S. Sikdar. Fast biclustering by dual parameteri- zation. In Thore Husfeldt and Iyad Kanj, editors, 10th International Symposium on Parame- terized and Exact Computation (IPEC 2015) , volume 43 of Leibniz International Proceedings in Informatics (LIPIcs) , pages 402–413, Dagstuhl, Germany, 2015. Schloss Dagstuhl–...

  15. [15]

    Drange, Felix Reidl, Fernando S´ anchez Villaamil, and Somnath Sikdar

    P.G. Drange, Felix Reidl, Fernando S´ anchez Villaamil, and Somnath Sikdar. Fast Biclustering by Dual Parameterization. In Thore Husfeldt and Iyad Kanj, editors, 10th International Symposium on Parameterized and Exact Computation (IPEC 2015) , volume 43 of Leibniz International Proceedings in Informatics (LIPIcs) , pages 402–413, Dagstuhl, Germany, 2015. ...

  16. [16]

    Walter M. Fitch. Homology: a personal view on some of the problems. Trends Genet., 16:227– 231, 2000

  17. [17]

    Toni Gabald´ on and Eugene V. Koonin. Functional and evolutionary implications of gene orthology. Nat Rev Genet. , 14:360–366, 2013. 10

  18. [18]

    M. Geiß, M. Gonz´ alez Laffitte, A. L´ opez S´ anchez, D. I. Valdivia, M. Hellmuth, M. Hern´ andez Rosales, and P. F. Stadler. Best match graphs and reconciliation of gene trees with species trees. CoRR, 2019. arXiv:1904.12021

  19. [19]

    Valdivia, Marc Hellmuth, Maribel Hern´ andez Rosales, and Peter F Stadler

    Manuela Geiß, Edgar Ch´ avez, Marcos Gonz´ alez Laffitte, Alitzel L´ opez S´ anchez, B¨ arbel M R Stadler, Dulce I. Valdivia, Marc Hellmuth, Maribel Hern´ andez Rosales, and Peter F Stadler. Best match graphs. J. Math. Biol. , 78(7):2015–2057, 2019

  20. [20]

    Manuela Geiß, Marc Hellmuth, and Peter F. Stadler. Reciprocal best match graphs. CoRR,

  21. [21]

    arXiv q-bio 1903.07920

  22. [22]

    RGFA: powerful and convenient handling of assembly graphs

    Giorgio Gonnella and Stefan Kurtz. RGFA: powerful and convenient handling of assembly graphs. Peer J., 4:e2681, 2016

  23. [23]

    Improved algorithms for bicluster editing

    Jiong Guo, Falk H¨ uffner, Christian Komusiewicz, and Yong Zhang. Improved algorithms for bicluster editing. In Manindra Agrawal, Dingzhu Du, Zhenhua Duan, and Angsheng Li, editors, Theory and Applications of Models of Computation , pages 445–456, Berlin, Heidelberg, 2008. Springer

  24. [24]

    Hellmuth, M

    M. Hellmuth, M. Hernandez-Rosales, K. T. Huber, V. Moulton, P. F. Stadler, and N. Wieseke. Orthology relations, symbolic ultrametrics, and cographs. J. Math. Biology, 66:399–420, 2013

  25. [25]

    Marc Hellmuth, Nicolas Wieseke, Marcus Lechner, Hans-Peter Lenhof, Martin Middendorf, and Peter F. Stadler. Phylogenetics from paralogs. Proc. Natl. Acad. Sci. USA , 112:2058– 2063, 2015

  26. [26]

    Hochbaum

    Dorit S. Hochbaum. Approximating clique and biclique problems. J. Algorithms, 29:174–200, 1998

  27. [27]

    Jamison and S

    B. Jamison and S. Olariu. Recognizing p4-sparse graphs in linear time. SIAM J. Computing , 21:381–406, 1992

  28. [28]

    Huang, and J

    Mihee Lee, Haipeng Shen, Jianhua Z. Huang, and J. S. Marron. Biclustering via sparse singular value decomposition. Biometrics, 66(4):1087–1095, 2010

  29. [29]

    Complexity and parameterized algorithms for cograph editing

    Yunlong Liu, Jianxin Wang, Jiong Guo, and Jianer Chen. Complexity and parameterized algorithms for cograph editing. Theor. Comp. Sci. , 461:45–54, 2012

  30. [30]

    R. Peeters. The maximum edge biclique problem is NP-complete. Discrete Applied Mathe- matics, 131:651–654, 2003

  31. [31]

    Applying modular decom- position to parameterized cluster editing problems

    F´ abio Protti, Maise Dantas da Silva, and Jayme Luiz Szwarcfiter. Applying modular decom- position to parameterized cluster editing problems. Theory of Computing Systems , 44:91–104, 2009

  32. [32]

    Correlation clustering and biclustering with locally bounded errors

    Gregory J Puleo and Olgica Milenkovic. Correlation clustering and biclustering with locally bounded errors. IEEE Transactions on Information Theory , 64(6):4105–4119, 2018

  33. [33]

    Setubal and Peter F

    Jo˜ ao C. Setubal and Peter F. Stadler. Gene phyologenies and orthologous groups. In Jo˜ ao C. Setubal, Peter F. Stadler, and Jens Stoye, editors, Comparative Genomics, volume 1704, pages 1–28. Springer, Heidelberg, 2018

  34. [34]

    Bi-Force: large- scale bicluster editing and its application to gene expression data biclustering

    Nora K Speicher, Richard R¨ ottger, Peng Sun, Jan Baumbach, and Jiong Guo. Bi-Force: large- scale bicluster editing and its application to gene expression data biclustering. Nucleic Acids Research, 42:e78–e78, 2014

  35. [35]

    Tatusov, Eugene V

    Roman L. Tatusov, Eugene V. Koonin, and David J. Lipman. A genomic perspective on protein families. Science, 278:631–637, 1997

  36. [36]

    Improved biclustering of microarray data demonstrated through systematic performance tests

    Heather Turner, Trevor Bailey, and Wojtek Krzanowski. Improved biclustering of microarray data demonstrated through systematic performance tests. Computational Statistics & Data Analysis, 48(2):235 – 254, 2005

  37. [37]

    D. I. Valdivia, M. Geiß, M. Hellmuth, M. Hern´ andez Rosales, and P. F. Stadler. Hierarchical colorings of cographs. CoRR, 2019. arXiv:1906.10031. 11