High-Order Epistasis Detection Using Factorization Machine with Quadratic Optimization Annealing and MDR-Based Evaluation
Pith reviewed 2026-05-16 17:33 UTC · model grok-4.3
The pith
Factorization machine quadratic annealing recovers ground-truth high-order epistasis by optimizing MDR error rates as a black-box objective.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We define the epistasis detection problem as a black-box optimization problem and solve it with a factorization machine with quadratic-optimization annealing (FMQA). The classification error rate (CER) computed by MDR is used as a black-box objective function. Experimental evaluations were conducted using simulated case-control datasets with predefined high-order epistasis. The results demonstrate that the proposed method successfully identified ground-truth epistasis across various interaction orders and the numbers of genetic loci within a limited number of iterations.
What carries the argument
factorization machine with quadratic-optimization annealing (FMQA) that treats MDR classification error rate as a black-box objective to be minimized
If this is right
- High-order interactions become searchable without enumerating every possible locus combination.
- The number of loci and interaction order that can be examined grows beyond the reach of exhaustive MDR.
- MDR error rates integrate directly into an iterative optimizer that converges in limited steps on simulated data.
- Detection performance holds across multiple tested interaction orders and total locus counts.
Where Pith is reading between the lines
- The same black-box formulation could be applied to other evaluation functions beyond MDR if they can be computed for candidate subsets.
- Real-world genetic studies would still need separate validation on data with unknown ground truth to confirm transfer from simulation.
- Hybrid pipelines that seed FMQA with prior biological knowledge could further reduce the number of required evaluations.
Load-bearing premise
Success on simulated data that contain artificially planted high-order epistasis will translate to real genetic data whose interaction patterns and noise structures are unknown.
What would settle it
The optimizer fails to recover the planted loci when run on a new collection of simulated datasets generated by the same process but with higher noise levels or altered interaction strengths.
Figures
read the original abstract
Detecting high-order epistasis is a fundamental challenge in genetic association studies due to the combinatorial explosion of candidate locus combinations. Although multifactor dimensionality reduction (MDR) is a widely used method for evaluating epistasis, exhaustive MDR-based searches become computationally infeasible as the number of loci or the interaction order increases. In this paper, we define the epistasis detection problem as a black-box optimization problem and solve it with a factorization machine with quadratic-optimization annealing (FMQA). We propose an efficient epistasis detection method based on FMQA, in which the classification error rate (CER) computed by MDR is used as a black-box objective function. Experimental evaluations were conducted using simulated case-control datasets with predefined high-order epistasis. The results demonstrate that the proposed method successfully identified ground-truth epistasis across various interaction orders and the numbers of genetic loci within a limited number of iterations. These results indicate that the proposed method is effective and computationally efficient for high-order epistasis detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript frames high-order epistasis detection as a black-box combinatorial optimization problem and proposes solving it via factorization machine with quadratic optimization annealing (FMQA), using the classification error rate (CER) obtained from multifactor dimensionality reduction (MDR) as the objective function. Experiments on simulated case-control datasets containing predefined high-order epistatic interactions report that the method recovers the planted ground-truth combinations across varying interaction orders and numbers of loci within a limited iteration budget.
Significance. If the recovery performance is shown to be robust and superior to standard optimizers, the approach could offer a scalable heuristic for high-order interaction searches that are otherwise intractable by exhaustive MDR enumeration, potentially aiding genetic association studies.
major comments (3)
- [Abstract / Experimental evaluations] Abstract and Experimental evaluations section: the central claim that the method 'successfully identified ground-truth epistasis' is presented without any quantitative recovery rates, success fractions over repeated trials, mean or median iteration counts, or failure-mode statistics, leaving the empirical support for the claim unquantified.
- [Experimental evaluations] Experimental evaluations section: no baseline comparisons are reported against standard black-box optimizers (e.g., plain simulated annealing, genetic algorithms, or random sampling) or against exhaustive MDR where computationally feasible, so it is impossible to determine whether the observed recoveries are attributable to FMQA or to the simulation design.
- [Experimental evaluations] Experimental evaluations section: all reported results use only simulated datasets with artificially planted interactions; the absence of any real GWAS panel experiments means the method's behavior under realistic noise structures, linkage disequilibrium, and unknown interaction patterns remains untested.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below, indicating where revisions have been made to strengthen the empirical support and clarify the scope of the work.
read point-by-point responses
-
Referee: [Abstract / Experimental evaluations] Abstract and Experimental evaluations section: the central claim that the method 'successfully identified ground-truth epistasis' is presented without any quantitative recovery rates, success fractions over repeated trials, mean or median iteration counts, or failure-mode statistics, leaving the empirical support for the claim unquantified.
Authors: We agree that the original presentation lacked quantitative backing for the recovery claim. In the revised manuscript, we have added explicit metrics including success fractions (recovery rate over 50 repeated trials per setting), mean and median iteration counts to first recovery of the ground-truth combination, and failure-mode analysis (e.g., convergence to non-ground-truth local optima). These are now reported in the Experimental evaluations section and summarized in the abstract. revision: yes
-
Referee: [Experimental evaluations] Experimental evaluations section: no baseline comparisons are reported against standard black-box optimizers (e.g., plain simulated annealing, genetic algorithms, or random sampling) or against exhaustive MDR where computationally feasible, so it is impossible to determine whether the observed recoveries are attributable to FMQA or to the simulation design.
Authors: We acknowledge the need for baselines to isolate the contribution of FMQA. We have incorporated new experiments comparing FMQA against plain simulated annealing, genetic algorithms, and uniform random sampling on identical simulated datasets and iteration budgets. For the smallest instances (where exhaustive MDR remains tractable), we also report direct recovery comparisons against exhaustive enumeration. The revised results show FMQA achieving higher recovery rates and lower iteration counts than the baselines. revision: yes
-
Referee: [Experimental evaluations] Experimental evaluations section: all reported results use only simulated datasets with artificially planted interactions; the absence of any real GWAS panel experiments means the method's behavior under realistic noise structures, linkage disequilibrium, and unknown interaction patterns remains untested.
Authors: The study is deliberately scoped to simulated data with known ground-truth interactions to enable precise quantitative recovery evaluation. We do not include real GWAS results in the current manuscript, as such validation would require separate handling of unknown interactions, population structure, and linkage disequilibrium and is beyond the scope of this work. We have expanded the discussion section to explicitly note this limitation and identify real-data application as future research. revision: partial
- Real GWAS panel experiments, as the manuscript is designed around controlled simulations with known ground truth and adding such experiments would require substantial new data access, preprocessing, and evaluation methodology.
Circularity Check
No circularity; MDR objective is independent of FMQA optimizer
full rationale
The paper frames epistasis detection as a black-box optimization task solved by FMQA, using classification error rate (CER) computed via the established MDR procedure as the objective function. This objective is supplied externally and is not derived from or fitted within the FMQA component. No equations reduce by construction to their own inputs, no parameters are fitted on a subset and then relabeled as predictions, and no load-bearing claims rest on self-citations whose content is unverified or equivalent to the present result. The reported success consists of empirical recovery on simulated datasets with planted interactions; this is an application of an external optimizer to an independent metric rather than a self-referential derivation.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We define the epistasis detection problem as a black-box optimization problem and solve it with a factorization machine with quadratic-optimization annealing (FMQA)... CER computed by MDR is used as a black-box objective function.
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The FM model... can be converted into a QUBO formulation... solved using Ising machines.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Improving FMQA via Initial Training Data Design Considering Marginal Bit Coverage in One-Hot Encoding
Ensuring complete marginal bit coverage in initial data for one-hot encoded FMQA improves mean optimization performance on wing-shape benchmarks with 17 and 32 variables.
Reference graph
Works this paper leans on
-
[1]
An introduction to variable and feature selection,
I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,”J. Mach. Learn. Res., vol. 3, no. Mar, pp. 1157–1182, 2003
work page 2003
-
[2]
A review of feature selection techniques in bioinformatics,
Y . Saeys, I. Inza, and P. Larranaga, “A review of feature selection techniques in bioinformatics,”Bioinform., vol. 23, no. 19, pp. 2507– 2517, 2007
work page 2007
-
[3]
Research techniques made simple: feature selection for biomarker discovery,
R. Torres and R. L. Judson-Torres, “Research techniques made simple: feature selection for biomarker discovery,”J. Invest. Dermatol., vol. 139, no. 10, pp. 2068–2074, 2019
work page 2068
-
[4]
Genetic interactions involving five or more genes contribute to a complex trait in yeast,
M. B. Taylor and I. M. Ehrenreich, “Genetic interactions involving five or more genes contribute to a complex trait in yeast,”PLOS Genet., vol. 10, no. 5, p. e1004324, 2014
work page 2014
-
[5]
Higher-order genetic interactions and their contribution to com- plex traits,
——, “Higher-order genetic interactions and their contribution to com- plex traits,”Trends. Genet., vol. 31, no. 1, pp. 34–40, 2015
work page 2015
-
[6]
A survey about methods dedicated to epistasis detection,
C. Niel, C. Sinoquet, C. Dina, and G. Rocheleau, “A survey about methods dedicated to epistasis detection,”Front. Genet., vol. 6, p. 285, 2015
work page 2015
-
[7]
Considerations in the search for epistasis,
M. Balvert, J. Cooper-Knock, J. Stamp, R. P. Byrne, S. Mourragui, J. van Gils, S. Benonisdottir, J. Schl ¨uter, K. Kenna, S. Abelnet al., “Considerations in the search for epistasis,”Genome Biol., vol. 25, no. 1, p. 296, 2024
work page 2024
-
[8]
M. D. Ritchie, L. W. Hahn, N. Roodi, L. R. Bailey, W. D. Dupont, F. F. Parl, and J. H. Moore, “Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer,”Am. J. Hum. Genet., vol. 69, pp. 138–147, 2001
work page 2001
-
[9]
Performance analysis of novel methods for detecting epistasis,
J. Shang, J. Zhang, Y . Sun, D. Liu, D. Ye, and Y . Yin, “Performance analysis of novel methods for detecting epistasis,”BMC Bioinform., vol. 12, no. 1, p. 475, 2011
work page 2011
-
[10]
C.-H. Yang, Y .-D. Lin, C.-S. Yang, and L.-Y . Chuang, “An efficiency analysis of high-order combinations of gene–gene interactions using multifactor-dimensionality reduction,”BMC Genomics, vol. 16, no. 1, p. 489, 2015
work page 2015
-
[11]
J. H. Moore, P. C. Andrews, R. S. Olson, S. E. Carlson, C. R. Larock, M. J. Bulhoes, J. P. O’Connor, E. M. Greytak, and S. L. Armentrout, “Grid-based stochastic search for hierarchical gene-gene interactions in population-based genetic studies of common human diseases,”BioData Min., vol. 10, no. 1, p. 19, 2017
work page 2017
-
[12]
Designing metamaterials with quantum annealing and factorization machines,
K. Kitai, J. Guo, S. Ju, S. Tanaka, K. Tsuda, J. Shiomi, and R. Tamura, “Designing metamaterials with quantum annealing and factorization machines,”Phys. Rev. Res., vol. 2, 2020, Art. no. 013319
work page 2020
-
[13]
S. Rendle, “Factorization machines,” inProc. IEEE Int. Conf. Data Min. IEEE, 2010, pp. 995–1000
work page 2010
-
[14]
Ising machines as hardware solvers of combinatorial optimization problems,
N. Mohseni, P. L. McMahon, and T. Byrnes, “Ising machines as hardware solvers of combinatorial optimization problems,”Nat. Rev. Phys., vol. 4, no. 6, pp. 363–379, 2022
work page 2022
-
[15]
Effectiveness of hybrid op- timization method for quantum annealing machines,
S. Kikuchi, N. Togawa, and S. Tanaka, “Effectiveness of hybrid op- timization method for quantum annealing machines,”arXiv preprint arXiv:2507.15544, 2025
-
[16]
Optimization by simulated annealing,
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated annealing,”Science, vol. 220, no. 4598, pp. 671–680, 1983
work page 1983
-
[17]
Quantum annealing in the transverse Ising model,
T. Kadowaki and H. Nishimori, “Quantum annealing in the transverse Ising model,”Phys. Rev. E, vol. 58, pp. 5355–5363, 1998
work page 1998
-
[18]
Black-box optimization using factorization and Ising machines
R. Tamura, Y . Seki, Y . Minamoto, K. Kitai, Y . Matsuda, S. Tanaka, and K. Tsuda, “Black-box optimization using factorization and ising machines,”arXiv preprint arXiv:2507.18003, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[19]
Optimization perfor- mance of factorization machine with annealing under limited training data,
M. Nakano, Y . Seki, S. Kikuchi, and S. Tanaka, “Optimization perfor- mance of factorization machine with annealing under limited training data,”arXiv preprint arXiv:2507.21024, 2025
-
[20]
Towards optimization of photonic-crystal surface-emitting lasers via quantum annealing,
T. Inoue, Y . Seki, S. Tanaka, N. Togawa, K. Ishizaki, and S. Noda, “Towards optimization of photonic-crystal surface-emitting lasers via quantum annealing,”Opt. Express, vol. 30, no. 24, pp. 43 503–43 512, 2022
work page 2022
-
[21]
Quantum annealing designs nonhemolytic antimicrobial peptides in a discrete latent space,
A. Tucs, F. Berenger, A. Yumoto, R. Tamura, T. Uzawa, and K. Tsuda, “Quantum annealing designs nonhemolytic antimicrobial peptides in a discrete latent space,”ACS Med. Chem. Lett., vol. 14, no. 5, pp. 577– 582, 2023
work page 2023
-
[22]
Y . Suga, A. Maruo, and H. Jippo, “A feasibility study for quantum com- puting methodologies in automotive advanced material investigation,” Trans. Soc. Automot. Eng. Jpn., vol. 55, no. 3, 2024
work page 2024
-
[23]
Simultaneous structure design optimization of multiple car models using fmqa,
T. Kondo, T. Kohira, and Y . Minamoto, “Simultaneous structure design optimization of multiple car models using fmqa,”Trans. Soc. Automot. Eng. Jpn., vol. 56, no. 2, 2025
work page 2025
-
[24]
D. R. Velez, B. C. White, A. A. Motsinger, W. S. Bush, M. D. Ritchie, S. M. Williams, and J. H. Moore, “A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction,”Genet. Epidemiol., vol. 31, no. 4, pp. 306– 315, 2007
work page 2007
-
[25]
C.-H. Yang, Y .-D. Lin, L.-Y . Chuang, J.-B. Chen, and H.-W. Chang, “MDR-ER: balancing functions for adjusting the ratio in risk classes and classification errors for imbalanced cases and controls using multifactor- dimensionality reduction,”PLOS ONE, vol. 8, no. 11, p. e79387, 2013
work page 2013
-
[26]
T. Matsumori, M. Taki, and T. Kadowaki, “Application of QUBO solver using black-box optimization to structural design for resonance avoidance,”Sci. Rep., vol. 12, 2022, Art. no. 12143
work page 2022
-
[27]
Toxo: a library for calculating penetrance tables of high-order epistasis models,
C. Ponte-Fern ´andez, J. Gonz ´alez-Dom´ınguez, A. Carvajal-Rodr ´ıguez, and M. J. Mart ´ın, “Toxo: a library for calculating penetrance tables of high-order epistasis models,”BMC Bioinform., vol. 21, 2020, Art. no. 138
work page 2020
-
[28]
R. J. Urbanowicz, J. Kiralis, N. A. Sinnott-Armstrong, T. Heberling, J. M. Fisher, and J. H. Moore, “GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures,” BioData Min., vol. 5, 2012, Art. no. 16
work page 2012
-
[29]
Fixstars Amplify Annealing Engine: Fixstars Amplify,
“Fixstars Amplify Annealing Engine: Fixstars Amplify,” [https://amplify. fixstars.com/en/]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.