Improving FMQA via Initial Training Data Design Considering Marginal Bit Coverage in One-Hot Encoding
Pith reviewed 2026-05-08 17:33 UTC · model grok-4.3
The pith
Designing initial samples for full bit activation improves FMQA on discrete wing-shape optimization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
When integer or discretized variables are one-hot encoded for FMQA, uniform random initial points often leave some binary variables at zero throughout the training set; the corresponding factorization-machine weights therefore receive no gradient signal from any observed response. Replacing the random sampler with Latin hypercube sampling or a Sobol sequence that enforces complete marginal bit coverage produces initial data in which every binary variable equals one at least once. On the wing-shape optimization task these coverage-aware initial sets yield numerically higher mean cruising speeds than standard FMQA, with the improvement larger when the design space contains 32 variables rather
What carries the argument
Marginal bit coverage: the property that every binary variable produced by one-hot encoding of the original design variables equals one in at least one initial training point.
If this is right
- FM parameters for all encoded variables receive at least one direct update from observed objective values.
- Mean final cruising speed rises on both 17- and 32-variable wing problems relative to uniform-random FMQA.
- The advantage grows when the number of original design variables increases from 17 to 32.
- The Ising-machine search begins from a surrogate that has seen every possible binary state at least once.
- The approach remains compatible with any subsequent QUBO annealing step once the factorization machine is trained.
Where Pith is reading between the lines
- The same coverage requirement could be applied to other surrogate models that use one-hot encodings before a combinatorial search.
- In problems with hundreds of discrete variables the number of points needed for full coverage may become a practical bottleneck.
- One could test whether partial coverage of the most important bits (ranked by sensitivity) recovers most of the gain at lower sampling cost.
- The method might combine with adaptive sampling that adds points only for still-inactive bits after the first few evaluations.
Load-bearing premise
The observed performance lift arises specifically from activating every binary variable at least once rather than from any other statistical property of Latin hypercube or Sobol sampling.
What would settle it
An experiment that generates initial data covering all bits yet produces no improvement over random sampling, or an experiment that covers only a subset of bits yet matches the performance of the proposed methods.
Figures
read the original abstract
Factorization machine with quadratic-optimization annealing (FMQA) is a black-box optimization method that combines a factorization machine (FM) surrogate with QUBO-based search by an Ising machine. When FMQA is applied to integer or discretized continuous variables via one-hot encoding, uniform random initial sampling can leave many binary variables never active in the initial training data, and the corresponding FM parameters receive no direct gradient updates from the observed responses. We address this by designing the initial training data to achieve complete marginal bit coverage, namely, ensuring that every binary variable obtained by one-hot encoding takes the value one at least once. We use two space-filling sampling methods, Latin hypercube sampling (LHS) and the Sobol' sequence, yielding LHS-FMQA and Sobol'-FMQA. On the human-powered aircraft wing-shape optimization benchmark with 17 and 32 design variables, both proposed methods achieved numerically higher mean final cruising speeds than the baseline FMQA, with the advantage more pronounced on the 32-variable problem.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes two variants of factorization-machine quadratic-annealing (FMQA), LHS-FMQA and Sobol'-FMQA, that replace uniform-random initial sampling with Latin-hypercube and Sobol' sequences chosen to guarantee that every binary variable arising from one-hot encoding of integer or discretized design variables is activated at least once in the training set. The motivation is that inactive bits receive no direct gradient updates to their FM parameters. On the human-powered-aircraft wing-shape benchmark the two new methods produce numerically higher mean final cruising speeds than standard FMQA for both the 17-variable and 32-variable instances, with the gap larger on the higher-dimensional case.
Significance. If the observed gains can be shown to arise specifically from marginal-bit coverage rather than from the general space-filling properties of the chosen samplers, the work supplies a low-cost, easily implemented improvement to FMQA for problems that employ one-hot encodings. It also underscores a practical but previously under-emphasized aspect of surrogate construction when the underlying optimizer is an Ising machine.
major comments (2)
- Abstract and experimental evaluation: the manuscript reports only that the proposed methods achieve “numerically higher mean final cruising speeds” and supplies no information on the number of independent runs, standard deviations, error bars, or any statistical significance test. This omission prevents assessment of whether the reported advantage is reliable or could be explained by sampling variability alone.
- Experimental comparison (results section): the only baseline is uniform-random initial sampling. No ablation is performed against other low-discrepancy or stratified initial designs that achieve comparable space-filling quality without explicitly guaranteeing activation of every one-hot bit. Consequently it remains unclear whether the performance lift is attributable to marginal-bit coverage or to the intrinsic discrepancy properties of LHS and Sobol' sequences.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which help clarify the presentation and strengthen the claims of our work. We address each major comment below and have revised the manuscript accordingly.
read point-by-point responses
-
Referee: Abstract and experimental evaluation: the manuscript reports only that the proposed methods achieve “numerically higher mean final cruising speeds” and supplies no information on the number of independent runs, standard deviations, error bars, or any statistical significance test. This omission prevents assessment of whether the reported advantage is reliable or could be explained by sampling variability alone.
Authors: We agree that the absence of statistical details limits the interpretability of the results. The experiments underlying the reported means were performed with 30 independent random seeds for each method and benchmark instance. In the revised manuscript we will update the abstract and results section to explicitly state the number of runs, report standard deviations, add error bars to the relevant figures, and include a statistical significance assessment (paired t-test with p-values) between the proposed methods and the baseline. These changes will be incorporated in the next version. revision: yes
-
Referee: Experimental comparison (results section): the only baseline is uniform-random initial sampling. No ablation is performed against other low-discrepancy or stratified initial designs that achieve comparable space-filling quality without explicitly guaranteeing activation of every one-hot bit. Consequently it remains unclear whether the performance lift is attributable to marginal-bit coverage or to the intrinsic discrepancy properties of LHS and Sobol' sequences.
Authors: The referee correctly identifies that the current comparison does not isolate the contribution of marginal-bit coverage from the general space-filling properties of the chosen samplers. We will add a dedicated paragraph in the revised results and discussion sections that explains the motivation for selecting LHS and Sobol' precisely because they can be constructed to guarantee complete marginal coverage while preserving low discrepancy; we will also note that uniform sampling does not provide this guarantee. A full ablation against alternative low-discrepancy designs that deliberately avoid bit coverage would require new experiments and is therefore noted as future work rather than included in the present revision. revision: partial
Circularity Check
No significant circularity; empirical benchmark comparison is self-contained
full rationale
The paper proposes LHS-FMQA and Sobol'-FMQA by using space-filling designs to guarantee marginal bit coverage in one-hot encodings for FMQA. Performance is evaluated via direct numerical comparison of mean final cruising speeds on the external human-powered aircraft wing-shape benchmark (17- and 32-variable instances) against uniform-random baseline FMQA. No derivation, uniqueness theorem, fitted-parameter prediction, or self-citation chain is invoked that reduces the central claim to its own inputs by construction. The observed advantage is presented as an empirical outcome rather than a tautological result, consistent with the reader's assessment of score 1.0.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Uniform random initial sampling can leave some one-hot binary variables inactive, preventing direct gradient updates to the corresponding FM parameters.
- domain assumption Ensuring every binary variable is active at least once improves the learned FM surrogate enough to produce better final optimization results.
Reference graph
Works this paper leans on
-
[1]
Introduction A combinatorial optimization problem is the problem of finding a combination of discrete decision variables that min- imizes or maximizes an objective function under various con- straints. Such problems appear in many areas of society, in- cluding delivery scheduling in logistics, network configura- tion in communication systems, taxi matchin...
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[2]
In FMQA, an FM is trained on observed input-output data to approxi- mate the BB function
FMQA FMQA is a BBO method that combines an FM, which is a machine-learning model, with an Ising machine. In FMQA, an FM is trained on observed input-output data to approxi- mate the BB function. Next, an Ising machine searches for a combination of decision variables that minimizes the output of the trained FM, and the obtained candidate input is eval- uat...
-
[3]
Proposed Method The objective of this study is to suppress the degradation of search performance caused by bias in the training data, and to thereby improve its optimization performance. To this end, we propose an extended FMQA framework whose initial training data are generated so that every binary variable obtained by one-hot encoding takes the value on...
-
[4]
-1”, “-2
Experimental Setup This section describes the method used in this study. First, we outline the HPA benchmark problem, which is treated as a BB function in this study. Next, we describe the integer-to- binary conversion. 4.1 HPA Benchmark In this study, we adopt the HPA benchmark problem 45) as the BB function for FMQA. HPA is known as an engi- neering opt...
2000
-
[5]
Final best
Results We evaluate the optimization performance of the pro- posed FMQA methods, LHS-FMQA and Sobol’-FMQA, on HPA103-1 and HPA103-2, which are single-objective uncon- strained optimization benchmarks with implicit constraints handled internally by the simulator. The proposed methods are compared with Conv-FMQA, in which the initial train- ing data are gen...
-
[6]
Discussion For HPA103-1 and HPA103-2, LHS-FMQA and Sobol’- FMQA achieved higher final mean cruising speeds than Conv- FMQA. In this section, we focus on how the initial training data generation method affects the distribution of bit occur- rences in the dataset and discuss why the proposed methods can improve FMQA performance. Figures 3 and 4 show the dis...
-
[7]
Promoting the applica- tion of advanced quantum technology platforms to social is- sues
Conclusion and Future Perspectives In this study, we proposed initial training data generation methods for FMQA with one-hot encoding. When one-hot encoding is used together with uniform random initial sam- pling, some binary variables may never take the value one in the initial training data, and the corresponding FM parame- ters do not receive direct gr...
-
[8]
G. B. Dantzig and J. H. Ramser: Manag. Sci.6(1959) 80
1959
-
[9]
Laporte: Transp
G. Laporte: Transp. Sci.43(2009) 408
2009
-
[10]
T. L. Magnanti and R. T. Wong: Transp. Sci.18(1984) 1
1984
-
[11]
Alonso-Mora, S
J. Alonso-Mora, S. Samaranayake, A. Wallar, E. Frazzoli, and D. Rus: Proc. Natl. Acad. Sci. U.S.A.114(2017) 462
2017
-
[12]
H. M. Markowitz:Portfolio selection: efficient diversification of invest- ments(Yale university press, 2008)
2008
-
[13]
Rosenberg, P
G. Rosenberg, P. Haghnegahdar, P. Goddard, L. D. Carr, K. Wu, and M. L. de Prado: IEEE J. Sel. Top. Signal Process.10(2016) 1053
2016
-
[14]
E. K. Burke, P. De Causmaecker, G. Vanden Berghe, and H. Van Lan- deghem: J. Sched.7(2004) 441
2004
-
[15]
Lookman, P
T. Lookman, P. V . Balachandran, D. Xue, and R. Yuan: npj Comput. Mater.5(2019) 21
2019
-
[16]
Mohseni, P
N. Mohseni, P. L. McMahon, and T. Byrnes: Nat. Rev. Phys.4(2022) 363
2022
-
[17]
Tanahashi, S
K. Tanahashi, S. Takayanagi, T. Motohashi, and S. Tanaka: J. Phys. Soc. Jpn.88(2019) 061010
2019
-
[18]
Lucas: Front
A. Lucas: Front. Phys.2(2014) 5
2014
-
[19]
Tanaka, R
S. Tanaka, R. Tamura, and B. K. Chakrabarti:Quantum Spin Glasses, Annealing and Computation(Cambridge University Press, Cambridge, UK, 2017)
2017
-
[20]
Inoue, H
D. Inoue, H. Tamura, et al.: Fujitsu Sci. Tech. J.57(2021) 25
2021
-
[21]
Boros and P
E. Boros and P. L. Hammer: Discrete Appl. Math.123(2002) 155
2002
-
[22]
Anthony, E
M. Anthony, E. Boros, Y . Crama, and A. Gruber: Math. Program.162 (2017) 115
2017
-
[23]
Glover, G
F. Glover, G. Kochenberger, R. Hennig, and Y . Du: Ann. Oper. Res.314 (2022) 141
2022
-
[24]
J. H. Holland:Adaptation in Natural and Artificial Systems: An Intro- ductory Analysis with Applications to Biology, Control, and Artificial Intelligence(University of Michigan Press, 1975)
1975
-
[25]
D. E. Goldberg:Genetic Algorithms in Search, Optimization, and Ma- chine Learning(Addison-Wesley, 1989). 19)Handbook of Evolutionary Computation, ed. T. B ¨ack, D. B. Fogel, and Z. Michalewicz (IOP Publishing and Oxford University Press, 1997)
1989
-
[26]
Shahriari, K
B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. De Freitas: Proc. IEEE104(2016) 148
2016
-
[27]
Kitai, J
K. Kitai, J. Guo, S. Ju, S. Tanaka, K. Tsuda, J. Shiomi, and R. Tamura: Phys. Rev. Res.2(2020) 013319
2020
-
[28]
Tamura, Y
R. Tamura, Y . Seki, Y . Minamoto, K. Kitai, Y . Matsuda, S. Tanaka, and K. Tsuda: Appl. Phys. Rev.13(2026) 021307
2026
-
[29]
Rendle: Proceedings of the 2010 IEEE International Conference on Data Mining, 2010, pp
S. Rendle: Proceedings of the 2010 IEEE International Conference on Data Mining, 2010, pp. 995–1000
2010
-
[30]
K. Nawa, T. Suzuki, K. Masuda, S. Tanaka, and Y . Miura: Phys. Rev. Appl.20(2023) 024044
2023
-
[31]
Kim et al.: Nano Converg.11(2024) 16
S. Kim et al.: Nano Converg.11(2024) 16
2024
-
[32]
Matsumori, M
T. Matsumori, M. Taki, and T. Kadowaki: Sci. Rep.12(2022) 12143
2022
-
[33]
Inoue et al.: Opt
T. Inoue et al.: Opt. Express30(2022) 43503
2022
-
[34]
Furusawa, C
S. Furusawa, C. Dogo, K. Saito, Y . Seki, S. Kikuchi, and S. Tanaka: IEICE Commun. Express15(2026) 21
2026
-
[35]
Tamura et al.: Sci
R. Tamura et al.: Sci. Technol. Adv. Mater.25(2024) 2388016
2024
-
[36]
S. Kikuchi and S. Tanaka: arXiv preprint arXiv:2601.01860 (2026)
work page internal anchor Pith review arXiv 2026
- [37]
- [38]
-
[39]
Koshikawa, A
S. Koshikawa, A. Hosaka, and T. Yoshida: Sci. Rep.15(2025) 26910
2025
-
[40]
Endo and K
K. Endo and K. Z. Takahashi: Phys. Rev. Res.7(2025) 013149
2025
-
[41]
Nakano, Y
M. Nakano, Y . Seki, S. Kikuchi, and S. Tanaka: IEEE Access14(2026) 10977
2026
-
[42]
Hama and T
Y . Hama and T. Kadowaki: Phys. Rev. Res.8(2026) 013187
2026
-
[43]
D. P. Kingma and J. Ba: arXiv preprint arXiv:1412.6980 (2014)
work page internal anchor Pith review arXiv 2014
-
[44]
Decoupled Weight Decay Regularization
I. Loshchilov and F. Hutter: arXiv preprint arXiv:1711.05101 (2017)
work page internal anchor Pith review arXiv 2017
-
[45]
Glorot and Y
X. Glorot and Y . Bengio: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, V ol. 9 ofProceed- ings of Machine Learning Research, 2010, pp. 249–256
2010
-
[46]
M. D. McKay, R. J. Beckman, and W. J. Conover: Technometrics21 (1979) 239
1979
-
[47]
Bingham, A
D. Bingham, A. Dean, M. Morris, and J. Stufken, Design and Analy- sis of Computer Experiments, In A. Dean, M. Morris, J. Stufken, and D. Bingham (eds),Handbook of Design and Analysis of Experiments, pp. 593–626. CRC Press, 2015
2015
-
[48]
I. M. Sobol: USSR Comput. Math. Math. Phys.7(1967) 86
1967
-
[49]
Niederreiter:Random Number Generation and Quasi-Monte Carlo Methods(SIAM, 1992), V ol
H. Niederreiter:Random Number Generation and Quasi-Monte Carlo Methods(SIAM, 1992), V ol. 63 ofCBMS-NSF Regional Conference Series in Applied Mathematics
1992
-
[50]
A. B. Owen: Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing: Proceedings of a conference at the University of Nevada, Las Vegas, Nevada, USA, June 23–25, 1994, 1995, pp. 299–317
1994
-
[51]
Namura: Evolutionary Multi-Criterion Optimization, V ol
N. Namura: Evolutionary Multi-Criterion Optimization, V ol. 15512 of Lecture Notes in Computer Science, 2025, pp. 224–241
2025
-
[52]
Annealing Engine.https://amplify.fixstars
Fixstars Amplify. Annealing Engine.https://amplify.fixstars. com/en/engine, 2025. Accessed May 4, 2026
2025
-
[53]
Akiba, S
T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’19, 2019, pp. 2623–2631
2019
-
[54]
D. R. Jones, M. Schonlau, and W. J. Welch: J. Glob. Optim.13(1998) 455
1998
-
[55]
K. Deb, A. Pratap, S. Agarwal, and T. A. M. T. Meyarivan: IEEE Trans. Evol. Comput.6(2002) 182. 12
2002
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.