Hardware-Compatible Single-Shot Feasible-Space Heuristics for Solving the Quadratic Assignment Problem

Arne Heittmann; Chan-Woo Yang; Dmitrii Dobrynin; Dmitri Strukov; Elisabetta Valiante; Giacomo Pedretti; Haesol Im; Ignacio Rozada; John Paul Strachan; Masoud Mohseni

arxiv: 2503.09676 · v3 · pith:S3UL3QXMnew · submitted 2025-03-12 · 🧮 math.OC · cs.AR

Hardware-Compatible Single-Shot Feasible-Space Heuristics for Solving the Quadratic Assignment Problem

Haesol Im , Chan-Woo Yang , Moslem Noori , Dmitrii Dobrynin , Elisabetta Valiante , Giacomo Pedretti , Arne Heittmann , Thomas Van Vaerenbergh

show 5 more authors

Masoud Mohseni John Paul Strachan Dmitri Strukov Ray Beausoleil Ignacio Rozada

This is my paper

Pith reviewed 2026-05-22 23:47 UTC · model grok-4.3

classification 🧮 math.OC cs.AR

keywords quadratic assignment problemfeasible-space heuristicin-memory computinglocal searchhardware-aware optimizationcombinatorial optimizationparallel neighborhood evaluation

0 comments

The pith

Co-designing local search with hardware allows single-shot evaluation of full feasible neighborhoods for the quadratic assignment problem.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a hardware-aware optimization framework for the quadratic assignment problem that works on in-memory computing devices. It co-designs the local search so that the full set of feasible neighboring solutions can be evaluated together in one parallel step, then picks only the moves that keep the solution feasible. This sidesteps the usual overhead of turning constrained problems into unconstrained binary forms. A reader would care because the method runs on both ordinary CPUs and specialized analog hardware while staying competitive with existing heuristics.

Core claim

By co-designing the local search heuristic with the underlying hardware, full neighbourhoods can be computed simultaneously to make update decisions. Binary solutions remain feasible by selecting local moves that lead to neighbouring feasible solutions. The approach is compatible with both digital computers and analog hardware and demonstrates effectiveness in CPU implementations by comparison with state-of-the-art heuristics.

What carries the argument

Single-shot feasible-space heuristics that use hardware parallelism to evaluate full feasible neighborhoods in one step and select only moves preserving feasibility.

If this is right

The method runs on both CPU and IMC hardware without first mapping the problem to QUBO form.
It produces solutions competitive with existing state-of-the-art heuristics when run on ordinary computers.
Each iteration can exploit massive parallelism to evaluate every feasible neighbor at once.
Feasibility is maintained through choice of moves rather than through extra penalty terms or repair steps.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same single-step neighborhood evaluation could reduce runtime on other permutation-based problems whose feasible moves have similar structure.
Actual hardware runs would be needed to confirm whether the claimed single parallel step incurs truly negligible overhead.
If the parallelism scales, the approach could handle larger QAP instances than conventional local search allows on CPUs.

Load-bearing premise

The hardware can compute full feasible neighborhoods in a single parallel step with negligible overhead and without introducing new constraint violations.

What would settle it

Executing the heuristic on physical in-memory computing hardware and checking whether iteration time stays constant as neighborhood size grows on standard QAP benchmark instances.

Figures

Figures reproduced from arXiv: 2503.09676 by Arne Heittmann, Chan-Woo Yang, Dmitrii Dobrynin, Dmitri Strukov, Elisabetta Valiante, Giacomo Pedretti, Haesol Im, Ignacio Rozada, John Paul Strachan, Masoud Mohseni, Moslem Noori, Ray Beausoleil, Thomas Van Vaerenbergh.

**Figure 2.1.** Figure 2.1: FIG. 2.1: Relative optimality gaps between objective values observed by three heuristic solvers and optimal [PITH_FULL_IMAGE:figures/full_fig_p005_2_1.png] view at source ↗

**Figure 2.2.** Figure 2.2: FIG. 2.2: Relative optimality gap as a function of the number of iterations for the QAPLIB [27] instances [PITH_FULL_IMAGE:figures/full_fig_p006_2_2.png] view at source ↗

**Figure 4.1.** Figure 4.1: FIG. 4.1: (a) Relationship between two feasible binary solutions 2 [PITH_FULL_IMAGE:figures/full_fig_p012_4_1.png] view at source ↗

**Figure 4.2.** Figure 4.2: FIG. 4.2: The transition of the full-neighbourhood lookup matrices [PITH_FULL_IMAGE:figures/full_fig_p015_4_2.png] view at source ↗

**Figure 4.** Figure 4: (d), (e), and (f) for an instance of size [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 4.3.** Figure 4.3: FIG. 4.3: Diagram of the computation of the equality (4.17) at variable assignment [PITH_FULL_IMAGE:figures/full_fig_p017_4_3.png] view at source ↗

**Figure 5.1.** Figure 5.1: FIG. 5.1: Objective value trajectories of the [PITH_FULL_IMAGE:figures/full_fig_p023_5_1.png] view at source ↗

**Figure 5.2.** Figure 5.2: FIG. 5.2: Objective value trajectories of the [PITH_FULL_IMAGE:figures/full_fig_p023_5_2.png] view at source ↗

**Figure 5.** Figure 5: summarizes the number of times the approximate gradient gradient method outperforms the [PITH_FULL_IMAGE:figures/full_fig_p023_5.png] view at source ↗

**Figure 5.3.** Figure 5.3: FIG. 5.3: (a) Range of relative optimality gaps between approximate and exact gradient computations [PITH_FULL_IMAGE:figures/full_fig_p024_5_3.png] view at source ↗

**Figure 5.4.** Figure 5.4: FIG. 5.4: Proportion of 100 trials in which the approximate or exact gradient method reaches the optimal [PITH_FULL_IMAGE:figures/full_fig_p024_5_4.png] view at source ↗

**Figure 5.5.** Figure 5.5: FIG. 5.5: Ratio of the average runtime using the exact gradient method versus the approximate gradient [PITH_FULL_IMAGE:figures/full_fig_p024_5_5.png] view at source ↗

**Figure 5.** Figure 5: shows rel [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗

**Figure 5.6.** Figure 5.6: shows rel error over the first 1000 iterations for instances formulated by Skorin–Kapov (“sko”), Li and Pardalos (“lipa”), and Taillard (“tai”). We observe a consistent trend across all three classes, indicating that the contribution of Ei relative to ∆i diminishes as the problem size increases, hence, the impact of omitting the error terms becomes negligible in large-scale instances. 0 500 1000 Iteratio… view at source ↗

**Figure 5.7.** Figure 5.7: FIG. 5.7: Computation load distribution on six QAPLIB instances. The runtimes over 10 [PITH_FULL_IMAGE:figures/full_fig_p026_5_7.png] view at source ↗

read the original abstract

Research into the development of special-purpose computing architectures designed to solve quadratic unconstrained binary optimization (QUBO) problems has flourished in recent years. It has been demonstrated in the literature that such special-purpose solvers can outperform traditional complementary metal--oxide--semiconductor architectures by orders of magnitude with respect to timing metrics on synthetic problems. However, they face challenges with constrained problems such as the quadratic assignment problem (QAP), where mapping to binary formulations such as QUBO introduces overhead and limits parallelism. In-memory computing (IMC) devices, such as memristor-based analog Ising machines, offer significant speed-ups and efficiency gains over traditional CPU-based solvers, particularly for solving combinatorial optimization problems. In this work, we present a novel hardware-aware QAP optimization framework designed for IMC hardware. By co-designing the local search heuristic with the underlying hardware, we exploit the intrinsic massive parallelism that allows for computing of full neighbourhoods simultaneously to make update decisions. We ensure binary solutions remain feasible by selecting local moves that lead to neighbouring feasible solutions, leveraging feasible-space search heuristics and the underlying structure of a given problem. Our approach is compatible with both digital computers and analog hardware. We demonstrate its effectiveness in CPU implementations by comparing it with state-of-the-art heuristics for solving the QAP.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The QAP heuristic is presented as hardware-compatible but only CPU results are shown.

read the letter

The key takeaway is that this work gives a feasible-space local search for QAP meant to run in one parallel step on IMC hardware, but the results are all from CPU runs. The new part is the way they combine feasible neighborhood selection with the single-shot evaluation that analog hardware can do. They leverage the QAP structure to pick only valid moves, which avoids the usual penalty terms or repairs that hurt parallelism. That part is solid on paper and the CPU comparisons look reasonable against SOTA heuristics. The issue is the hardware side. The abstract and presumably the paper claim compatibility with memristor-based Ising machines, but there is no hardware experiment, no crossbar simulation, and no analysis of how the move selection would actually be implemented in analog without additional digital logic. The load-bearing assumption stays untested. This paper is aimed at people working on special-purpose hardware for optimization. A reader in that niche might get some ideas from the heuristic design. It is worth sending to peer review so the method can be checked in detail and the authors can address whether the hardware mapping is feasible as described.

Referee Report

2 major / 1 minor

Summary. The paper proposes a hardware-aware local-search framework for the Quadratic Assignment Problem that co-designs a feasible-space heuristic with in-memory computing (IMC) architectures. By evaluating full neighborhoods in one parallel step and restricting updates to feasible neighboring solutions, the method aims to avoid QUBO-mapping overhead while remaining compatible with both digital CPUs and analog memristor-based Ising machines; effectiveness is shown via CPU implementations benchmarked against existing QAP heuristics.

Significance. A validated single-shot feasible-space heuristic on IMC hardware would remove a key barrier to applying analog solvers to constrained combinatorial problems and could deliver substantial speed and energy gains over CPU baselines. The CPU results supply a necessary first benchmark, but the absence of any hardware mapping or timing measurements means the central hardware-compatibility claim remains untested.

major comments (2)

[Abstract] Abstract: the assertion that the heuristic 'allows for computing of full neighbourhoods simultaneously' on IMC hardware is unsupported; the manuscript reports only CPU implementations and provides neither a hardware mapping, crossbar simulation, nor any analog timing/energy data that would confirm single-step parallel evaluation of feasible neighborhoods.
[Abstract] Abstract: the claim that binary solutions 'remain feasible by selecting local moves that lead to neighbouring feasible solutions' is presented as hardware-compatible, yet no mechanism is described for realizing the feasible-move filter in analog IMC circuitry without extra digital correction logic or risk of introducing uncorrectable constraint violations.

minor comments (1)

The abstract refers to comparisons with 'state-of-the-art heuristics' but does not name the specific algorithms or report quantitative metrics (runtime, solution quality) in the provided text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed review and for identifying points where the abstract could be read as overstating hardware results. The manuscript presents an algorithmic co-design whose CPU implementation is fully benchmarked; the hardware-compatibility claims rest on the design choices that map directly onto IMC parallelism and feasible-space moves. We will revise the abstract and add clarifying text to ensure no unsupported performance assertions appear. No hardware experiments or crossbar simulations were performed, so we limit claims to algorithmic compatibility.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion that the heuristic 'allows for computing of full neighbourhoods simultaneously' on IMC hardware is unsupported; the manuscript reports only CPU implementations and provides neither a hardware mapping, crossbar simulation, nor any analog timing/energy data that would confirm single-step parallel evaluation of feasible neighborhoods.

Authors: We agree the phrasing in the abstract can be misread as claiming executed single-shot evaluation on IMC hardware. The heuristic is constructed so that a single parallel read-out of the entire neighborhood is sufficient to select the next feasible move; this property is a direct consequence of the feasible-space formulation and matches the native parallelism of IMC crossbars. The CPU implementation evaluates the same neighborhood structure sequentially to obtain the reported benchmarks. We will revise the abstract to read that the heuristic is designed to exploit single-shot neighborhood evaluation on IMC hardware, while making explicit that all reported results are from CPU runs and that hardware validation remains future work. revision: yes
Referee: [Abstract] Abstract: the claim that binary solutions 'remain feasible by selecting local moves that lead to neighbouring feasible solutions' is presented as hardware-compatible, yet no mechanism is described for realizing the feasible-move filter in analog IMC circuitry without extra digital correction logic or risk of introducing uncorrectable constraint violations.

Authors: The feasible-space restriction is enforced at the algorithmic level by enumerating only the moves that preserve the permutation structure of QAP solutions; this enumeration can be performed either in digital pre-processing or by a lightweight digital controller that gates the analog update. The manuscript does not claim an all-analog realization of the filter. We will add a short paragraph in the methods section describing this hybrid interface and will tone down the abstract to state that the approach remains compatible with IMC hardware when a feasible-move selector (digital or hybrid) is present. revision: yes

Circularity Check

0 steps flagged

No circularity: heuristic description contains no derivations, fits, or self-referential reductions

full rationale

The manuscript describes an algorithmic framework for feasible-space local search co-designed for IMC hardware. No equations, parameters, or first-principles derivations appear in the provided text. The central claims rest on stated design choices and CPU-based empirical comparisons rather than any quantity derived from itself by construction. No self-citations, ansatzes, or uniqueness theorems are invoked in a load-bearing way. The derivation chain is therefore self-contained and does not reduce to its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the central claim implicitly assumes standard definitions of QAP and local search plus an unverified hardware parallelism property.

pith-pipeline@v0.9.0 · 5817 in / 1091 out tokens · 25940 ms · 2026-05-22T23:47:26.252240+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages

[1]

We provide an efficient construction of the full neighbourhood at a given feasible binary solution

work page
[2]

We design a parallel evaluation framework that computes objective differences across all neighbouring solutions in one iteration

work page
[3]

We introduce a framework tailored to the architectural requirements of application-driven IMC hard- ware. 1.3. Outline Section 2 provides a background on the QAP and a discussion on the motivation for using the feasible- space method. In Section 3, we show formulas that leverage problems’ underlying structure to optimize performance. Section 4 introduces ...

work page
[4]

Being an NP-hard problem, the QAP is commonly tackled using local- search heuristic algorithms, such as those presented in Refs

BACKGROUND In this section, we review existing approaches for solving the QAP and explain the rationale behind pursuing the feasible-space method. Being an NP-hard problem, the QAP is commonly tackled using local- search heuristic algorithms, such as those presented in Refs. [3, 9, 12]. They are CPU-based solvers that offer fast runtimes, and have been sh...

work page
[5]

binary formulation

AN IMPROVED LOCAL SEARCH HEURISTIC FOR HARDWARE The QAP problem QAP( F, D) can be formulated as a series of matrix multiplications by representing solutions as permutation matrices. Recall that Pn is the set of n × n permutation matrices. Then, the 6 103 105 0 10 3 10 2 10 1 100 Optimality Gap chr12a 103 105 rou15 103 105 nug18 103 105 Iteration 0 10 3 10...

work page
[6]

neighbour

FULL-NEIGHBOURHOOD SEARCH In the native QAP space, given a solution represented as a permutation matrix X ∈ Pn, we call Y a “neighbour” of X if Y is obtained by swapping two distinct columns of X. A “full neighbourhood” of X is the set of all such neighbours of X. Hence, the full neighbourhood of X consists of n 2 = n(n−1) 2 members. In this section, we p...

work page
[7]

Note that by the definition (4.1), the elements in this neighbour are related by the equalities z∗ 1 = j ¯z1 n k · n + mod(¯z2, n) and z∗ 2 = j ¯z2 n k · n + mod(¯z1, n)

∈ N (x). Note that by the definition (4.1), the elements in this neighbour are related by the equalities z∗ 1 = j ¯z1 n k · n + mod(¯z2, n) and z∗ 2 = j ¯z2 n k · n + mod(¯z1, n). (4.4) 12 Now, consider the neighbour that results from the quadruple bit-flip via (¯z1, ¯z2, z∗ 2 , z∗ 1), y = x − e¯z1 − e¯z2 + ez∗ 1 + ez∗ 2 , and its full neighbourhood N (y)...

work page
[8]

For all i ∈ R1, let S12(i, ¯z1) = 0 and S12(i, z∗

work page
[9]

For all i ∈ R2, let S12(i, ¯z2) = 0 and S12(i, z∗

work page
[10]

Note that this update involves changing 2n − 2 elements of S12 from 0 to 1, and another 2 n − 2 changes from 1 to 0

For i ∈ R3, let S12(i, [¯z1, ¯z2]) = 0 and S12(i, [z∗ 1 , z∗ 2]) = 1. Note that this update involves changing 2n − 2 elements of S12 from 0 to 1, and another 2 n − 2 changes from 1 to 0. See Figure 4.2 (a), (b), and (c) for an illustration of the transition of this update for an instance of size n = 12. The update procedure for S12 is outlined in Algorith...

work page
[11]

= 1, i ∈ R1 3: S12(i, z2) = 0, S12(i, z∗

work page
[12]

1” to “0

= 1, i ∈ R2 4: S12(i, [z1, z2]) = 0, S12(i, [z∗ 1 , z∗ 2]) = 1, i ∈ R3 5: return S12 6: end function Recall that a neighbour has an analogous meaning in the space of permutation matrices. Once a neighbour is chosen from N (vec(X)), it analogously means that two columns of X ∈ Pn are swapped while the other n − 2 columns remain invariant. The indices of th...

work page
[13]

identifier n version

NUMERICAL EXPERIMENTS We used instances from the QAPLIB [27] of various sizes ranging from n = 12 to n = 100 to evaluate the performance of Algorithm 4.5. All instances from the QAPLIB follow a size-based naming convention, “identifier n version”, where n indicates the problem size. The relative optimality gap was chosen as the performance metric. Let fbe...

work page 2000
[14]

CONCLUSIONS To sustain the computational performance growth rate in the post-Moore’s law era, new technologies such as IMC have emerged. Mapping the native formulation of binary constrained optimization problems onto the unconstrained binary formulation incurs a massive overhead and remains the primary obstacle limiting the performance of special-purpose ...

work page
[15]

Aramon, G

M. Aramon, G. Rosenberg, E. Valiante, T. Miyazawa, H. Tamura, and H. G. Katzgraber , Physics-inspired optimization for quadratic unconstrained problems using a digital annealer, Frontiers in Physics, 7 (2019), p. 48

work page 2019
[16]

Bagherbeik, W

M. Bagherbeik, W. Xu, S. F. Mousavi, K. Kanda, H. Tamura, and A. Sheikholeslami , Maqo: A scalable many-core annealer for quadratic optimization, in 2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), 2022, pp. 76–77

work page 2022
[17]

Bashiri and H

M. Bashiri and H. Karimi, Effective heuristics and meta-heuristics for the quadratic assignment problem with tuned parameters and analytical comparisons, Journal of industrial engineering international, 8 (2012), pp. 1–9

work page 2012
[18]

Bhattacharya, G

T. Bhattacharya, G. H. Hutchinson, G. Pedretti, X. Sheng, J. Ignowski, T. Van Vaerenbergh, R. Beausoleil, J. P. Strachan, and D. B. Strukov , Computing high-degree polynomial gradients in memory, Nature Communications, 15 (2024), p. 8211

work page 2024
[19]

R. E. Burkard, E. Cela, P. M. Pardalos, and L. S. Pitsoulis, The quadratic assignment problem, Springer, 1998

work page 1998
[20]

F. Cai, S. Kumar, T. Van Vaerenbergh, X. Sheng, R. Liu, C. Li, Z. Liu, M. Foltin, S. Yu, Q. Xia, et al., Power-efficient combinatorial optimization using intrinsic noise in memristor hopfield neural networks, Nature Electronics, 3 (2020), pp. 409–418

work page 2020
[21]

S. A. de Carvalho Jr. and S. Rahmann , Microarray layout as quadratic assignment problem, in German Conference on Bioinformatics, Gesellschaft f¨ ur Informatik eV, 2006

work page 2006
[22]

Dobrynin, A

D. Dobrynin, A. Renaudineau, M. Hizzani, D. Strukov, M. Mohseni, and J. P. Strachan , Energy landscapes of combinatorial optimization in Ising machines, Phys. Rev. E, 110 (2024), p. 045308

work page 2024
[23]

Drezner, A new genetic algorithm for the quadratic assignment problem, INFORMS journal on computing, 15 (2003), pp

Z. Drezner, A new genetic algorithm for the quadratic assignment problem, INFORMS journal on computing, 15 (2003), pp. 320–330

work page 2003
[24]

Fahimi, M

Z. Fahimi, M. Mahmoodi, H. Nili, V. Polishchuk, and D. Strukov, Combinatorial optimization by weight annealing in memristive hopfield networks, Scientific Reports, 11 (2021), p. 16383

work page 2021
[25]

Finke, R

G. Finke, R. E. Burkard, and F. Rendl , Quadratic assignment problems, in North-Holland Mathematics Studies, vol. 132, Elsevier, 1987, pp. 61–82

work page 1987
[26]

L. M. Gambardella, ´E. D. Taillard, and M. Dorigo , Ant colonies for the quadratic assignment problem, The Journal of the Operational Research Society, 50 (1999), pp. 167–176

work page 1999
[27]

Glover, G

F. Glover, G. Kochenberger, R. Hennig, and Y. Du, Quantum bridge analytics I: a tutorial on formulating and using QUBO models, Annals of Operations Research, 314 (2022), pp. 141–183

work page 2022
[28]

H. Goto, K. Tatsumura, and A. R. Dixon, Combinatorial optimization by simulating adiabatic bifurcations 29 in nonlinear Hamiltonian systems, Science Advances, 5 (2019)

work page 2019
[29]

Hamerly, T

R. Hamerly, T. Inagaki, P. L. McMahon, D. Venturelli, A. Marandi, T. Onodera, E. Ng, C. Lan- grock, K. Inaba, T. Honjo, K. Enbutsu, T. Umeki, R. Kasahara, S. Utsunomiya, S. Kako, K. I. Kawarabayashi, R. L. Byer, M. M. Fejer, H. Mabuchi, D. Englund, E. Rieffel, H. Takesue, and Y. Yamamoto, Experimental investigation of performance differences between coher...

work page 2019
[30]

Johnson, M

M. Johnson, M. Amin, S. Gildert, T. Lanting, F. Hamze, N. Dickson, R. Harris, A. Berkley, J. Jo- hansson, P. Bunyk, E. Chapple, C. Enderud, J. Hilton, K. Karimi, E. Ladizinsky, N. Ladizinsky, T. Oh, I. Perminov, C. Rich, and G. Rose , Quantum annealing with manufactured spins, Nature, 473 (2011), pp. 194–8

work page 2011
[31]

T. C. Koopmans and M. Beckmann , Assignment problems and the location of economic activities, Econo- metrica, 25 (1957), pp. 53–76

work page 1957
[32]

J. Li, M. Moghaddam, and S. Y. Nof , Dynamic storage assignment with product affinity and abc classification—a case study, The International Journal of Advanced Manufacturing Technology, 84 (2016), pp. 2179–2194

work page 2016
[33]

A. Lu, J. Hur, Y.-C. Luo, H. Li, D. E. Nikonov, I. A. Young, Y.-K. Choi, and S. Yu, Scalable in-memory clustered annealer with temporal noise of charge trap transistor for large scale travelling salesman problems, IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 13 (2023), pp. 422–435

work page 2023
[34]

Lucas, Ising formulations of many NP problems, Frontiers in Physics, 2 (2014)

A. Lucas, Ising formulations of many NP problems, Frontiers in Physics, 2 (2014)

work page 2014
[35]

Matsubara, H

S. Matsubara, H. Tamura, M. Takatsu, D. Yoo, B. Vatankhahghadim, H. Yamasaki, T. Miyazawa, S. Tsukamoto, Y. Watanabe, K. Takemoto, and A. Sheikholeslami , Ising-model optimizer with parallel-trial bit-sieve engine, in Complex, Intelligent, and Software Intensive Systems – Proceedings of the 11th International Conference on Complex, Intelligent, and Softwa...

work page 2017
[36]

Mohseni, D

M. Mohseni, D. Eppens, J. Str ¨umpfer, R. Marino, V. S. Denchev, A. K. Ho, S. V. Isakov, S. Boixo, F. Ricci-Tersenghi, and H. Neven, Nonequilibrium monte carlo for unfreezing variables in hard combinatorial optimization, arXiv.2111.13628, (2021)

work page arXiv 2021
[37]

Munera, D

D. Munera, D. Diaz, and S. Abreu , Hybridization as cooperative parallelism for the quadratic assignment problem, in Hybrid Metaheuristics, M. J. Blesa, C. Blum, A. Cangelosi, V. Cutello, A. DI NUOVO, M. Pavone, and E.-G. Talbi, eds., Cham, 2016, Springer International Publishing, pp. 47–61

work page 2016
[38]

Okuyama, T

T. Okuyama, T. Sonobe, K. Kawarabayashi, and M. Yamaoka , Binary optimization by momentum annealing, Physics Review E, 100 (2019), p. 012111

work page 2019
[39]

Okuyama, C

T. Okuyama, C. Yoshimura, M. Hayashi, and M. Yamaoka , Computing architecture to perform approximated simulated annealing for Ising models, in IEEE International Conference on Rebooting Computing (ICRC), 2016, pp. 1–8

work page 2016
[40]

Pedretti, F

G. Pedretti, F. B ¨ohm, T. Bhattacharya, A. Heittman, X. Zhang, M. Hizzani, G. Hutchinson, D. Kwon, J. Moon, E. Valiante, I. Rozada, C. E. Graves, J. Ignowski, M. Mohseni, J. P. Stra- chan, D. Strukov, R. Beausoleil, and T. V. Vaerenbergh , Solving Boolean satisfiability problems with resistive content addressable memories, arXiv:2501.07733, (2025)

work page arXiv 2025
[41]

https://coral.ise.lehigh.edu/data- sets/qaplib/

QAPLIB, QAPLIB: A library of quadratic assignment problem instances. https://coral.ise.lehigh.edu/data- sets/qaplib/

work page
[42]

Selman, H

B. Selman, H. Kautz, and B. Cohen, Noise strategies for improving local search, Proceedings of the National Conference on Artificial Intelligence, 1 (1999)

work page 1999
[43]

W. Song, M. Rao, Y. Li, C. Li, Y. Zhuo, F. Cai, M. Wu, W. Yin, Z. Li, Q. Wei, S. Lee, H. Zhu, L. Gong, M. Barnell, Q. Wu, P. A. Beerel, M. S.-W. Chen, N. Ge, M. Hu, Q. Xia, and J. J. Yang, Programming memristor arrays with arbitrarily high precision for analog computing, Science, 383 (2024), pp. 903–910

work page 2024
[44]

J. Su, T. Tu, and L. He , A quantum annealing approach for Boolean satisfiability problem, in 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC), 2016, pp. 1–6

work page 2016
[45]

Taillard, Robust tabu search for the quadratic assignment problem, Parallel Computing, 17 (1991), pp

E. Taillard, Robust tabu search for the quadratic assignment problem, Parallel Computing, 17 (1991), pp. 443– 455

work page 1991
[46]

R. H. Warren, Solving the traveling salesman problem on a quantum annealer, SN applied sciences, 2 (2020), p. 75

work page 2020
[47]

Yoshimura, M

C. Yoshimura, M. Hayashi, T. Okuyama, and M. Yamaoka, Implementation and evaluation of FPGA-based annealing processor for Ising model by use of resource sharing, International Journal of Networking and Com- puting, 7 (2017), pp. 154–172

work page 2017
[48]

Zhang, F

X. Zhang, F. B ¨ohm, E. Valiante, M. Noori, T. V. Vaerenbergh, C.-W. Yang, G. Pedretti, M. Mohseni, R. Beausoleil, and I. Rozada , Distributed Binary Optimization with In-Memory Computing: An Application for the SAT Problem, arXiv:2409.09152, (2024)

work page arXiv 2024

[1] [1]

We provide an efficient construction of the full neighbourhood at a given feasible binary solution

work page

[2] [2]

We design a parallel evaluation framework that computes objective differences across all neighbouring solutions in one iteration

work page

[3] [3]

We introduce a framework tailored to the architectural requirements of application-driven IMC hard- ware. 1.3. Outline Section 2 provides a background on the QAP and a discussion on the motivation for using the feasible- space method. In Section 3, we show formulas that leverage problems’ underlying structure to optimize performance. Section 4 introduces ...

work page

[4] [4]

Being an NP-hard problem, the QAP is commonly tackled using local- search heuristic algorithms, such as those presented in Refs

BACKGROUND In this section, we review existing approaches for solving the QAP and explain the rationale behind pursuing the feasible-space method. Being an NP-hard problem, the QAP is commonly tackled using local- search heuristic algorithms, such as those presented in Refs. [3, 9, 12]. They are CPU-based solvers that offer fast runtimes, and have been sh...

work page

[5] [5]

binary formulation

AN IMPROVED LOCAL SEARCH HEURISTIC FOR HARDWARE The QAP problem QAP( F, D) can be formulated as a series of matrix multiplications by representing solutions as permutation matrices. Recall that Pn is the set of n × n permutation matrices. Then, the 6 103 105 0 10 3 10 2 10 1 100 Optimality Gap chr12a 103 105 rou15 103 105 nug18 103 105 Iteration 0 10 3 10...

work page

[6] [6]

neighbour

FULL-NEIGHBOURHOOD SEARCH In the native QAP space, given a solution represented as a permutation matrix X ∈ Pn, we call Y a “neighbour” of X if Y is obtained by swapping two distinct columns of X. A “full neighbourhood” of X is the set of all such neighbours of X. Hence, the full neighbourhood of X consists of n 2 = n(n−1) 2 members. In this section, we p...

work page

[7] [7]

Note that by the definition (4.1), the elements in this neighbour are related by the equalities z∗ 1 = j ¯z1 n k · n + mod(¯z2, n) and z∗ 2 = j ¯z2 n k · n + mod(¯z1, n)

∈ N (x). Note that by the definition (4.1), the elements in this neighbour are related by the equalities z∗ 1 = j ¯z1 n k · n + mod(¯z2, n) and z∗ 2 = j ¯z2 n k · n + mod(¯z1, n). (4.4) 12 Now, consider the neighbour that results from the quadruple bit-flip via (¯z1, ¯z2, z∗ 2 , z∗ 1), y = x − e¯z1 − e¯z2 + ez∗ 1 + ez∗ 2 , and its full neighbourhood N (y)...

work page

[8] [8]

For all i ∈ R1, let S12(i, ¯z1) = 0 and S12(i, z∗

work page

[9] [9]

For all i ∈ R2, let S12(i, ¯z2) = 0 and S12(i, z∗

work page

[10] [10]

Note that this update involves changing 2n − 2 elements of S12 from 0 to 1, and another 2 n − 2 changes from 1 to 0

For i ∈ R3, let S12(i, [¯z1, ¯z2]) = 0 and S12(i, [z∗ 1 , z∗ 2]) = 1. Note that this update involves changing 2n − 2 elements of S12 from 0 to 1, and another 2 n − 2 changes from 1 to 0. See Figure 4.2 (a), (b), and (c) for an illustration of the transition of this update for an instance of size n = 12. The update procedure for S12 is outlined in Algorith...

work page

[11] [11]

= 1, i ∈ R1 3: S12(i, z2) = 0, S12(i, z∗

work page

[12] [12]

1” to “0

= 1, i ∈ R2 4: S12(i, [z1, z2]) = 0, S12(i, [z∗ 1 , z∗ 2]) = 1, i ∈ R3 5: return S12 6: end function Recall that a neighbour has an analogous meaning in the space of permutation matrices. Once a neighbour is chosen from N (vec(X)), it analogously means that two columns of X ∈ Pn are swapped while the other n − 2 columns remain invariant. The indices of th...

work page

[13] [13]

identifier n version

NUMERICAL EXPERIMENTS We used instances from the QAPLIB [27] of various sizes ranging from n = 12 to n = 100 to evaluate the performance of Algorithm 4.5. All instances from the QAPLIB follow a size-based naming convention, “identifier n version”, where n indicates the problem size. The relative optimality gap was chosen as the performance metric. Let fbe...

work page 2000

[14] [14]

CONCLUSIONS To sustain the computational performance growth rate in the post-Moore’s law era, new technologies such as IMC have emerged. Mapping the native formulation of binary constrained optimization problems onto the unconstrained binary formulation incurs a massive overhead and remains the primary obstacle limiting the performance of special-purpose ...

work page

[15] [15]

Aramon, G

M. Aramon, G. Rosenberg, E. Valiante, T. Miyazawa, H. Tamura, and H. G. Katzgraber , Physics-inspired optimization for quadratic unconstrained problems using a digital annealer, Frontiers in Physics, 7 (2019), p. 48

work page 2019

[16] [16]

Bagherbeik, W

M. Bagherbeik, W. Xu, S. F. Mousavi, K. Kanda, H. Tamura, and A. Sheikholeslami , Maqo: A scalable many-core annealer for quadratic optimization, in 2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), 2022, pp. 76–77

work page 2022

[17] [17]

Bashiri and H

M. Bashiri and H. Karimi, Effective heuristics and meta-heuristics for the quadratic assignment problem with tuned parameters and analytical comparisons, Journal of industrial engineering international, 8 (2012), pp. 1–9

work page 2012

[18] [18]

Bhattacharya, G

T. Bhattacharya, G. H. Hutchinson, G. Pedretti, X. Sheng, J. Ignowski, T. Van Vaerenbergh, R. Beausoleil, J. P. Strachan, and D. B. Strukov , Computing high-degree polynomial gradients in memory, Nature Communications, 15 (2024), p. 8211

work page 2024

[19] [19]

R. E. Burkard, E. Cela, P. M. Pardalos, and L. S. Pitsoulis, The quadratic assignment problem, Springer, 1998

work page 1998

[20] [20]

F. Cai, S. Kumar, T. Van Vaerenbergh, X. Sheng, R. Liu, C. Li, Z. Liu, M. Foltin, S. Yu, Q. Xia, et al., Power-efficient combinatorial optimization using intrinsic noise in memristor hopfield neural networks, Nature Electronics, 3 (2020), pp. 409–418

work page 2020

[21] [21]

S. A. de Carvalho Jr. and S. Rahmann , Microarray layout as quadratic assignment problem, in German Conference on Bioinformatics, Gesellschaft f¨ ur Informatik eV, 2006

work page 2006

[22] [22]

Dobrynin, A

D. Dobrynin, A. Renaudineau, M. Hizzani, D. Strukov, M. Mohseni, and J. P. Strachan , Energy landscapes of combinatorial optimization in Ising machines, Phys. Rev. E, 110 (2024), p. 045308

work page 2024

[23] [23]

Drezner, A new genetic algorithm for the quadratic assignment problem, INFORMS journal on computing, 15 (2003), pp

Z. Drezner, A new genetic algorithm for the quadratic assignment problem, INFORMS journal on computing, 15 (2003), pp. 320–330

work page 2003

[24] [24]

Fahimi, M

Z. Fahimi, M. Mahmoodi, H. Nili, V. Polishchuk, and D. Strukov, Combinatorial optimization by weight annealing in memristive hopfield networks, Scientific Reports, 11 (2021), p. 16383

work page 2021

[25] [25]

Finke, R

G. Finke, R. E. Burkard, and F. Rendl , Quadratic assignment problems, in North-Holland Mathematics Studies, vol. 132, Elsevier, 1987, pp. 61–82

work page 1987

[26] [26]

L. M. Gambardella, ´E. D. Taillard, and M. Dorigo , Ant colonies for the quadratic assignment problem, The Journal of the Operational Research Society, 50 (1999), pp. 167–176

work page 1999

[27] [27]

Glover, G

F. Glover, G. Kochenberger, R. Hennig, and Y. Du, Quantum bridge analytics I: a tutorial on formulating and using QUBO models, Annals of Operations Research, 314 (2022), pp. 141–183

work page 2022

[28] [28]

H. Goto, K. Tatsumura, and A. R. Dixon, Combinatorial optimization by simulating adiabatic bifurcations 29 in nonlinear Hamiltonian systems, Science Advances, 5 (2019)

work page 2019

[29] [29]

Hamerly, T

R. Hamerly, T. Inagaki, P. L. McMahon, D. Venturelli, A. Marandi, T. Onodera, E. Ng, C. Lan- grock, K. Inaba, T. Honjo, K. Enbutsu, T. Umeki, R. Kasahara, S. Utsunomiya, S. Kako, K. I. Kawarabayashi, R. L. Byer, M. M. Fejer, H. Mabuchi, D. Englund, E. Rieffel, H. Takesue, and Y. Yamamoto, Experimental investigation of performance differences between coher...

work page 2019

[30] [30]

Johnson, M

M. Johnson, M. Amin, S. Gildert, T. Lanting, F. Hamze, N. Dickson, R. Harris, A. Berkley, J. Jo- hansson, P. Bunyk, E. Chapple, C. Enderud, J. Hilton, K. Karimi, E. Ladizinsky, N. Ladizinsky, T. Oh, I. Perminov, C. Rich, and G. Rose , Quantum annealing with manufactured spins, Nature, 473 (2011), pp. 194–8

work page 2011

[31] [31]

T. C. Koopmans and M. Beckmann , Assignment problems and the location of economic activities, Econo- metrica, 25 (1957), pp. 53–76

work page 1957

[32] [32]

J. Li, M. Moghaddam, and S. Y. Nof , Dynamic storage assignment with product affinity and abc classification—a case study, The International Journal of Advanced Manufacturing Technology, 84 (2016), pp. 2179–2194

work page 2016

[33] [33]

A. Lu, J. Hur, Y.-C. Luo, H. Li, D. E. Nikonov, I. A. Young, Y.-K. Choi, and S. Yu, Scalable in-memory clustered annealer with temporal noise of charge trap transistor for large scale travelling salesman problems, IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 13 (2023), pp. 422–435

work page 2023

[34] [34]

Lucas, Ising formulations of many NP problems, Frontiers in Physics, 2 (2014)

A. Lucas, Ising formulations of many NP problems, Frontiers in Physics, 2 (2014)

work page 2014

[35] [35]

Matsubara, H

S. Matsubara, H. Tamura, M. Takatsu, D. Yoo, B. Vatankhahghadim, H. Yamasaki, T. Miyazawa, S. Tsukamoto, Y. Watanabe, K. Takemoto, and A. Sheikholeslami , Ising-model optimizer with parallel-trial bit-sieve engine, in Complex, Intelligent, and Software Intensive Systems – Proceedings of the 11th International Conference on Complex, Intelligent, and Softwa...

work page 2017

[36] [36]

Mohseni, D

M. Mohseni, D. Eppens, J. Str ¨umpfer, R. Marino, V. S. Denchev, A. K. Ho, S. V. Isakov, S. Boixo, F. Ricci-Tersenghi, and H. Neven, Nonequilibrium monte carlo for unfreezing variables in hard combinatorial optimization, arXiv.2111.13628, (2021)

work page arXiv 2021

[37] [37]

Munera, D

D. Munera, D. Diaz, and S. Abreu , Hybridization as cooperative parallelism for the quadratic assignment problem, in Hybrid Metaheuristics, M. J. Blesa, C. Blum, A. Cangelosi, V. Cutello, A. DI NUOVO, M. Pavone, and E.-G. Talbi, eds., Cham, 2016, Springer International Publishing, pp. 47–61

work page 2016

[38] [38]

Okuyama, T

T. Okuyama, T. Sonobe, K. Kawarabayashi, and M. Yamaoka , Binary optimization by momentum annealing, Physics Review E, 100 (2019), p. 012111

work page 2019

[39] [39]

Okuyama, C

T. Okuyama, C. Yoshimura, M. Hayashi, and M. Yamaoka , Computing architecture to perform approximated simulated annealing for Ising models, in IEEE International Conference on Rebooting Computing (ICRC), 2016, pp. 1–8

work page 2016

[40] [40]

Pedretti, F

G. Pedretti, F. B ¨ohm, T. Bhattacharya, A. Heittman, X. Zhang, M. Hizzani, G. Hutchinson, D. Kwon, J. Moon, E. Valiante, I. Rozada, C. E. Graves, J. Ignowski, M. Mohseni, J. P. Stra- chan, D. Strukov, R. Beausoleil, and T. V. Vaerenbergh , Solving Boolean satisfiability problems with resistive content addressable memories, arXiv:2501.07733, (2025)

work page arXiv 2025

[41] [41]

https://coral.ise.lehigh.edu/data- sets/qaplib/

QAPLIB, QAPLIB: A library of quadratic assignment problem instances. https://coral.ise.lehigh.edu/data- sets/qaplib/

work page

[42] [42]

Selman, H

B. Selman, H. Kautz, and B. Cohen, Noise strategies for improving local search, Proceedings of the National Conference on Artificial Intelligence, 1 (1999)

work page 1999

[43] [43]

W. Song, M. Rao, Y. Li, C. Li, Y. Zhuo, F. Cai, M. Wu, W. Yin, Z. Li, Q. Wei, S. Lee, H. Zhu, L. Gong, M. Barnell, Q. Wu, P. A. Beerel, M. S.-W. Chen, N. Ge, M. Hu, Q. Xia, and J. J. Yang, Programming memristor arrays with arbitrarily high precision for analog computing, Science, 383 (2024), pp. 903–910

work page 2024

[44] [44]

J. Su, T. Tu, and L. He , A quantum annealing approach for Boolean satisfiability problem, in 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC), 2016, pp. 1–6

work page 2016

[45] [45]

Taillard, Robust tabu search for the quadratic assignment problem, Parallel Computing, 17 (1991), pp

E. Taillard, Robust tabu search for the quadratic assignment problem, Parallel Computing, 17 (1991), pp. 443– 455

work page 1991

[46] [46]

R. H. Warren, Solving the traveling salesman problem on a quantum annealer, SN applied sciences, 2 (2020), p. 75

work page 2020

[47] [47]

Yoshimura, M

C. Yoshimura, M. Hayashi, T. Okuyama, and M. Yamaoka, Implementation and evaluation of FPGA-based annealing processor for Ising model by use of resource sharing, International Journal of Networking and Com- puting, 7 (2017), pp. 154–172

work page 2017

[48] [48]

Zhang, F

X. Zhang, F. B ¨ohm, E. Valiante, M. Noori, T. V. Vaerenbergh, C.-W. Yang, G. Pedretti, M. Mohseni, R. Beausoleil, and I. Rozada , Distributed Binary Optimization with In-Memory Computing: An Application for the SAT Problem, arXiv:2409.09152, (2024)

work page arXiv 2024