Memristive Devices for Computation-In-Memory
Pith reviewed 2026-05-24 19:44 UTC · model grok-4.3
The pith
Memristive devices enable two computation-in-memory accelerators that reduce latency, energy, and area versus conventional designs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The preliminary results of these two accelerators show significant improvement in terms of latency, energy and area as compared to today's architectures and design.
What carries the argument
Memristive devices integrated for in-memory computation within the Memristive Vector Processor and RRAM Automata Processor.
If this is right
- The accelerators deliver lower latency for targeted workloads than standard designs.
- Energy consumption drops compared with conventional architectures.
- Physical area requirements shrink relative to today's implementations.
- Computation-in-memory approaches can address emerging applications with tight power and speed constraints.
Where Pith is reading between the lines
- Success would encourage exploration of similar in-memory designs using other emerging memory technologies.
- The reported gains might extend to workloads beyond those tested in the preliminary results.
- Practical deployment would require solving integration questions left open by the current work.
Load-bearing premise
Memristive devices can be practically fabricated, integrated, and operated in the proposed accelerator architectures at scale without major unforeseen issues in reliability or compatibility.
What would settle it
Fabrication and testing of the accelerators at scale that yields no measurable gains in latency, energy, or area, or that reveals major reliability failures.
Figures
read the original abstract
CMOS technology and its continuous scaling have made electronics and computers accessible and affordable for almost everyone on the globe; in addition, they have enabled the solutions of a wide range of societal problems and applications. Today, however, both the technology and the computer architectures are facing severe challenges/walls making them incapable of providing the demanded computing power with tight constraints. This motivates the need for the exploration of novel architectures based on new device technologies; not only to sustain the financial benefit of technology scaling, but also to develop solutions for extremely demanding emerging applications. This paper presents two computation-in-memory based accelerators making use of emerging memristive devices; they are Memristive Vector Processor and RRAM Automata Processor. The preliminary results of these two accelerators show significant improvement in terms of latency, energy and area as compared to today's architectures and design.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents two computation-in-memory accelerators based on emerging memristive devices: the Memristive Vector Processor and the RRAM Automata Processor. It states that preliminary results of these accelerators demonstrate significant improvements in latency, energy, and area compared to today's architectures and designs.
Significance. If the performance claims can be substantiated with detailed, reproducible results that account for device non-idealities, the work could contribute to alternatives for overcoming CMOS scaling walls and the von Neumann bottleneck in data-intensive applications.
major comments (2)
- [Abstract] Abstract: The central claim that the accelerators show 'significant improvement in terms of latency, energy and area' rests on unspecified 'preliminary results' but supplies no data, error bars, methods, benchmarks, or verification, so the support for the performance claim cannot be evaluated.
- [Abstract] Abstract: The architectural proposals do not appear to include sensitivity analysis or Monte-Carlo variation sweeps; without bounding performance under realistic memristor statistics (resistance-state variability, read/write disturb, limited endurance, sneak-path effects), the reported latency/energy/area advantages may not survive translation from idealized models to fabricated devices.
minor comments (1)
- [Abstract] The abstract would benefit from naming the specific applications or benchmarks used for the latency/energy/area comparisons.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major comment point-by-point below, indicating planned revisions where appropriate.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that the accelerators show 'significant improvement in terms of latency, energy and area' rests on unspecified 'preliminary results' but supplies no data, error bars, methods, benchmarks, or verification, so the support for the performance claim cannot be evaluated.
Authors: The abstract summarizes results presented in the body of the manuscript. We agree the abstract is insufficiently specific. In revision we will expand the abstract to name the benchmarks (vector operations and automata workloads), comparison baselines, and simulation methodology, while retaining the high-level claim and pointing readers to the quantitative data and methods in the main text. revision: yes
-
Referee: [Abstract] Abstract: The architectural proposals do not appear to include sensitivity analysis or Monte-Carlo variation sweeps; without bounding performance under realistic memristor statistics (resistance-state variability, read/write disturb, limited endurance, sneak-path effects), the reported latency/energy/area advantages may not survive translation from idealized models to fabricated devices.
Authors: The presented results use idealized device models. We accept that this limits the strength of the claims. The revision will add a new subsection discussing the listed non-idealities, providing first-order analytical bounds on their impact and stating the modeling assumptions explicitly. Full Monte-Carlo sweeps are beyond the current scope but will be noted as future work. revision: partial
Circularity Check
No circularity; architectural claims rest on preliminary results without derivations or self-referential fitting
full rationale
The paper presents two computation-in-memory accelerators (Memristive Vector Processor and RRAM Automata Processor) and states that their preliminary results show improvements in latency, energy, and area. No equations, derivations, fitted parameters, or load-bearing self-citations appear in the abstract or described content. The central claim is an empirical assertion about unspecified simulation outcomes rather than any reduction of a prediction to its own inputs by construction, imported uniqueness theorems, or ansatz smuggling. This is the normal case of a proposal paper whose validity hinges on external validation of the results, not on internal circularity.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
J. L. Hennessy and D. A. Patterson, Computer architecture: a quantita- tive approach. Elsevier, 2011
work page 2011
-
[2]
Memristor for computing: Myth or reality?
S. Hamdioui, S. Kvatinsky, G. Cauwenberghs, L. Xie, N. Wald, S. Joshi, H. M. Elsayed, H. Corporaal, and K. Bertels, “Memristor for computing: Myth or reality?” in 2017 Design, Automation & Test in Europe Conference & Exhibition (DATE) . IEEE, 2017, pp. 722–731
work page 2017
-
[3]
Memristor based computation-in-memory architecture for data-intensive applications,
S. Hamdioui, L. Xie, H. A. D. Nguyen, M. Taouil, K. Bertels, H. Corpo- raal, H. Jiao, F. Catthoor, D. Wouters, L. Eike et al., “Memristor based computation-in-memory architecture for data-intensive applications,” in DATE’15. EDA Consortium, 2015, pp. 1718–1725
work page 2015
-
[4]
The programmable logic-in-memory (plim) computer,
P. E. Gaillardon, L. Amar, A. Siemon, E. Linn, R. Waser, A. Chattopad- hyay, and G. D. Micheli, “The programmable logic-in-memory (plim) computer,” in 2016 Design, Automation Test in Europe Conference Exhibition (DATE), March 2016, pp. 427–432
work page 2016
-
[5]
S. Li, C. Xu, Q. Zou, J. Zhao, Y . Lu, and Y . Xie, “Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerg- ing non-volatile memories,” in DAC’16. New York, NY , USA: ACM, 2016, pp. 173:1–173:6
work page 2016
-
[6]
Memristive devices for computing,
J. J. Yang, D. B. Strukov, and D. R. Stewart, “Memristive devices for computing,” Nature nanotechnology, vol. 8, no. 1, pp. 13–24, 2013
work page 2013
-
[7]
Large-scale neuromorphic computing systems,
S. Furber, “Large-scale neuromorphic computing systems,” Journal of neural engineering, vol. 13, no. 5, p. 051001, 2016
work page 2016
-
[8]
A heterogeneous quantum computer architecture,
X. Fu, L. Riesebos, L. Lao, C. G. Almudever, F. Sebastiano, R. Versluis, E. Charbon, and K. Bertels, “A heterogeneous quantum computer architecture,” in CF’16. ACM, 2016, pp. 323–330
work page 2016
-
[9]
On the implementation of computation-in-memory parallel adder,
H. A. Du Nguyen, L. Xie, M. Taouil, R. Nane, S. Hamdioui, and K. Bertels, “On the implementation of computation-in-memory parallel adder,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2017
work page 2017
-
[10]
Memristor-the missing circuit element,
L. Chua, “Memristor-the missing circuit element,” IEEE Transactions on circuit theory , vol. 18, no. 5, pp. 507–519, 1971
work page 1971
-
[11]
D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. Williams, “The missing memristor found,” nature, vol. 453, no. 7191, pp. 80–83, 2008
work page 2008
-
[12]
Memristive devices: Technology, design automation and computing frontiers,
M. Barbareschi, A. Bosio, H. A. Du Nguyen, S. Hamdioui, M. Traiola, and E. I. Vatajelu, “Memristive devices: Technology, design automation and computing frontiers,” in DTIS’17. IEEE, 2017, pp. 1–8
work page 2017
-
[13]
Memristive devices for computing: Beyond cmos and beyond von neumann,
H. A. Du Nguyen, J. Yu, L. Xie, M. Taouil, S. Hamdioui, and D. Fey, “Memristive devices for computing: Beyond cmos and beyond von neumann,” in 25TH IFIP/IEEE International Conference on Very Large Scale Integration. IEEE, 2017, pp. 1–8
work page 2017
-
[14]
Scouting logic: A novel memristor-based logic design for resistive computing,
L. Xie, H. Du Nguyen, J. Yu, A. Kaichouhi, M. Taouil, M. AlFailakawi, and S. Hamdioui, “Scouting logic: A novel memristor-based logic design for resistive computing,” in VLSI (ISVLSI), 2017 IEEE Computer Society Annual Symposium on . IEEE, 2017, pp. 176–181
work page 2017
-
[15]
Cpu db: recording microprocessor history,
A. Danowitz, K. Kelley, J. Mao, J. P. Stevenson, and M. Horowitz, “Cpu db: recording microprocessor history,” Communications of the ACM , vol. 55, no. 4, pp. 55–63, 2012
work page 2012
-
[16]
Dark memory and accelerator-rich system optimization in the dark silicon era,
A. Pedram, S. Richardson, M. Horowitz, S. Galal, and S. Kvatinsky, “Dark memory and accelerator-rich system optimization in the dark silicon era,” IEEE Design & Test , vol. 34, no. 2, pp. 39–50, 2017
work page 2017
-
[17]
Fastbit: an efficient indexing technology for accelerating data- intensive science,
K. Wu, “Fastbit: an efficient indexing technology for accelerating data- intensive science,” in Journal of Physics: Conference Series , vol. 16, no. 1. IOP Publishing, 2005, p. 556
work page 2005
-
[18]
Efficient string matching using bit parallelism,
K. K. Soni, R. Vyas, and V . Sharma, “Efficient string matching using bit parallelism,” International Journal of Computer Science and Information Technologies, 2015
work page 2015
-
[19]
Bitwise data parallelism in regular expression matching,
R. D. Cameron, T. C. Shermer, A. Shriraman, K. S. Herdy, D. Lin, B. R. Hull, and M. Lin, “Bitwise data parallelism in regular expression matching,” in Proceedings of the 23rd international conference on Parallel architectures and compilation. ACM, 2014, pp. 139–150
work page 2014
-
[20]
Dna mapping using processor- in-memory architecture,
D. Lavenier, J.-F. Roy, and D. Furodet, “Dna mapping using processor- in-memory architecture,” in Workshop on Accelerator-Enabled Algo- rithms and Applications in Bioinformatics , 2016
work page 2016
-
[21]
Direction-optimizing breadth-first search,
S. Beamer, K. Asanovi ´c, and D. Patterson, “Direction-optimizing breadth-first search,”Scientific Programming, vol. 21, no. 3-4, pp. 137– 148, 2013
work page 2013
-
[22]
Fast and memory-efficient regular expression matching for deep packet inspection,
F. Yu, Z. Chen, Y . Diao, T. V . Lakshman, and R. H. Katz, “Fast and memory-efficient regular expression matching for deep packet inspection,” in 2006 Symposium on Architecture For Networking And Communications Systems, Dec 2006, pp. 93–102
work page 2006
-
[23]
Discovering motifs in biological sequences using the micron automata processor,
I. Roy and S. Aluru, “Discovering motifs in biological sequences using the micron automata processor,” IEEE/ACM Trans. Comput. Biol. Bioinformatics, vol. 13, no. 1, pp. 99–111, Jan. 2016
work page 2016
-
[24]
Sequential pattern mining with the micron automata processor,
K. Wang, E. Sadredini, and K. Skadron, “Sequential pattern mining with the micron automata processor,” in CF’16. ACM, 2016, pp. 135–144
work page 2016
-
[25]
An efficient and scalable semiconductor architecture for parallel au- tomata processing,
P. Dlugosch, D. Brown, P. Glendenning, M. Leventhal, and H. Noyes, “An efficient and scalable semiconductor architecture for parallel au- tomata processing,” IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 12, pp. 3088–3098, Dec 2014
work page 2014
-
[26]
Entity resolution accelera- tion using the automata processor,
C. Bo, K. Wang, J. J. Fox, and K. Skadron, “Entity resolution accelera- tion using the automata processor,” in Big Data, Dec 2016, pp. 311–318
work page 2016
-
[27]
A. Subramaniyan, J. Wang, E. R. M. Balasubramanian, D. Blaauw, D. Sylvester, and R. Das, “Cache automaton,” in Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture , ser. MICRO-50 ’17. New York, NY , USA: ACM, 2017, pp. 259–272
work page 2017
-
[28]
Compact modeling of rram devices and its applications in 1t1r and 1s1r array design,
P. Y . Chen and S. Yu, “Compact modeling of rram devices and its applications in 1t1r and 1s1r array design,” IEEE Transactions on Electron Devices, vol. 62, no. 12, pp. 4022–4028, Dec 2015
work page 2015
-
[29]
High performance unipolar aloy/hfox/ni based rram compatible with si diodes for 3d application,
X. A. Tran, B. Gao, J. F. Kang, L. Wu, Z. R. Wang, Z. Fang, K. L. Pey, Y . C. Yeo, A. Y . Du, B. Y . Nguyen, M. F. Li, and H. Y . Yu, “High performance unipolar aloy/hfox/ni based rram compatible with si diodes for 3d application,” in 2011 Symposium on VLSI Technology - Digest of Technical Papers, June 2011, pp. 44–45
work page 2011
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.