pith. sign in

arxiv: 1907.07898 · v1 · pith:OSXRWFNEnew · submitted 2019-07-18 · 💻 cs.ET

Memristive Devices for Computation-In-Memory

Pith reviewed 2026-05-24 19:44 UTC · model grok-4.3

classification 💻 cs.ET
keywords memristive devicescomputation-in-memoryacceleratorsvector processorautomata processorRRAM
0
0 comments X

The pith

Memristive devices enable two computation-in-memory accelerators that reduce latency, energy, and area versus conventional designs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces two accelerators that perform computation inside memory using memristive devices: the Memristive Vector Processor and the RRAM Automata Processor. It positions these designs as responses to the limits of CMOS scaling and the rising demands of new applications. The central evidence consists of preliminary results showing gains in latency, energy consumption, and physical area compared with today's architectures. If accurate, the work indicates that emerging devices can sustain computing progress where scaling alone no longer suffices.

Core claim

The preliminary results of these two accelerators show significant improvement in terms of latency, energy and area as compared to today's architectures and design.

What carries the argument

Memristive devices integrated for in-memory computation within the Memristive Vector Processor and RRAM Automata Processor.

If this is right

  • The accelerators deliver lower latency for targeted workloads than standard designs.
  • Energy consumption drops compared with conventional architectures.
  • Physical area requirements shrink relative to today's implementations.
  • Computation-in-memory approaches can address emerging applications with tight power and speed constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Success would encourage exploration of similar in-memory designs using other emerging memory technologies.
  • The reported gains might extend to workloads beyond those tested in the preliminary results.
  • Practical deployment would require solving integration questions left open by the current work.

Load-bearing premise

Memristive devices can be practically fabricated, integrated, and operated in the proposed accelerator architectures at scale without major unforeseen issues in reliability or compatibility.

What would settle it

Fabrication and testing of the accelerators at scale that yields no measurable gains in latency, energy, or area, or that reveals major reliability failures.

Figures

Figures reproduced from arXiv: 1907.07898 by Hoang Anh Du Nguyen, Jintao Yu, Lei Xie, Mottaqiallah Taouil, Said Hamdioui.

Figure 2
Figure 2. Figure 2: Memristive Vector Processor architecture. [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Evaluation results for MVP and multicore architectures. [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Example notations for NFAs and homogeneous automata. [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: General architecture for automata processors. [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Vector dot product operator used as switches and STEs. [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Different implementations of a configurable bit. [PITH_FULL_IMAGE:figures/full_fig_p005_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: SPICE simulation results of a vector dot product operator. [PITH_FULL_IMAGE:figures/full_fig_p006_9.png] view at source ↗
read the original abstract

CMOS technology and its continuous scaling have made electronics and computers accessible and affordable for almost everyone on the globe; in addition, they have enabled the solutions of a wide range of societal problems and applications. Today, however, both the technology and the computer architectures are facing severe challenges/walls making them incapable of providing the demanded computing power with tight constraints. This motivates the need for the exploration of novel architectures based on new device technologies; not only to sustain the financial benefit of technology scaling, but also to develop solutions for extremely demanding emerging applications. This paper presents two computation-in-memory based accelerators making use of emerging memristive devices; they are Memristive Vector Processor and RRAM Automata Processor. The preliminary results of these two accelerators show significant improvement in terms of latency, energy and area as compared to today's architectures and design.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper presents two computation-in-memory accelerators based on emerging memristive devices: the Memristive Vector Processor and the RRAM Automata Processor. It states that preliminary results of these accelerators demonstrate significant improvements in latency, energy, and area compared to today's architectures and designs.

Significance. If the performance claims can be substantiated with detailed, reproducible results that account for device non-idealities, the work could contribute to alternatives for overcoming CMOS scaling walls and the von Neumann bottleneck in data-intensive applications.

major comments (2)
  1. [Abstract] Abstract: The central claim that the accelerators show 'significant improvement in terms of latency, energy and area' rests on unspecified 'preliminary results' but supplies no data, error bars, methods, benchmarks, or verification, so the support for the performance claim cannot be evaluated.
  2. [Abstract] Abstract: The architectural proposals do not appear to include sensitivity analysis or Monte-Carlo variation sweeps; without bounding performance under realistic memristor statistics (resistance-state variability, read/write disturb, limited endurance, sneak-path effects), the reported latency/energy/area advantages may not survive translation from idealized models to fabricated devices.
minor comments (1)
  1. [Abstract] The abstract would benefit from naming the specific applications or benchmarks used for the latency/energy/area comparisons.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment point-by-point below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that the accelerators show 'significant improvement in terms of latency, energy and area' rests on unspecified 'preliminary results' but supplies no data, error bars, methods, benchmarks, or verification, so the support for the performance claim cannot be evaluated.

    Authors: The abstract summarizes results presented in the body of the manuscript. We agree the abstract is insufficiently specific. In revision we will expand the abstract to name the benchmarks (vector operations and automata workloads), comparison baselines, and simulation methodology, while retaining the high-level claim and pointing readers to the quantitative data and methods in the main text. revision: yes

  2. Referee: [Abstract] Abstract: The architectural proposals do not appear to include sensitivity analysis or Monte-Carlo variation sweeps; without bounding performance under realistic memristor statistics (resistance-state variability, read/write disturb, limited endurance, sneak-path effects), the reported latency/energy/area advantages may not survive translation from idealized models to fabricated devices.

    Authors: The presented results use idealized device models. We accept that this limits the strength of the claims. The revision will add a new subsection discussing the listed non-idealities, providing first-order analytical bounds on their impact and stating the modeling assumptions explicitly. Full Monte-Carlo sweeps are beyond the current scope but will be noted as future work. revision: partial

Circularity Check

0 steps flagged

No circularity; architectural claims rest on preliminary results without derivations or self-referential fitting

full rationale

The paper presents two computation-in-memory accelerators (Memristive Vector Processor and RRAM Automata Processor) and states that their preliminary results show improvements in latency, energy, and area. No equations, derivations, fitted parameters, or load-bearing self-citations appear in the abstract or described content. The central claim is an empirical assertion about unspecified simulation outcomes rather than any reduction of a prediction to its own inputs by construction, imported uniqueness theorems, or ansatz smuggling. This is the normal case of a proposal paper whose validity hinges on external validation of the results, not on internal circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no mathematical derivations, fitted parameters, axioms, or new postulated entities; the proposal relies on existing concepts of memristive devices without additional detail.

pith-pipeline@v0.9.0 · 5681 in / 997 out tokens · 19302 ms · 2026-05-24T19:44:24.723927+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    J. L. Hennessy and D. A. Patterson, Computer architecture: a quantita- tive approach. Elsevier, 2011

  2. [2]

    Memristor for computing: Myth or reality?

    S. Hamdioui, S. Kvatinsky, G. Cauwenberghs, L. Xie, N. Wald, S. Joshi, H. M. Elsayed, H. Corporaal, and K. Bertels, “Memristor for computing: Myth or reality?” in 2017 Design, Automation & Test in Europe Conference & Exhibition (DATE) . IEEE, 2017, pp. 722–731

  3. [3]

    Memristor based computation-in-memory architecture for data-intensive applications,

    S. Hamdioui, L. Xie, H. A. D. Nguyen, M. Taouil, K. Bertels, H. Corpo- raal, H. Jiao, F. Catthoor, D. Wouters, L. Eike et al., “Memristor based computation-in-memory architecture for data-intensive applications,” in DATE’15. EDA Consortium, 2015, pp. 1718–1725

  4. [4]

    The programmable logic-in-memory (plim) computer,

    P. E. Gaillardon, L. Amar, A. Siemon, E. Linn, R. Waser, A. Chattopad- hyay, and G. D. Micheli, “The programmable logic-in-memory (plim) computer,” in 2016 Design, Automation Test in Europe Conference Exhibition (DATE), March 2016, pp. 427–432

  5. [5]

    Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerg- ing non-volatile memories,

    S. Li, C. Xu, Q. Zou, J. Zhao, Y . Lu, and Y . Xie, “Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerg- ing non-volatile memories,” in DAC’16. New York, NY , USA: ACM, 2016, pp. 173:1–173:6

  6. [6]

    Memristive devices for computing,

    J. J. Yang, D. B. Strukov, and D. R. Stewart, “Memristive devices for computing,” Nature nanotechnology, vol. 8, no. 1, pp. 13–24, 2013

  7. [7]

    Large-scale neuromorphic computing systems,

    S. Furber, “Large-scale neuromorphic computing systems,” Journal of neural engineering, vol. 13, no. 5, p. 051001, 2016

  8. [8]

    A heterogeneous quantum computer architecture,

    X. Fu, L. Riesebos, L. Lao, C. G. Almudever, F. Sebastiano, R. Versluis, E. Charbon, and K. Bertels, “A heterogeneous quantum computer architecture,” in CF’16. ACM, 2016, pp. 323–330

  9. [9]

    On the implementation of computation-in-memory parallel adder,

    H. A. Du Nguyen, L. Xie, M. Taouil, R. Nane, S. Hamdioui, and K. Bertels, “On the implementation of computation-in-memory parallel adder,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2017

  10. [10]

    Memristor-the missing circuit element,

    L. Chua, “Memristor-the missing circuit element,” IEEE Transactions on circuit theory , vol. 18, no. 5, pp. 507–519, 1971

  11. [11]

    The missing memristor found,

    D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. Williams, “The missing memristor found,” nature, vol. 453, no. 7191, pp. 80–83, 2008

  12. [12]

    Memristive devices: Technology, design automation and computing frontiers,

    M. Barbareschi, A. Bosio, H. A. Du Nguyen, S. Hamdioui, M. Traiola, and E. I. Vatajelu, “Memristive devices: Technology, design automation and computing frontiers,” in DTIS’17. IEEE, 2017, pp. 1–8

  13. [13]

    Memristive devices for computing: Beyond cmos and beyond von neumann,

    H. A. Du Nguyen, J. Yu, L. Xie, M. Taouil, S. Hamdioui, and D. Fey, “Memristive devices for computing: Beyond cmos and beyond von neumann,” in 25TH IFIP/IEEE International Conference on Very Large Scale Integration. IEEE, 2017, pp. 1–8

  14. [14]

    Scouting logic: A novel memristor-based logic design for resistive computing,

    L. Xie, H. Du Nguyen, J. Yu, A. Kaichouhi, M. Taouil, M. AlFailakawi, and S. Hamdioui, “Scouting logic: A novel memristor-based logic design for resistive computing,” in VLSI (ISVLSI), 2017 IEEE Computer Society Annual Symposium on . IEEE, 2017, pp. 176–181

  15. [15]

    Cpu db: recording microprocessor history,

    A. Danowitz, K. Kelley, J. Mao, J. P. Stevenson, and M. Horowitz, “Cpu db: recording microprocessor history,” Communications of the ACM , vol. 55, no. 4, pp. 55–63, 2012

  16. [16]

    Dark memory and accelerator-rich system optimization in the dark silicon era,

    A. Pedram, S. Richardson, M. Horowitz, S. Galal, and S. Kvatinsky, “Dark memory and accelerator-rich system optimization in the dark silicon era,” IEEE Design & Test , vol. 34, no. 2, pp. 39–50, 2017

  17. [17]

    Fastbit: an efficient indexing technology for accelerating data- intensive science,

    K. Wu, “Fastbit: an efficient indexing technology for accelerating data- intensive science,” in Journal of Physics: Conference Series , vol. 16, no. 1. IOP Publishing, 2005, p. 556

  18. [18]

    Efficient string matching using bit parallelism,

    K. K. Soni, R. Vyas, and V . Sharma, “Efficient string matching using bit parallelism,” International Journal of Computer Science and Information Technologies, 2015

  19. [19]

    Bitwise data parallelism in regular expression matching,

    R. D. Cameron, T. C. Shermer, A. Shriraman, K. S. Herdy, D. Lin, B. R. Hull, and M. Lin, “Bitwise data parallelism in regular expression matching,” in Proceedings of the 23rd international conference on Parallel architectures and compilation. ACM, 2014, pp. 139–150

  20. [20]

    Dna mapping using processor- in-memory architecture,

    D. Lavenier, J.-F. Roy, and D. Furodet, “Dna mapping using processor- in-memory architecture,” in Workshop on Accelerator-Enabled Algo- rithms and Applications in Bioinformatics , 2016

  21. [21]

    Direction-optimizing breadth-first search,

    S. Beamer, K. Asanovi ´c, and D. Patterson, “Direction-optimizing breadth-first search,”Scientific Programming, vol. 21, no. 3-4, pp. 137– 148, 2013

  22. [22]

    Fast and memory-efficient regular expression matching for deep packet inspection,

    F. Yu, Z. Chen, Y . Diao, T. V . Lakshman, and R. H. Katz, “Fast and memory-efficient regular expression matching for deep packet inspection,” in 2006 Symposium on Architecture For Networking And Communications Systems, Dec 2006, pp. 93–102

  23. [23]

    Discovering motifs in biological sequences using the micron automata processor,

    I. Roy and S. Aluru, “Discovering motifs in biological sequences using the micron automata processor,” IEEE/ACM Trans. Comput. Biol. Bioinformatics, vol. 13, no. 1, pp. 99–111, Jan. 2016

  24. [24]

    Sequential pattern mining with the micron automata processor,

    K. Wang, E. Sadredini, and K. Skadron, “Sequential pattern mining with the micron automata processor,” in CF’16. ACM, 2016, pp. 135–144

  25. [25]

    An efficient and scalable semiconductor architecture for parallel au- tomata processing,

    P. Dlugosch, D. Brown, P. Glendenning, M. Leventhal, and H. Noyes, “An efficient and scalable semiconductor architecture for parallel au- tomata processing,” IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 12, pp. 3088–3098, Dec 2014

  26. [26]

    Entity resolution accelera- tion using the automata processor,

    C. Bo, K. Wang, J. J. Fox, and K. Skadron, “Entity resolution accelera- tion using the automata processor,” in Big Data, Dec 2016, pp. 311–318

  27. [27]

    Cache automaton,

    A. Subramaniyan, J. Wang, E. R. M. Balasubramanian, D. Blaauw, D. Sylvester, and R. Das, “Cache automaton,” in Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture , ser. MICRO-50 ’17. New York, NY , USA: ACM, 2017, pp. 259–272

  28. [28]

    Compact modeling of rram devices and its applications in 1t1r and 1s1r array design,

    P. Y . Chen and S. Yu, “Compact modeling of rram devices and its applications in 1t1r and 1s1r array design,” IEEE Transactions on Electron Devices, vol. 62, no. 12, pp. 4022–4028, Dec 2015

  29. [29]

    High performance unipolar aloy/hfox/ni based rram compatible with si diodes for 3d application,

    X. A. Tran, B. Gao, J. F. Kang, L. Wu, Z. R. Wang, Z. Fang, K. L. Pey, Y . C. Yeo, A. Y . Du, B. Y . Nguyen, M. F. Li, and H. Y . Yu, “High performance unipolar aloy/hfox/ni based rram compatible with si diodes for 3d application,” in 2011 Symposium on VLSI Technology - Digest of Technical Papers, June 2011, pp. 44–45