pith. machine review for the scientific record.
sign in

arxiv: 2604.20178 · v1 · submitted 2026-04-22 · 📡 eess.SY · cs.SY

Design Space Exploration for ReRAM-based Architectures to Address Scaling Non-idealities

Pith reviewed 2026-05-10 00:23 UTC · model grok-4.3

classification 📡 eess.SY cs.SY
keywords ReRAMin-memory computingdesign space explorationnon-idealitiesscalingenergy efficiencyIMCparameter extraction
0
0 comments X

The pith

A parameter-extraction testbench lets designers optimize ReRAM array size, ADC resolution, and frequency without exhaustive simulations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework that extracts modeling parameters from a small number of representative transistor-level simulations and uses them to predict performance across many possible ReRAM-based in-memory computing architectures. This matters because larger arrays reduce the relative cost of DACs and ADCs but increase parasitics and non-idealities, and current practice requires slow iterative full simulations at the early design stage. The method is demonstrated on two cases that seek to maximize energy efficiency measured in TOPs per watt while staying inside given power and error budgets.

Core claim

A specialized testbench extracts parameters from limited transistor-level simulations; these parameters then accurately predict the behavior of arbitrary ReRAM IMC architectures, enabling selection of optimal array size, ADC resolution, and system frequency that maximize energy efficiency under power and error constraints.

What carries the argument

The specialized testbench that extracts parameters from a limited set of representative transistor-level simulations to model scaling non-idealities and predict performance of arbitrary ReRAM IMC configurations.

If this is right

  • Designers can identify valid configurations that satisfy power and error limits without running exhaustive simulations for each candidate.
  • Energy efficiency measured in TOPs/s/W can be maximized at the architectural level by trading array size against peripheral overhead.
  • Early-stage design decisions on array size, ADC bits, and clock frequency become feasible before committing to detailed circuit simulation.
  • The trade-off between amortizing peripheral power and incurring more parasitic effects can be quantified quickly for scaling ReRAM arrays.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same extraction approach could be reused for other memory technologies that face analogous scaling versus peripheral trade-offs.
  • Integration of the framework into automated design flows would allow larger system-level explorations that include multiple IMC tiles.
  • Validation against measured silicon results for at least one non-representative configuration would strengthen confidence in the parameter transfer.

Load-bearing premise

Parameters taken from a small number of representative transistor-level simulations remain accurate when applied to any arbitrary architecture.

What would settle it

A full transistor-level simulation of a new array size or ADC configuration outside the representative set shows large errors in predicted power, error rate, or TOPs/s/W compared with the framework output.

Figures

Figures reproduced from arXiv: 2604.20178 by Ching-Yi Lin, Sahil Shah.

Figure 1
Figure 1. Figure 1: ReRAM array scaling on energy and parasitics: (a) A [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Testbench setup: (a) Each input DAC sequentially [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: (a) Spatial distribution of Gef f in a 256×256 array (b) Cumulative conductance PM i PM j GN,ef f [i, j] for varying N and M, demonstrating that the total conductance of an M ×M array can be approximated using the extracted conductance GN,ef f from a larger array. The per-cell RMSE is visualized in [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: (a-b) RMSE distribution for a smaller 64×64 array and a larger 256×256 array. (c) Normalized RMSE as a function of normalized array size. The consistency of curves demonstrates the same error distribution for various array size N 0 100 200 Size 0.00 0.05 0.10 0.15 0.20 R M S E max 3-b 4-b 6-b8-b12-b16-b 3b 4b 6b 8b 12b 16b 3 4 6 8 12 16 ADC Resolution 10 −2 10 −1 R M S E max N=16 N=32 N=64 N=128 N=192 N=25… view at source ↗
Figure 6
Figure 6. Figure 6: Size-frequency tradeoff under a power constraint: (a) [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗
read the original abstract

ReRAM-based in-memory computing (IMC) architectures are promising candidates for energy-efficient matrix-vector multiplication. While scaling the size of ReRAM arrays allows for the amortization of power-hungry peripheral circuits like DACs and ADCs, it simultaneously introduces more parasitic along the signal path. Because of these challenges, current design methodologies often lack practical guidelines to balance these effects at early design stage, forcing designers to rely on time-consuming, iterative transistor-level simulations. In this work, we propose a comprehensive framework for design space exploration that enables the selection of optimal array size, ADC resolution, and system frequency without requiring exhaustive simulations. The framework utilizes a specialized testbench to extract parameters from a limited set of representative transistor-level simulations. These parameters are then used to accurately predict the performance of arbitrary architectures. We demonstrate the effectiveness of this framework through two realistic design cases aimed at maximizing energy efficiency (TOPs/s/W). The results show that the framework successfully identifies optimal architectural configurations under strict power and error constraints, providing an efficient path for high-performance IMC design.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes a framework for design space exploration of ReRAM-based in-memory computing (IMC) architectures to address scaling non-idealities. It extracts parameters from a limited set of representative transistor-level simulations via a specialized testbench and uses these parameters to predict performance metrics for arbitrary array sizes, ADC resolutions, and frequencies without exhaustive simulations. Effectiveness is demonstrated through two realistic design cases focused on maximizing energy efficiency (TOPs/s/W) under strict power and error constraints.

Significance. If the extracted parameters enable reliable predictions across design spaces, the framework could substantially accelerate early-stage IMC architecture optimization by reducing dependence on time-intensive transistor-level simulations, allowing designers to efficiently explore trade-offs in array scaling, peripheral circuits, and operating conditions for energy-efficient ReRAM systems.

major comments (2)
  1. [Abstract] Abstract: the central claim that the extracted parameters 'accurately predict the performance of arbitrary architectures' lacks any quantitative support such as prediction error metrics, validation plots against full simulations, or explicit description of captured non-idealities, leaving the generalization from limited simulations unverified.
  2. [Demonstration] Demonstration section (two design cases): while optimal configurations are identified under power/error constraints, the absence of reported accuracy metrics or cross-validation against exhaustive simulations for the predicted TOPs/s/W values makes it impossible to assess whether the parameter-based model is load-bearing for the claimed efficiency gains.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for the thorough review and valuable suggestions. The comments have helped us identify areas where the manuscript can be improved by providing stronger quantitative evidence for the framework's predictive capabilities. We will incorporate the suggested validations in the revised version.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the extracted parameters 'accurately predict the performance of arbitrary architectures' lacks any quantitative support such as prediction error metrics, validation plots against full simulations, or explicit description of captured non-idealities, leaving the generalization from limited simulations unverified.

    Authors: We thank the referee for highlighting this important point. While the manuscript details the parameter extraction from representative simulations and applies them to predict performance in the design cases, we recognize that quantitative validation of the prediction accuracy (e.g., error metrics or plots comparing to full simulations) is not explicitly provided. This omission weakens the support for the generalization claim. In the revised manuscript, we will add a dedicated validation section that includes cross-validation results, prediction error metrics (such as RMSE for energy efficiency and error rates), and plots for various array sizes and ADC resolutions to explicitly demonstrate the accuracy of the extracted parameters. revision: yes

  2. Referee: [Demonstration] Demonstration section (two design cases): while optimal configurations are identified under power/error constraints, the absence of reported accuracy metrics or cross-validation against exhaustive simulations for the predicted TOPs/s/W values makes it impossible to assess whether the parameter-based model is load-bearing for the claimed efficiency gains.

    Authors: We agree that reporting accuracy metrics for the predictions in the demonstration cases would strengthen the paper and allow readers to better evaluate the framework's reliability. The current demonstration uses the framework to explore the design space and identify optima, but does not include direct comparisons to exhaustive simulations for those specific predictions. We will revise the demonstration section to include such validation, for example by performing full simulations at the identified optimal points and reporting the discrepancy in TOPs/s/W and other metrics, thereby confirming that the efficiency gains are indeed supported by the model. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes a framework that extracts parameters via a specialized testbench from a limited set of representative transistor-level simulations and then applies those parameters to predict performance metrics for arbitrary array sizes, ADC resolutions, and frequencies. This constitutes a standard calibration-and-prediction modeling workflow rather than any self-definitional loop, fitted input renamed as prediction by construction, or load-bearing self-citation. No equations are presented that reduce the claimed predictions to the input simulation data by algebraic identity or statistical necessity; the exploration of design space under power and error constraints retains independent content. The provided text contains no uniqueness theorems, ansatzes smuggled via citation, or renamings of known results that would trigger circularity under the specified criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, so no concrete free parameters, axioms, or invented entities can be extracted from the manuscript text.

pith-pipeline@v0.9.0 · 5483 in / 1073 out tokens · 28693 ms · 2026-05-10T00:23:39.792025+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

12 extracted references · 1 canonical work pages

  1. [1]

    Edge intelligence through in-sensor and near-sensor computing for the artificial intelligence of things,

    Y . Baek, B. Bae, H. Shin, C. Sonnadara, H. Cho, C.-Y . Lin, Y . Mu, C. Shen, S. Shah, G. Wang, and K. Lee, “Edge intelligence through in-sensor and near-sensor computing for the artificial intelligence of things,”npj Unconventional Computing, vol. 2, no. 1, p. 25, Oct

  2. [2]

    Available: https://www.nature.com/articles/s44335-025- 00040-6

    [Online]. Available: https://www.nature.com/articles/s44335-025- 00040-6

  3. [3]

    Characterization and Modeling of Multilevel Analog ReRAM Synapses in the Sky130 Process,

    I. Didin, C. Brando, C.-Y . Lin, and S. Shah, “Characterization and Modeling of Multilevel Analog ReRAM Synapses in the Sky130 Process,”IEEE Journal on Exploratory Solid-State Computational Devices and Circuits, pp. 1–1, 2026. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/11421367

  4. [4]

    A 40-nm, 64-kb, 56.67 tops/w voltage-sensing computing-in-memory/digital rram macro supporting iterative write with verification and online read-disturb detection,

    J.-H. Yoon, M. Chang, W.-S. Khwa, Y .-D. Chih, M.-F. Chang, and A. Raychowdhury, “A 40-nm, 64-kb, 56.67 tops/w voltage-sensing computing-in-memory/digital rram macro supporting iterative write with verification and online read-disturb detection,”IEEE Journal of Solid- State Circuits, vol. 57, no. 1, pp. 68–79, 2021

  5. [5]

    The transient response of damped linear networks with particular regard to wideband amplifiers,

    W. C. Elmore, “The transient response of damped linear networks with particular regard to wideband amplifiers,”Journal of applied physics, vol. 19, no. 1, pp. 55–63, 1948

  6. [6]

    Modeling and compensation of ir drop in crosspoint accelerators of neural networks,

    N. Lepri, M. Baldo, P. Mannocci, A. Glukhov, V . Milo, and D. Ielmini, “Modeling and compensation of ir drop in crosspoint accelerators of neural networks,”IEEE Transactions on Electron Devices, vol. 69, no. 3, pp. 1575–1581, 2022

  7. [7]

    Scaling limits of memristor-based routers for asynchronous neuromorphic systems,

    J. Chen, S. Yang, H. Wu, G. Indiveri, and M. Payvand, “Scaling limits of memristor-based routers for asynchronous neuromorphic systems,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 71, no. 3, pp. 1576–1580, 2023

  8. [8]

    Current compliance-dependent nonlinearity in tio 2 reram,

    F. Lentz, B. Roesgen, V . Rana, D. J. Wouters, and R. Waser, “Current compliance-dependent nonlinearity in tio 2 reram,”IEEE electron device letters, vol. 34, no. 8, pp. 996–998, 2013

  9. [9]

    A data-driven verilog-a reram model,

    I. Messaris, A. Serb, S. Stathopoulos, A. Khiat, S. Nikolaidis, and T. Pro- dromakis, “A data-driven verilog-a reram model,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 12, pp. 3151–3162, 2018

  10. [10]

    Performance impacts of analog reram non-ideality on neuromorphic computing,

    Y .-H. Lin, C.-H. Wang, M.-H. Lee, D.-Y . Lee, Y .-Y . Lin, F.-M. Lee, H.-L. Lung, K.-C. Wang, T.-Y . Tseng, and C.-Y . Lu, “Performance impacts of analog reram non-ideality on neuromorphic computing,” IEEE Transactions on Electron Devices, vol. 66, no. 3, pp. 1289–1295, 2019

  11. [11]

    Fundamental limits on the computational accuracy of resistive crossbar-based in-memory architec- tures,

    S. K. Roy, A. Patil, and N. R. Shanbhag, “Fundamental limits on the computational accuracy of resistive crossbar-based in-memory architec- tures,” in2022 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2022, pp. 384–388

  12. [12]

    ADC Performance Survey 1997-2026,

    B. Murmann, “ADC Performance Survey 1997-2026,” [Online]. Avail- able: https://github.com/bmurmann/ADC-survey