Comparative Evaluation of Memory Technologies for Synaptic Crossbar Arrays- Part 2: Design Knobs and DNN Accuracy Trends
Pith reviewed 2026-05-23 22:15 UTC · model grok-4.3
The pith
FeFET is best suited for large synaptic crossbar arrays due to its small layout height and high state distinguishability, with ReRAM matching it on higher bit-slices and complex datasets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Compared to the other technologies, FeFET, by virtue of its small layout height and high distinguishability of its memory states, is best suited for large arrays. For higher bit-slices and a more complex dataset (ResNet-50 with Cifar-100) we found that ReRAM matches the performance of FeFET. PWA increases accuracy by up to 32.56% while custom ADC reference levels yield up to 31.62% accuracy enhancement.
What carries the argument
Evaluation of array size, bit-slice number, partial wordline activation (PWA), and custom ADC reference levels on 8T SRAM, FeFET, ReRAM, and SOT-MRAM crossbars for DNN inference accuracy.
If this is right
- FeFET is preferred for large arrays over SRAM, ReRAM, and SOT-MRAM.
- ReRAM matches FeFET performance at higher bit-slices with ResNet-50 on CIFAR-100.
- PWA can improve accuracy by up to 32.56%.
- Custom ADC reference levels can improve accuracy by up to 31.62%.
- The technologies respond differently to these accuracy-enhancing techniques.
Where Pith is reading between the lines
- Designers of in-memory computing hardware may favor FeFET when scaling array sizes.
- Testing these findings on additional DNN models could confirm the trends.
- The results highlight the importance of co-optimizing circuit solutions with memory technology choice.
- Future work might explore power or latency implications of these choices.
Load-bearing premise
The device compact models and circuit simulations at the 7 nm node accurately capture all relevant hardware non-idealities and their quantitative impact on end-to-end DNN inference accuracy.
What would settle it
Fabrication and measurement of 7nm crossbar arrays with these technologies showing reversed accuracy rankings or no benefit from PWA would disprove the claims.
read the original abstract
Crossbar memory arrays have been touted as the workhorse of in-memory computing (IMC)-based acceleration of Deep Neural Networks (DNNs), but the associated hardware non-idealities limit their efficacy. To address this, cross-layer design solutions that reduce the impact of hardware non-idealities on DNN accuracy are needed. In Part 1 of this paper, we established the co-optimization strategies for various memory technologies and their crossbar arrays, and conducted a comparative technology evaluation in the context of IMC robustness. In this part, we analyze various design knobs such as array size and bit-slice (number of bits per device) and their impact on the performance of 8T SRAM, ferroelectric transistor (FeFET), Resistive RAM (ReRAM) and spin-orbit-torque magnetic RAM (SOT-MRAM) in the context of inference accuracy at 7nm technology node. Further, we study the effect of circuit design solutions such as Partial Wordline Activation (PWA) and custom ADC reference levels that reduce the hardware non-idealities and comparatively analyze the response of each technology to such accuracy enhancing techniques. Our results on ResNet-20 (with CIFAR-10) show that PWA increases accuracy by up to 32.56% while custom ADC reference levels yield up to 31.62% accuracy enhancement. We observe that compared to the other technologies, FeFET, by virtue of its small layout height and high distinguishability of its memory states, is best suited for large arrays. For higher bit-slices and a more complex dataset (ResNet-50 with Cifar-100) we found that ReRAM matches the performance of FeFET.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript compares the performance of 8T SRAM, FeFET, ReRAM, and SOT-MRAM in synaptic crossbar arrays for DNN inference at the 7 nm technology node. It examines the influence of design parameters including array size and bit-slice on accuracy, as well as the benefits of Partial Wordline Activation (PWA) and custom ADC reference levels. Key findings include accuracy improvements of up to 32.56% with PWA and 31.62% with custom ADCs on ResNet-20/CIFAR-10, with FeFET identified as optimal for large arrays and ReRAM performing comparably at higher bit-slices on ResNet-50/CIFAR-100.
Significance. Should the compact models and simulations prove accurate, this work contributes comparative data on memory technologies for in-memory computing, highlighting effective design strategies to mitigate hardware non-idealities. It extends prior work in Part 1 by focusing on accuracy trends and circuit-level solutions.
major comments (2)
- [Abstract] Abstract: The reported accuracy deltas (32.56% from PWA, 31.62% from custom ADCs) and technology rankings (FeFET superiority for large arrays, ReRAM parity at higher bit-slices) rest entirely on 7 nm SPICE simulations; no error bars, sensitivity analysis to model parameters, or silicon validation of the compact models is referenced, directly affecting the load-bearing quantitative claims.
- [Device modeling and simulation methodology] Device modeling and simulation methodology sections: The central assumption that the chosen compact models quantitatively reproduce all dominant non-idealities (device variability, IR drop, ADC quantization, sense-amplifier offset) and that their aggregate effect on end-to-end DNN accuracy is faithfully captured lacks any cross-check against measured array data or ablation on model fidelity; any systematic mismatch would scale the reported deltas and rankings.
minor comments (2)
- [Abstract] Abstract: Reference to 'Part 1' lacks a citation or brief summary of its co-optimization strategies, reducing standalone readability.
- [Results] Results presentation: Tables and figures reporting accuracy trends should specify the number of Monte Carlo runs or variability sources considered to allow assessment of statistical significance.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. Below we respond point-by-point to the major comments. This work is a simulation study at the 7 nm node; we address the implications of that scope honestly.
read point-by-point responses
-
Referee: [Abstract] Abstract: The reported accuracy deltas (32.56% from PWA, 31.62% from custom ADCs) and technology rankings (FeFET superiority for large arrays, ReRAM parity at higher bit-slices) rest entirely on 7 nm SPICE simulations; no error bars, sensitivity analysis to model parameters, or silicon validation of the compact models is referenced, directly affecting the load-bearing quantitative claims.
Authors: We agree that the quantitative deltas and rankings are obtained exclusively from the described 7 nm SPICE simulations. The manuscript will be revised to state explicitly in the abstract that all reported accuracy numbers and technology comparisons are simulation results. No error bars or parameter sensitivity sweeps were performed; adding them would require a substantial additional study that is outside the present scope. Silicon validation data for a consistent cross-technology comparison at 7 nm is not available to us. revision: partial
-
Referee: [Device modeling and simulation methodology] Device modeling and simulation methodology sections: The central assumption that the chosen compact models quantitatively reproduce all dominant non-idealities (device variability, IR drop, ADC quantization, sense-amplifier offset) and that their aggregate effect on end-to-end DNN accuracy is faithfully captured lacks any cross-check against measured array data or ablation on model fidelity; any systematic mismatch would scale the reported deltas and rankings.
Authors: The methodology section specifies the compact models employed and the non-idealities that are included. We do not possess measured array-level data that could serve as an independent cross-check for all four technologies. We can add a short paragraph in the revised manuscript discussing the reliance on published compact-model fidelity and the consequent limitations on absolute accuracy predictions, while preserving the relative trends that are the paper's focus. revision: partial
- Silicon validation or measured array data against which to cross-check the compact models and the aggregate impact of all modeled non-idealities
Circularity Check
No circularity: results are direct simulation outputs
full rationale
The paper reports DNN inference accuracy trends obtained from circuit simulations of crossbar arrays using compact models for 8T SRAM, FeFET, ReRAM and SOT-MRAM at the 7 nm node. Accuracy deltas (e.g., PWA up to 32.56 %, custom ADC up to 31.62 %) and technology rankings are stated as outputs of these end-to-end simulations under varying array sizes and bit-slices. No equations, fitted parameters, or self-referential definitions are present that would reduce any reported prediction to its own inputs by construction. The single reference to 'Part 1' concerns prior co-optimization strategies and is not load-bearing for the accuracy numbers or rankings presented here. The derivation chain therefore consists of independent simulation runs rather than any of the enumerated circular patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
appropriately modified for our analysis . We anal yzed device-circuit optimization knobs to obtain optimal design points for each technology for a fair comparative technol ogy evaluation. The design parameters were chosen consideri ng non-ideality factor (NF) and sense margin (SM). The NF quantifies the deviation of the non-ideal current (Inon-ideal) from...
-
[2]
However, the technique discussed here is generally applicable
This data correspond s to FeFET -based 128x128 array. However, the technique discussed here is generally applicable. For low magnitude output states (Fig. 4a), we observe that the SM is positive , but decreases as the output increases. The linear and custom ADC reference levels are located withi n the range of the output currents but the difference betwee...
-
[3]
X-Former: In -Memory Acceleration of Transformers,
S. Sridharan, J. R. Stevens, K. Roy, and A. Raghunathan, “X- Former: In-Memory Acceleration of Transformers,” IEEE Trans Very Large Scale Integr VLSI Syst, vol. 31, no. 8, pp. 1223–1233, 2023, doi: 10.1109/TVLSI.2023.3282046
-
[4]
GENIEx: A Generalized Approach to Emulating Non -Ideality in Memristive Xbars using Neural Networks,
I. Chakraborty, M. Fayez Ali, D. Eun Kim, A. Ankit, and K. Roy, “GENIEx: A Generalized Approach to Emulating Non -Ideality in Memristive Xbars using Neural Networks,” in 2020 57th ACM/IEEE Design Automation Conference (DAC), 2020, pp. 1–6. doi: 10.1109/DAC18072.2020.9218688
-
[5]
Modeling and Circuit Analysis of Interconnects with TaS2 Barrier/Liner,
X. Chen, C.-L. Lo, M. C. Johnson, Z. Chen, and S. K. Gupta, “Modeling and Circuit Analysis of Interconnects with TaS2 Barrier/Liner,” in 2021 Device Research Conference (DRC), 2021, pp. 1–2. doi: 10.1109/DRC52342.2021.9467160
-
[6]
Reduction and IR -drop compensations techniques for reliable neuromorphic computing systems,
B. Liu et al., “Reduction and IR-drop compensations techniques for reliable neuromorphic computing systems,” in 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2014, pp. 63–70. doi: 10.1109/ICCAD.2014.7001330
-
[7]
A. Agrawal, C. Lee, and K. Roy, “X-CHANGR: Changing Memristive Crossbar Mapping for Mitigating Line-Resistance Induced Accuracy Degradation in Deep Neural Networks,” Jun. 2019, [Online]. Available: http://arxiv.org/abs/1907.00285
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[8]
Vortex: Variation -aware training for memristor X -bar,
B. Liu, H. Li, Y. Chen, X. Li, Q. Wu, and T. Huang, “Vortex: Variation-aware training for memristor X-bar,” in 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC), 2015, pp. 1–6. doi: 10.1145/2744769.2744930
-
[9]
A. Bhattacharjee, L. Bhatnagar, and P. Panda, “Examining and Mitigating the Impact of Crossbar Non-Idealities for Accurate Implementation of Sparse Deep Neural Networks,” in Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in Europe, in DATE ’22. Leuven, BEL: European Design and Automation Association, 2022, pp. 1119–1122
work page 2022
-
[10]
A. Bhattacharjee, L. Bhatnagar, Y. Kim, and P. Panda, “NEAT: Nonlinearity Aware Training for Accurate, Energy-Efficient, and Robust Implementation of Neural Networks on 1T-1R Crossbars,” Trans. Comp. -Aided Des. Integ. Cir. Sys., vol. 41, no. 8, pp. 2625– 2637, Aug. 2022, doi: 10.1109/TCAD.2021.3109857
-
[11]
Effect of Device Variation on Mapping Binary Neural Network to Memristor Crossbar Array,
W. Yi, Y. Kim, and J.-J. Kim, “Effect of Device Variation on Mapping Binary Neural Network to Memristor Crossbar Array,” in 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2019, pp. 320–323. doi: 10.23919/DATE.2019.8714817
-
[12]
A. Agrawal, C. Lee, and K. Roy, “X-CHANGR: Changing Memristive Crossbar Mapping for Mitigating Line-Resistance Induced Accuracy Degradation in Deep Neural Networks.”
-
[13]
T. Sharma, C. Wang, A. Agrawal, and K. Roy, “Enabling Robust SOT-MTJ Crossbars for Machine Learning using Sparsity-Aware Device-Circuit Co -design.”
-
[14]
Emerging NVM: A survey on architectural integration and research challenges,
J. Boukhobza, S. Rubini, R. Chen, and Z. Shao, “Emerging NVM: A survey on architectural integration and research challenges,” ACM Transact Des Autom Electron Syst, vol. 23, no. 2, Nov. 2017, doi: 10.1145/3131848
-
[15]
InfoX: An Energy- Efficient ReRAM Accelerator Design with Information-Lossless Low-Bit ADCs,
Y. He, S. Qu, Y. Wang, B. Li, H. Li, and X. Li, “InfoX: An Energy- Efficient ReRAM Accelerator Design with Information-Lossless Low-Bit ADCs,” in Proceedings - Design Automation Conference, Institute of Electrical and Electronics Engineers Inc., Jul. 2022, pp. 97–102. doi: 10.1145/3489517.3530396
-
[16]
PUMA: A Programmable Ultra -Efficient Memristor -Based Accelerator for Machine Learning Inference,
A. Ankit et al., “PUMA: A Programmable Ultra-Efficient Memristor-Based Accelerator for Machine Learning Inference,” in Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, in ASPLOS ’19. New York, NY, USA: Association for Computing Machinery, 2019, pp. 715–731. doi: 10.1145/329...
-
[17]
Design and evaluation of 6T SRAM layout designs at modern nanoscale CMOS processes,
D. Balobas and N. Konofaos, “Design and evaluation of 6T SRAM layout designs at modern nanoscale CMOS processes,” International Conference on Modern Circuits and System Technologies, no. January, pp. 7–12, 2016
work page 2016
-
[18]
I. Yeo, W. He, Y.-C. Luo, S. Yu, and J.-S. Seo, “A Dynamic Power- Only Compute-in-Memory Macro With Power-of-Two Nonlinear SAR ADC for Nonvolatile Ferroelectric Capacitive Crossbar Array,” IEEE Solid State Circuits Lett, vol. 7, pp. 70–73, 2024, doi: 10.1109/LSSC.2024.3361011
-
[19]
Learning Multiple Layers of Features from Tiny Images,
A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” 2009
work page 2009
-
[20]
URL http://dx.doi.org/10.1109/CVPR.2016.90
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Dec. 2016, pp. 770–778. doi: 10.1109/CVPR.2016.90
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.