X-CHANGR: Changing Memristive Crossbar Mapping for Mitigating Line-Resistance Induced Accuracy Degradation in Deep Neural Networks
Pith reviewed 2026-05-25 12:22 UTC · model grok-4.3
The pith
Remapping sensitive kernels closer to drivers in memristive crossbars cuts DNN accuracy loss from line resistance to 2.1-2.9%.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By performing sensitivity analysis on DNN weights and kernels and re-arranging crossbar columns so that the most sensitive kernels are placed closer to the drivers, the static remapping strategy (SRS) and dynamic remapping strategy (DRS) reduce the accuracy degradation due to line resistance from 5.6% to 2.9% and 2.1%, respectively, for a VGG16 network on CIFAR10 without retraining the weights.
What carries the argument
Sensitivity-ranked column remapping of kernels in resistive crossbars to minimize the effect of line-resistance induced voltage drops.
If this is right
- SRS and DRS reduce accuracy degradation without retraining the network weights.
- The remapping strategies can be combined with existing mitigation methods such as in-situ compensation.
- Benefits are shown for VGG16 trained on CIFAR10, with SRS achieving 2.9% degradation and DRS achieving 2.1%.
- Static remapping uses a fixed arrangement while dynamic remapping allows adjustments during operation.
Where Pith is reading between the lines
- The same sensitivity ranking might be reused across similar network architectures if the dominant error sources remain line resistance.
- Hardware implementations could incorporate the ranking step as a one-time preprocessing cost before deployment.
- Extending the method to multi-layer mappings might require checking interactions between layers' column assignments.
Load-bearing premise
A sensitivity analysis performed on the pre-trained network accurately identifies which kernels are most affected by column voltage drops in the physical crossbar.
What would settle it
Fabricating a crossbar array, applying the same VGG16 weights with and without the proposed remapping, and measuring the resulting accuracy on CIFAR10 test images.
Figures
read the original abstract
There is widespread interest in emerging technologies, especially resistive crossbars for accelerating Deep Neural Networks (DNNs). Resistive crossbars offer a highly-parallel and efficient matrix-vector-multiplication (MVM) operation. MVM being the most dominant operation in DNNs makes crossbars ideally suited. However, various sources of device and circuit non-idealities lead to errors in the MVM output, thereby reducing DNN accuracy. Towards that end, we propose crossbar re-mapping strategies to mitigate line-resistance induced accuracy degradation in DNNs, without having to re-train the learned weights, unlike most prior works. Line-resistances degrade the voltage levels along the crossbar columns, thereby inducing more errors at the columns away from the drivers. We rank the DNN weights and kernels based on a sensitivity analysis, and re-arrange the columns such that the most sensitive kernels are mapped closer to the drivers, thereby minimizing the impact of errors on the overall accuracy. We propose two algorithms $-$ static remapping strategy (SRS) and dynamic remapping strategy (DRS), to optimize the crossbar re-arrangement of a pre-trained DNN. We demonstrate the benefits of our approach on a standard VGG16 network trained using CIFAR10 dataset. Our results show that SRS and DRS limit the accuracy degradation to 2.9\% and 2.1\%, respectively, compared to a 5.6\% drop from an as it is mapping of weights and kernels to crossbars. We believe this work brings an additional aspect for optimization, which can be used in tandem with existing mitigation techniques, such as in-situ compensation, technology aware training and re-training approaches, to enhance system performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents X-CHANGR, a set of crossbar remapping strategies (SRS and DRS) for mitigating line-resistance induced accuracy loss in memristive crossbar implementations of DNNs. The approach ranks kernels by sensitivity computed on the pre-trained model and remaps columns to place high-sensitivity kernels closer to the drivers. On VGG16 trained on CIFAR-10, it reports reducing accuracy degradation from 5.6% (naive mapping) to 2.9% (SRS) and 2.1% (DRS) without retraining the weights.
Significance. If the empirical results hold under more rigorous validation, this provides a low-overhead, post-training optimization for resistive-crossbar DNN accelerators that can be used alongside existing compensation or training-aware techniques. The concrete numbers on a standard VGG16/CIFAR-10 benchmark indicate practical relevance for hardware mapping.
major comments (2)
- [Abstract] Abstract and results: the central claims report accuracy degradation limited to 2.9% (SRS) and 2.1% (DRS) versus 5.6% baseline, yet supply no error bars, no description of how the sensitivity ranking is computed, and no ablation of the ranking method itself. These omissions leave the numerical support for the mitigation only partially substantiated.
- [Proposed remapping strategies] Proposed approach: sensitivity is derived from the ideal pre-trained network without line resistance. Column voltage drop is a position-dependent, conductance-weighted effect; the manuscript provides no simulation or test confirming that the pre-trained ranking predicts actual per-kernel error once kernels are mapped to specific columns. If this correlation is weak, the observed benefit may be specific to the VGG16 mapping rather than a general property of the method.
minor comments (1)
- [Abstract] The abstract would benefit from a one-sentence statement of the line-resistance model parameters (e.g., resistance per segment, crossbar dimensions) used to generate the reported numbers.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on substantiating the numerical claims and validating the sensitivity-based approach. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation.
read point-by-point responses
-
Referee: [Abstract] Abstract and results: the central claims report accuracy degradation limited to 2.9% (SRS) and 2.1% (DRS) versus 5.6% baseline, yet supply no error bars, no description of how the sensitivity ranking is computed, and no ablation of the ranking method itself. These omissions leave the numerical support for the mitigation only partially substantiated.
Authors: We agree that the abstract and results would benefit from greater detail. In revision we will add an explicit description of the sensitivity computation (ranking kernels by the accuracy impact of position-dependent weight perturbations in the ideal pre-trained model). We will also insert an ablation comparing sensitivity-ranked remapping against random column permutation to isolate the contribution of the ranking. The reported figures are deterministic simulation outcomes on a fixed model and mapping; we will clarify this and note the absence of stochastic variation rather than add error bars. revision: partial
-
Referee: [Proposed remapping strategies] Proposed approach: sensitivity is derived from the ideal pre-trained network without line resistance. Column voltage drop is a position-dependent, conductance-weighted effect; the manuscript provides no simulation or test confirming that the pre-trained ranking predicts actual per-kernel error once kernels are mapped to specific columns. If this correlation is weak, the observed benefit may be specific to the VGG16 mapping rather than a general property of the method.
Authors: The ranking is deliberately computed on the ideal model because the method is a post-training, no-retrain optimization. The rationale is that kernels whose outputs are most sensitive to perturbations will suffer disproportionately from the larger voltage drops at distant columns. While the original manuscript relies on end-to-end accuracy improvement as evidence, we acknowledge the value of an explicit correlation check. We will add a new analysis (and figure) that maps each kernel’s sensitivity score against its measured output error at different column positions, thereby testing whether the pre-trained ranking predicts per-kernel degradation under line resistance. revision: yes
Circularity Check
No circularity: empirical remapping validated by direct measurement on fixed network
full rationale
The paper presents SRS and DRS as heuristic algorithms that reorder columns according to a sensitivity ranking computed once from the ideal pre-trained VGG16 on CIFAR-10; the reported accuracy figures (2.9 % / 2.1 % vs 5.6 %) are obtained by applying those fixed reorderings and measuring the resulting MVM error under line-resistance simulation. No equation, fitted parameter, or self-citation is invoked to derive or guarantee those numbers; the outcome is an independent empirical observation on the chosen mapping. The derivation chain therefore contains no self-definitional, fitted-input, or self-citation reduction and remains self-contained.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
Comparative Evaluation of Memory Technologies for Synaptic Crossbar Arrays- Part 2: Design Knobs and DNN Accuracy Trends
Simulation study at 7 nm finds FeFET best for large arrays on ResNet-20/CIFAR-10, ReRAM competitive at higher bit-slices on ResNet-50/CIFAR-100, with partial wordline activation and custom ADC levels each raising accu...
Reference graph
Works this paper leans on
-
[1]
Learning deep architectures for ai,
Y . Bengio et al., “Learning deep architectures for ai,” Foundations and trends R⃝ in Machine Learning , vol. 2, no. 1, pp. 1–127, 2009
work page 2009
-
[2]
N. Jones, “The learning machines,” Nature, vol. 505, no. 7482, p. 146, 2014
work page 2014
-
[3]
Mastering the game of go with deep neural networks and tree search,
D. Silver et al., “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, 2016
work page 2016
-
[4]
J. Backus, “Can programming be liberated from the von neumann style?: A functional style and its algebra of programs,” Commun. ACM, vol. 21, no. 8, pp. 613–641, Aug. 1978
work page 1978
-
[5]
M. Kang, M.-S. Keel, N. R. Shanbhag, S. Eilert, and K. Curewitz, “An energy-efficient VLSI architecture for pattern recognition via deep embedding of computation in SRAM,” in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . IEEE, may 2014
work page 2014
-
[6]
A 28 nm con- figurable memory (tcam/bcam/sram) using push-rule 6t bit cell enabling logic-in-memory,
S. Jeloka, N. B. Akesh, D. Sylvester, and D. Blaauw, “A 28 nm con- figurable memory (tcam/bcam/sram) using push-rule 6t bit cell enabling logic-in-memory,” IEEE Journal of Solid-State Circuits , vol. 51, no. 4, pp. 1009–1021, April 2016
work page 2016
-
[7]
In-memory computation of a machine-learning classifier in a standard 6t SRAM array,
J. Zhang, Z. Wang, and N. Verma, “In-memory computation of a machine-learning classifier in a standard 6t SRAM array,” IEEE Journal of Solid-State Circuits , vol. 52, no. 4, pp. 915–924, apr 2017
work page 2017
-
[8]
A 0.3v VDDmin 4+2t SRAM for searching and in-memory computing using 55nm DDC technology,
Q. Dong, S. Jeloka, M. Saligane, Y . Kim, M. Kawaminami, A. Harada, S. Miyoshi, D. Blaauw, and D. Sylvester, “A 0.3v VDDmin 4+2t SRAM for searching and in-memory computing using 55nm DDC technology,” in 2017 Symposium on VLSI Circuits . IEEE, jun 2017
work page 2017
-
[9]
X-SRAM: Enabling in- memory boolean computations in CMOS static random access memo- ries,
A. Agrawal, A. Jaiswal, C. Lee, and K. Roy, “X-SRAM: Enabling in- memory boolean computations in CMOS static random access memo- ries,” IEEE Transactions on Circuits and Systems I: Regular Papers, pp. 1–14, 2018
work page 2018
-
[10]
Xnor-sram: In-memory computing sram macro for binary/ternary deep neural networks,
Z. Jiang, S. Yin, M. Seok, and J.-s. Seo, “Xnor-sram: In-memory computing sram macro for binary/ternary deep neural networks,” in 2018 IEEE Symposium on VLSI Technology . IEEE, 2018, pp. 173–174
work page 2018
-
[11]
Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays
A. Agrawal, A. Jaiswal, B. Han, G. Srinivasan, and K. Roy, “Xcel-ram: Accelerating binary neural networks in high-throughput sram compute arrays,” arXiv preprint arXiv:1807.00343 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[12]
8T SRAM Cell as a Multi-bit Dot Product Engine for Beyond von-Neumann Computing
A. Jaiswal, I. Chakraborty, A. Agrawal, and K. Roy, “8t SRAM cell as a multi-bit dot product engine for beyond von-neumann computing,” arXiv preprint arXiv:1802.08601 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[13]
A. Sengupta, Y . Shim, and K. Roy, “Proposal for an all-spin artificial neural network: Emulating neural and synaptic functionalities through domain wall motion in ferromagnets,” IEEE Transactions on Biomedical Circuits and Systems , vol. 10, no. 6, pp. 1152–1160, dec 2016
work page 2016
-
[14]
A memristor crossbar based computing engine optimized for high speed and accuracy,
C. Liu, Q. Yang, B. Yan, J. Yang, X. Du, W. Zhu, H. Jiang, Q. Wu, M. Barnell, and H. Li, “A memristor crossbar based computing engine optimized for high speed and accuracy,” in 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) . IEEE, jul 2016
work page 2016
-
[15]
Brain-like associative learning using a nanoscale non-volatile phase change synaptic device array,
S. B. Eryilmaz, D. Kuzum, R. Jeyasingh, S. Kim, M. BrightSky, C. Lam, and H.-S. P. Wong, “Brain-like associative learning using a nanoscale non-volatile phase change synaptic device array,” Frontiers in Neuroscience, vol. 8, jul 2014
work page 2014
-
[16]
Training and operation of an integrated neuromorphic network based on metal-oxide memristors,
M. Prezioso, F. Merrikh-Bayat, B. D. Hoskins, G. C. Adam, K. K. Likharev, and D. B. Strukov, “Training and operation of an integrated neuromorphic network based on metal-oxide memristors,” Nature, vol. 521, no. 7550, pp. 61–64, may 2015
work page 2015
-
[17]
P. Chi, S. Li, C. Xu, T. Zhang, J. Zhao, Y . Liu, Y . Wang, and Y . Xie, “PRIME: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory,” in 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) . IEEE, jun 2016
work page 2016
-
[18]
ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars,
A. Shafiee, A. Nag, N. Muralimanohar, R. Balasubramonian, J. P. Stra- chan, M. Hu, R. S. Williams, and V . Srikumar, “ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars,” in 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA). IEEE, jun 2016
work page 2016
-
[19]
Neuromorphic computing with multi-memristive synapses,
I. Boybat, M. L. Gallo, S. R. Nandakumar, T. Moraitis, T. Parnell, T. Tuma, B. Rajendran, Y . Leblebici, A. Sebastian, and E. Eleftheriou, “Neuromorphic computing with multi-memristive synapses,” Nature Communications, vol. 9, no. 1, jun 2018
work page 2018
- [20]
-
[21]
PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference
A. Ankit, I. E. Hajj, S. R. Chalamalasetti, G. Ndu, M. Foltin, R. S. Williams, P. Faraboschi, J. P. Strachan, K. Roy, and D. S. Milojicic, “Puma: A programmable ultra-efficient memristor-based accelerator for machine learning inference,” arXiv preprint arXiv:1901.10351 , 2019
work page internal anchor Pith review Pith/arXiv arXiv 1901
-
[22]
Memristive crossbar mapping for neuromorphic computing systems on 3d IC,
Q. Xu, S. Chen, B. Yu, and F. Wu, “Memristive crossbar mapping for neuromorphic computing systems on 3d IC,” in Proceedings of the 2018 on Great Lakes Symposium on VLSI (GLSVLSI) . ACM Press, 2018
work page 2018
-
[23]
A. Ankit, A. Sengupta, and K. Roy, “TraNNsformer: Neural network transformation for memristive crossbar based neuromorphic system design,” in 2017 IEEE/ACM International Conference on Computer- Aided Design (ICCAD) . IEEE, nov 2017
work page 2017
-
[24]
Rx-caffe: Frame- work for evaluating and training deep neural networks on resistive crossbars,
S. Jain, A. Sengupta, K. Roy, and A. Raghunathan, “Rx-caffe: Frame- work for evaluating and training deep neural networks on resistive crossbars,” arXiv preprint arXiv:1809.00072 , 2018
-
[25]
Technology aware training in memristive neuromorphic systems for nonideal synaptic crossbars,
I. Chakraborty, D. Roy, and K. Roy, “Technology aware training in memristive neuromorphic systems for nonideal synaptic crossbars,” IEEE Transactions on Emerging Topics in Computational Intelligence , vol. 2, no. 5, pp. 335–344, oct 2018
work page 2018
-
[26]
Accelerator-friendly neural-network training: Learning variations and defects in RRAM crossbar,
L. Chen, J. Li, Y . Chen, Q. Deng, J. Shen, X. Liang, and L. Jiang, “Accelerator-friendly neural-network training: Learning variations and defects in RRAM crossbar,” in Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017 . IEEE, mar 2017
work page 2017
-
[27]
Overcoming crossbar nonidealities in binary neural networks through learning,
M. E. Fouda, J. Lee, A. M. Eltawil, and F. Kurdahi, “Overcoming crossbar nonidealities in binary neural networks through learning,” in Proceedings of the 14th IEEE/ACM International Symposium on Nanoscale Architectures, ser. NANOARCH ’18. New York, NY , USA: ACM, 2018, pp. 31–33. [Online]. Available: http://doi.acm.org/10.1145/ 3232195.3232226
-
[28]
J. Hu, C. J. Xue, W. Tseng, Y . He, M. Qiu, and E. H. . Sha, “Reducing write activities on non-volatile memories in embedded cmps via data migration and recomputation,” in Design Automation Conference , June 2010, pp. 350–355
work page 2010
-
[29]
Group scissor: Scaling neuromorphic computing design to large neural networks,
Y . Wang, W. Wen, B. Liu, D. Chiarulli, and H. Li, “Group scissor: Scaling neuromorphic computing design to large neural networks,” in 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC) , June 2017, pp. 1–6
work page 2017
-
[30]
Overview of selector devices for 3-d stackable cross point rram arrays,
R. Aluguri and T. Tseng, “Overview of selector devices for 3-d stackable cross point rram arrays,” IEEE Journal of the Electron Devices Society , vol. 4, no. 5, pp. 294–306, Sep. 2016
work page 2016
-
[31]
Memristor crossbar-based neuromorphic computing system: A case study,
M. Hu, H. Li, Y . Chen, Q. Wu, G. S. Rose, and R. W. Linderman, “Memristor crossbar-based neuromorphic computing system: A case study,” IEEE Transactions on Neural Networks and Learning Systems , vol. 25, no. 10, pp. 1864–1878, Oct 2014
work page 2014
-
[32]
Learning internal representations by error propagation,
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation,” California Univ San Diego La Jolla Inst for Cognitive Science, Tech. Rep., 1985
work page 1985
-
[33]
Axnn: Energy-efficient neuromorphic systems using approximate computing,
S. Venkataramani, A. Ranjan, K. Roy, and A. Raghunathan, “Axnn: Energy-efficient neuromorphic systems using approximate computing,” in 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED) , Aug 2014, pp. 27–32
work page 2014
-
[34]
V oltage divider effect for the improvement of variability and endurance of TaOx memristor,
K. M. Kim, J. J. Yang, J. P. Strachan, E. M. Grafals, N. Ge, N. D. Melendez, Z. Li, and R. S. Williams, “V oltage divider effect for the improvement of variability and endurance of TaOx memristor,”Scientific Reports, vol. 6, no. 1, feb 2016
work page 2016
-
[35]
K.-H. Kim, S. Gaba, D. Wheeler, J. M. Cruz-Albrecht, T. Hussain, N. Srinivasa, and W. Lu, “A functional hybrid memristor crossbar- array/CMOS system for data storage and neuromorphic applications,” Nano Letters, vol. 12, no. 1, pp. 389–395, dec 2011
work page 2011
-
[36]
Variation-tolerant spin-torque transfer (stt) mram array for yield enhancement,
J. Li, , S. Salahuddin, and K. Roy, “Variation-tolerant spin-torque transfer (stt) mram array for yield enhancement,” in 2008 IEEE Custom Integrated Circuits Conference, Sep. 2008, pp. 193–196
work page 2008
-
[37]
Automatic differentiation in pytorch,
A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017
work page 2017
-
[38]
Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556 , 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[39]
Learning multiple layers of features from tiny images,
A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” Citeseer, Tech. Rep., 2009
work page 2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.