Sequential Learning and Catastrophic Forgetting in Differentiable Resistor Networks
Pith reviewed 2026-05-09 14:05 UTC · model grok-4.3
The pith
Differentiable resistor networks learn single input-output mappings via conductance tuning but suffer catastrophic forgetting when trained sequentially on conflicting tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Although individual input-output mappings can be learned by gradient-based adjustment of edge conductances in resistor networks governed by Kirchhoff's laws, sequential training on conflicting tasks produces catastrophic forgetting. Forgetting is controlled by task conflict and by the degree of adaptation to the new task. Uniform anchoring and normalised gradient-weighted anchoring reduce forgetting only by increasing the final loss on the new task. Forgetting is associated with localised conductance changes on high-current edges, giving a physical interpretation as reconfiguration of dominant transport pathways. Broader random-task ensembles show that the strongest forgetting occurs when a
What carries the argument
Gradient-based adjustment of edge conductances in networks obeying Kirchhoff's current and voltage laws, which enforces physical equilibrium at every training step while allowing the conductances to serve as trainable parameters.
If this is right
- Sequential training on tasks whose output orderings oppose each other produces the largest forgetting.
- Anchoring conductances reduces forgetting only at the expense of higher error on the newly learned task.
- Forgetting appears as concentrated conductance shifts along the highest-current pathways, reconfiguring the dominant routes for current flow.
- Across different random graph families the forgetting-adaptation balance changes, with topology acting as an independent control variable.
- The same networks can be used to quantify how much task similarity modulates the severity of forgetting.
Where Pith is reading between the lines
- The link between forgetting and high-current edge reconfiguration could be tested by deliberately protecting those edges in future physical prototypes to improve retention.
- If topology alters the forgetting-adaptation trade-off, then choosing or evolving network structure becomes a design parameter for building physical continual learners.
- The resistor-network testbed offers a low-dimensional way to compare forgetting across equilibrium-based physical systems such as fluidic or optical networks that obey similar conservation laws.
- Task reversal as the worst-case conflict suggests that output ordering, rather than input statistics alone, may be the dominant driver of interference in any equilibrium-constrained learner.
Load-bearing premise
That the simulated resistor networks with gradient-based conductance updates sufficiently capture the mechanisms of learning and forgetting to yield generalizable insights about continual learning in physical systems.
What would settle it
Running the same sequential training protocol on a physical analog resistor network and finding that conductance changes do not concentrate on high-current edges or that task reversal does not produce the largest forgetting would falsify the reported mechanism.
Figures
read the original abstract
Differentiable physical networks provide a simple setting in which learning can be studied through the interaction between trainable parameters and physical equilibrium constraints. We investigate sequential learning in differentiable resistor networks governed by Kirchhoff's laws. Although individual input--output mappings can be learned by gradient-based adjustment of edge conductances, sequential training on conflicting tasks produces catastrophic forgetting. We show that forgetting is controlled by task conflict and by the degree of adaptation to the new task. Uniform anchoring and normalised gradient-weighted anchoring reduce forgetting only by increasing the final loss on the new task, giving a clear forgetting--adaptation trade-off. We also show that forgetting is associated with localised conductance changes on high-current edges, giving a physical interpretation as reconfiguration of dominant transport pathways. Broader random-task ensembles show that the strongest forgetting occurs when the second task reverses the output ordering imposed by the first task. Finally, comparisons across Erd\H{o}s--R\'enyi, small-world, scale-free, and random-geometric graph ensembles show that topology changes the forgetting--adaptation balance. These results position differentiable resistor networks as compact, physically interpretable testbeds for studying continual learning in tunable matter.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies sequential learning in differentiable resistor networks governed by Kirchhoff's laws. It shows that gradient-based adjustment of edge conductances can learn individual input-output mappings, but sequential training on conflicting tasks produces catastrophic forgetting. Forgetting is shown to depend on task conflict and the degree of adaptation to the new task; uniform and normalised gradient-weighted anchoring reduce forgetting only at the cost of higher final loss on the new task. Forgetting correlates with localised conductance changes on high-current edges. Strongest forgetting occurs when the second task reverses the output ordering of the first; comparisons across Erdős–Rényi, small-world, scale-free and random-geometric graphs show that topology modulates the forgetting–adaptation trade-off. The work positions these networks as compact, physically interpretable testbeds for continual learning.
Significance. If the simulation results hold, the manuscript supplies a physically grounded, low-dimensional model in which continual-learning phenomena can be studied through explicit equilibrium constraints and measurable transport pathways. Credit is due for the use of well-defined task ensembles, explicit conflict metrics, topology comparisons, and the demonstration of a clear forgetting–adaptation trade-off under anchoring. These elements make the networks a potentially useful testbed for exploring mechanisms in physical or neuromorphic continual learning.
minor comments (3)
- [Abstract] Abstract: the phrases 'task conflict' and 'normalised gradient-weighted anchoring' are used without a one-sentence definition; a brief parenthetical gloss would aid readers who have not yet reached the methods section.
- [Introduction or Methods] The manuscript would benefit from an explicit statement, early in the text, of the precise form of the Kirchhoff-law equilibrium equations that are differentiated for gradient computation.
- [Figures] Figure captions and axis labels should consistently indicate whether error bars represent standard deviation across graph realizations or across task pairs.
Simulated Author's Rebuttal
We thank the referee for their accurate summary of our work, the positive assessment of its significance as a testbed for continual learning, and the recommendation for minor revision. The report correctly identifies the core findings on catastrophic forgetting, the role of task conflict and adaptation, the forgetting-adaptation trade-off under anchoring, the association with high-current edges, the effect of output-order reversal, and the topology dependence across graph ensembles.
Circularity Check
No significant circularity identified
full rationale
The paper reports outcomes from explicit forward simulations of Kirchhoff-governed resistor networks with gradient-based conductance updates on defined task sequences. All central claims (catastrophic forgetting under conflict, anchoring trade-offs, localization to high-current edges, and topology-dependent balances) are presented as direct results of these computational experiments rather than as derivations that reduce to their own inputs by definition, fitted-parameter renaming, or self-citation chains. No load-bearing step equates a prediction to a prior fit or invokes an unverified uniqueness theorem; the work remains self-contained against the stated simulation benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Our paper differs in both method and emphasis
takes an important step further by directly study- ing sequential learning in tunable resistor networks and showing that thresholded local updates can reduce catas- trophic forgetting by spatially separating task-specific tuned regions. Our paper differs in both method and emphasis. Methodologically, we study a differentiable resistor-network model traine...
-
[2]
L. G. Wright, T. Onodera, M. M. Stein, T. Wang, D. T. Schachter, Z. Hu, and P. L. McMahon, Nature601, 549 (2022)
2022
-
[3]
Momeni, B
A. Momeni, B. Rahmani, M. Mall´ ejac, P. Del Hougne, and R. Fleury, Science382, 1297 (2023)
2023
- [4]
-
[5]
Stern and A
M. Stern and A. Murugan, Annual Review of Condensed Matter Physics14, 417 (2023)
2023
-
[6]
Stern, D
M. Stern, D. Hexner, J. W. Rocks, and A. J. Liu, Physical Review X11, 021045 (2021)
2021
-
[7]
Stern, M
M. Stern, M. Guzman, F. Martins, A. J. Liu, and V. Bal- asubramanian, Physical Review Letters134, 147402 (2025)
2025
- [8]
-
[9]
R. M. French, Trends in Cognitive Sciences3, 128 (1999)
1999
-
[10]
Kirkpatrick, R
J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ra- malho, A. Grabska-Barwinska,et al., Proceedings of the National Academy of Sciences114, 3521 (2017)
2017
-
[11]
Zenke, B
F. Zenke, B. Poole, and S. Ganguli, inProceedings of the 34th International Conference on Machine Learning (PMLR, 2017) pp. 3987–3995
2017
-
[12]
Serra, D
J. Serra, D. Suris, M. Miron, and A. Karatzoglou, inPro- ceedings of the 35th International Conference on Machine Learning(PMLR, 2018) pp. 4548–4557
2018
-
[13]
De Lange, R
M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars, IEEE Transactions on Pattern Analysis and Machine Intelli- gence44, 3366 (2022)
2022
-
[14]
Davies, N
M. Davies, N. Srinivasa, T.-H. Lin, G. Chinya, Y. Cao, G. Choday, G. Dimou, P. Joshi, N. Imam, S. Jain,et al., IEEE Micro38, 82 (2018)
2018
-
[15]
Furber, Journal of Neural Engineering13, 051001 (2016)
S. Furber, Journal of Neural Engineering13, 051001 (2016)
2016
-
[16]
Indiveri and S.-C
G. Indiveri and S.-C. Liu, Proceedings of the IEEE103, 1379 (2015)
2015
-
[17]
Dillavou, B
S. Dillavou, B. D. Beyer, M. Stern, A. J. Liu, M. Z. Miskin, and D. J. Durian, Proceedings of the National Academy of Sciences121, e2319718121 (2024)
2024
-
[18]
Stern, S
M. Stern, S. Dillavou, D. Jayaraman, D. J. Durian, and A. J. Liu, APL Machine Learning2, 016114 (2024)
2024
-
[19]
P. Chatterjee, M. Guzman, and A. J. Liu, arXiv preprint arXiv:2512.03799 (2025)
-
[20]
Li and D
Z. Li and D. Hoiem, IEEE Transactions on Pattern Anal- ysis and Machine Intelligence40, 2935 (2018)
2018
-
[21]
Scellier and Y
B. Scellier and Y. Bengio, Frontiers in Computational Neuroscience11, 24 (2017)
2017
-
[22]
Jaeger and H
H. Jaeger and H. Haas, Science304, 78 (2004)
2004
-
[23]
Tanaka, T
G. Tanaka, T. Yamane, J. B. H´ eroux, R. Nakane, N. Kanazawa, S. Takeda, H. Numata, D. Nakano, and A. Hirose, Neural Networks115, 100 (2019)
2019
-
[24]
Stepney, Natural Computing 10.1007/s11047-024- 09997-y (2024)
S. Stepney, Natural Computing 10.1007/s11047-024- 09997-y (2024)
-
[25]
M. J. Falk, J. Wu, A. Matthews, V. Sachdeva, N. Pashine, M. L. Gardel, S. R. Nagel, and A. Muru- gan, Proceedings of the National Academy of Sciences 120, e2219558120 (2023)
2023
-
[26]
S. Dillavou, M. Guzman, A. J. Liu, and D. J. Durian, arXiv preprint arXiv:2505.22887 (2025)
-
[27]
M. Ibrahim, Physical learning in resistor networks, https://doi.org/10.5281/zenodo.19975054(2026), version 1.1.0, Zenodo archived software; GitHub repository:https://github.com/Manirmaths/ physical-learning-resistor-networks
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.