pith. sign in

arxiv: 2605.19403 · v1 · pith:EHCBDALEnew · submitted 2026-05-19 · 💻 cs.LG

TIDE: Asymmetric Neural Circuits for Stabilized Temporal Inhibitory-Excitatory Dynamics

Pith reviewed 2026-05-20 07:42 UTC · model grok-4.3

classification 💻 cs.LG
keywords excitatory-inhibitory networksneural dynamicsWilson-Cowan modelimage classificationnetwork stabilityDale's principleasymmetric circuitstemporal inhibition
0
0 comments X

The pith

TIDE shows that asymmetric excitatory-inhibitory networks stabilize neural dynamics while cutting training time and raising accuracy on perturbed ImageNet tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TIDE to model internal neural dynamics with asymmetric excitatory-inhibitory networks that incorporate Wilson-Cowan dynamics and lateral inhibition. This setup is expressed as energy-based systems optimized through a game-theoretic loss and enforces Dale's principle to maintain an 80:20 E-I ratio for biological realism. The goal is to add stability guarantees that earlier continuous thought machine models lacked, along with proofs of convergence and complexity bounds. If the approach holds, it would let neuro-inspired architectures achieve both theoretical stability and practical gains in efficiency and robustness under input perturbations.

Core claim

TIDE is a neuro-inspired architecture that computes internal representations through neural dynamics stabilized by asymmetric Excitatory-Inhibitory networks, Wilson-Cowan dynamics, and lateral inhibition. It balances biological realism by using Hierarchical Receptive Fields and enforcing Dale's principle to ensure a realistic 80:20 E-I balance ratio within an end-to-end trainable architecture. The paper presents proofs of convergence, stability, and complexity bounds, and reports that TIDE surpasses CTM with under 50% of the training time while improving top-1 accuracy by an average of +1.65% on ImageNet under various perturbations.

What carries the argument

Asymmetric excitatory-inhibitory networks that embed Wilson-Cowan dynamics and lateral inhibition, formulated as energy-based systems optimized via game-theoretic loss and constrained by Dale's principle to enforce 80:20 E-I balance.

If this is right

  • TIDE supplies provable convergence and stability for the modeled neural dynamics.
  • The architecture maintains a biologically realistic 80:20 E-I ratio through Dale's principle.
  • Training requires under 50% of the time needed by the Continuous Thought Machine.
  • Top-1 accuracy rises by an average of +1.65% on ImageNet under perturbations.
  • Complexity bounds are established for the stabilized dynamics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The stability mechanism could transfer to other recurrent architectures that currently lack convergence guarantees.
  • Enforcing Dale's principle may make internal representations more interpretable by aligning them with known biological constraints.
  • The game-theoretic loss offers a template for designing new objectives that directly penalize unstable dynamics in energy-based models.

Load-bearing premise

Embedding Wilson-Cowan dynamics plus lateral inhibition into asymmetric E-I networks with enforced Dale's principle will produce both provable stability and the reported empirical gains without additional post-hoc tuning.

What would settle it

Run an ablation that removes the lateral inhibition term, then check whether the claimed convergence and stability proofs still hold and whether the +1.65% accuracy lift on perturbed ImageNet vanishes.

Figures

Figures reproduced from arXiv: 2605.19403 by Alexander Kyuroson, Denis Kleyko, Marcus Liwicki.

Figure 1
Figure 1. Figure 1: Schematic architectural comparison between CTM and TIDE. TIDE’s architectural compo [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Temporal evolution of mean attention as saliency per computation step for TIDE and CTM. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Robustness analysis using ImageNet-C [48] for corrupted images with perturbations. Left panel presents results for TIDE, center panel corresponds to CTM, right panel reports differences. Ablations studies: MNIST and Fashion-MNIST are used to analyze the effects of various hyper￾parameter choices on learning outcomes and the stability of TIDE. All models are trained for 50 K steps, with an identical simple … view at source ↗
read the original abstract

Recent Continuous Thought Machine architecture decouples internal computation from external inputs via neural dynamics, but relies on multi-layer perceptrons without stability guarantees. We propose to model neural dynamics using asymmetric Excitatory-Inhibitory (E-I) networks, which can be stabilized via principles from network theory and can be expressed as energy-based systems optimized through a game-theoretic loss. Building on this perspective, we introduce Temporal Inhibitory-Excitatory Dynamic Engine (TIDE), a neuro-inspired architecture that computes internal representations through neural dynamics stabilized by incorporating the Wilson-Cowan dynamics and lateral inhibition. TIDE balances biological realism by, for instance, using Hierarchical Receptive Fields and enforcing Dale's principle to ensure a realistic $80:20$ E-I balance ratio with an end-to-end trainable architecture. The aim of this paper is to introduce a new architecture that brings neuro-inspired learning to the forefront. We present proofs of convergence, stability, and complexity bounds, along with empirical ablation studies. Overall, TIDE surpasses CTM with under $50\%$ of the training time and improves $\texttt{top-1}$ accuracy by an average of $+1.65\%$ on ImageNet under various perturbations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the Temporal Inhibitory-Excitatory Dynamic Engine (TIDE), an architecture that incorporates asymmetric excitatory-inhibitory (E-I) networks, Wilson-Cowan dynamics, and lateral inhibition to stabilize internal representations in continuous-time neural computation. Building on the Continuous Thought Machine (CTM), TIDE enforces Dale's principle with an 80:20 E-I ratio, claims to provide proofs of convergence, stability, and complexity bounds via network theory and a game-theoretic loss, and reports empirical results showing an average +1.65% top-1 accuracy improvement and under 50% training time versus CTM on perturbed ImageNet.

Significance. If the stability guarantees transfer from continuous Wilson-Cowan dynamics to the discrete, trained implementation and the reported efficiency/accuracy gains prove robust, the work could meaningfully advance neuro-inspired architectures that prioritize biological constraints like E-I balance for more stable and efficient dynamic neural networks.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (Theoretical Analysis): The manuscript asserts proofs of convergence, stability, and complexity bounds based on continuous-time Wilson-Cowan dynamics and network-theory principles, yet provides no derivation steps, discretization analysis, or verification that the end-to-end trained discrete implementation preserves these properties. This is load-bearing for the central claim that TIDE achieves provable stability without post-hoc tuning.
  2. [§5] §5 (Experiments): The key empirical claims (+1.65% top-1 accuracy and <50% training time on ImageNet under perturbations) are stated without baseline implementation details for CTM, explicit perturbation definitions, or statistical significance measures (e.g., standard error over runs). This directly affects verifiability of the practical gains that rest on the chosen E-I ratio and Wilson-Cowan parameters.
minor comments (2)
  1. [§2] The 80:20 E-I balance ratio is referenced as an example of biological realism but should be explicitly tied to the loss function and architecture equations in the main text for clarity.
  2. [Figures] Figure captions and architecture diagrams would benefit from explicit annotation of lateral inhibition pathways and how they interact with the asymmetric E-I connections during training.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments. We address each major comment point by point below and indicate the revisions we will incorporate to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (Theoretical Analysis): The manuscript asserts proofs of convergence, stability, and complexity bounds based on continuous-time Wilson-Cowan dynamics and network-theory principles, yet provides no derivation steps, discretization analysis, or verification that the end-to-end trained discrete implementation preserves these properties. This is load-bearing for the central claim that TIDE achieves provable stability without post-hoc tuning.

    Authors: We appreciate the referee's emphasis on this foundational aspect. Section 3 presents the theoretical analysis based on Wilson-Cowan dynamics, network theory, and a game-theoretic loss, including outlines of the convergence and stability arguments. However, we acknowledge that the current presentation would be strengthened by including more explicit derivation steps, a dedicated discretization analysis, and verification that the stability properties carry over to the discrete trained model. We will revise §3 accordingly in the next version of the manuscript. revision: yes

  2. Referee: [§5] §5 (Experiments): The key empirical claims (+1.65% top-1 accuracy and <50% training time on ImageNet under perturbations) are stated without baseline implementation details for CTM, explicit perturbation definitions, or statistical significance measures (e.g., standard error over runs). This directly affects verifiability of the practical gains that rest on the chosen E-I ratio and Wilson-Cowan parameters.

    Authors: We agree that these details are important for reproducibility and assessment of the results. In the revised manuscript we will add full implementation details for the CTM baseline, explicit definitions of the perturbations applied to ImageNet, and statistical significance measures including standard errors computed over multiple runs. These changes will improve the verifiability of the reported accuracy and training-time improvements. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation remains self-contained with independent theoretical and empirical content.

full rationale

The paper defines TIDE by incorporating Wilson-Cowan dynamics and lateral inhibition into asymmetric E-I networks with Dale's principle, then separately presents proofs of convergence/stability/complexity and reports empirical results on ImageNet. No quoted step shows a prediction or first-principles result reducing by construction to a fitted hyperparameter, self-citation chain, or renamed input. The stability claims rest on network-theory principles and game-theoretic loss applied to the defined architecture rather than tautological re-expression of the inputs. Empirical gains (+1.65% accuracy, <50% training time) are presented as measured outcomes distinct from the model definition. This is the expected non-finding for a paper whose central claims retain independent content from its assumptions and experiments.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 1 invented entities

The architecture rests on biological modeling choices treated as fixed rather than derived; the 80:20 ratio and Wilson-Cowan equations are imported without new justification inside the paper.

free parameters (1)
  • 80:20 E-I balance ratio
    Explicitly enforced to match biological observation; treated as a hard constraint rather than learned.
axioms (2)
  • domain assumption Wilson-Cowan dynamics stabilize asymmetric E-I networks when combined with lateral inhibition
    Invoked to guarantee convergence and stability of the temporal dynamics.
  • domain assumption Dale's principle holds and produces realistic 80:20 E-I ratio
    Used to constrain neuron types and connection signs throughout the network.
invented entities (1)
  • TIDE architecture no independent evidence
    purpose: Computes internal representations via stabilized neural dynamics
    New proposed model that integrates the listed biological constraints.

pith-pipeline@v0.9.0 · 5747 in / 1518 out tokens · 47780 ms · 2026-05-20T07:42:43.169875+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    We propose to model neural dynamics using asymmetric Excitatory-Inhibitory (E-I) networks, which can be stabilized via principles from network theory and can be expressed as energy-based systems optimized through a game-theoretic loss. ... incorporating the Wilson-Cowan dynamics and lateral inhibition.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages · 3 internal anchors

  1. [1]

    Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998

    Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998

  2. [2]

    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. ImageNet classification with deep convolutional neural networks.Communications of the ACM, 60(6):84–90, 2017

  3. [3]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016

  4. [4]

    Gomez, Łukasz Kaiser, and Illia Polosukhin

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems (NeurIPS), volume 30, pages 5998–6008, 2017

  5. [5]

    An image is worth 16x16 words: Transformers for image recognition at scale

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), 2021

  6. [6]

    Complex-valued neural networks: A comprehensive survey.IEEE/CAA Journal of Automatica Sinica, 9(8):1406–1426, 2022

    ChiYan Lee, Hideyuki Hasegawa, and Shangce Gao. Complex-valued neural networks: A comprehensive survey.IEEE/CAA Journal of Automatica Sinica, 9(8):1406–1426, 2022

  7. [7]

    Lillicrap, Daniel Cownden, Douglas B

    Timothy P. Lillicrap, Daniel Cownden, Douglas B. Tweed, and Colin J. Akerman. Random synaptic feedback weights support error backpropagation for deep learning.Nature Communications, 7:13276, 2016

  8. [8]

    Equilibrium propagation: Bridging the gap between energy-based models and backpropagation.Frontiers in Computational Neuroscience, 11:24, 2017

    Benjamin Scellier and Yoshua Bengio. Equilibrium propagation: Bridging the gap between energy-based models and backpropagation.Frontiers in Computational Neuroscience, 11:24, 2017

  9. [9]

    Richards, and Richard Naud

    Alexandre Payeur, Jordan Guerguiev, Friedemann Zenke, Blake A. Richards, and Richard Naud. Burst- dependent synaptic plasticity can coordinate learning in hierarchical circuits.Nature Neuroscience, 24:1010–1019, 2021

  10. [10]

    Anthony M. Zador. A critique of pure learning and what artificial neural networks can learn from animal brains.Nature Communications, 10(1):3770, 2019

  11. [11]

    Isaacson and Massimo Scanziani

    Jeffry S. Isaacson and Massimo Scanziani. How inhibition shapes cortical activity.Neuron, 72(2):231–243, 2011

  12. [12]

    The computational and learning benefits of Daleian neural networks

    Adam Haber and Elad Schneidman. The computational and learning benefits of Daleian neural networks. InAdvances in Neural Information Processing Systems (NeurIPS), volume 35, 2022

  13. [13]

    Kullmann, and Blake Richards

    Jonathan Cornford, Damjan Kalajdzievski, Marco Leite, Amélie Lamarquette, Dimitri M. Kullmann, and Blake Richards. Learning to live with Dale’s principle: ANNs with separate excitatory and inhibitory units. InInternational Conference on Learning Representations (ICLR), pages 1–27, 2021

  14. [14]

    Hasenstaub, and David A

    Bilal Haider, Alvaro Duque, Andrea R. Hasenstaub, and David A. McCormick. Neocortical network activity in vivo is generated through a dynamic balance of excitation and inhibition.The Journal of Neuroscience, 26(17):4535–4545, 2006

  15. [15]

    Continuous thought machines

    Luke Darlow, Ciaran Regan, Sebastian Risi, Jeffrey Seely, and Llion Jones. Continuous thought machines. InAdvances in Neural Information Processing Systems (NeurIPS), 2025

  16. [16]

    Pharmacology and nerve-endings.Proceedings of the Royal Society of Medicine, 28(3):319– 332, 1935

    Henry Dale. Pharmacology and nerve-endings.Proceedings of the Royal Society of Medicine, 28(3):319– 332, 1935

  17. [17]

    Eccles, Paul Fatt, and Kyozo Koketsu

    John C. Eccles, Paul Fatt, and Kyozo Koketsu. Cholinergic and inhibitory synapses in a pathway from motor-axon collaterals to motoneurones.The Journal of Physiology, 126(3):524–562, 1954

  18. [18]

    Competition, stability, and functionality in excitatory-inhibitory neural circuits.arXiv:2512.05252, 2025

    Simone Betteti, William Retnaraj, Alexander Davydov, Jorge Cortés, and Francesco Bullo. Competition, stability, and functionality in excitatory-inhibitory neural circuits.arXiv:2512.05252, 2025

  19. [19]

    Can you learn an algorithm? Generalizing from easy to hard problems with recurrent networks

    Avi Schwarzschild, Eitan Borgnia, Arjun Gupta, Furong Huang, Uzi Vishkin, Micah Goldblum, and Tom Goldstein. Can you learn an algorithm? Generalizing from easy to hard problems with recurrent networks. InAdvances in Neural Information Processing Systems (NeurIPS), pages 6695–6706, 2021

  20. [20]

    Zico Kolter, and Vladlen Koltun

    Shaojie Bai, J. Zico Kolter, and Vladlen Koltun. Deep equilibrium models. InAdvances in Neural Information Processing Systems (NeurIPS), pages 688–699, 2019. 10

  21. [21]

    Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural ordinary differential equations. InAdvances in Neural Information Processing Systems (NeurIPS), 2018

  22. [22]

    Adaptive Computation Time for Recurrent Neural Networks

    Alex Graves. Adaptive computation time for recurrent neural networks.arXiv preprint arXiv:1603.08983, 2016

  23. [23]

    PonderNet: Learning to ponder

    Andrea Banino, Jan Balaguer, and Charles Blundell. PonderNet: Learning to ponder. InICML Workshop on Automated Machine Learning, 2021

  24. [24]

    Wilson and Jack D

    Hugh R. Wilson and Jack D. Cowan. Excitatory and inhibitory interactions in localized populations of model neurons.Biophysical Journal, 12(1):1–24, 1972

  25. [25]

    Wilson and Jack D

    Hugh R. Wilson and Jack D. Cowan. A mathematical theory of the functional dynamics of cortical and thalamic nervous tissue.Kybernetik, 13(2):55–80, 1973

  26. [26]

    Chaos in neuronal networks with balanced excitatory and inhibitory activity.Science, 274(5293):1724–1726, 1996

    Carl van Vreeswijk and Haim Sompolinsky. Chaos in neuronal networks with balanced excitatory and inhibitory activity.Science, 274(5293):1724–1726, 1996

  27. [27]

    Dynamics of sparsely connected networks of excitatory and inhibitory spiking neurons

    Nicolas Brunel. Dynamics of sparsely connected networks of excitatory and inhibitory spiking neurons. Journal of Computational Neuroscience, 8(3):183–208, 2000

  28. [28]

    V ogels, Henning Sprekeler, Friedemann Zenke, Claudia Clopath, and Wulfram Gerstner

    Tim P. V ogels, Henning Sprekeler, Friedemann Zenke, Claudia Clopath, and Wulfram Gerstner. In- hibitory plasticity balances excitation and inhibition in sensory pathways and memory networks.Science, 334(6062):1569–1573, 2011

  29. [29]

    Turrigiano

    Gina G. Turrigiano. The self-tuning neuron: Synaptic scaling of excitatory synapses.Cell, 135(3):422–435, 2008

  30. [30]

    Harris and Thomas D

    Kenneth D. Harris and Thomas D. Mrsic-Flogel. Cortical connectivity and sensory coding.Nature, 503(7474):51–58, 2013

  31. [31]

    J. J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79(8):2554–2558, 1982

  32. [32]

    Atlas Kazemian, Eric Elmoznino, and Michael F. Bonner. Convolutional architectures are cortex-aligned de novo.Nature Machine Intelligence, 7:1834–1844, 2025

  33. [33]

    Hierarchical models of object recognition in cortex.Nature Neuroscience, 2(11):1019–1025, 1999

    Maximilian Riesenhuber and Tomaso Poggio. Hierarchical models of object recognition in cortex.Nature Neuroscience, 2(11):1019–1025, 1999

  34. [34]

    Daniel L. K. Yamins, Ha Hong, Charles F. Cadieu, Ethan A. Solomon, Darren Seibert, and James J. DiCarlo. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, 111(23):8619–8624, 2014

  35. [35]

    Martin Schrimpf, Jonas Kubilius, Michael J. Lee, N. Apurva Ratan Murty, Robert Ajemian, and James J. DiCarlo. Integrative benchmarking to advance neurally mechanistic models of human intelligence.Neuron, 108(3):413–423, 2020

  36. [36]

    Titans: Learning to memorize at test time

    Ali Behrouz, Peilin Zhong, and Vahab Mirrokni. Titans: Learning to memorize at test time. InAdvances in Neural Information Processing Systems (NeurIPS), pages 1–38, 2025

  37. [37]

    It’s all connected: A journey through test-time memorization, attentional bias, retention, and online optimization.arXiv:2504.13173, 2025

    Ali Behrouz, Meisam Razaviyayn, Peilin Zhong, and Vahab Mirrokni. It’s all connected: A journey through test-time memorization, attentional bias, retention, and online optimization.arXiv:2504.13173, 2025

  38. [38]

    Mamba: Linear-Time Sequence Modeling with Selective State Spaces

    Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv:2312.00752, 2023

  39. [39]

    xLSTM: Extended long short-term memory

    Maximilian Beck, Korbinian Pöppel, Markus Spanring, Andreas Auer, Oleksandra Prudnikova, Michael Kopp, Günter Klambauer, Johannes Brandstetter, and Sepp Hochreiter. xLSTM: Extended long short-term memory. InAdvances in Neural Information Processing Systems (NeurIPS), 2024

  40. [40]

    Root mean square layer normalization

    Biao Zhang and Rico Sennrich. Root mean square layer normalization. InAdvances in Neural Information Processing Systems (NeurIPS), 2019

  41. [41]

    Decoupled weight decay regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InInternational Conference on Learning Representations (ICLR), 2019

  42. [42]

    SGDR: Stochastic gradient descent with warm restarts

    Ilya Loshchilov and Frank Hutter. SGDR: Stochastic gradient descent with warm restarts. InInternational Conference on Learning Representations (ICLR), 2017. 11

  43. [43]

    Diamos, Erich Elsen, David García, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, and Hao Wu

    Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory F. Diamos, Erich Elsen, David García, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, and Hao Wu. Mixed precision training. InInternational Conference on Learning Representations (ICLR), 2018

  44. [44]

    Berg, and Li Fei-Fei

    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet large scale visual recognition challenge.International Journal of Computer Vision, 115(3):211–252, 2015

  45. [45]

    Theory of edge detection.Proceedings of the Royal Society of London

    David Marr and Ellen Hildreth. Theory of edge detection.Proceedings of the Royal Society of London. Series B, Biological Sciences, 207(1167):187–217, 1980

  46. [46]

    Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

    Han Xiao, Kashif Rasul, and Roland V ollgraf. Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms.arXiv:1708.07747, 2017

  47. [47]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009

  48. [48]

    Benchmarking neural network robustness to common corruptions and perturbations

    Dan Hendrycks and Thomas Dietterich. Benchmarking neural network robustness to common corruptions and perturbations. InInternational Conference on Learning Representations (ICLR), 2019

  49. [49]

    Tiny ImageNet visual recognition challenge

    Ya Le and Xuan Yang. Tiny ImageNet visual recognition challenge. Technical report, Stanford University, 2015

  50. [50]

    The many faces of robustness: A critical analysis of out-of- distribution generalization

    Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, Dawn Song, Jacob Steinhardt, and Justin Gilmer. The many faces of robustness: A critical analysis of out-of- distribution generalization. InIEEE/CVF International Conference on Computer Vision (ICCV), pages 8340–8349, 2021

  51. [51]

    Springer Monographs in Mathematics

    Andrzej Granas and James Dugundji.Fixed Point Theory. Springer Monographs in Mathematics. Springer, 2003

  52. [52]

    Chaotic balanced state in a model of cortical circuits.Neural Computation, 10(6):1321–1371, 1998

    Carl van Vreeswijk and Haim Sompolinsky. Chaotic balanced state in a model of cortical circuits.Neural Computation, 10(6):1321–1371, 1998

  53. [53]

    Yashar Ahmadian and Kenneth D. Miller. What is the dynamical regime of cerebral cortex?Neuron, 109(21):3373–3391, 2021

  54. [54]

    Chiu, Alexander Rush, and V olodymyr Kuleshov

    Subham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T. Chiu, Alexander Rush, and V olodymyr Kuleshov. Simple and effective masked diffusion language models. In Advances in Neural Information Processing Systems (NeurIPS), 2024

  55. [55]

    Khalil.Nonlinear Systems

    Hassan K. Khalil.Nonlinear Systems. Prentice Hall, Upper Saddle River, NJ, 3rd edition, 2002

  56. [56]

    Horn and Charles R

    Roger A. Horn and Charles R. Johnson.Matrix Analysis. Cambridge University Press, New York, NY , USA, 2nd edition, 2013

  57. [57]

    Fast global oscillations in networks of integrate-and-fire neurons with low firing rates.Neural Computation, 11(7):1621–1671, 1999

    Nicolas Brunel and Vincent Hakim. Fast global oscillations in networks of integrate-and-fire neurons with low firing rates.Neural Computation, 11(7):1621–1671, 1999

  58. [58]

    Stephen W. Kuffler. Discharge patterns and functional organization of mammalian retina.Journal of Neurophysiology, 16(1):37–68, 1953

  59. [59]

    Dacey, Beth B

    Dennis M. Dacey, Beth B. Peterson, Farrel R. Robinson, and Paul D. Gamlin. Fireworks in the primate retina: In vitro photodynamics reveals diverse LGN-projecting ganglion cell types.Neuron, 37(1):15–27, 2003

  60. [60]

    Hubel and Torsten N

    David H. Hubel and Torsten N. Wiesel. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex.The Journal of Physiology, 160(1):106–154, 1962

  61. [61]

    BNM,MHN->BNH

    Tony Lindeberg.Scale-Space Theory in Computer Vision. The Kluwer International Series in Engineering and Computer Science. Kluwer Academic Publishers, Boston, MA, 1994. 12 NeurIPS Paper Checklist 1.Claims Question: Do the main claims made in the abstract and introduction accurately reflect the paper’s contributions and scope? Answer: [Yes] Justification: ...