arxiv: 2605.14388 · v1 · submitted 2026-05-14 · 🧬 q-bio.NC

Recognition: 1 theorem link

· Lean Theorem

Multiple mechanisms of rhythm switching in recurrent neural networks with adaptive time constants

Yutaka Yamaguti , Shota Nakamura

Authors on Pith no claims yet

Pith reviewed 2026-05-15 02:03 UTC · model grok-4.3

classification 🧬 q-bio.NC

keywords rhythm switchingrecurrent neural networkstime constantsfrequency bandsdegeneracy of solutionsleaky integratorneural computation

0 comments

The pith

RNNs switch between four rhythms using multiple coexisting mechanisms that differ across networks

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Leaky-integrator recurrent networks with learnable per-neuron time constants were trained to produce and switch among theta, alpha, beta, and gamma activity. Low-frequency output engaged many neurons distributed across the population, whereas high-frequency output was carried by small groups of neurons with short time constants, and this correlation grew stronger at higher frequencies. Switching itself occurred through three overlapping routes: replacement of the active neuron set, global baseline shifts that moved the network near different unstable fixed points, and selective phase realignment among neurons that either reinforced or cancelled specific frequency components in the summed output. The particular route used for any given pair of modes was not fixed; it varied from one trained network to the next.

Core claim

Rhythm switching was supported by multiple coexisting mechanisms: turnover of the active subpopulation, network-wide baseline shifts that reposition the operating point near distinct unstable fixed points, and inter-neuronal phase reorganization that selectively cancels or supports band components in the population output. The mechanism deployed for each mode pair varied across training runs, exposing a degeneracy of learned solutions.

What carries the argument

Neuron-specific learnable time constants in leaky-integrator units that control participation in different frequency bands and enable the observed switching mechanisms.

If this is right

High-frequency rhythms rely on small subpopulations of short-time-constant neurons while low-frequency rhythms recruit distributed participation.
The strength of the negative correlation between a neuron’s time constant and its contribution to a given band increases with frequency.
Switching can be achieved by replacing the active neuron set, by shifting the whole network’s operating point, or by reorganizing phases among neurons.
Different networks can solve the identical switching task with entirely different internal mechanisms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Degeneracy implies that biological rhythm-switching circuits may also realize the same function through varied cellular or synaptic implementations.
The time-constant–frequency correlation offers a testable signature for identifying rhythm-specialized neurons in recordings.
The same training procedure could be applied to networks with more realistic connectivity or synaptic dynamics to check whether the same three mechanisms persist.

Load-bearing premise

That the mechanisms found in these simplified artificial networks on a four-band switching task can be used to interpret how real neural circuits differentiate frequency bands.

What would settle it

Observation that every independently trained network uses exactly the same mechanism for the same mode pair, or experimental removal of short-time-constant neurons that eliminates only high-frequency rhythms while leaving low-frequency rhythms intact.

read the original abstract

Although recurrent neural networks (RNNs) trained on cognitive tasks have become a widely used framework for studying neural computation, the internal mechanisms by which RNNs switch between rhythms across multiple frequency bands, and how these mechanisms relate to neuronal time constants, have not been systematically analyzed. We trained leaky integrator RNNs with neuron-specific learnable time constants on a four-band (theta, alpha, beta, gamma) rhythm-switching task and analyzed 20 independently trained networks. Whereas low-frequency rhythms were produced by distributed participation of many neurons, high-frequency rhythms were dominated by a small subpopulation of short-time-constant neurons, and the negative correlation between time constant and matched-mode amplitude strengthened monotonically with frequency. Rhythm switching was supported by multiple coexisting mechanisms: turnover of the active subpopulation, network-wide baseline shifts that reposition the operating point near distinct unstable fixed points, and inter-neuronal phase reorganization that selectively cancels or supports band components in the population output. The mechanism deployed for each mode pair varied across training runs, exposing a degeneracy of learned solutions. These findings parallel the coexistence of rhythm-specific and multi-rhythm interneurons reported in biological circuits and provide a candidate framework for interpreting frequency-band-specific functional differentiation in neural systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RNNs with learnable time constants switch between frequency bands using several coexisting mechanisms that vary across training runs.

read the letter

The main thing here is that leaky-integrator RNNs with neuron-specific learnable time constants, trained on an artificial four-band switching task, produce high-frequency rhythms mainly through short-tau subpopulations and switch modes via multiple routes: active subpopulation turnover, network baseline shifts near different unstable fixed points, and phase reorganization that cancels or boosts band components. The specific route used for any given pair of modes changes from one training run to the next, showing degeneracy in the solutions. They also see a negative correlation between time constant and amplitude that gets stronger at higher frequencies, based on 20 independently trained networks. Low-frequency rhythms involve more distributed participation across neurons.

Referee Report

3 major / 2 minor

Summary. The manuscript trains leaky-integrator RNNs with neuron-specific learnable time constants on a four-band (theta, alpha, beta, gamma) rhythm-switching task. Analysis of 20 independently trained networks reveals that low-frequency rhythms engage distributed neuronal populations while high-frequency rhythms are dominated by short-time-constant subpopulations, accompanied by a monotonically strengthening negative correlation between time constants and matched-mode amplitudes. Rhythm switching is supported by multiple mechanisms—active subpopulation turnover, network-wide baseline shifts near distinct unstable fixed points, and inter-neuronal phase reorganization—with the deployed mechanism varying across training runs, indicating degeneracy of learned solutions. These findings are offered as a candidate framework for frequency-band-specific functional differentiation in biological circuits.

Significance. If substantiated, the work supplies a concrete computational demonstration that adaptive time constants in RNNs can produce frequency-dependent subpopulation specialization and support rhythm switching through coexisting, degenerate mechanisms. The explicit identification of multiple switching strategies and their variability across runs provides a useful reference point for interpreting similar degeneracy in biological rhythm circuits. The use of 20 independent trainings offers a basic check on robustness, though the absence of quantitative metrics limits immediate impact.

major comments (3)

[Results] Results section (correlation and participation patterns): The claims of distributed vs. subpopulation dominance and the monotonic strengthening of the negative tau-amplitude correlation are stated without reported correlation coefficients, confidence intervals, p-values, or per-network variability measures across the 20 runs, rendering the strength and consistency of these central descriptive results difficult to evaluate.
[Results] Analysis of switching mechanisms: The identification of three coexisting mechanisms (subpopulation turnover, baseline shifts near unstable fixed points, phase reorganization) is based on post-training observations, yet no quantitative prevalence statistics (e.g., fraction of mode pairs using each mechanism) or controls (e.g., perturbation experiments) are supplied to establish that these mechanisms are necessary or sufficient for the observed switches.
[Methods] Methods: No details are provided on task performance metrics (e.g., switching accuracy, spectral power error), training hyperparameters, convergence criteria, or ablation experiments comparing learnable vs. fixed time constants, which are required to isolate the contribution of adaptive taus to the reported mechanisms.

minor comments (2)

[Figures] Figure captions and text should explicitly reference the exact frequency bands and provide scale bars or units for time-constant distributions and amplitude measures to improve reproducibility.
[Abstract and Discussion] The abstract and main text use the term 'degeneracy of learned solutions' without a brief operational definition or reference to prior usage in the RNN literature.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below. Where the manuscript is missing quantitative details or methodological information, we will revise accordingly. We maintain that the core findings on frequency-dependent subpopulation specialization and mechanistic degeneracy are supported by the 20 independent trainings, but we agree that additional reporting will strengthen the presentation.

read point-by-point responses

Referee: [Results] Results section (correlation and participation patterns): The claims of distributed vs. subpopulation dominance and the monotonic strengthening of the negative tau-amplitude correlation are stated without reported correlation coefficients, confidence intervals, p-values, or per-network variability measures across the 20 runs, rendering the strength and consistency of these central descriptive results difficult to evaluate.

Authors: We agree that the Results section would benefit from explicit statistical reporting. In the revised manuscript we will add Pearson correlation coefficients with 95% confidence intervals and p-values for the tau-amplitude relationship at each frequency band, together with mean and standard deviation of participation indices across the 20 networks. These values will be computed from the same post-training analyses already performed. revision: yes
Referee: [Results] Analysis of switching mechanisms: The identification of three coexisting mechanisms (subpopulation turnover, baseline shifts near unstable fixed points, phase reorganization) is based on post-training observations, yet no quantitative prevalence statistics (e.g., fraction of mode pairs using each mechanism) or controls (e.g., perturbation experiments) are supplied to establish that these mechanisms are necessary or sufficient for the observed switches.

Authors: We will add prevalence statistics in the revised Results: for each of the 20 networks we will report the fraction of the six mode-pair transitions that are accounted for by each of the three mechanisms (with a small residual category for ambiguous cases). This quantification is directly obtainable from the existing trajectory and fixed-point analyses. Perturbation experiments that would test necessity or sufficiency were not performed; the study was designed as an observational characterization of learned solutions rather than a causal intervention study. We will explicitly state this scope limitation. revision: partial
Referee: [Methods] Methods: No details are provided on task performance metrics (e.g., switching accuracy, spectral power error), training hyperparameters, convergence criteria, or ablation experiments comparing learnable vs. fixed time constants, which are required to isolate the contribution of adaptive taus to the reported mechanisms.

Authors: We will expand the Methods section to include: (i) quantitative task performance (mean switching accuracy and spectral power error across the 20 networks), (ii) the full set of training hyperparameters and convergence criteria, and (iii) results of the ablation experiments in which time constants were frozen after initial training or initialized as fixed. These data exist in our training logs and will be summarized concisely with reference to supplementary figures. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results are observational from simulations

full rationale

The paper trains leaky-integrator RNNs with learnable time constants on an artificial four-band rhythm-switching task, then performs post-training analysis on 20 independent networks to observe emergent mechanisms such as subpopulation turnover, baseline shifts, and phase reorganization. These are direct simulation outcomes rather than any derivation that reduces by construction to fitted parameters or self-citations. No equations or claims equate a prediction to its own inputs; the negative tau-amplitude correlation and mechanism degeneracy are reported as empirical findings. The biological interpretation is framed as a candidate framework, not a derived equivalence. The derivation chain is self-contained against external benchmarks with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The model rests on standard leaky-integrator RNN dynamics with learnable per-neuron time constants; no new entities are postulated and the task is custom but fully specified in principle.

free parameters (1)

neuron-specific time constants
Learned during training to match the rhythm-switching task.

axioms (1)

standard math Leaky integrator neuron dynamics
Standard continuous-time RNN formulation used throughout the field.

pith-pipeline@v0.9.0 · 5512 in / 1412 out tokens · 54223 ms · 2026-05-15T02:03:41.425279+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We trained leaky integrator RNNs with neuron-specific learnable time constants on a four-band (theta, alpha, beta, gamma) rhythm-switching task

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 1 internal anchor

[1]

Science 304(5679):1926--1929

Buzsáki G, Draguhn A (2004) Neuronal oscillations in cortical networks. Science 304(5679):1926--1929. doi:10.1126/science.1099745

work page doi:10.1126/science.1099745 2004
[2]

SIAM Journal on Scientific Computing 16(5):1190--1208

Byrd RH, Lu P, Nocedal J, et al (1995) A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing 16(5):1190--1208. doi:10.1137/0916069

work page doi:10.1137/0916069 1995
[3]

In: International 1989 Joint Conference on Neural Networks, pp 27--32, doi:10.1109/IJCNN.1989.118555

Doya K, Yoshizawa S (1989) Memorizing oscillatory patterns in the analog neuron network. In: International 1989 Joint Conference on Neural Networks, pp 27--32, doi:10.1109/IJCNN.1989.118555

work page doi:10.1109/ijcnn.1989.118555 1989
[4]

Proceedings of the National Academy of Sciences 98(24):13763--13768

Edelman GM, Gally JA (2001) Degeneracy and complexity in biological systems. Proceedings of the National Academy of Sciences 98(24):13763--13768. doi:10.1073/pnas.231499798

work page doi:10.1073/pnas.231499798 2001
[5]

Adam: A Method for Stochastic Optimization

Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. In: 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA, https://arxiv.org/abs/1412.6980 arXiv:1412.6980

work page internal anchor Pith review Pith/arXiv arXiv 2015
[6]

Science 321(5885):53--57

Klausberger T, Somogyi P (2008) Neuronal Diversity and Temporal Dynamics : The Unity of Hippocampal Circuit Operations . Science 321(5885):53--57. doi:10.1126/science.1149381

work page doi:10.1126/science.1149381 2008
[7]

In: Wallach H, Larochelle H, Beygelzimer A, et al (eds) Advances in Neural Information Processing Systems, pp 15629--15641

Maheswaranathan N, Williams A, Golub M, et al (2019) Universality and individuality in neural dynamics across large populations of recurrent networks. In: Wallach H, Larochelle H, Beygelzimer A, et al (eds) Advances in Neural Information Processing Systems, pp 15629--15641

work page 2019
[8]

Nature neuroscience 14:133--8

Marder E, Taylor AL (2011) Multiple models to capture the variability in biological neurons and networks. Nature neuroscience 14:133--8. doi:10.1038/nn.2735

work page doi:10.1038/nn.2735 2011
[9]

PLOS Computational Biology 20(2):e1011852

Pals M, Macke JH, Barak O (2024) Trained recurrent neural networks develop phase-locked limit cycles in a working memory task. PLOS Computational Biology 20(2):e1011852. doi:10.1371/journal.pcbi.1011852, publisher: Public Library of Science

work page doi:10.1371/journal.pcbi.1011852 2024
[10]

In: Dasgupta S, McAllester D (eds) Proceedings of the 30th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol 28

Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: Dasgupta S, McAllester D (eds) Proceedings of the 30th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol 28. PMLR, Atlanta, Georgia, USA, pp 1310--1318

work page 2013
[11]

Nature Communications 12(1):5791

Perez-Nieves N, Leung VCH, Dragotti PL, et al (2021) Neural heterogeneity promotes robust learning. Nature Communications 12(1):5791. doi:10.1038/s41467-021-26022-3

work page doi:10.1038/s41467-021-26022-3 2021
[12]

Scientific Reports 10:11360

Quax SC, D’Asaro M, Van Gerven MAJ (2020) Adaptive time scales in recurrent neural networks. Scientific Reports 10:11360. doi:10.1038/s41598-020-68169-x

work page doi:10.1038/s41598-020-68169-x 2020
[13]

Proceedings of the National Academy of Sciences 122(3):e2316745122

Rungratsameetaweemana N, Kim R, Chotibut T, et al (2025) Random noise promotes slow heterogeneous synaptic dynamics important for robust working memory computation. Proceedings of the National Academy of Sciences 122(3):e2316745122. doi:10.1073/pnas.2316745122

work page doi:10.1073/pnas.2316745122 2025
[14]

PLoS computational biology 12(2):e1004792

Song HF, Yang GR, Wang XJ (2016) Training excitatory-inhibitory recurrent neural networks for cognitive tasks: a simple and flexible framework. PLoS computational biology 12(2):e1004792. doi:10.1371/journal.pcbi.1004792

work page doi:10.1371/journal.pcbi.1004792 2016
[15]

eLife 12:e86552

Stern M, Istrate N, Mazzucato L (2023) A reservoir of timescales emerges in recurrent circuits with heterogeneous neural assemblies. eLife 12:e86552. doi:10.7554/eLife.86552

work page doi:10.7554/elife.86552 2023
[16]

Neuron 63(4):544--557

Sussillo D, Abbott LF (2009) Generating coherent patterns of activity from chaotic neural networks. Neuron 63(4):544--557. doi:10.1016/j.neuron.2009.07.018

work page doi:10.1016/j.neuron.2009.07.018 2009
[17]

Neural Computation 25(3):626--649

Sussillo D, Barak O (2013) Opening the Black Box : Low - Dimensional Dynamics in High - Dimensional Recurrent Neural Networks . Neural Computation 25(3):626--649. doi:10.1162/NECO_a_00409

work page doi:10.1162/neco_a_00409 2013
[18]

Cognitive Neurodynamics 20:5

Tomoda Y, Tsuda I, Yamaguti Y (2026) Emergence of functionally differentiated structures via mutual information minimization in recurrent neural networks. Cognitive Neurodynamics 20:5. doi:10.1007/s11571-025-10377-0

work page doi:10.1007/s11571-025-10377-0 2026
[19]

E., et al

Virtanen P, Gommers R, Oliphant TE, et al (2020) SciPy 1.0: fundamental algorithms for scientific computing in Python . Nature Methods 17(3):261--272. doi:10.1038/s41592-019-0686-2

work page doi:10.1038/s41592-019-0686-2 2020
[20]

Chaos: An Interdisciplinary Journal of Nonlinear Science 31(1):013137

Yamaguti Y, Tsuda I (2021) Functional differentiations in evolutionary reservoir computing networks. Chaos: An Interdisciplinary Journal of Nonlinear Science 31(1):013137. doi:10.1063/5.0019116

work page doi:10.1063/5.0019116 2021
[21]

PLoS Computational Biology 4(11):e1000220

Yamashita Y, Tani J (2008) Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model : A Humanoid Robot Experiment . PLoS Computational Biology 4(11):e1000220. doi:10.1371/journal.pcbi.1000220

work page doi:10.1371/journal.pcbi.1000220 2008
[22]

Nature neuroscience 22(2):297--306

Yang GR, Joglekar MR, Song HF, et al (2019) Task representations in neural networks trained to perform many cognitive tasks. Nature neuroscience 22(2):297--306. doi:10.1038/s41593-018-0310-2

work page doi:10.1038/s41593-018-0310-2 2019
[23]

Scientific Reports 14:26388

Zemlianova K, Bose A, Rinzel J (2024) Dynamical mechanisms of how an RNN keeps a beat, uncovered with a low-dimensional reduced model. Scientific Reports 14:26388. doi:10.1038/s41598-024-77849-x

work page doi:10.1038/s41598-024-77849-x 2024