Parallelized Hierarchical Connectome: A Spatiotemporal Recurrent Framework for Spiking State-Space Models

Po-Han Chiang

arxiv: 2604.01295 · v2 · pith:SFDPJ6YDnew · submitted 2026-04-01 · 🧬 q-bio.NC

Parallelized Hierarchical Connectome: A Spatiotemporal Recurrent Framework for Spiking State-Space Models

Po-Han Chiang This is my paper

Pith reviewed 2026-05-21 10:10 UTC · model grok-4.3

classification 🧬 q-bio.NC

keywords spiking state-space modelshierarchical connectomebiological priorsparallel scan trainingspatiotemporal recurrenceneuromorphic deploymentDale's lawSTDP

0 comments

The pith

A hierarchical connectome framework upgrades temporal state-space models into spatiotemporal networks that integrate five biological neuron priors at low parameter cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents the Parallelized Hierarchical Connectome as a general way to add spatial recurrence to standard state-space models. It maps the SSM diagonal core to a shared neuron layer and connection patterns to a synapse layer, then links them with a multi-transmission loop that performs spatial iterations inside each time step. The resulting structure supports direct addition of adaptive LIF neurons, synaptic delays, short-term plasticity, Dale's law with excitatory-inhibitory asymmetry, and spike-timing-dependent plasticity. When instantiated as PHCSSM, the model reaches test accuracy comparable to larger state-of-the-art SSMs on long sequences while using only 1,312 to 4,891 trainable parameters. The same weights also run as a sequential spiking network that converges to the parallel training mode and deploys on hardware ranging from GPUs to microcontrollers.

Core claim

The paper claims that mapping the diagonal SSM core to a shared Neuron Layer and inter-neuronal communication to a Synapse Layer, reconnected by a Multi-Transmission Loop iterating spatial recurrence within each temporal window, produces spatiotemporal recurrent networks at Theta(D^2) parameter complexity. This architecture integrates adaptive LIF, synaptic delay, STP, Dale's Law with E/I-asymmetric topology, and STDP while preserving parallel-scan training. The PHCSSM instantiation achieves competitive accuracy on long-sequence tasks with 1,312 to 4,891 parameters and supports direct sequential RSNN deployment that converges to the parallel mode without ANN-to-SNN conversion, with cross-hab

What carries the argument

The Multi-Transmission Loop that iterates spatial recurrence inside each temporal window, connecting the Neuron Layer derived from the SSM diagonal core to the Synapse Layer that carries inter-neuronal communication.

If this is right

PHCSSM becomes the first spiking SSM to combine all five listed biological priors in one model.
Training remains parallel-scan while parameter count stays at 1,312–4,891 for competitive accuracy.
The same trained weights support direct deployment as a sequential recurrent spiking network.
Cross-backend reproducibility holds across x86 CPU, H100 GPU, Cortex-A76, and Cortex-M4F hardware.
The architecture bridges parallel-scan SSM training with biologically grounded RSNN execution.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation of temporal and spatial recurrence may allow neuromorphic chips to run the model with native spike-based communication rather than emulated floating-point operations.
The low parameter count could support scaling the same structure to longer sequences or more regions without the quadratic blow-up seen in stacked SSMs.
Direct convergence between parallel and sequential modes suggests the framework could serve as a single set of weights for both cloud training and edge inference without conversion steps.

Load-bearing premise

The diagonal SSM core can be mapped to a shared neuron layer and inter-neuronal signals to a synapse layer reconnected by a multi-transmission loop without losing parallel-scan training or the ability to add the listed biological priors.

What would settle it

A side-by-side evaluation on the same long-sequence benchmarks in which PHCSSM accuracy falls more than a few percent below the best standard SSM baselines, or in which sequential RSNN inference on held-out weights diverges from parallel-scan outputs.

Figures

Figures reproduced from arXiv: 2604.01295 by Po-Han Chiang.

**Figure 1.** Figure 1: Comparison of conventional stacked SSMs and the Parallelized Hierarchical Connectome (PHC) framework. (A) Conventional SSMs process sequences through stacked layers connected via unidirectional feedforward MLPs, corresponding to L independent diagonal state-transition matrices and L dense inter-layer weight matrices, supporting only temporal recurrence with no lateral or feedback interactions. (B) The PH… view at source ↗

**Figure 2.** Figure 2: Structural isomorphism between stacked SSMs and the PHC framework. Left: A conventional L-layer stacked SSM, where L independent diagonal state-transition matrices (Layer 0, Layer 1) are interleaved with L independent dense MLPs (MLP1, MLP2), each with non-shared parameters, forming a one-direction forward connection. Right: The PHC framework collapses this vertical stack into a single spatial plane. Each… view at source ↗

**Figure 3.** Figure 3: Detailed signal flow of the PHCSSM forward pass. The input sequence is projected via a linear encoder and gated by an input mask restricting sensory drive to designated populations. Within the Multi-Transmission Loop, the NL performs three sequential diagonal parallel scans (membrane potential, adaptive threshold, and refractory suppression) followed by pointwise spike generation (ALIF). The SL applies a s… view at source ↗

read the original abstract

This work presents the Parallelized Hierarchical Connectome (PHC), a general architectural framework that upgrades temporal-only State-Space Models (SSMs) into spatiotemporal recurrent networks. Conventional SSMs achieve parallel-scan training but are limited to temporal recurrence, lacking lateral or feedback interactions within a single timestep. PHC maps the diagonal SSM core to a shared Neuron Layer and inter-neuronal communication to a shared Synapse Layer of hierarchical regions, reconnected by a Multi-Transmission Loop iterating spatial recurrence within each temporal window, at parameter complexity Theta(D^2) versus Theta(D^2 L) of stacked SSMs. This spatiotemporal framework enables the seamless integration of neuro-physical priors typically intractable for standard SSMs, including adaptive LIF, synaptic delay, STP, Dale's Law with E/I-asymmetric topology, and STDP. The framework is instantiated as PHCSSM, the first spiking SSM that integrates all five biological priors and is evaluated on long-sequence data, achieving test accuracy competitive with state-of-the-art SSM baselines at 1,312 to 4,891 trainable parameters (1 to 4 orders of magnitude smaller than every baseline). PHCSSM further admits a sequential recurrent spiking neural network (RSNN) deployment mode that converges asymptotically to the parallel-scan training mode without artificial-neural-network-to-spiking-neural-network (ANN-to-SNN) conversion, with cross-backend reproducibility verified across four hardware backends (x86 CPU, H100 GPU, Cortex-A76, Cortex-M4F) including end-to-end deployment on the Cortex-M4F microcontroller (40 KB SRAM, 128 KB Flash). PHCSSM thereby bridges parallel-scan SSM and biologically grounded RSNN, two paradigms with previously incompatible training regimes, into a single architecture and trained weights.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper maps diagonal SSMs to a neuron-synapse hierarchy with a multi-transmission loop to fold in five spiking biological priors while claiming to keep parallel training and enable direct sequential deployment.

read the letter

The core move here is taking the diagonal SSM core, mapping it to a shared neuron layer, routing inter-neuronal communication through a synapse layer, and closing the spatial part with a multi-transmission loop that runs inside each time window. This setup is meant to let them add adaptive LIF, synaptic delay, short-term plasticity, Dale's law with E/I-asymmetric wiring, and STDP without losing the parallel-scan training path or needing an ANN-to-SNN conversion. The same weights are then supposed to run sequentially as an RSNN on hardware. The small parameter counts (1k to 5k) and the tests across CPU, GPU, and a Cortex-M4F microcontroller are the concrete practical pieces that stand out if they check out. Cross-backend reproducibility is a useful data point for anyone thinking about edge deployment. The soft spot is exactly the one the stress test flags. Once you insert threshold nonlinearities, history-dependent weights, and sign constraints, the combined spatiotemporal recurrence has to stay associative for the parallel scan to remain exact. If the loop ends up doing per-timestep iteration or approximation to handle those priors, training loses its efficiency edge and the Theta(D^2) versus Theta(D^2 L) comparison no longer holds. The abstract states competitive accuracy but supplies no numbers, baselines, or error bars, so the performance claim is hard to weigh without the full evaluation. This is aimed at people working on sequence models who want to add realistic neuron and synapse constraints, or neuromorphic engineers who need trainable spiking nets that scale. It deserves a serious referee to examine the loop implementation and verify whether the scan property survives the nonlinear additions. I would send it for review.

Referee Report

1 major / 2 minor

Summary. This paper presents the Parallelized Hierarchical Connectome (PHC), a general architectural framework that upgrades temporal-only State-Space Models (SSMs) into spatiotemporal recurrent networks. It maps the diagonal SSM core to a shared Neuron Layer and inter-neuronal communication to a Synapse Layer of hierarchical regions, reconnected by a Multi-Transmission Loop that iterates spatial recurrence within each temporal window. This enables seamless integration of five neuro-physical priors (adaptive LIF, synaptic delay, STP, Dale's Law with E/I-asymmetric topology, and STDP) into an instantiated spiking SSM called PHCSSM. PHCSSM is claimed to achieve test accuracy competitive with state-of-the-art SSM baselines using only 1,312 to 4,891 trainable parameters (1-4 orders of magnitude smaller), while also supporting sequential RSNN deployment that converges asymptotically to the parallel-scan mode without ANN-to-SNN conversion and demonstrating cross-backend reproducibility across x86 CPU, H100 GPU, Cortex-A76, and Cortex-M4F (including end-to-end microcontroller deployment).

Significance. If the central claims hold, particularly the preservation of parallel-scan training efficiency alongside the integration of multiple biological priors at Theta(D^2) parameter complexity, this work would meaningfully bridge two previously incompatible paradigms: parallelizable SSMs and biologically detailed recurrent spiking networks. The low parameter counts, hardware-agnostic reproducibility (including on severely constrained microcontrollers), and explicit multi-prior integration represent concrete strengths that could influence neuromorphic hardware design and long-sequence modeling in neuroscience-inspired AI.

major comments (1)

[Multi-Transmission Loop description and parallel-scan claims] The central claim that the Multi-Transmission Loop preserves exact parallel associative scan training when nonlinear priors (adaptive LIF threshold dynamics, history-dependent STP/STDP weights, and sign-constrained Dale's Law connectivity) are added to the diagonal SSM core is load-bearing for the efficiency and Theta(D^2) vs. Theta(D^2 L) arguments. The manuscript description does not supply an explicit derivation or proof that the combined spatiotemporal recurrence remains associative (or exactly linearizable) so that the temporal scan operator stays parallelizable; without this, it is unclear whether the loop implements the priors via a single linear state transition or via per-timestep iteration/fixed-point solves that would break the parallel-scan property and revert training to sequential methods.

minor comments (2)

[Abstract] Abstract: strong performance and reproducibility claims are made without any numerical accuracy values, baseline names, dataset identifiers, run counts, or error bars; adding at least the key quantitative highlights would improve standalone readability.
[Evaluation section] The evaluation is described as using 'long-sequence data' but the provided text supplies no dataset names, sequence lengths, or statistical details of the accuracy comparisons; these should be stated explicitly in the results section.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thorough and constructive review. The feedback on the Multi-Transmission Loop and its compatibility with parallel-scan training is particularly valuable. We address the major comment below and have revised the manuscript to strengthen the presentation.

read point-by-point responses

Referee: The central claim that the Multi-Transmission Loop preserves exact parallel associative scan training when nonlinear priors (adaptive LIF threshold dynamics, history-dependent STP/STDP weights, and sign-constrained Dale's Law connectivity) are added to the diagonal SSM core is load-bearing for the efficiency and Theta(D^2) vs. Theta(D^2 L) arguments. The manuscript description does not supply an explicit derivation or proof that the combined spatiotemporal recurrence remains associative (or exactly linearizable) so that the temporal scan operator stays parallelizable; without this, it is unclear whether the loop implements the priors via a single linear state transition or via per-timestep iteration/fixed-point solves that would break the parallel-scan property and revert training to sequential methods.

Authors: We agree that an explicit derivation is necessary to substantiate the central claim. The original manuscript described the Multi-Transmission Loop at a high level but did not include a formal proof of associativity under the nonlinear priors. In the revised manuscript we have added a new subsection (3.3) that derives the spatiotemporal recurrence explicitly. We show that the spatial iterations within each temporal window are resolved via a fixed number of linear updates to auxiliary state variables (for adaptive thresholds, STP, and STDP), after which the effective temporal transition matrix remains linear and associative. This formulation preserves the parallel-scan operator exactly, without per-timestep nonlinear solves or iteration that would break parallelism. A proof sketch demonstrating equivalence to the standard SSM scan is now included, along with a complexity analysis confirming the Theta(D^2) parameter scaling. We believe these additions directly address the concern. revision: yes

Circularity Check

0 steps flagged

No significant circularity in architectural mapping or claims

full rationale

The paper presents PHC as an architectural framework that maps the diagonal SSM core to a Neuron Layer, inter-neuronal communication to a Synapse Layer, and reconnects them via a Multi-Transmission Loop to incorporate spatiotemporal recurrence and biological priors at Theta(D^2) parameter complexity while preserving parallel-scan training. The abstract and provided text contain no equations, derivations, or self-referential steps that reduce the central efficiency or performance claims to fitted quantities defined by the same parameters, self-definitional loops, or load-bearing self-citations. The claims are framed as consequences of the design mapping rather than tautological reductions, with empirical support from accuracy comparisons and hardware deployment. This is a standard non-circular finding for an architectural proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Ledger populated from abstract descriptions only; full paper would likely add more parameters and assumptions around hierarchy depth and prior implementations.

axioms (1)

domain assumption Diagonal SSM core maps directly to shared Neuron Layer without functional loss.
This mapping is invoked to justify the spatiotemporal extension and low parameter scaling.

invented entities (1)

Multi-Transmission Loop no independent evidence
purpose: Iterates spatial recurrence within each temporal window.
New architectural component introduced to enable lateral and feedback interactions.

pith-pipeline@v0.9.0 · 5856 in / 1512 out tokens · 72455 ms · 2026-05-21T10:10:18.592458+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

PHC maps the diagonal SSM core to a shared Neuron Layer and inter-neuronal communication to a shared Synapse Layer... Multi-Transmission Loop enables intra-slice spatial recurrence... preserving O(log T) temporal parallelism
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

reformulating these non-linear biological dynamics into affine recurrences solvable via log-domain parallel prefix sums
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

integrates adaptive LIF, synaptic delay, STP, Dale’s Law with E/I-asymmetric topology, and STDP

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

ISBN 9798400704901

xLSTM: Extended long short-term memory, in: Ad- vances in Neural Information Processing Systems. URL: https://proceedings.neurips.cc/paper_files/paper/2024/hash/ c2ce2f2701c10a2b2f2ea0bfa43cfaa3-Abstract-Conference.html. Behrouz, A., Hashemi, F., 2024. Graph Mamba: Towards learning on graphs with state space models, in: Proceedings of the 30th ACM SIGKDD ...

work page doi:10.1145/3637528 2024
[2]

URL:https://openreview.net/forum?id= g4OTKRKfS7R

Liquid structural state-space models, in: Advances in Neural Infor- mation Processing Systems. URL:https://openreview.net/forum?id= g4OTKRKfS7R. Khalil, R., Moftah, M.Z., Moustafa, A.A., 2017. The effects of dynamical synapses on firing rate activity: A spiking neural network model. European Journal of Neuroscience 46, 2445–2470. doi:10.1111/ejn.13691. Le...

work page doi:10.1111/ejn.13691 2017

[1] [1]

ISBN 9798400704901

xLSTM: Extended long short-term memory, in: Ad- vances in Neural Information Processing Systems. URL: https://proceedings.neurips.cc/paper_files/paper/2024/hash/ c2ce2f2701c10a2b2f2ea0bfa43cfaa3-Abstract-Conference.html. Behrouz, A., Hashemi, F., 2024. Graph Mamba: Towards learning on graphs with state space models, in: Proceedings of the 30th ACM SIGKDD ...

work page doi:10.1145/3637528 2024

[2] [2]

URL:https://openreview.net/forum?id= g4OTKRKfS7R

Liquid structural state-space models, in: Advances in Neural Infor- mation Processing Systems. URL:https://openreview.net/forum?id= g4OTKRKfS7R. Khalil, R., Moftah, M.Z., Moustafa, A.A., 2017. The effects of dynamical synapses on firing rate activity: A spiking neural network model. European Journal of Neuroscience 46, 2445–2470. doi:10.1111/ejn.13691. Le...

work page doi:10.1111/ejn.13691 2017