Burst-dependent plasticity and dendritic amplification support target-based learning and hierarchical imitation learning

Cosimo Lupo; Cristiano Capone; Paolo Muratore; Pier Stanislao Paolucci

arxiv: 2201.11717 · v1 · submitted 2022-01-27 · 🧬 q-bio.NC

Burst-dependent plasticity and dendritic amplification support target-based learning and hierarchical imitation learning

Cristiano Capone , Cosimo Lupo , Paolo Muratore , Pier Stanislao Paolucci This is my paper

Pith reviewed 2026-05-24 12:11 UTC · model grok-4.3

classification 🧬 q-bio.NC

keywords multi-compartment neuronsburst-dependent plasticitytarget-based learninghierarchical imitation learningdendritic segregationpyramidal neuronsbiological learning models

0 comments

The pith

A multi-compartment pyramidal neuron model uses bursts and dendritic segregation to enable target-based learning without backpropagation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a neuron model with multiple compartments where bursts carry an internal target pattern that guides the network toward a solution. This setup lets learning occur by matching activity to the suggested pattern rather than computing errors across layers. A sympathetic reader would care because it offers a way for biological circuits to achieve efficient learning on complex tasks without the non-biological machinery of backpropagation. The same structure also lets the network decompose long decision sequences into simpler imitation subtasks.

Core claim

In a multi-compartment model of pyramidal neurons, bursts and dendritic input segregation support biological target-based learning by suggesting an internal spatio-temporal pattern of bursts to the network. This bypasses error backpropagation and credit assignment. The architecture also naturally supports hierarchical imitation learning by enabling decomposition of long-horizon decision-making tasks into simpler subtasks.

What carries the argument

Multi-compartment pyramidal neuron model with burst-dependent plasticity and dendritic input segregation that supplies an internal target pattern.

If this is right

Learning proceeds by matching network activity to an internally suggested burst pattern rather than by propagating errors backward.
Complex long-horizon tasks can be broken into simpler subtasks that the same architecture learns by imitation.
Credit assignment is handled locally through the separation of dendritic inputs from somatic bursts.
The model achieves state-of-the-art task performance while remaining composed of biologically motivated neuron compartments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The mechanism suggests a route by which cortical circuits could generate and use internal targets without an external teacher at every step.
Similar segregation of signals could be explored in artificial networks to improve sample efficiency on sequential decision problems.
The architecture implies that hierarchical task decomposition might emerge directly from the neuron-level separation of burst and dendritic computation.

Load-bearing premise

The proposed multi-compartment architecture with burst-dependent plasticity can be realized in real cortical circuits and providing an internal target pattern via bursts is biologically feasible without additional mechanisms for credit assignment.

What would settle it

Recording burst activity in cortical pyramidal neurons during a learning task to test whether the observed spatio-temporal burst patterns match an internal target solution when no external error signal is present.

Figures

Figures reproduced from arXiv: 2201.11717 by Cosimo Lupo, Cristiano Capone, Paolo Muratore, Pier Stanislao Paolucci.

**Figure 1.** Figure 1: Model structure A. The model of a pyramidal neuron, consisting of two separated compartments, the basal and the apical ones. The latter is further divided into two regions, proximal (receiving recurrent connections from the network) and distal (receiving teaching/context signals from other areas of the cortex). B. In addition to isolated spike signals emitted by the soma, a coincidence mechanism between ba… view at source ↗

**Figure 2.** Figure 2: Apical signals for dynamics selection. A. Model of pyramidal neuron where a binary context signal (A or B) is projected on the apical distal compartment. The target to be reproduced by the network changes according to which context is active. B. The network is able to reproduce the correct output trajectory even if the context is provided only in the first time steps. An alternative model in which the cont… view at source ↗

**Figure 3.** Figure 3: Hierarchical Imitation Learning A. A two-level network, where high-level neurons produce a signal that serves as a context for the neurons in the low-level network. The two subnetworks received two different but synchronized teaching signals in the training phase. B. Button-and-food task, an agent placed at an initial position (black cross) in a 2D maze has to first, reach a button (red circle) and then th… view at source ↗

**Figure 4.** Figure 4: Convergence of the target pattern of bursts. (left) D(B? n , B? n−1 )/(number of bursts) as a function of the number n of learning iterations, for different σtarg values (lower to higher values, from dark to light). (middle) Distance between the target and spontaneous pattern of bursts D(B? n , Bn) after n learning iterations. (right) Blue: average final D(B? n , B? n−1 )/(number of bursts) value as a func… view at source ↗

read the original abstract

The brain can learn to solve a wide range of tasks with high temporal and energetic efficiency. However, most biological models are composed of simple single compartment neurons and cannot achieve the state-of-art performances of artificial intelligence. We propose a multi-compartment model of pyramidal neuron, in which bursts and dendritic input segregation give the possibility to plausibly support a biological target-based learning. In target-based learning, the internal solution of a problem (a spatio temporal pattern of bursts in our case) is suggested to the network, bypassing the problems of error backpropagation and credit assignment. Finally, we show that this neuronal architecture naturally support the orchestration of hierarchical imitation learning, enabling the decomposition of challenging long-horizon decision-making tasks into simpler subtasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The model uses dendritic bursts for target-based learning but the source of those target patterns is left underspecified.

read the letter

The core idea is a multi-compartment pyramidal neuron where burst-dependent plasticity lets an internal spatio-temporal burst pattern act as the learning target. This is meant to sidestep backpropagation and credit assignment while also enabling hierarchical imitation learning by breaking long tasks into subtasks. That combination is the main new element relative to earlier burst or multi-compartment work. The paper does a reasonable job framing why dendritic segregation could make target-based signals biologically plausible and why that might help with hierarchical decomposition, which is a live question in both systems neuroscience and neuromorphic design. Credit for trying to connect the cellular mechanism directly to the functional outcome rather than stopping at the single-neuron level. The soft spot is exactly the stress-test concern: nothing in the abstract (and apparently little in the full text) explains how the correct burst target pattern is generated or routed to the right compartments in the first place. If that step requires an external oracle or additional circuitry that itself faces the credit-assignment problem, the bypass claim weakens. The paper would be stronger with explicit equations for the plasticity rule, simulations showing learning actually occurs under the stated conditions, and a clear account of where the targets come from. Without those, the architecture risks embedding the desired behavior by fiat. This is for readers already working on biologically grounded alternatives to backprop or on hierarchical control in spiking networks. It is coherent enough on its own terms to deserve referee time, though the central claim needs the missing mechanistic detail checked.

Referee Report

2 major / 2 minor

Summary. The paper proposes a multi-compartment pyramidal neuron model in which burst-dependent plasticity and segregation of dendritic inputs enable target-based learning: an internal spatio-temporal pattern of bursts is provided as the solution to a task, bypassing error backpropagation and credit assignment. The same architecture is claimed to naturally support hierarchical imitation learning by decomposing long-horizon tasks into simpler subtasks.

Significance. If the central claims were demonstrated with explicit mechanisms, derivations, and validation, the work would offer a concrete biological substrate for target-based learning rules that avoid the credit-assignment problem, with potential implications for both cortical computation and biologically inspired AI architectures. No machine-checked proofs, reproducible code, or falsifiable predictions are identified in the manuscript.

major comments (2)

[Abstract] Abstract: The claim that 'bursts ... give the possibility to plausibly support a biological target-based learning' by 'suggesting' the internal solution rests on an unspecified mechanism for generating and routing the correct spatio-temporal burst pattern to the relevant compartments. This mechanism is load-bearing for both the target-based learning claim and the hierarchical imitation claim, yet the manuscript provides no description of how the pattern is produced without either an external oracle or additional circuitry that would itself require credit assignment.
[Abstract] Abstract: The statement that the architecture 'naturally support[s] the orchestration of hierarchical imitation learning' is presented without any derivation, simulation protocol, or concrete mapping from burst patterns to subtask decomposition. Because the central contribution is the assertion that the architecture supports these learning modes, the absence of even a minimal formalization or result undermines evaluation of whether the claimed support is achieved or merely asserted.

minor comments (2)

[Abstract] Abstract: 'state-of-art' should read 'state-of-the-art'.
[Abstract] Abstract: 'naturally support the orchestration' should read 'naturally supports the orchestration'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript to improve clarity on the assumptions underlying the target burst patterns and the hierarchical learning claims.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that 'bursts ... give the possibility to plausibly support a biological target-based learning' by 'suggesting' the internal solution rests on an unspecified mechanism for generating and routing the correct spatio-temporal burst pattern to the relevant compartments. This mechanism is load-bearing for both the target-based learning claim and the hierarchical imitation claim, yet the manuscript provides no description of how the pattern is produced without either an external oracle or additional circuitry that would itself require credit assignment.

Authors: The model is explicitly framed within the target-based learning paradigm, in which an internal spatio-temporal burst pattern is provided as the solution (as stated in the abstract: 'the internal solution of a problem ... is suggested to the network'). The contribution centers on how burst-dependent plasticity and dendritic segregation enable local learning once such a target is available, thereby avoiding backpropagation-style credit assignment at the synaptic level. We do not provide a mechanism for generating the target pattern itself, as this is outside the scope of the local learning rule; the manuscript treats the pattern as an external input analogous to a teacher signal. We will revise the abstract and add a dedicated paragraph in the Discussion section to explicitly acknowledge this assumption and outline possible biological sources (e.g., top-down signals from prefrontal areas or recurrent dynamics) that could supply the pattern without requiring the same credit-assignment solution. revision: yes
Referee: [Abstract] Abstract: The statement that the architecture 'naturally support[s] the orchestration of hierarchical imitation learning' is presented without any derivation, simulation protocol, or concrete mapping from burst patterns to subtask decomposition. Because the central contribution is the assertion that the architecture supports these learning modes, the absence of even a minimal formalization or result undermines evaluation of whether the claimed support is achieved or merely asserted.

Authors: The claim of natural support for hierarchical imitation learning follows from the architecture's capacity to represent distinct subtasks via separable burst patterns that can be combined or sequenced without altering the underlying plasticity rule. However, we agree that the manuscript presents this conceptually without an explicit derivation or example. We will add a short illustrative example (or minimal simulation protocol) in a new subsection or appendix that maps specific burst patterns to subtask decomposition in a simple long-horizon task, thereby providing the requested concrete mapping. revision: yes

Circularity Check

0 steps flagged

No circularity: model proposal posits architecture enabling target-based learning without self-referential reduction in equations

full rationale

The paper proposes a multi-compartment pyramidal neuron model in which bursts and dendritic input segregation are claimed to support target-based learning by suggesting an internal spatio-temporal burst pattern, thereby bypassing backpropagation. No equations, derivations, or parameter-fitting steps are supplied in the abstract or description that would allow a prediction to reduce by construction to its own inputs, a fitted parameter, or a self-citation chain. The central claim is a biological-plausibility hypothesis rather than a closed mathematical derivation; the architecture is posited to enable the desired behavior, but the provided text contains no load-bearing step that can be shown to be equivalent to its inputs. This is the normal finding for an initial modeling proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the untested assertion that the described compartmental architecture can implement target-based learning and hierarchical imitation learning in a biologically plausible manner; no free parameters, axioms, or invented entities are quantified in the abstract.

axioms (1)

domain assumption Pyramidal neurons possess distinct somatic and dendritic compartments whose activity can be segregated.
Invoked implicitly when the abstract states that bursts and dendritic input segregation enable the learning modes.

invented entities (1)

Burst-dependent plasticity rule for target-based learning no independent evidence
purpose: To allow direct provision of internal solution patterns without backpropagation
Introduced in the abstract as the mechanism that supports target-based learning.

pith-pipeline@v0.9.0 · 5661 in / 1136 out tokens · 24760 ms · 2026-05-24T12:11:26.604347+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page
[2]

Bellec, G., Scherr, F., Subramoney, A., Hajek, E., Salaj, D., Legenstein, R., and Maass, W. (2020). A solution to the learning dilemma for recurrent networks of spiking neurons. Nature communications , 11(1):1--15

work page 2020
[3]

Capone, C., Muratore, P., and Paolucci, P. S. (2021). Error-based or target-based? a unifying framework for learning in recurrent spiking networks. arXiv preprint arXiv:2109.01039

work page arXiv 2021
[4]

Capone, C., Pastorelli, E., Golosio, B., and Paolucci, P. S. (2019). Sleep-like slow oscillations improve visual classification through synaptic homeostasis and memory association in a thalamo-cortical model. Scientific reports , 9(1):1--11

work page 2019
[5]

J., Rajan, K., Escola, G

DePasquale, B., Cueva, C. J., Rajan, K., Escola, G. S., and Abbott, L. (2018). full-force: A target-based method for training recurrent networks. PloS one , 13(2):e0191527

work page 2018
[6]

and Gr \"u ning, A

Gardner, B. and Gr \"u ning, A. (2016). Supervised learning in spiking neural networks for precise temporal encoding. PloS one , 11(8):e0161335

work page 2016
[7]

Goldman, J., Kusch, L., Hazalyalcinkaya, B., Depannemaecker, D., Nghiem, T.-A., Jirsa, V., and Destexhe, A. (2020). Brain-scale emergence of slow-wave synchrony and highly responsive asynchronous states based on biologically realistic population models simulated in the virtual brain. BioRxiv

work page 2020
[8]

P., and Richards, B

Guerguiev, J., Lillicrap, T. P., and Richards, B. A. (2017). Towards deep learning with segregated dendrites. Elife , 6:e22901

work page 2017
[9]

and Gerstner, W

Jimenez Rezende, D. and Gerstner, W. (2014). Stochastic variational learning in recurrent spiking networks. Frontiers in Computational Neuroscience , 8:38

work page 2014
[10]

Kaefer, K., Nardin, M., Blahna, K., and Csicsvari, J. (2020). Replay of behavioral sequences in the medial prefrontal cortex during rule switching. Neuron , 106(1):154--165

work page 2020
[11]

Larkum, M. (2013). A cellular mechanism for cortical associations: an organizing principle for the cerebral cortex. Trends in neurosciences , 36(3):141--151

work page 2013
[12]

Le, H., Jiang, N., Agarwal, A., Dudik, M., Yue, Y., and Daum \'e III, H. (2018). Hierarchical imitation and reinforcement learning. In International conference on machine learning , pages 2917--2926. PMLR

work page 2018
[13]

Lee, D.-H., Zhang, S., Fischer, A., and Bengio, Y. (2015). Difference target propagation. In Joint european conference on machine learning and knowledge discovery in databases , pages 498--515. Springer

work page 2015
[14]

and Spratling, M

Manchev, N. and Spratling, M. W. (2020). Target propagation in recurrent neural networks. J. Mach. Learn. Res. , 21:7--1

work page 2020
[15]

S., Suykens, J

Meulemans, A., Carzaniga, F. S., Suykens, J. A., Sacramento, J., and Grewe, B. F. (2020). A theoretical framework for target propagation. arXiv preprint arXiv:2006.14331

work page arXiv 2020
[16]

Muratore, P., Capone, C., and Paolucci, P. S. (2021). Target spike patterns enable efficient and biologically plausible learning for complex temporal tasks. PloS one , 16(2):e0247014

work page 2021
[17]

and Clopath, C

Nicola, W. and Clopath, C. (2017). Supervised learning in spiking neural networks with force training. Nature communications , 8(1):2208

work page 2017
[18]

Pateria, S., Subagdja, B., Tan, A.-h., and Quek, C. (2021). Hierarchical reinforcement learning: A comprehensive survey. ACM Computing Surveys (CSUR) , 54(5):1--35

work page 2021
[19]

A., and Naud, R

Payeur, A., Guerguiev, J., Zenke, F., Richards, B. A., and Naud, R. (2021). Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. Nature neuroscience , pages 1--10

work page 2021
[20]

Pfister, J.-P., Toyoizumi, T., Barber, D., and Gerstner, W. (2006). Optimal spike-timing-dependent plasticity for precise action potential firing in supervised learning. Neural computation , 18(6):1318--1348

work page 2006
[21]

and Papoutsi, A

Poirazi, P. and Papoutsi, A. (2020). Illuminating dendritic function with computational models. Nature Reviews Neuroscience , 21(6):303--321

work page 2020
[22]

a., Ponte Costa, R., Bengio, Y., and Senn, W

Sacramento, J. a., Ponte Costa, R., Bengio, Y., and Senn, W. (2018). Dendritic cortical microcircuits approximate the backpropagation algorithm. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R., editors, Advances in Neural Information Processing Systems 31 , pages 8721--8732. Curran Associates, Inc

work page 2018
[23]

V., and Mattia, M

Tort-Colet, N., Capone, C., Sanchez-Vives, M. V., and Mattia, M. (2021). Attractor competition enriches cortical dynamics during awakening from anesthesia. Cell Reports , 35(12):109270

work page 2021
[24]

and Senn, W

Urbanczik, R. and Senn, W. (2014). Learning by the dendritic prediction of somatic spiking. Neuron , 81(3):521--528

work page 2014
[25]

P., Komarov, M., and Bazhenov, M

Wei, Y., Krishnan, G. P., Komarov, M., and Bazhenov, M. (2018). Differential roles of sleep spindles and sleep slow oscillations in memory consolidation. PLoS computational biology , 14(7):e1006322

work page 2018

[1] [1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page

[2] [2]

Bellec, G., Scherr, F., Subramoney, A., Hajek, E., Salaj, D., Legenstein, R., and Maass, W. (2020). A solution to the learning dilemma for recurrent networks of spiking neurons. Nature communications , 11(1):1--15

work page 2020

[3] [3]

Capone, C., Muratore, P., and Paolucci, P. S. (2021). Error-based or target-based? a unifying framework for learning in recurrent spiking networks. arXiv preprint arXiv:2109.01039

work page arXiv 2021

[4] [4]

Capone, C., Pastorelli, E., Golosio, B., and Paolucci, P. S. (2019). Sleep-like slow oscillations improve visual classification through synaptic homeostasis and memory association in a thalamo-cortical model. Scientific reports , 9(1):1--11

work page 2019

[5] [5]

J., Rajan, K., Escola, G

DePasquale, B., Cueva, C. J., Rajan, K., Escola, G. S., and Abbott, L. (2018). full-force: A target-based method for training recurrent networks. PloS one , 13(2):e0191527

work page 2018

[6] [6]

and Gr \"u ning, A

Gardner, B. and Gr \"u ning, A. (2016). Supervised learning in spiking neural networks for precise temporal encoding. PloS one , 11(8):e0161335

work page 2016

[7] [7]

Goldman, J., Kusch, L., Hazalyalcinkaya, B., Depannemaecker, D., Nghiem, T.-A., Jirsa, V., and Destexhe, A. (2020). Brain-scale emergence of slow-wave synchrony and highly responsive asynchronous states based on biologically realistic population models simulated in the virtual brain. BioRxiv

work page 2020

[8] [8]

P., and Richards, B

Guerguiev, J., Lillicrap, T. P., and Richards, B. A. (2017). Towards deep learning with segregated dendrites. Elife , 6:e22901

work page 2017

[9] [9]

and Gerstner, W

Jimenez Rezende, D. and Gerstner, W. (2014). Stochastic variational learning in recurrent spiking networks. Frontiers in Computational Neuroscience , 8:38

work page 2014

[10] [10]

Kaefer, K., Nardin, M., Blahna, K., and Csicsvari, J. (2020). Replay of behavioral sequences in the medial prefrontal cortex during rule switching. Neuron , 106(1):154--165

work page 2020

[11] [11]

Larkum, M. (2013). A cellular mechanism for cortical associations: an organizing principle for the cerebral cortex. Trends in neurosciences , 36(3):141--151

work page 2013

[12] [12]

Le, H., Jiang, N., Agarwal, A., Dudik, M., Yue, Y., and Daum \'e III, H. (2018). Hierarchical imitation and reinforcement learning. In International conference on machine learning , pages 2917--2926. PMLR

work page 2018

[13] [13]

Lee, D.-H., Zhang, S., Fischer, A., and Bengio, Y. (2015). Difference target propagation. In Joint european conference on machine learning and knowledge discovery in databases , pages 498--515. Springer

work page 2015

[14] [14]

and Spratling, M

Manchev, N. and Spratling, M. W. (2020). Target propagation in recurrent neural networks. J. Mach. Learn. Res. , 21:7--1

work page 2020

[15] [15]

S., Suykens, J

Meulemans, A., Carzaniga, F. S., Suykens, J. A., Sacramento, J., and Grewe, B. F. (2020). A theoretical framework for target propagation. arXiv preprint arXiv:2006.14331

work page arXiv 2020

[16] [16]

Muratore, P., Capone, C., and Paolucci, P. S. (2021). Target spike patterns enable efficient and biologically plausible learning for complex temporal tasks. PloS one , 16(2):e0247014

work page 2021

[17] [17]

and Clopath, C

Nicola, W. and Clopath, C. (2017). Supervised learning in spiking neural networks with force training. Nature communications , 8(1):2208

work page 2017

[18] [18]

Pateria, S., Subagdja, B., Tan, A.-h., and Quek, C. (2021). Hierarchical reinforcement learning: A comprehensive survey. ACM Computing Surveys (CSUR) , 54(5):1--35

work page 2021

[19] [19]

A., and Naud, R

Payeur, A., Guerguiev, J., Zenke, F., Richards, B. A., and Naud, R. (2021). Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. Nature neuroscience , pages 1--10

work page 2021

[20] [20]

Pfister, J.-P., Toyoizumi, T., Barber, D., and Gerstner, W. (2006). Optimal spike-timing-dependent plasticity for precise action potential firing in supervised learning. Neural computation , 18(6):1318--1348

work page 2006

[21] [21]

and Papoutsi, A

Poirazi, P. and Papoutsi, A. (2020). Illuminating dendritic function with computational models. Nature Reviews Neuroscience , 21(6):303--321

work page 2020

[22] [22]

a., Ponte Costa, R., Bengio, Y., and Senn, W

Sacramento, J. a., Ponte Costa, R., Bengio, Y., and Senn, W. (2018). Dendritic cortical microcircuits approximate the backpropagation algorithm. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R., editors, Advances in Neural Information Processing Systems 31 , pages 8721--8732. Curran Associates, Inc

work page 2018

[23] [23]

V., and Mattia, M

Tort-Colet, N., Capone, C., Sanchez-Vives, M. V., and Mattia, M. (2021). Attractor competition enriches cortical dynamics during awakening from anesthesia. Cell Reports , 35(12):109270

work page 2021

[24] [24]

and Senn, W

Urbanczik, R. and Senn, W. (2014). Learning by the dendritic prediction of somatic spiking. Neuron , 81(3):521--528

work page 2014

[25] [25]

P., Komarov, M., and Bazhenov, M

Wei, Y., Krishnan, G. P., Komarov, M., and Bazhenov, M. (2018). Differential roles of sleep spindles and sleep slow oscillations in memory consolidation. PLoS computational biology , 14(7):e1006322

work page 2018