pith. sign in

arxiv: 2401.10748 · v2 · submitted 2023-12-28 · 💻 cs.NE · cs.LG

Fast gradient-free activation maximization for neurons in spiking neural networks

Pith reviewed 2026-05-24 05:02 UTC · model grok-4.3

classification 💻 cs.NE cs.LG
keywords activation maximizationspiking neural networkstensor train decompositiongradient-free optimizationneuron selectivitygenerative modelsSN-GANVQ-VAE
0
0 comments X

The pith

A low-rank Tensor Train decomposition of the activation function enables gradient-free maximization of neuron responses in spiking neural networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an optimization technique to discover the stimuli that most strongly activate individual neurons in spiking neural networks. It uses low-rank Tensor Train decomposition to handle the discrete nature of spiking activations and performs the search in the latent space of generative models such as SN-GAN or VQ-VAE. This approach makes it possible to track how neuron selectivity develops over the course of training in convolutional spiking networks. A sympathetic reader would care because the method supplies a practical way to interpret the features learned by spiking models, which simulate biological neural activity and are otherwise difficult to probe with gradient-based tools.

Core claim

The optimization method based on the low-rank Tensor Train decomposition of the discrete activation function allows effective activation maximization for neurons in spiking neural networks. This is the first time effective AM has been applied to SNNs. Highly selective neurons can form already in the early epochs of training and in the early layers of a convolutional spiking network. This formation of refined optimal stimuli is associated with an increase in classification accuracy. Some neurons, especially in the deeper layers, may gradually change the concepts they are selective for during learning.

What carries the argument

Low-rank Tensor Train decomposition of the discrete activation function, which approximates neuron responses to enable optimization over the latent parameter space of SN-GAN or VQ-VAE generative models without gradients.

If this is right

  • Highly selective neurons form in early epochs and early layers of convolutional spiking networks.
  • Formation of refined optimal stimuli correlates with gains in classification accuracy.
  • Neurons in deeper layers can shift the concepts they respond to during learning.
  • The framework supports efficient iterative stimulus adjustment to maximize responses with fewer iterations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be adapted to probe selectivity in other non-differentiable models beyond SNNs.
  • Early selectivity formation might indicate that important features are learned rapidly in spiking architectures.
  • Applying the technique to real biological neuron recordings could test its relevance to living systems.
  • Reducing iterations in the feedback loop has implications for efficient interpretation of large networks.

Load-bearing premise

The low-rank Tensor Train approximation of the activation function in the generative model's latent space still identifies the true stimuli that maximize neuron activation without significant distortion.

What would settle it

Compare the activation levels produced by the Tensor Train-optimized stimuli against those from random sampling or other optimization baselines on the same trained SNN; failure to produce consistently higher activations would falsify the effectiveness claim.

Figures

Figures reproduced from arXiv: 2401.10748 by Andrei Chertkov, Ivan Oseledets, Konstantin Anokhin, Maxim Beketov, Nikita Pospelov.

Figure 1
Figure 1. Figure 1: Spiking neuron models (image from snnTorch online tutorial [41]) [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Schematic of the PROTES method. For our problem we’ve tried several TT-based optimization methods, and found PROTES [51] to be the most effective. PROTES stands for “Probabilistic Optimization with Tensor Sampling” – it is a probabilistic optimization method. The main idea is similar to that of Simulated Annealing methods (see [52]) of the Monte-Carlo family. While gradient-based methods may be stuck in lo… view at source ↗
Figure 3
Figure 3. Figure 3: Schematic of the MANGO framework. Within the framework, one can select a dataset, a generator model, a target neural network model, and an optimization method. One can also generate MEI search using various hardware backends and save and analyze the results. The following list shows the options that have been added to the framework: 1. Datasets: MNIST [60], Fashion-MNIST [61], CIFAR10 [54], Imagenet [62] 2… view at source ↗
Figure 4
Figure 4. Figure 4: Activation of one selected neuron (unit 0, first LIF spiking layer, spiking [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: “Pink horse” neuron 52 from LIF layer 1.1 of spiking ResNet18. Images [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Distribution of neurons by entropy of class prediction probabilities (lower [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Fraction of selective neurons. Horizontal axis - epoch number, vertical [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Emergence of neuronal specializations. Horizontal axis – layer depth (in [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Left: Dynamics of neurons’ activations in response to their MEIs. Right: the same for the first 100 epochs with the model classification accuracy in the same axes. 13 [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Left: fractions of non-specialized, stable (selective to only 1 class) and labile (selective to 2 or more) neurons vs layer depth. Right: oscillating MEIs of labile “bird-ship” neuron, rows: epoch # 40, 100, 300 of training; columns: results obtained with TT, TT-S, TT-B optimization methods. The captions above each MEI provide information about the predominant class according to classifica￾tion results, t… view at source ↗
Figure 11
Figure 11. Figure 11: Diversity of specializations in terms of distances between MEIs in the [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Complexity of MEIs, measured as compressed file size (JPEG and ZIP [PITH_FULL_IMAGE:figures/full_fig_p015_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: MEIs of first spiking (sn1) layer neurons. [PITH_FULL_IMAGE:figures/full_fig_p023_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: MEIs of intermediary (layer2.1.sn1) layer neurons. [PITH_FULL_IMAGE:figures/full_fig_p024_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: MEIs of last spiking layer (layer4.1.sn1). [PITH_FULL_IMAGE:figures/full_fig_p025_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: 10 classes of images in CIFAR10 dataset [54] [PITH_FULL_IMAGE:figures/full_fig_p025_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: MEIs for unit 16 of the first spiking layer from spiking ResNet18. [PITH_FULL_IMAGE:figures/full_fig_p027_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: MEIs for unit 35 of the first spiking layer from spiking ResNet18. [PITH_FULL_IMAGE:figures/full_fig_p027_18.png] view at source ↗
read the original abstract

Elements of neural networks, both biological and artificial, can be described by their selectivity for specific cognitive features. Understanding these features is important for understanding the inner workings of neural networks. For a living system, such as a neuron, whose response to a stimulus is unknown and not differentiable, the only way to reveal these features is through a feedback loop that exposes it to a large set of different stimuli. The properties of these stimuli should be varied iteratively in order to maximize the neuronal response. To utilize this feedback loop for a biological neural network, it is important to run it quickly and efficiently in order to reach the stimuli that maximizes certain neurons' activation with the least number of iterations possible. Here we present a framework with an efficient design for such a loop. We successfully tested it on an artificial spiking neural network (SNN), which is a model that simulates the asynchronous spiking activity of neurons in living brains. Our optimization method for activation maximization is based on the low-rank Tensor Train decomposition of the discrete activation function. The optimization space is the latent parameter space of images generated by SN-GAN or VQ-VAE generative models. To our knowledge, this is the first time that effective AM has been applied to SNNs. We track changes in the optimal stimuli for artificial neurons during training and show that highly selective neurons can form already in the early epochs of training and in the early layers of a convolutional spiking network. This formation of refined optimal stimuli is associated with an increase in classification accuracy. Some neurons, especially in the deeper layers, may gradually change the concepts they are selective for during learning, potentially explaining their importance for model performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript presents a gradient-free activation maximization (AM) framework for spiking neural networks (SNNs) that decomposes the discrete, non-differentiable activation function via low-rank Tensor Train (TT) approximation and optimizes in the latent space of SN-GAN or VQ-VAE generative models. It claims this is the first effective application of AM to SNNs, reports that highly selective neurons form early in training (including in early layers of convolutional SNNs), and links this formation to gains in classification accuracy; some deeper-layer neurons are observed to shift their preferred concepts during learning.

Significance. If the TT approximation and generative-manifold optimization reliably recover the true maximizing stimuli without substantial distortion, the method would supply a practical tool for interpreting non-differentiable SNN neurons and could illuminate how selectivity emerges during training. The early-epoch selectivity observation, if quantitatively tied to accuracy, would be a useful empirical contribution to SNN training dynamics. The approach is novel in its use of TT decomposition for this discrete setting, but its utility hinges on unverified preservation of maxima.

major comments (3)
  1. [Abstract and §4] Abstract and §4 (Experiments): the central claim of 'effective AM' for SNNs is unsupported by any quantitative validation, baselines, error metrics, ablation studies, or statistical analysis; the abstract supplies only qualitative observations, so the headline result cannot be assessed.
  2. [§3] §3 (Method): the low-rank TT decomposition of the discrete activation function is asserted to enable optimization, yet no error bounds, rank-selection criteria, or analysis is provided to show that truncation preserves the location or value of the global maximum of the true spike-count map rather than smoothing sharp peaks.
  3. [§3 and §4] §3 and §4: the optimization occurs inside the latent manifold of SN-GAN/VQ-VAE; no verification is given that this manifold contains stimuli near the true optimum, nor any comparison to exhaustive search or direct (non-manifold) optimization to quantify distortion.
minor comments (2)
  1. [§2] Notation for the TT rank and the mapping from latent codes to stimuli should be defined explicitly in §2 or §3 to avoid ambiguity when reproducing the pipeline.
  2. [§4] Figure captions and axis labels in the results section would benefit from explicit mention of the number of trials or runs underlying each curve.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below. Where the comments correctly identify gaps in quantitative support or analysis, we agree that revisions are needed and will incorporate additional material in the next version of the manuscript.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (Experiments): the central claim of 'effective AM' for SNNs is unsupported by any quantitative validation, baselines, error metrics, ablation studies, or statistical analysis; the abstract supplies only qualitative observations, so the headline result cannot be assessed.

    Authors: We agree that the current version relies primarily on qualitative demonstrations. In the revised manuscript we will expand §4 to include quantitative metrics (maximum spike counts achieved versus random and baseline stimuli), ablation studies on TT rank and generative-model choice, and statistical reporting across multiple independent runs and network initializations. These additions will allow readers to assess the effectiveness of the method more rigorously. revision: yes

  2. Referee: [§3] §3 (Method): the low-rank TT decomposition of the discrete activation function is asserted to enable optimization, yet no error bounds, rank-selection criteria, or analysis is provided to show that truncation preserves the location or value of the global maximum of the true spike-count map rather than smoothing sharp peaks.

    Authors: The TT decomposition is introduced to make the otherwise intractable discrete optimization tractable by providing a low-rank continuous surrogate. Ranks were chosen empirically by monitoring reconstruction error on held-out activation samples. While we do not supply theoretical error bounds guaranteeing preservation of the exact global maximum, the empirical results indicate that the recovered stimuli elicit substantially higher responses than random inputs. We will add an explicit description of the rank-selection procedure and an empirical study of how approximation error correlates with the quality of the recovered maxima. revision: yes

  3. Referee: [§3 and §4] §3 and §4: the optimization occurs inside the latent manifold of SN-GAN/VQ-VAE; no verification is given that this manifold contains stimuli near the true optimum, nor any comparison to exhaustive search or direct (non-manifold) optimization to quantify distortion.

    Authors: The generative-model latent space is used to restrict the search to stimuli that are statistically similar to natural images, which is standard practice for activation maximization in vision models. Exhaustive search over the full pixel space is computationally infeasible. We will add a limitations paragraph discussing the potential bias introduced by the manifold and, where computationally feasible, include a controlled comparison of latent-space optimization versus direct pixel-space optimization with an image prior. revision: partial

Circularity Check

0 steps flagged

No significant circularity; method presented as independent empirical framework

full rationale

The paper introduces a Tensor Train-based optimization framework for activation maximization in SNNs, operating in the latent space of generative models, and validates it through experiments on training dynamics and accuracy correlations. No equations, derivations, or self-citations are shown that reduce the central claims (effective AM, early selective neuron formation) to fitted quantities or prior results by construction. The approach is self-contained as an applied method with external testing, consistent with the default expectation of non-circularity for most papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no equations or implementation details, so no specific free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.0 · 5843 in / 1173 out tokens · 39675 ms · 2026-05-24T05:02:41.709628+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

92 extracted references · 92 canonical work pages · 6 internal anchors

  1. [1]

    D. H. Hubel and T. N. Wiesel, Brain and visual perception: the story of a 25-year collaboration . Oxford University Press, 2004

  2. [2]

    Bipoles is an optogenetic tool developed for bidirectional dual-color control of neurons,

    J. Vierock, S. Rodriguez-Rozada, A. Dieter, F. Pieper, R. Sims, F. Tenedini, A. C. Bergs, I. Ben- difallah, F. Zhou, N. Zeitzschel, et al. , “Bipoles is an optogenetic tool developed for bidirectional dual-color control of neurons,” Nature communications, vol. 12, no. 1, p. 4527, 2021

  3. [3]

    Single neuron responses underlying face recognition in the human midfusiform face-selective cortex,

    R. Quian Quiroga, M. Boscaglia, J. Jonas, H. G. Rey, X. Yan, L. Maillard, S. Colnat-Coulbois, L. Koessler, and B. Rossion, “Single neuron responses underlying face recognition in the human midfusiform face-selective cortex,” Nature Communications, vol. 14, Sept. 2023

  4. [4]

    Evolving images for visual neurons using a deep generative network reveals coding principles and neuronal preferences,

    C. R. Ponce, W. Xiao, P. F. Schade, T. S. Hartmann, G. Kreiman, and M. S. Livingstone, “Evolving images for visual neurons using a deep generative network reveals coding principles and neuronal preferences,” Cell, vol. 177, no. 4, pp. 999–1009, 2019

  5. [5]

    Face neurons encode nonse- mantic features,

    A. Bardon, W. Xiao, C. R. Ponce, M. S. Livingstone, and G. Kreiman, “Face neurons encode nonse- mantic features,” Proceedings of the national academy of sciences , vol. 119, no. 16, p. e2118705119, 2022

  6. [6]

    Neuroethics and animals: report and recommendations from the university of pennsylvania animal research neuroethics workshop,

    A. J. Shriver and T. M. John, “Neuroethics and animals: report and recommendations from the university of pennsylvania animal research neuroethics workshop,” ILAR journal , vol. 60, no. 3, pp. 424–433, 2019

  7. [7]

    Ai ethics: the case for including animals,

    P. Singer and Y. F. Tse, “Ai ethics: the case for including animals,” AI and Ethics , vol. 3, no. 2, pp. 539–551, 2023

  8. [8]

    A critique of pure learning and what artificial neural networks can learn from animal brains,

    A. M. Zador, “A critique of pure learning and what artificial neural networks can learn from animal brains,” Nature communications, vol. 10, no. 1, p. 3770, 2019

  9. [9]

    No free lunch from deep learning in neuroscience: A case study through models of the entorhinal-hippocampal circuit,

    R. Schaeffer, M. Khona, and I. Fiete, “No free lunch from deep learning in neuroscience: A case study through models of the entorhinal-hippocampal circuit,” Advances in Neural Information Processing Systems, vol. 35, pp. 16052–16067, 2022

  10. [10]

    Brain-like functional specialization emerges spontaneously in deep neural networks,

    K. Dobs, J. Martinez, A. J. Kell, and N. Kanwisher, “Brain-like functional specialization emerges spontaneously in deep neural networks,” Science advances, vol. 8, no. 11, p. eabl8913, 2022

  11. [11]

    Multimodal neurons in artificial neural networks,

    G. Goh, N. Cammarata, C. Voss, S. Carter, M. Petrov, L. Schubert, A. Radford, and C. Olah, “Multimodal neurons in artificial neural networks,” Distill, vol. 6, no. 3, p. e30, 2021. 18 Draft

  12. [12]

    Network dissection: Quantifying inter- pretability of deep visual representations,

    D. Bau, B. Zhou, A. Khosla, A. Oliva, and A. Torralba, “Network dissection: Quantifying inter- pretability of deep visual representations,” inProceedings of the IEEE conference on computer vision and pattern recognition, pp. 6541–6549, 2017

  13. [13]

    Feature visualization,

    C. Olah, A. Mordvintsev, and L. Schubert, “Feature visualization,” Distill, 2017. https://distill.pub/2017/feature-visualization

  14. [14]

    Nguyen, J

    A. Nguyen, J. Yosinski, and J. Clune, Understanding Neural Networks via Feature Visualization: A Survey, p. 55–76. Springer International Publishing, 2019

  15. [15]

    High-performance evolutionary algorithms for online neuron control,

    B. Wang and C. R. Ponce, “High-performance evolutionary algorithms for online neuron control,” in Proceedings of the Genetic and Evolutionary Computation Conference , pp. 1308–1316, 2022

  16. [16]

    Generative adversarial nets,

    I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,”Advances in neural information processing systems, vol. 27, 2014

  17. [17]

    Ttopt: A maximum volume quantized tensor train-based optimization and its application to reinforcement learning,

    K. Sozykin, A. Chertkov, R. Schutski, A.-H. Phan, A. S. CICHOCKI, and I. Oseledets, “Ttopt: A maximum volume quantized tensor train-based optimization and its application to reinforcement learning,” Advances in Neural Information Processing Systems , vol. 35, pp. 26052–26065, 2022

  18. [18]

    Tensor-train decomposition,

    I. V. Oseledets, “Tensor-train decomposition,” SIAM Journal on Scientific Computing, vol. 33, no. 5, pp. 2295–2317, 2011

  19. [19]

    Spiking neural networks and their applications: A review,

    K. Yamazaki, V.-K. Vo-Ho, D. Bulsara, and N. Le, “Spiking neural networks and their applications: A review,” Brain Sciences, vol. 12, no. 7, p. 863, 2022

  20. [20]

    The role of spike timing in the coding of stimulus location in rat somatosensory cortex,

    S. Panzeri, R. S. Petersen, S. R. Schultz, M. Lebedev, and M. E. Diamond, “The role of spike timing in the coding of stimulus location in rat somatosensory cortex,” Neuron, vol. 29, p. 769–777, Mar. 2001

  21. [21]

    Timing to be precise? an overview of spike timing-dependent plasticity, brain rhythmicity, and glial cells interplay within neuronal circuits,

    Y. Andrade-Talavera, A. Fisahn, and A. Rodr´ ıguez-Moreno, “Timing to be precise? an overview of spike timing-dependent plasticity, brain rhythmicity, and glial cells interplay within neuronal circuits,” Molecular Psychiatry, vol. 28, p. 2177–2188, Mar. 2023

  22. [22]

    Uncovering the representation of spiking neural networks trained with surrogate gradient,

    Y. Li, Y. Kim, H. Park, and P. Panda, “Uncovering the representation of spiking neural networks trained with surrogate gradient,” arXiv preprint arXiv:2304.13098 , 2023

  23. [23]

    Inception loops discover what excites neurons most using deep predictive models,

    E. Y. Walker, F. H. Sinz, E. Cobos, T. Muhammad, E. Froudarakis, P. G. Fahey, A. S. Ecker, J. Reimer, X. Pitkow, and A. S. Tolias, “Inception loops discover what excites neurons most using deep predictive models,” Nature neuroscience, vol. 22, no. 12, pp. 2060–2065, 2019

  24. [24]

    Visualizing higher-layer features of a deep network,

    D. Erhan, Y. Bengio, A. Courville, and P. Vincent, “Visualizing higher-layer features of a deep network,” University of Montreal, vol. 1341, no. 3, p. 1, 2009

  25. [25]

    Understanding Neural Networks Through Deep Visualization

    J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson, “Understanding neural networks through deep visualization,” arXiv preprint arXiv:1506.06579 , 2015

  26. [26]

    Evaluating the visualization of what a deep neural network has learned,

    W. Samek, A. Binder, G. Montavon, S. Lapuschkin, and K.-R. M¨ uller, “Evaluating the visualization of what a deep neural network has learned,” IEEE transactions on neural networks and learning systems, vol. 28, no. 11, pp. 2660–2673, 2016

  27. [27]

    Methods for interpreting and understanding deep neural networks,

    G. Montavon, W. Samek, and K.-R. M¨ uller, “Methods for interpreting and understanding deep neural networks,” Digital signal processing, vol. 73, pp. 1–15, 2018

  28. [28]

    Nevergrad - A gradient-free optimization platform

    J. Rapin and O. Teytaud, “Nevergrad - A gradient-free optimization platform.” https://GitHub. com/FacebookResearch/Nevergrad, 2018

  29. [29]

    Xdream: Finding preferred stimuli for visual neurons using generative networks and gradient-free optimization,

    W. Xiao and G. Kreiman, “Xdream: Finding preferred stimuli for visual neurons using generative networks and gradient-free optimization,” PLoS computational biology, vol. 16, no. 6, p. e1007973, 2020

  30. [30]

    An overview of gradient descent optimization algorithms,

    S. Ruder, “An overview of gradient descent optimization algorithms,” 2016. 19 Draft

  31. [31]

    Gradient-free activation maximization for identifying effective stimuli,

    W. Xiao and G. Kreiman, “Gradient-free activation maximization for identifying effective stimuli,” arXiv preprint arXiv:1905.00378 , 2019

  32. [32]

    Adapting arbitrary normal mutation distributions in evolution strategies: The covariance matrix adaptation,

    N. Hansen and A. Ostermeier, “Adapting arbitrary normal mutation distributions in evolution strategies: The covariance matrix adaptation,” in Proceedings of IEEE international conference on evolutionary computation, pp. 312–317, IEEE, 1996

  33. [33]

    Deep learning,

    Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015

  34. [34]

    Goodfellow, Y

    I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press, 2016

  35. [35]

    Simple model of spiking neurons,

    E. M. Izhikevich, “Simple model of spiking neurons,” IEEE Transactions on neural networks, vol. 14, no. 6, pp. 1569–1572, 2003

  36. [36]

    A logical calculus of the ideas immanent in nervous activity,

    W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” The bulletin of mathematical biophysics , vol. 5, pp. 115–133, 1943

  37. [37]

    Rosenblatt, The perceptron, a perceiving and recognizing automaton Project Para

    F. Rosenblatt, The perceptron, a perceiving and recognizing automaton Project Para. Cornell Aero- nautical Laboratory, 1957

  38. [38]

    The perceptron: a probabilistic model for information storage and organization in the brain.,

    F. Rosenblatt, “The perceptron: a probabilistic model for information storage and organization in the brain.,” Psychological review, vol. 65, no. 6, p. 386, 1958

  39. [39]

    Learning representations by back-propagating errors,

    D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” nature, vol. 323, no. 6088, pp. 533–536, 1986

  40. [40]

    Networks of spiking neurons: the third generation of neural network models,

    W. Maass, “Networks of spiking neurons: the third generation of neural network models,” Neural networks, vol. 10, no. 9, pp. 1659–1671, 1997

  41. [41]

    Training spiking neural networks using lessons from deep learning,

    J. K. Eshraghian, M. Ward, E. Neftci, X. Wang, G. Lenz, G. Dwivedi, M. Bennamoun, D. S. Jeong, and W. D. Lu, “Training spiking neural networks using lessons from deep learning,” Proceedings of the IEEE, vol. 111, no. 9, pp. 1016–1054, 2023

  42. [42]

    Spikingjelly: An open-source machine learning infrastructure platform for spike-based intelligence,

    W. Fang, Y. Chen, J. Ding, Z. Yu, T. Masquelier, D. Chen, L. Huang, H. Zhou, G. Li, and Y. Tian, “Spikingjelly: An open-source machine learning infrastructure platform for spike-based intelligence,” Science Advances, vol. 9, no. 40, p. eadi1480, 2023

  43. [43]

    Deep residual learning for image recognition,

    K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition , pp. 770–778, 2016

  44. [44]

    A quantitative description of membrane current and its application to conduction and excitation in nerve,

    A. L. Hodgkin and A. F. Huxley, “A quantitative description of membrane current and its application to conduction and excitation in nerve,” The Journal of physiology , vol. 117, no. 4, p. 500, 1952

  45. [45]

    Lapicque’s 1907 paper: from frogs to integrate-and-fire,

    N. Brunel and M. C. Van Rossum, “Lapicque’s 1907 paper: from frogs to integrate-and-fire,” Bio- logical cybernetics, vol. 97, no. 5-6, pp. 337–339, 2007

  46. [46]

    Tensorizing neural networks,

    A. Novikov, D. Podoprikhin, A. Osokin, and D. P. Vetrov, “Tensorizing neural networks,” Advances in neural information processing systems , vol. 28, 2015

  47. [47]

    Tt-tsvd: A multi-modal tensor train decom- position with its application in convolutional neural networks for smart healthcare,

    D. Liu, L. T. Yang, P. Wang, R. Zhao, and Q. Zhang, “Tt-tsvd: A multi-modal tensor train decom- position with its application in convolutional neural networks for smart healthcare,” ACM Trans- actions on Multimedia Computing, Communications, and Applications (TOMM) , vol. 18, no. 1s, pp. 1–17, 2022

  48. [48]

    Optimization of functions given in the tensor train format,

    A. Chertkov, G. Ryzhakov, G. Novikov, and I. Oseledets, “Optimization of functions given in the tensor train format,” arXiv preprint arXiv:2209.14808 , 2022

  49. [49]

    Tensor networks for dimensionality reduction and large-scale optimization: Part 1 low-rank tensor decompositions,

    A. Cichocki, N. Lee, I. Oseledets, A.-H. Phan, Q. Zhao, D. P. Mandic, et al. , “Tensor networks for dimensionality reduction and large-scale optimization: Part 1 low-rank tensor decompositions,” Foundations and Trends® in Machine Learning, vol. 9, no. 4-5, pp. 249–429, 2016

  50. [50]

    Tensor networks for dimensionality reduction and large-scale optimization: Part 2 applications and future perspectives,

    A. Cichocki, A.-H. Phan, Q. Zhao, N. Lee, I. Oseledets, M. Sugiyama, D. P. Mandic, et al., “Tensor networks for dimensionality reduction and large-scale optimization: Part 2 applications and future perspectives,” Foundations and Trends® in Machine Learning, vol. 9, no. 6, pp. 431–673, 2017. 20 Draft

  51. [51]

    PROTES: Probabilistic optimization with tensor sampling,

    A. Batsheva, A. Chertkov, G. Ryzhakov, and I. Oseledets, “PROTES: Probabilistic optimization with tensor sampling,” Advances in Neural Information Processing Systems , 2023

  52. [52]

    What is simulated annealing?,

    M. W. Trosset, “What is simulated annealing?,” Optimization and Engineering, vol. 2, pp. 201–213, 2001

  53. [53]

    Simple statistical gradient-following algorithms for connectionist reinforcement learning,

    R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Machine learning, vol. 8, pp. 229–256, 1992

  54. [54]

    Learning multiple layers of features from tiny images,

    A. Krizhevsky, G. Hinton, et al., “Learning multiple layers of features from tiny images,” 2009

  55. [55]

    Spectral Normalization for Generative Adversarial Networks

    T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral normalization for generative adver- sarial networks,” arXiv preprint arXiv:1802.05957 , 2018

  56. [56]

    Auto-Encoding Variational Bayes

    D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013

  57. [57]

    Neural discrete representation learning,

    A. Van Den Oord, O. Vinyals, et al., “Neural discrete representation learning,” Advances in neural information processing systems, vol. 30, 2017

  58. [58]

    Improved tech- niques for training gans,

    T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen, “Improved tech- niques for training gans,” Advances in neural information processing systems , vol. 29, 2016

  59. [59]

    Gans trained by a two time- scale update rule converge to a local nash equilibrium,

    M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time- scale update rule converge to a local nash equilibrium,” Advances in neural information processing systems, vol. 30, 2017

  60. [60]

    The mnist database of handwritten digits,

    Y. LeCun, “The mnist database of handwritten digits,” http://yann. lecun. com/exdb/mnist/, 1998

  61. [61]

    Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

    H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv:1708.07747 , 2017

  62. [62]

    Imagenet: A large-scale hierarchical image database,

    J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255, Ieee, 2009

  63. [63]

    Imagenet classification with deep convolutional neural networks,

    A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems , vol. 25, 2012

  64. [64]

    Densely connected convolutional net- works,

    G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional net- works,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700– 4708, 2017

  65. [65]

    Very Deep Convolutional Networks for Large-Scale Image Recognition

    K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recogni- tion,” arXiv preprint arXiv:1409.1556 , 2014

  66. [66]

    Recent advances in physical reservoir computing: A review,

    G. Tanaka, T. Yamane, J. B. H´ eroux, R. Nakane, N. Kanazawa, S. Takeda, H. Numata, D. Nakano, and A. Hirose, “Recent advances in physical reservoir computing: A review,” Neural Networks , vol. 115, p. 100–123, July 2019

  67. [67]

    On the number of linear regions of deep neural networks,

    G. Mont´ ufar, R. Pascanu, K. Cho, and Y. Bengio, “On the number of linear regions of deep neural networks,” in Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2 , NIPS’14, (Cambridge, MA, USA), p. 2924–2932, MIT Press, 2014

  68. [68]

    Complexity of images: Experimental and computational estimates compared,

    V. Chikhman, V. Bondarko, M. Danilova, A. Goluzina, and Y. Shelepin, “Complexity of images: Experimental and computational estimates compared,” Perception, vol. 41, p. 631–647, Jan. 2012

  69. [69]

    Measuring complexity with zippers,

    A. Baronchelli, E. Caglioti, and V. Loreto, “Measuring complexity with zippers,” European Journal of Physics, vol. 26, p. S69–S77, July 2005

  70. [70]

    A theoretically based index of consciousness independent of sensory processing and behavior,

    A. G. Casali, O. Gosseries, M. Rosanova, M. Boly, S. Sarasso, K. R. Casali, S. Casarotto, M.-A. Bruno, S. Laureys, G. Tononi, and M. Massimini, “A theoretically based index of consciousness independent of sensory processing and behavior,” Science Translational Medicine, vol. 5, Aug. 2013

  71. [71]

    How transferable are features in deep neural networks?,

    J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?,” Advances in neural information processing systems , vol. 27, 2014. 21 Draft

  72. [72]

    How does the brain solve visual object recognition?,

    J. J. DiCarlo, D. Zoccolan, and N. C. Rust, “How does the brain solve visual object recognition?,” Neuron, vol. 73, p. 415–434, Feb. 2012

  73. [73]

    Abstract representations emerge naturally in neural networks trained to perform multiple tasks,

    W. J. Johnston and S. Fusi, “Abstract representations emerge naturally in neural networks trained to perform multiple tasks,” Nature Communications, vol. 14, Feb. 2023

  74. [74]

    How ‘visual’ is the visual cortex? the interactions between the visual cortex and other sensory, motivational and motor systems as enabling factors for visual perception,

    C. M. A. Pennartz, M. N. Oude Lohuis, and U. Olcese, “How ‘visual’ is the visual cortex? the interactions between the visual cortex and other sensory, motivational and motor systems as enabling factors for visual perception,” Philosophical Transactions of the Royal Society B: Biological Sciences, vol. 378, Aug. 2023

  75. [75]

    Representation of naturalistic image structure in the primate visual cortex,

    J. A. Movshon and E. P. Simoncelli, “Representation of naturalistic image structure in the primate visual cortex,” Cold Spring Harbor Symposia on Quantitative Biology , vol. 79, p. 115–122, 2014

  76. [76]

    Neuronal selectivity to complex vocalization features emerges in the superficial layers of primary auditory cortex,

    P. Montes-Lourido, M. Kar, S. V. David, and S. Sadagopan, “Neuronal selectivity to complex vocalization features emerges in the superficial layers of primary auditory cortex,” PLOS Biology, vol. 19, p. e3001299, June 2021

  77. [77]

    Critical Learning Periods in Deep Neural Networks

    A. Achille, M. Rovere, and S. Soatto, “Critical learning periods in deep neural networks,” arXiv preprint arXiv:1711.08856, 2017

  78. [78]

    Calcium imaging reveals fast tuning dynamics of hippocampal place cells and ca1 population activity during free exploration task in mice,

    V. P. Sotskov, N. A. Pospelov, V. V. Plusnin, and K. V. Anokhin, “Calcium imaging reveals fast tuning dynamics of hippocampal place cells and ca1 population activity during free exploration task in mice,” International Journal of Molecular Sciences , vol. 23, p. 638, Jan. 2022

  79. [79]

    The log-dynamic brain: how skewed distributions affect network operations,

    G. Buzs´ aki and K. Mizuseki, “The log-dynamic brain: how skewed distributions affect network operations,” Nature Reviews Neuroscience, vol. 15, p. 264–278, Feb. 2014

  80. [80]

    Taming transformers for high-resolution image synthesis,

    P. Esser, R. Rombach, and B. Ommer, “Taming transformers for high-resolution image synthesis,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12873– 12883, 2021

Showing first 80 references.