On sources to variabilities of simple cells in the primary visual cortex: A principled theory for the interaction between geometric image transformations and receptive field responses
Pith reviewed 2026-05-18 20:12 UTC · model grok-4.3
pith:4XG56AL5 Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{4XG56AL5}
Prints a linked pith:4XG56AL5 badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
The pith
Receptive fields of simple cells in the primary visual cortex expand their shapes over the degrees of freedom of geometric image transformations to match responses across viewing conditions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By postulating that the family of receptive fields should be covariant under these classes of geometric image transformations, it follows that the receptive field shapes should be expanded over the degrees of freedom of the corresponding image transformations, to enable a formal matching between the receptive field responses computed under different viewing conditions for the same scene or for a structurally similar spatio-temporal event.
What carries the argument
Covariance of receptive fields under geometric image transformations, which expands receptive-field shapes over the degrees of freedom of uniform spatial scaling, spatial affine, Galilean, and temporal scaling transformations.
If this is right
- Receptive fields expand over the degrees of freedom of uniform spatial scaling transformations.
- Receptive fields expand over the degrees of freedom of spatial affine transformations.
- Receptive fields remain covariant under Galilean transformations.
- Receptive fields expand over the degrees of freedom of temporal scaling transformations.
- Responses to the same scene or structurally similar event can be matched formally across different viewing conditions.
Where Pith is reading between the lines
- The theory implies that measured variability across simple cells can be organized by the four listed transformation classes rather than appearing random.
- It suggests that invariance or equivariance in higher-level visual tasks could arise from this lower-level covariance property.
- A direct test would compare recorded receptive-field profiles against the predicted expansions under controlled geometric image changes.
Load-bearing premise
The family of receptive fields must be covariant under the listed geometric image transformations to allow matching of responses under different viewing conditions.
What would settle it
An experiment or recording that shows simple-cell receptive fields do not expand their shapes over the degrees of freedom of scaling, affine, Galilean, or temporal transformations when the same scene is viewed under those changed conditions.
read the original abstract
This paper gives an overview of a theory for modelling the interaction between geometric image transformations and receptive field responses for a visual observer that views objects and spatio-temporal events in the environment. This treatment is developed over combinations of (i) uniform spatial scaling transformations, (ii) spatial affine transformations, (iii) Galilean transformations and (iv) temporal scaling transformations. By postulating that the family of receptive fields should be covariant under these classes of geometric image transformations, it follows that the receptive field shapes should be expanded over the degrees of freedom of the corresponding image transformations, to enable a formal matching between the receptive field responses computed under different viewing conditions for the same scene or for a structurally similar spatio-temporal event. We conclude the treatment by discussing and providing potential support for a working hypothesis that the receptive fields of simple cells in the primary visual cortex ought to be covariant under these classes of geometric image transformations, and thus have the shapes of their receptive fields expanded over the degrees of freedom of the corresponding geometric image transformations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. This manuscript outlines a theoretical framework for the interaction between geometric image transformations (uniform spatial scaling, spatial affine, Galilean, and temporal scaling) and receptive field responses. By postulating that receptive field families must be covariant under these transformations, the paper derives that receptive field shapes should be expanded over the corresponding degrees of freedom. This expansion is argued to enable formal matching of responses computed under different viewing conditions for the same scene or similar spatio-temporal events. The treatment concludes with a working hypothesis that simple cells in primary visual cortex (V1) exhibit such covariance, resulting in expanded receptive field shapes.
Significance. If the covariance postulate can be independently grounded and the expanded families shown to match V1 statistics, the work would supply a principled, transformation-based account of receptive field variability in V1, linking environmental geometry directly to neural response properties. It offers a formal group-theoretic perspective on handling changes in viewing conditions. The significance remains provisional given the overview format and absence of explicit derivations or empirical checks.
major comments (2)
- [Abstract] Abstract (second paragraph): the central claim that receptive field shapes must be expanded over the degrees of freedom follows directly from the covariance postulate, yet no derivation from neural mechanisms, independent data, or comparison to alternative principles (e.g., invariance) is supplied; this postulate is load-bearing for the working hypothesis stated in the final paragraph.
- [Conclusion] Conclusion: the discussion of potential support for the V1 covariance hypothesis does not reference specific empirical measurements of receptive field shapes or population statistics that would allow a test of the predicted expansion over the listed transformation degrees of freedom.
minor comments (1)
- [Abstract] The abstract is dense; separating the postulate, the formal consequence, and the biological hypothesis into distinct sentences would improve readability.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our manuscript. The feedback highlights important aspects of how the theoretical framework is presented. We address each major comment below and indicate the revisions made to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract (second paragraph): the central claim that receptive field shapes must be expanded over the degrees of freedom follows directly from the covariance postulate, yet no derivation from neural mechanisms, independent data, or comparison to alternative principles (e.g., invariance) is supplied; this postulate is load-bearing for the working hypothesis stated in the final paragraph.
Authors: We agree that the expansion of receptive field shapes follows directly from the covariance postulate, which serves as the foundational assumption in this theoretical overview. The manuscript does not derive the postulate from neural mechanisms or independent data because it is presented as a working hypothesis motivated by the requirements for response matching under geometric transformations. In the main text, we elaborate on why covariance is preferred over invariance for preserving information about the scene under changes in viewing conditions. To address the comment, we have revised the abstract to explicitly label the covariance as a postulate and to briefly mention the distinction from invariance principles, which would typically involve different computational mechanisms such as max-pooling over transformed responses. revision: partial
-
Referee: [Conclusion] Conclusion: the discussion of potential support for the V1 covariance hypothesis does not reference specific empirical measurements of receptive field shapes or population statistics that would allow a test of the predicted expansion over the listed transformation degrees of freedom.
Authors: The conclusion provides a general discussion of potential support based on established properties of V1 simple cells, such as their selectivity to orientation, spatial frequency, and motion. However, we acknowledge that it does not include specific references to empirical measurements or population statistics that directly test the predicted expansion over the degrees of freedom of scaling, affine, Galilean, and temporal transformations. This is because the current work is primarily theoretical, and a detailed empirical validation would constitute a separate study. We have revised the conclusion to include suggestions for how the hypothesis could be tested using existing V1 receptive field datasets and have added references to relevant studies on V1 variability where appropriate. revision: yes
Circularity Check
Covariance postulate restated as conclusion with expanded RF shapes following by definition
specific steps
-
self definitional
[Abstract]
"By postulating that the family of receptive fields should be covariant under these classes of geometric image transformations, it follows that the receptive field shapes should be expanded over the degrees of freedom of the corresponding image transformations, to enable a formal matching between the receptive field responses computed under different viewing conditions for the same scene or for a structurally similar spatio-temporal event. We conclude the treatment by discussing and providing potential support for a working hypothesis that the receptive fields of simple cells in the primary视觉có"
The text postulates covariance and immediately concludes that shapes 'should be expanded over the degrees of freedom' as what 'follows.' The expansion is not derived from additional premises, data, or mechanisms; it is the direct definitional consequence of requiring covariance under transformations that include scaling, affine, Galilean and temporal degrees of freedom. The final working hypothesis simply reasserts the same postulate, closing the loop.
full rationale
The paper's central derivation begins by explicitly postulating covariance of receptive-field families under the listed geometric transformations and then states that expanded shapes over those degrees of freedom follow directly to enable matching responses. This step is self-definitional: the claimed 'result' (expanded shapes) is the operational content of the covariance postulate itself rather than an independent derivation or external prediction. No separate neural mechanism, data fit, or uniqueness theorem is invoked to justify the postulate; the working hypothesis simply reasserts it. The remainder of the treatment develops formal consequences of the postulate but does not break the definitional loop. This produces moderate circularity (score 6) while still allowing the mathematical development of the covariance condition to have independent technical content.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The family of receptive fields should be covariant under uniform spatial scaling, spatial affine, Galilean, and temporal scaling transformations.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
By postulating that the family of receptive fields should be covariant under these classes of geometric image transformations, it follows that the receptive field shapes should be expanded over the degrees of freedom of the corresponding image transformations
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
idealized generalized Gaussian derivative model of visual receptive fields
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
L. Abballe and H. Asari. Natural image statistics for mouse vision. PLoS ONE, 17 0 (1): 0 e0262763, 2022
work page 2022
-
[2]
H. Bae, S. J. Kim, and C.-E. Kim. Lessons from deep neural networks for studying the coding principles of biological neural networks. Frontiers in Systems Neuroscience, 14: 0 615129, 2021
work page 2021
- [3]
-
[4]
I. Biederman and E. E. Cooper. Size invariance in visual object priming. Journal of Experimental Physiology: Human Perception and Performance, 18 0 (1): 0 121--133, 1992
work page 1992
-
[5]
G. G. Blasdel. Orientation selectivity, preference and continuity in monkey striate cortex. Journal of Neuroscience, 12 0 (8): 0 3139--3161, 1992
work page 1992
-
[6]
T. Bonhoeffer and A. Grinvald. Iso-orientation domains in cat visual cortex are arranged in pinwheel-like patterns. Nature, 353: 0 429--431, 1991
work page 1991
-
[7]
J. S. Bowers, G. Malhotra, M. Dujmovi \'c , M. L. Montero, C. Tsvetkov, V. Biscione, G. Puebla, F. Adolfi, J. E. Hummel, R. F. Heaton, B. D. Evans, J. Mitchell, and R. Blything. Deep problems with neural network models of human vision. Behavioral and Brain Sciences, pages 1--74, 2022
work page 2022
-
[8]
M. M. Bronstein, J. Bruna, T. Cohen, and P. Veli c kovi \'c . Geometric deep learning: G rids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[9]
B. R. Conway and M. S. Livingstone. Spatial and temporal properties of cone signals in alert macaque primary visual cortex. Journal of Neuroscience, 26 0 (42): 0 10826--10846, 2006
work page 2006
- [10]
-
[11]
G. C. DeAngelis and A. Anzai. A modern view of the classical receptive field: Linear and non-linear spatio-temporal processing by V1 neurons. In L. M. Chalupa and J. S. Werner, editors, The Visual Neurosciences, volume 1, pages 704--719. MIT Press, 2004
work page 2004
-
[12]
G. C. DeAngelis, I. Ohzawa, and R. D. Freeman. Receptive field dynamics in the central visual pathways. Trends in Neuroscience, 18 0 (10): 0 451--457, 1995
work page 1995
-
[13]
J. J. DiCarlo, D. Zoccolan, and N. C. Rust. How does the brain solve visual object recognition? Neuron, 73 0 (3): 0 415--434, 2012
work page 2012
-
[14]
C. S. Furmanski and S. A. Engel. Perceptual learning in object recognition: O bject specificity and size invariance. Vision Research, 40: 0 473--484, 2000
work page 2000
-
[15]
J. G rding and T. Lindeberg. Direct computation of shape cues using scale-adapted spatial derivative operators. International Journal of Computer Vision, 17 0 (2): 0 163--191, 1996
work page 1996
-
[16]
W. S. Geisler. Visual perception and the statistical properties of natural scenes. Annual Review of Psychology, 59: 0 10.1--10.26, 2008
work page 2008
-
[17]
M. A. Georgeson, K. A. May, T. C. A. Freeman, and G. S. Hesse. From filters to features: S cale-space analysis of edge and blur coding in human vision. Journal of Vision, 7 0 (13): 0 7.1--21, 2007
work page 2007
-
[18]
J. E. Gerken, J. Aronsson, O. Carlsson, H. Linander, F. Ohlsson, C. Petersson, and D. Persson. Geometric deep learning and equivariant neural networks. Artificial Intelligence Review, 56 0 (12): 0 14605--14662, 2023
work page 2023
-
[19]
M. Ghodrati, S.-M. Khaligh-Razavi, and S. R. Lehky. Towards building a more complex view of the lateral geniculate nucleus: R ecent advances in understanding its role. Progress in Neurobiology, 156: 0 214--255, 2017
work page 2017
-
[20]
R. L. T. Goris, E. P. Simoncelli, and J. A. Movshon. Origin and function of tuning diversity in M acaque visual cortex. Neuron, 88 0 (4): 0 819--831, 2015
work page 2015
-
[21]
T. Hansen and H. Neumann. A recurrent model of contour integration in primary visual cortex. Journal of Vision, 8 0 (8): 0 8.1--25, 2008
work page 2008
- [22]
-
[23]
G. S. Hesse and M. A. Georgeson. Edges and bars: where do people see features in 1- D images? Vision Research, 45 0 (4): 0 507--525, 2005
work page 2005
-
[24]
D. H. Hubel and T. N. Wiesel. Receptive fields of single neurones in the cat's striate cortex. J Physiol, 147: 0 226--238, 1959
work page 1959
-
[25]
D. H. Hubel and T. N. Wiesel. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. J Physiol, 160: 0 106--154, 1962
work page 1962
-
[26]
D. H. Hubel and T. N. Wiesel. Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 195 0 (1): 0 215--243, 1968
work page 1968
-
[27]
D. H. Hubel and T. N. Wiesel. Brain and Visual Perception: T he Story of a 25-Year Collaboration . Oxford University Press, 2005
work page 2005
-
[28]
C. P. Hung, G. Kreiman, T. Poggio, and J. J. DiCarlo. Fast readout of object indentity from macaque inferior temporal cortex. Science, 310: 0 863--866, 2005
work page 2005
-
[29]
A. Hyv \"a rinen, J. Hurri, and P. O. Hoyer. Natural Image Statistics: A Probabilistic Approach to Early Computational Vision . Computational Imaging and Vision. Springer, 2009
work page 2009
-
[30]
L. Isik, E. M. Meyers, J. Z. Leibo, and T. Poggio. The dynamics of invariant object recognition in the human visual system. Journal of Neurophysiology, 111 0 (1): 0 91--102, 2013
work page 2013
-
[31]
M. Ito, H. Tamura, I. Fujita, and K. Tanaka. Size and position invariance of neuronal responses in monkey inferotemporal cortex. Journal of Neurophysiology, 73 0 (1): 0 218--226, 1995
work page 1995
-
[32]
Y. Jansson and T. Lindeberg. Scale-invariant scale-channel networks: Deep networks that generalise to previously unseen scales. Journal of Mathematical Imaging and Vision, 64 0 (5): 0 506--536, 2022
work page 2022
-
[33]
E. N. Johnson, M. J. Hawken, and R. Shapley. The orientation selectivity of color-responsive neurons in M acaque V1 . The Journal of Neuroscience, 28 0 (32): 0 8096--8106, 2008
work page 2008
-
[34]
J. Jones and L. Palmer. The two-dimensional spatial structure of simple receptive fields in cat striate cortex. J. of Neurophysiology, 58: 0 1187--1211, 1987 a
work page 1987
-
[35]
J. Jones and L. Palmer. An evaluation of the two-dimensional G abor filter model of simple receptive fields in cat striate cortex. J. of Neurophysiology, 58: 0 1233--1258, 1987 b
work page 1987
-
[36]
M. Keshishian, H. Akbari, B. Khalighinejad, J. L. Herrero, A. D. Mehta, and N. Mesgarani. Estimating and interpreting nonlinear receptive field of sensory neural responses with deep neural network models. eLife, 9: 0 e53445, 2020
work page 2020
-
[37]
E. Koch, J. Jin, J. M. Alonso, and Q. Zaidi. Functional implications of orientation maps in primary visual cortex. Nature Communications, 7 0 (1): 0 13529, 2016
work page 2016
-
[38]
J. J. Koenderink. The structure of images. Biological Cybernetics, 50 0 (5): 0 363--370, 1984
work page 1984
-
[39]
J. J. Koenderink and A. J. van Doorn . Representation of local geometry in the visual system. Biological Cybernetics, 55 0 (6): 0 367--375, 1987
work page 1987
-
[40]
J. J. Koenderink and A. J. van Doorn . Generic neighborhood operators. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14 0 (6): 0 597--605, Jun. 1992
work page 1992
-
[41]
D. G. Kristensen and K. Sandberg. Population receptive fields of human primary visual cortex organised as DC -balanced bandpass filters. Scientific Reports, 11 0 (1): 0 22423, 2021
work page 2021
-
[42]
E. H. Land. The retinex theory of colour vision. Proc.\ Royal Institution of Great Britain, 57: 0 23--58, 1974
work page 1974
-
[43]
E. H. Land. Recent advances in retinex theory. Vision Research, 26 0 (1): 0 7--21, 1986
work page 1986
- [44]
- [45]
- [46]
- [47]
- [48]
-
[49]
T. Lindeberg. Scale selection. In K. Ikeuchi, editor, Computer Vision, pages 1110--1123. Springer, 2021 a . https ://doi.org/10.1007/978-3-030-03243-2\_242-1
-
[50]
T. Lindeberg. Normative theory of visual receptive fields. Heliyon, 7 0 (1): 0 e05897:1--20, 2021 b . doi:10.1016/j.heliyon.2021.e05897
- [51]
- [52]
- [53]
- [54]
-
[55]
T. Lindeberg. Relationships between the degrees of freedom in the affine G aussian derivative model for visual receptive fields and 2-D affine image transformations, with application to covariance properties of simple cells in the primary visual cortex. Biological Cybernetics, 119 0 (2--3): 0 15:1--25, 2025 a
work page 2025
- [56]
-
[57]
T. Lindeberg. Do the receptive fields in the primary visual cortex span a variability over the degree of elongation of the receptive fields? Journal of Computational Neuroscience, 2025 c . https://doi.org/10.1007/s10827-025-00907-4
-
[58]
T. Lindeberg. Unified theory for joint covariance properties under geometric image transformations for spatio-temporal receptive fields according to the generalized G aussian derivative model for visual receptive fields. Journal of Mathematical Imaging and Vision, 67 0 (4): 0 44:1--49, 2025 d
work page 2025
- [59]
-
[60]
T. Lindeberg and L. Florack. Foveal scale-space and linear increase of receptive field size as a function of eccentricity. report, ISRN KTH/NA/P- -94/27- -SE, Dept. of Numerical Analysis and Computer Science, KTH, Aug. 1994. Available from http://www.csc.kth.se/ tony/abstracts/CVAP166.html
work page 1994
-
[61]
N. K. Logothetis, J. Pauls, and T. Poggio. Shape representation in the inferior temporal cortex of monkeys. Current Biology, 5 0 (2): 0 552--563, 1995
work page 1995
-
[62]
A. L \"o rincz, Z. Palotal, and G. Szirtes. Efficient sparse coding in early sensory processing: L essons from signal recovery. PLOS Computational Biology, 8 0 (3): 0 e1002372, 2012
work page 2012
-
[63]
D. G. Lowe. Towards a computational model for object recognition in IT cortex. In Biologically Motivated Computer Vision, volume 1811 of Springer LNCS, pages 20--31. Springer, 2000
work page 2000
- [64]
-
[65]
K. A. May and M. A. Georgeson. Blurred edges look faint, and faint edges look sharp: T he effect of a gradient threshold in a multi-scale edge coding model. Vision Research, 47 0 (13): 0 1705--1720, 2007
work page 2007
-
[66]
I. Nauhaus, A. Benucci, M. Carandini, and D. L. Ringach. Neuronal selectivity and local map structure in visual cortex. Neuron, 57 0 (5): 0 673--679, 2008
work page 2008
-
[67]
B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Journal of Optical Society of America, 381: 0 607--609, 1996
work page 1996
-
[68]
B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: A strategy employed by V 1? Vision Research, 37 0 (23): 0 3311--3325, 1997
work page 1997
-
[69]
S. E. Palmer. Vision Science: Photons to Phenomenology. MIT Press, 1999. First Edition
work page 1999
- [70]
-
[71]
E. Peli. Contrast in complex images. Journal of the Optical Society of America (JOSA A), 7 0 (10): 0 2032--2040, 1990
work page 2032
-
[72]
A. Perzanowski and T. Lindeberg. Scale generalisation properties of extended scale-covariant and scale-invariant G aussian derivative networks on image datasets with spatial scaling variations. Journal of Mathematical Imaging and Vision, 67 0 (3): 0 1--39, 2025
work page 2025
-
[73]
T. A. Poggio and F. Anselmi. Visual Cortex and Deep Networks: Learning Invariant Representations. MIT Press, 2016
work page 2016
-
[74]
M. Porat and Y. Y. Zeevi. The generalized G abor scheme of image representation in biological and machine vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10 0 (4): 0 452--468, 1988
work page 1988
-
[75]
R. P. N. Rao and D. H. Ballard. Development of localized oriented receptive fields by learning a translation-invariant code for natural images. Computation in Neural Systems, 9 0 (2): 0 219--234, 1998
work page 1998
-
[76]
D. L. Ringach. Spatial structure and symmetry of simple-cell receptive fields in macaque primary visual cortex. Journal of Neurophysiology, 88: 0 455--463, 2002
work page 2002
-
[77]
D. L. Ringach. Mapping receptive fields in primary visual cortex. Journal of Physiology, 558 0 (3): 0 717--728, 2004
work page 2004
-
[78]
M. A. Ruslim, A. N. Burkitt, and Y. Lian. Learning spatio-temporal V1 cells from diverse LGN inputs. bioRxiv, pages 2023--11.30.569354, 2023
work page 2023
- [79]
-
[80]
E. P. Simoncelli and B. A. Olshausen. Natural image statistics and neural representations. Annual Review of Neuroscience, 24: 0 1193--1216, 2001
work page 2001
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.