Beyond Routing: Characterising Expert Tuning and Representation in Vision Mixture-of-Experts
Pith reviewed 2026-05-21 05:52 UTC · model grok-4.3
The pith
An animate-inanimate distinction dominates expert partitioning in vision mixture-of-experts models from gating through readout and remains stable across independent trainings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Expert specialisation in vision mixture-of-experts models is dominated by an animate-inanimate distinction that appears from gating through to expert readout and proves stable across independently trained models. Although routing statistics indicate relatively sparse, categorical preferences, the experts themselves exhibit tuning to continuous visual and semantic dimensions that extend beyond category boundaries. Experts achieve similar levels of category separability despite maintaining distinct feature tuning, showing the explanatory gain from moving past category-level descriptions.
What carries the argument
The animate-inanimate distinction that organises expert partitioning, tracked by combining gating analysis with per-expert category separability, most-exciting-input tuning, semantic-dimension interpretation, and cross-model representational similarity.
If this is right
- Expert specialisation involves continuous feature tuning that crosses category lines rather than rigid category assignment.
- The animate-inanimate structure emerges reliably from the training process on natural images regardless of initialisation.
- Comparable category separability can be achieved through distinct underlying tunings across different experts.
- Analyses limited to routing statistics miss the graded visual and semantic dimensions that experts actually use.
Where Pith is reading between the lines
- Tools for measuring fine-grained tuning could be applied to other modular vision architectures to check whether similar continuous dimensions appear.
- If the animate-inanimate axis proves general, it may indicate an organisational bias that vision models acquire from natural-image statistics.
- Disrupting this distinction during training and measuring effects on downstream tasks would test its functional importance.
Load-bearing premise
That semantic dimensions derived from human behavioural judgements on object similarities supply a valid basis for interpreting the tuning of individual model experts.
What would settle it
Repeated training runs that show no consistent animate-inanimate separation in gating weights, expert activations, or most-exciting inputs would falsify the claim that this distinction dominates and stabilises expert partitioning.
Figures
read the original abstract
Mixture-of-Experts (MoE) models are often interpreted by analysing which categories are routed to which experts. However, routing alone does not reveal what each expert actually encodes. We train sparsely-gated convolutional MoE models with a contrastive objective on natural images and characterise expert specialisation using tools from visual neuroscience. Extending from gating-level to expert-level analyses, we measure per-expert category separability, and per-expert tuning using the most exciting inputs. Extending from category-level to feature-level explanations, we interpret tuning via semantic dimensions derived from a dataset of human behavioural judgements (THINGS). Finally, we use tuning and representational similarity analysis to assess the stability of expertise-allocation across independent initialisations. We find that an animate-inanimate distinction dominates expert partitioning, apparent from gating through to expert readout, and is stable across independently trained models. Although routing statistics suggest relatively sparse, categorical preferences, expert analyses reveal broader tuning to continuous visual and semantic dimensions that extend beyond category boundaries. Experts exhibit similar category-separability to one another, despite distinct feature tuning, demonstrating the explanatory benefits of moving beyond category-level analyses. Together, these results show that expert specialisation in vision MoEs extends well beyond category routing and is better understood by probing fine-grained expert-level tuning and representational structure.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript trains sparsely-gated convolutional Mixture-of-Experts models with a contrastive objective on natural images and characterises expert specialisation using visual neuroscience tools. Analyses extend from gating statistics to per-expert category separability and tuning measured via most-exciting inputs, interpreted through semantic dimensions extracted from the THINGS dataset of human behavioural judgements. Stability of expertise allocation is assessed via tuning and representational similarity analysis across independent initialisations. The central claims are that an animate-inanimate distinction dominates expert partitioning from gating through readout and remains stable across models, while routing appears sparse and categorical but expert tuning is broader and extends to continuous visual and semantic dimensions beyond category boundaries.
Significance. If the results hold, the work provides a valuable bridge between MoE interpretability and visual neuroscience methods, showing that category routing alone is insufficient and that expert-level tuning analyses reveal richer structure. The stability finding across initialisations and the demonstration that experts can share category separability while differing in feature tuning are useful for both theory and practical MoE design. The approach of using most-exciting inputs and THINGS dimensions is a strength when properly validated.
major comments (2)
- The interpretation that expert tuning extends to continuous visual and semantic dimensions beyond category boundaries rests on semantic dimensions derived from the THINGS human behavioural judgements dataset. The manuscript should supply direct evidence (e.g., comparison of THINGS axes to model-derived embeddings or ablation of the interpretation) that these dimensions align with the features actually encoded by the contrastively trained convolutional experts rather than imposing an external human similarity ontology. Without such validation the central move from routing statistics to expert-level tuning claims is weakened.
- §5 (stability analysis): The claim that the animate-inanimate distinction is stable across independently trained models is load-bearing for the robustness conclusion. The text should report the exact number of independent runs, the specific representational similarity metric employed, variance in routing preferences, and any statistical tests. The current description leaves these details underspecified, making it difficult to assess the strength of the stability result.
minor comments (2)
- Abstract: Adding at least one quantitative anchor (e.g., mean category separability across experts or average correlation with THINGS dimensions) would give readers an immediate sense of effect size and support the qualitative claims of dominance and broader tuning.
- Methods section: Provide the precise number of experts, gating temperature or sparsity schedule, and contrastive loss hyperparameters to facilitate exact reproduction of the reported routing and tuning behaviours.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and the opportunity to clarify and strengthen the manuscript. We address each major point below and will revise accordingly.
read point-by-point responses
-
Referee: The interpretation that expert tuning extends to continuous visual and semantic dimensions beyond category boundaries rests on semantic dimensions derived from the THINGS human behavioural judgements dataset. The manuscript should supply direct evidence (e.g., comparison of THINGS axes to model-derived embeddings or ablation of the interpretation) that these dimensions align with the features actually encoded by the contrastively trained convolutional experts rather than imposing an external human similarity ontology. Without such validation the central move from routing statistics to expert-level tuning claims is weakened.
Authors: We agree that direct validation would strengthen the link between THINGS dimensions and model features. In the revised manuscript we will add a comparison of the THINGS semantic axes against the leading principal components of per-expert activation vectors computed on the same image set, together with a quantitative alignment metric (e.g., canonical correlation). We will also include a brief ablation that substitutes model-derived dimensions for the THINGS axes and re-evaluates the reported tuning patterns. These additions will demonstrate that the continuous dimensions reflect structure present in the contrastively trained experts rather than an external ontology alone. revision: yes
-
Referee: §5 (stability analysis): The claim that the animate-inanimate distinction is stable across independently trained models is load-bearing for the robustness conclusion. The text should report the exact number of independent runs, the specific representational similarity metric employed, variance in routing preferences, and any statistical tests. The current description leaves these details underspecified, making it difficult to assess the strength of the stability result.
Authors: We thank the referee for highlighting the missing methodological details. The stability results were obtained from five independent training runs that differed only in random seed. Representational similarity was measured with cosine similarity between expert tuning vectors (most-exciting-input embeddings). In the revision we will explicitly state the number of runs, report the observed variance in routing preferences across runs, name the similarity metric, and include statistical support (permutation tests on the animate-inanimate separability scores) to quantify the reliability of the stability finding. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper's core claims about animate-inanimate dominance in expert partitioning and broader tuning to continuous dimensions are derived from empirical analyses: routing statistics, per-expert category separability, most-exciting inputs, and representational similarity, all interpreted using external tools from visual neuroscience and the independent THINGS human behavioural dataset. No steps reduce by construction to self-defined quantities, fitted parameters renamed as predictions, or load-bearing self-citations. The stability assessment across independent initialisations and the move from category-level to feature-level explanations rely on standard methods applied to model outputs rather than internal redefinitions. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Tools from visual neuroscience, including category separability and most-exciting-input analysis, can be meaningfully applied to interpret activations in artificial neural network experts.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We find that an animate—inanimate distinction dominates expert partitioning, apparent from gating through to expert readout, and is stable across independently trained models.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we interpret tuning via semantic dimensions derived from a dataset of human behavioural judgements (THINGS)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
William Fedus, Barret Zoph, and Noam Shazeer. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity.Journal of Machine Learning Research, 23(120):1–39, 2022. ISSN 1533-7928. URL http://jmlr.org/papers/v23/21-0998. html
work page 2022
-
[2]
Adaptive Mixtures of Local Experts
Robert A. Jacobs, Michael I. Jordan, Steven J. Nowlan, and Geoffrey E. Hinton. Adaptive Mixtures of Local Experts.Neural Computation, 3(1):79–87, 1991. ISSN 1530-888X. doi: 10.1162/neco.1991.3.1.79
-
[3]
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer, January 2017. URLhttp://arxiv.org/abs/1701.06538. arXiv:1701.06538 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[4]
Michael I. Jordan and Robert A. Jacobs. Hierarchical Mixtures of Experts and the EM Algorithm. Neural Computation, 6(2):181–214, March 1994. ISSN 0899-7667. doi: 10.1162/neco.1994.6. 2.181. URLhttps://ieeexplore.ieee.org/abstract/document/6796382
-
[5]
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications, January 2026
Siyuan Mu and Sen Lin. A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications, January 2026. URL http://arxiv.org/abs/2503.07137. arXiv:2503.07137 [cs]
-
[6]
ViMoE: An Empirical Study of Designing Vision Mixture-of- Experts, November 2024
Xumeng Han, Longhui Wei, Zhiyang Dou, Zipeng Wang, Chenhui Qiang, Xin He, Yingfei Sun, Zhenjun Han, and Qi Tian. ViMoE: An Empirical Study of Designing Vision Mixture-of- Experts, November 2024. URL http://arxiv.org/abs/2410.15732. arXiv:2410.15732 [cs]
-
[7]
Scaling Vision with Sparse Mixture of Experts, June 2021
Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, An- dré Susano Pinto, Daniel Keysers, and Neil Houlsby. Scaling Vision with Sparse Mixture of Experts, June 2021. URLhttp://arxiv.org/abs/2106.05974. arXiv:2106.05974 [cs]
-
[8]
Scaling Vision-Language Models with Sparse Mixture of Experts, March 2023
Sheng Shen, Zhewei Yao, Chunyuan Li, Trevor Darrell, Kurt Keutzer, and Yuxiong He. Scaling Vision-Language Models with Sparse Mixture of Experts, March 2023. URLhttp://arxiv. org/abs/2303.07226. arXiv:2303.07226 [cs]
- [9]
-
[10]
Mixture of Experts Made Intrinsically Interpretable
Xingyi Yang, Constantin Venhoff, and Ashkan Khakzar. Mixture of Experts Made Intrinsically Interpretable. May 2025
work page 2025
-
[11]
Svetlana Pavlitska, Christian Hubschneider, Lukas Struppek, and J. Marius Zöllner. Sparsely- gated Mixture-of-Expert Layers for CNN Interpretability. In2023 International Joint Con- ference on Neural Networks (IJCNN), pages 1–10, June 2023. doi: 10.1109/IJCNN54540. 2023.10191904. URL https://ieeexplore.ieee.org/document/10191904. ISSN: 2161- 4407
-
[12]
Multi- modal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts, June 2022
Basil Mustafa, Carlos Riquelme, Joan Puigcerver, Rodolphe Jenatton, and Neil Houlsby. Multi- modal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts, June 2022. URLhttp://arxiv.org/abs/2206.02770. arXiv:2206.02770 [cs]
-
[13]
MoIIE: Mixture of Intra- and Inter-Modality Experts for Large Vision Language Models, January 2026
Dianyi Wang, Siyuan Wang, Zejun Li, Yikun Wang, Yitong Li, Duyu Tang, Xiaoyu Shen, Xuanjing Huang, and Zhongyu Wei. MoIIE: Mixture of Intra- and Inter-Modality Experts for Large Vision Language Models, January 2026. URL http://arxiv.org/abs/2508.09779. arXiv:2508.09779 [cs]
-
[14]
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities, April 2025
Raman Dutt, Harleen Hanspal, Guoxuan Xia, Petru-Daniel Tudosiu, Alexander Black, Yongxin Yang, Steven McDonagh, and Sarah Parisot. Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities, April 2025. URL http://arxiv.org/abs/2503. 22517. arXiv:2503.22517 [cs]. 11
-
[15]
Mixture of Experts in Image Classification: What’s the Sweet Spot?, October 2025
Mathurin Videau, Alessandro Leite, Marc Schoenauer, and Olivier Teytaud. Mixture of Experts in Image Classification: What’s the Sweet Spot?, October 2025. URL http://arxiv.org/ abs/2411.18322. arXiv:2411.18322 [cs]
-
[16]
Long-Tailed Distribution-Aware Router For Mixture-of-Experts in Large Vision-Language Model, April
Chaoxiang Cai, Longrong Yang, Minghe Weng, Xuewei Li, Zequn Qin, and Xi Li. Long-Tailed Distribution-Aware Router For Mixture-of-Experts in Large Vision-Language Model, April
- [17]
-
[18]
MoE Lens – An Expert Is All You Need, March 2026
Marmik Chaudhari, Idhant Gulati, Nishkal Hundia, Pranav Karra, and Shivam Raval. MoE Lens – An Expert Is All You Need, March 2026. URLhttp://arxiv.org/abs/2603.05806. arXiv:2603.05806 [cs]
-
[19]
A Closer Look into Mixture-of- Experts in Large Language Models, June 2025
Ka Man Lo, Zeyu Huang, Zihan Qiu, Zili Wang, and Jie Fu. A Closer Look into Mixture-of- Experts in Large Language Models, June 2025. URL http://arxiv.org/abs/2406.18219. arXiv:2406.18219 [cs]
-
[20]
Probing Semantic Routing in Large Mixture-of-Expert Models
Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Man Luo, Sungduk Yu, Chendi Xue, and Vasudev Lal. Probing Semantic Routing in Large Mixture-of-Expert Models. In Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng, editors,Findings of the Association for Computational Linguistics: EMNLP 2025, pages 18263–18278, Suzhou, China, ...
work page 2025
-
[21]
URL https://aclanthology.org/2025
doi: 10.18653/v1/2025.findings-emnlp.991. URL https://aclanthology.org/2025. findings-emnlp.991/
-
[22]
Jiahao Ying, Mingbao Lin, Qianru Sun, and Yixin Cao. Beyond Benchmarks: Understanding Mixture-of-Experts Models through Internal Mechanisms, September 2025. URL http:// arxiv.org/abs/2509.23933. arXiv:2509.23933 [cs]
-
[23]
Strahinja Nikolic, Ilker Oguz, and Demetri Psaltis. Exploring Expert Specialization through Unsupervised Training in Sparse Mixture of Experts, September 2025. URL http://arxiv. org/abs/2509.10025. arXiv:2509.10025 [cs]
-
[24]
Part-Of-Speech Sensitivity of Routers in Mixture of Experts Models, December 2024
Elie Antoine, Frédéric Béchet, and Philippe Langlais. Part-Of-Speech Sensitivity of Routers in Mixture of Experts Models, December 2024. URL http://arxiv.org/abs/2412.16971. arXiv:2412.16971 [cs]
-
[25]
ST-MoE: Designing Stable and Transferable Sparse Expert Models, May
Barret Zoph, Irwan Bello, Sameer Kumar, Nan Du, Yanping Huang, Jeff Dean, Noam Shazeer, and William Fedus. ST-MoE: Designing Stable and Transferable Sparse Expert Models, May
-
[26]
ST-MoE: Designing Stable and Transferable Sparse Expert Models
URLhttp://arxiv.org/abs/2202.08906. arXiv:2202.08906 [cs]
work page internal anchor Pith review Pith/arXiv arXiv
-
[27]
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models, March 2024
Fuzhao Xue, Zian Zheng, Yao Fu, Jinjie Ni, Zangwei Zheng, Wangchunshu Zhou, and Yang You. OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models, March 2024. URLhttp://arxiv.org/abs/2402.01739. arXiv:2402.01739 [cs]
-
[28]
Svetlana Pavlitska, Haixi Fan, Konstantin Ditschuneit, and J. Marius Zöllner. Design and Behavior of Sparse Mixture-of-Experts Layers in CNN-based Semantic Segmentation, April
-
[29]
Design and Behavior of Sparse Mixture-of-Experts Layers in CNN-based Semantic Segmentation
URLhttp://arxiv.org/abs/2604.13761. arXiv:2604.13761 [cs]
work page internal anchor Pith review Pith/arXiv arXiv
-
[30]
D. J. McKeefry and S. Zeki. The position and topography of the human colour centre as revealed by functional magnetic resonance imaging.Brain: A Journal of Neurology, 120 ( Pt 12):2229–2242, December 1997. ISSN 0006-8950. doi: 10.1093/brain/120.12.2229
-
[31]
R Malach, J B Reppas, R R Benson, K K Kwong, H Jiang, W A Kennedy, P J Ledden, T J Brady, B R Rosen, and R B Tootell. Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex.Proceedings of the National Academy of Sciences, 92(18):8135–8139, August 1995. doi: 10.1073/pnas.92.18.8135. URL https://www.pnas. org/doi...
-
[32]
Kalanit Grill-Spector, Tamar Kushnir, Shimon Edelman, Yacov Itzchak, and Rafael Malach. Cue-Invariant Activation in Object-Related Areas of the Human Occipital Lobe.Neuron, 21 (1):191–202, July 1998. ISSN 0896-6273. doi: 10.1016/S0896-6273(00)80526-7. URL https://www.sciencedirect.com/science/article/pii/S0896627300805267. 12
-
[33]
Nancy Kanwisher and Galit Yovel. The fusiform face area: a cortical region specialized for the perception of faces.Philosophical Transactions of the Royal Society B: Biological Sciences, 361(1476):2109–2128, December 2006. ISSN 0962-8436. doi: 10.1098/rstb.2006.1934. URL https://pmc.ncbi.nlm.nih.gov/articles/PMC1857737/
-
[34]
Nancy Kanwisher, Josh McDermott, and Marvin M. Chun. The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception.Journal of Neuroscience, 17(11): 4302–4311, June 1997. ISSN 0270-6474, 1529-2401. doi: 10.1523/JNEUROSCI.17-11-04302
-
[35]
URLhttps://www.jneurosci.org/content/17/11/4302
-
[36]
A cortical representation of the local visual environment
Russell Epstein and Nancy Kanwisher. A cortical representation of the local visual environment. Nature, 392(6676):598–601, April 1998. ISSN 1476-4687. doi: 10.1038/33402. URL https: //www.nature.com/articles/33402
-
[37]
Downing, Yuhong Jiang, Miles Shuman, and Nancy Kanwisher
Paul E. Downing, Yuhong Jiang, Miles Shuman, and Nancy Kanwisher. A Cortical Area Selective for Visual Processing of the Human Body.Science, 293(5539):2470–2473, September
-
[38]
URL https://www.science.org/doi/10.1126/ science.1063414
doi: 10.1126/science.1063414. URL https://www.science.org/doi/10.1126/ science.1063414
-
[39]
Origins of the specialization for letters and numbers in ventral occipitotemporal cortex
Thomas Hannagan, Amir Amedi, Laurent Cohen, Ghislaine Dehaene-Lambertz, and Stanislas Dehaene. Origins of the specialization for letters and numbers in ventral occipitotemporal cortex. Trends in Cognitive Sciences, 19(7):374–382, July 2015. ISSN 1364-6613, 1879-307X. doi: 10. 1016/j.tics.2015.05.006. URL https://www.cell.com/trends/cognitive-sciences/ abs...
work page 2015
-
[40]
Aliette Lochy, Corentin Jacques, Louis Maillard, Sophie Colnat-Coulbois, Bruno Rossion, and Jacques Jonas. Selective visual representation of letters and words in the left ventral occipito-temporal cortex with intracerebral recordings.Proceedings of the National Academy of Sciences, 115(32):E7595–E7604, August 2018. doi: 10.1073/pnas.1718987115. URL https...
-
[41]
Oliver Contier, Chris I. Baker, and Martin N. Hebart. Distributed representations of behaviour- derived object dimensions in the human visual system.Nature Human Behaviour, 8(11): 2179–2193, November 2024. ISSN 2397-3374. doi: 10.1038/s41562-024-01980-y. URL https://www.nature.com/articles/s41562-024-01980-y
-
[42]
Leonard E. van Dyck, Martin N. Hebart, and Katharina Dobs. Multidimensional feature tuning in category-selective areas of human visual cortex, June 2025. URL https://www.biorxiv. org/content/10.1101/2025.06.17.659578v2. Pages: 2025.06.17.659578 Section: New Results
-
[43]
Selma Lugtmeijer, Aleksandra M Sobolewska, Edward H F De Haan, and H Steven Scholte. Visual feature processing in a large stroke cohort: evidence against modular organization.Brain, 148(4):1144–1154, April 2025. ISSN 0006-8950, 1460-2156. doi: 10.1093/brain/awaf009. URLhttps://academic.oup.com/brain/article/148/4/1144/7952043
-
[44]
J. Brendan Ritchie, Susan G. Wardle, Maryam Vaziri-Pashkam, Dwight J. Kravitz, and Chris I. Baker. Rethinking category-selectivity in human visual cortex.Cognitive Neuroscience, 17(2):49–76, April 2026. ISSN 1758-8928. doi: 10.1080/17588928. 2025.2543890. URL https://doi.org/10.1080/17588928.2025.2543890. _eprint: https://doi.org/10.1080/17588928.2025.2543890
-
[45]
Martin N. Hebart, Adam H. Dickter, Alexis Kidder, Wan Y . Kwok, Anna Corriveau, Caitlin Van Wicklin, and Chris I. Baker. THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images.PLOS ONE, 14(10):e0223792, October 2019. ISSN 1932-
work page 2019
-
[46]
URL https://journals.plos.org/plosone/ article?id=10.1371/journal.pone.0223792
doi: 10.1371/journal.pone.0223792. URL https://journals.plos.org/plosone/ article?id=10.1371/journal.pone.0223792
-
[47]
Martin N. Hebart, Charles Y . Zheng, Francisco Pereira, and Chris I. Baker. Revealing the multidimensional mental representations of natural objects underlying human similarity judgements.Nature Human Behaviour, 4(11):1173–1185, November 2020. ISSN 2397-
work page 2020
-
[48]
URL https://www.nature.com/articles/ s41562-020-00951-3
doi: 10.1038/s41562-020-00951-3. URL https://www.nature.com/articles/ s41562-020-00951-3. 13
-
[49]
Residual Mixture of Experts, October 2022
Lemeng Wu, Mengchen Liu, Yinpeng Chen, Dongdong Chen, Xiyang Dai, and Lu Yuan. Residual Mixture of Experts, October 2022. URL http://arxiv.org/abs/2204.09636. arXiv:2204.09636 [cs]
-
[50]
HI-MoE: Hierarchical Instance-Conditioned Mixture-of-Experts for Object Detection
Vadim Vashkelis and Natalia Trukhina. HI-MoE: Hierarchical Instance-Conditioned Mixture- of-Experts for Object Detection, April 2026. URL http://arxiv.org/abs/2604.04908. arXiv:2604.04908 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[51]
E. D. Adrian and D. W. Bronk. The discharge of impulses in motor nerve fibres.The Journal of Physiology, 66(1):81–101, 1928. ISSN 1469-7793. doi: 10.1113/jphysiol.1928.sp002509. URL https://onlinelibrary.wiley.com/doi/abs/10.1113/jphysiol.1928.sp002509. _eprint: https://physoc.onlinelibrary.wiley.com/doi/pdf/10.1113/jphysiol.1928.sp002509
-
[52]
H. K. Hartline. The response of single optic nerve fibers of the vertebrate eye to illumination of the retina.American Journal of Physiology-Legacy Content, 121(2):400–415, January
-
[53]
doi: 10.1152/ajplegacy.1938.121.2.400
ISSN 0002-9513. doi: 10.1152/ajplegacy.1938.121.2.400. URL https://journals. physiology.org/doi/abs/10.1152/ajplegacy.1938.121.2.400
-
[54]
Edgar Y . Walker, Fabian H. Sinz, Erick Cobos, Taliah Muhammad, Emmanouil Froudarakis, Paul G. Fahey, Alexander S. Ecker, Jacob Reimer, Xaq Pitkow, and Andreas S. Tolias. Inception loops discover what excites neurons most using deep predictive models.Nature Neuroscience, 22(12):2060–2065, December 2019. ISSN 1546-1726. doi: 10.1038/s41593-019-0517-x. URL ...
-
[55]
Nikolaus Kriegeskorte and Xue-Xin Wei. Neural tuning and representational geom- etry.Nature Reviews Neuroscience, 22(11):703–718, November 2021. ISSN 1471-
work page 2021
-
[56]
URL https://www.nature.com/articles/ s41583-021-00502-3
doi: 10.1038/s41583-021-00502-3. URL https://www.nature.com/articles/ s41583-021-00502-3
-
[57]
Nikolaus Kriegeskorte and Rogier A. Kievit. Representational geometry: integrating cognition, computation, and the brain.Trends in Cognitive Sciences, 17(8):401–412, August 2013. ISSN 1364-6613. doi: 10.1016/j.tics.2013.06.007. URL https://pmc.ncbi.nlm.nih.gov/ articles/PMC3730178/
-
[58]
S. E. Petersen, P. T. Fox, M. I. Posner, M. Mintun, and M. E. Raichle. Positron emission tomographic studies of the cortical anatomy of single-word processing.Nature, 331(6157): 585–589, February 1988. ISSN 1476-4687. doi: 10.1038/331585a0. URL https://www. nature.com/articles/331585a0
-
[59]
Stanislas Dehaene and Laurent Cohen. The unique role of the visual word form area in reading.Trends in Cognitive Sciences, 15(6):254–262, June 2011. ISSN 1364-6613. doi: 10.1016/j.tics.2011.04.003. URL https://www.sciencedirect.com/science/article/ pii/S1364661311000738
-
[60]
Grace E. Rice, David M. Watson, Tom Hartley, and Timothy J. Andrews. Low-Level Image Properties of Visual Objects Predict Patterns of Neural Response across Category-Selective Regions of the Ventral Visual Pathway.Journal of Neuroscience, 34(26):8837–8844, June
-
[61]
doi: 10.1523/JNEUROSCI.5265-13.2014
ISSN 0270-6474, 1529-2401. doi: 10.1523/JNEUROSCI.5265-13.2014. URL https: //www.jneurosci.org/content/34/26/8837
-
[62]
Uri Hasson, Ifat Levy, Marlene Behrmann, Talma Hendler, and Rafael Malach. Eccentricity Bias as an Organizing Principle for Human High-Order Object Areas.Neuron, 34(3):479– 490, April 2002. ISSN 0896-6273. doi: 10.1016/S0896-6273(02)00662-1. URL https: //www.sciencedirect.com/science/article/pii/S0896627302006621
-
[63]
Michael J. Arcaro, Stephanie A. McMains, Benjamin D. Singer, and Sabine Kastner. Retinotopic Organization of Human Ventral Visual Cortex.Journal of Neuroscience, 29(34):10638–10652, August 2009. ISSN 0270-6474, 1529-2401. doi: 10.1523/JNEUROSCI.2807-09.2009. URL https://www.jneurosci.org/content/29/34/10638
-
[64]
Bria Long, Chen-Ping Yu, and Talia Konkle. Mid-level visual features underlie the high- level categorical organization of the ventral stream.Proceedings of the National Academy of Sciences, 115(38):E9015–E9024, September 2018. doi: 10.1073/pnas.1719616115. URL https://www.pnas.org/doi/full/10.1073/pnas.1719616115. 14
-
[65]
Sushrut Thorat, Daria Proklova, and Marius V Peelen. The nature of the animacy organization in human ventral temporal cortex.eLife, 8:e47142, September 2019. ISSN 2050-084X. doi: 10.7554/eLife.47142. URLhttps://doi.org/10.7554/eLife.47142
-
[66]
Talia Konkle and Alfonso Caramazza. Tripartite Organization of the Ventral Stream by Animacy and Object Size.Journal of Neuroscience, 33(25):10235–10242, June 2013. ISSN 0270-6474, 1529-2401. doi: 10.1523/JNEUROSCI.0983-13.2013. URL https://www.jneurosci.org/ content/33/25/10235
-
[67]
Deep Residual Learning for Image Recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition, December 2015. URL http://arxiv.org/abs/1512.03385. arXiv:1512.03385 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[68]
Investigating the Benefits of Projection Head for Representation Learning, March 2024
Yihao Xue, Eric Gan, Jiayi Ni, Siddharth Joshi, and Baharan Mirzasoleiman. Investigating the Benefits of Projection Head for Representation Learning, March 2024. URL http://arxiv. org/abs/2403.11391. arXiv:2403.11391 [cs]
-
[69]
Jacob S. Prince, George A. Alvarez, and Talia Konkle. Contrastive learning explains the emer- gence and function of visual category-selective regions.Science Advances, 10(39):eadl1776, September 2024. doi: 10.1126/sciadv.adl1776. URL https://www.science.org/doi/10. 1126/sciadv.adl1776
-
[70]
An Analysis of Single-Layer Networks in Unsupervised Feature Learning
Adam Coates, Honglak Lee, and Andrew Y Ng. An Analysis of Single-Layer Networks in Unsupervised Feature Learning. 2011
work page 2011
-
[71]
Xiao-Xiong Lin, Andreas Nieder, and Simon N. Jacob. The neuronal implementation of representational geometry in primate prefrontal cortex.Science Advances, 9(50):eadh8685, December 2023. doi: 10.1126/sciadv.adh8685. URL https://www.science.org/doi/10. 1126/sciadv.adh8685
-
[72]
Martin, Rhodri Cusack, and Stefan Köhler
Anna Blumenthal, Bobby Stojanoski, Chris B. Martin, Rhodri Cusack, and Stefan Köhler. Animacy and real-world size shape object representations in the human medial temporal lobes. Human Brain Mapping, 39(9):3779–3792, June 2018. ISSN 1065-9471. doi: 10.1002/hbm. 24212. URLhttps://pmc.ncbi.nlm.nih.gov/articles/PMC6866524/
work page doi:10.1002/hbm 2018
-
[73]
Johannes Mehrer, Courtney J. Spoerer, Emer C. Jones, Nikolaus Kriegeskorte, and Tim C. Kietzmann. An ecologically motivated image dataset for deep learning yields better models of human vision.Proceedings of the National Academy of Sciences, 118(8):e2011417118, February 2021. doi: 10.1073/pnas.2011417118. URL https://www.pnas.org/doi/10. 1073/pnas.2011417...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.