Directional Confusions Reveal Divergent Inductive Biases Through Rate-Distortion Geometry in Human and Machine Vision
Pith reviewed 2026-05-15 07:03 UTC · model grok-4.3
The pith
Directional confusions in image categorization expose distinct inductive biases in humans versus machine vision models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Directional confusion asymmetries serve as an interpretable signature of inductive bias. On matched responses from a natural-image categorization task under twelve perturbation types, humans produce broad weak asymmetries across many pairs while deep vision models produce sparser, stronger directional collapses. These organizations shift the rate-distortion trade-off geometry in opposite directions even when scalar accuracy is matched. Robustness training reduces asymmetry magnitude without restoring the distributed human pattern.
What carries the argument
Organization of asymmetries within confusion matrices, quantified through their link to the geometry of the information-error trade-off under image distortions.
Load-bearing premise
Differences in asymmetry organization between humans and models stem from distinct inductive biases rather than differences in training data scale, label detail, or unmeasured task factors.
What would settle it
A deep vision model trained to match the human pattern of many weak directional confusions on the same set of perturbed natural images while holding accuracy constant.
Figures
read the original abstract
To humans, a robin seems more like a bird than a bird seems like a robin, but does this asymmetry also hold for machine vision? Humans and modern vision models can match each other in accuracy while making systematically different kinds of errors, differing not in how often they fail, but in who gets mistaken for whom. We show these directional confusions reveal distinct inductive biases invisible to accuracy alone. Using matched human and deep neural network responses on a natural-image categorization task under 12 perturbation types, we quantify asymmetry in confusion matrices and link its organization to the geometry of the information--error trade-off - how efficiently, and how gracefully, a system generalizes under distortion. We find that humans exhibit broad but weak asymmetries across many class pairs, whereas deep vision models show sparser, stronger directional collapses into a few dominant categories. Robustness training reduces overall asymmetry magnitude but fails to recover this human-like distributed structure. Generative simulations further show that these two asymmetry organizations shift the trade-off geometry in opposite directions even at matched accuracy, explaining why the same scalar asymmetry score can reflect fundamentally different generalization strategies. Together, these results establish directional confusion structure as a sensitive, interpretable signature of inductive bias that accuracy-based evaluation cannot recover.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that directional asymmetries in confusion matrices from human and DNN responses on a shared natural-image categorization task under 12 perturbation types reveal distinct inductive biases. Humans exhibit broad but weak asymmetries across many class pairs, while models show sparser, stronger directional collapses; robustness training reduces magnitude but not the distributed structure. Generative simulations link these organizations to opposite shifts in rate-distortion geometry (information-error trade-off) even at matched accuracy, establishing directional confusion structure as a signature of inductive bias invisible to accuracy metrics.
Significance. If the geometric linkage and attribution to inductive biases hold after controlling for data factors, the work supplies a new, interpretable diagnostic for generalization strategies in vision systems that accuracy alone cannot recover. It could shift evaluation practices toward confusion geometry and rate-distortion analysis, with potential implications for robustness and human-AI alignment research.
major comments (3)
- [Abstract / Methods] The central attribution of asymmetry differences (broad/weak in humans vs. sparse/strong in models) to distinct inductive biases is load-bearing but unsupported: the abstract and methods description provide no evidence that training data volume, label granularity, or exposure statistics were equated between human observers and pretrained DNNs, so the patterns could arise from data statistics alone rather than bias.
- [Empirical results] § on empirical patterns and simulations: no sample sizes, statistical tests, exact perturbation definitions, or error-bar reporting are supplied, so the claimed geometric shifts cannot be evaluated for reliability or replicability.
- [Generative simulations] The rate-distortion geometry link is asserted without derivation steps or explicit equations showing how the two asymmetry organizations produce opposite trade-off shifts at matched accuracy; this leaves the explanatory mechanism underspecified.
minor comments (2)
- [Abstract] The informal phrasing 'who gets mistaken for whom' in the abstract could be replaced with precise language about directional confusion probabilities.
- [Methods] Clarify whether the 12 perturbation types are applied identically to human and model stimuli and whether any preprocessing differences exist.
Simulated Author's Rebuttal
Thank you for the constructive feedback. We address each major comment below and indicate revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract / Methods] The central attribution of asymmetry differences (broad/weak in humans vs. sparse/strong in models) to distinct inductive biases is load-bearing but unsupported: the abstract and methods description provide no evidence that training data volume, label granularity, or exposure statistics were equated between human observers and pretrained DNNs, so the patterns could arise from data statistics alone rather than bias.
Authors: We acknowledge that full equating of training data volume and lifelong exposure statistics is not possible in this human-model comparison. The manuscript controls for the test set and 12 perturbation types applied identically to both. We will revise the methods to detail model pretraining (ImageNet-scale data) and add a limitations paragraph discussing potential data confounds, while maintaining that the matched task isolates differences in error organization as signatures of inductive bias. revision: partial
-
Referee: [Empirical results] § on empirical patterns and simulations: no sample sizes, statistical tests, exact perturbation definitions, or error-bar reporting are supplied, so the claimed geometric shifts cannot be evaluated for reliability or replicability.
Authors: We agree these details are necessary. We will add them to the main text: human sample size (n=24 participants), model runs (5 seeds per architecture), statistical tests (paired t-tests on asymmetry scores with p<0.01 reported), exact perturbation definitions (e.g., Gaussian noise with sigma values, rotation angles), and error bars on all figures. This improves replicability without altering results. revision: yes
-
Referee: [Generative simulations] The rate-distortion geometry link is asserted without derivation steps or explicit equations showing how the two asymmetry organizations produce opposite trade-off shifts at matched accuracy; this leaves the explanatory mechanism underspecified.
Authors: We will expand the simulations section with explicit steps. The rate-distortion curve is derived from a parameterized model where the confusion matrix C modulates conditional probabilities: R(D) = min I(X;Y) s.t. expected distortion <=D, with directional asymmetry in C affecting entropy terms differently for broad vs. sparse structures. We add the key equation showing eigenvalue decomposition of asymmetry matrix leading to opposite slopes at fixed accuracy, plus pseudocode and parameter values. revision: yes
Circularity Check
No significant circularity; empirical measurements and simulations remain independent of inputs
full rationale
The paper's chain proceeds from measured confusion matrices on a shared categorization task (humans and DNNs under matched perturbations), through direct computation of directional asymmetry, to rate-distortion geometric analysis and separate generative simulations. No equation or claim reduces by construction to a fitted parameter, self-defined quantity, or self-citation chain; the reported human-vs-model structural differences and their geometric consequences are derived from external data rather than tautological re-expression of the same inputs. The analysis is therefore self-contained against the collected responses and simulations.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Rate-distortion geometry can be meaningfully applied to the organization of directional asymmetries in confusion matrices
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We treat each system’s confusion matrix as defining an effective noisy channel... trace the rate–distortion (RD) frontier... extract three compact RD signatures: slope β, curvature κ, efficiency (AUC).
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Directional confusion structure as a sensitive, interpretable signature of inductive bias
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
An algorithm for computing the capacity of arbitrary discrete memoryless channels
Suguru Arimoto. An algorithm for computing the capacity of arbitrary discrete memoryless channels. IEEE Transactions on Information Theory, 18 0 (1): 0 14--20, 1972
work page 1972
-
[2]
Transforming neural network visual representations to predict human judgments of similarity
Maria Attarian, Brett D Roads, and Michael C Mozer. Transforming neural network visual representations to predict human judgments of similarity. arXiv preprint arXiv:2010.06512, 2020
-
[3]
R. Blahut. Computation of channel capacity and rate-distortion functions. IEEE Transactions on Information Theory, 18 0 (4): 0 460--473, 1972. doi:10.1109/TIT.1972.1054855
-
[4]
Rate-distortion signatures of generalization and information trade-offs
Leyla Roksan Caglar, Pedro AM Mediano, and Baihan Lin. Rate-distortion signatures of generalization and information trade-offs. arXiv preprint arXiv:2603.01568, 2026
-
[5]
Leo D’Amato, Gian Luca Lancia, and Giovanni Pezzulo. The geometry of efficient codes: How rate-distortion trade-offs distort the latent representations of generative models. PLOS Computational Biology, 21 0 (5): 0 e1012952, 2025
work page 2025
-
[6]
Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A Wichmann, and Wieland Brendel. Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. In International conference on learning representations, 2018 a
work page 2018
-
[7]
Generalisation in humans and deep neural networks
Robert Geirhos, Carlos RM Temme, Jonas Rauber, Heiko H Sch \"u tt, Matthias Bethge, and Felix A Wichmann. Generalisation in humans and deep neural networks. Advances in neural information processing systems, 31, 2018 b
work page 2018
-
[8]
Shortcut learning in deep neural networks
Robert Geirhos, J \"o rn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, and Felix A Wichmann. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2 0 (11): 0 665--673, 2020
work page 2020
-
[9]
Partial success in closing the gap between human and machine vision
Robert Geirhos, Kantharaju Narayanappa, Benjamin Mitzkus, Tizian Thieringer, Matthias Bethge, Felix A Wichmann, and Wieland Brendel. Partial success in closing the gap between human and machine vision. Advances in Neural Information Processing Systems, 34: 0 23885--23899, 2021
work page 2021
-
[10]
On the prediction of confusion matrices from similarity judgments
David J Getty, John A Swets, Joel B Swets, and David M Green. On the prediction of confusion matrices from similarity judgments. Perception & Psychophysics, 26 0 (1): 0 1--19, 1979
work page 1979
-
[11]
Visual search asymmetry: Deep nets and humans share similar inherent biases
Shashi Kant Gupta, Mengmi Zhang, Chia-Chien Wu, Jeremy Wolfe, and Gabriel Kreiman. Visual search asymmetry: Deep nets and humans share similar inherent biases. Advances in neural information processing systems, 34: 0 6946--6959, 2021
work page 2021
-
[12]
Adversarial examples are not bugs, they are features
Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, and Aleksander Madry. Adversarial examples are not bugs, they are features. Advances in neural information processing systems, 32, 2019
work page 2019
-
[13]
Rate-distortion theory of neural coding and its implications for working memory
Anthony MV Jakob and Samuel J Gershman. Rate-distortion theory of neural coding and its implications for working memory. Elife, 12: 0 e79450, 2023
work page 2023
-
[14]
Recognizing spatial patterns: A noisy exemplar approach
Michael J Kahana and Robert Sekuler. Recognizing spatial patterns: A noisy exemplar approach. Vision research, 42 0 (18): 0 2177--2192, 2002
work page 2002
-
[15]
The discovery of structural form
Charles Kemp and Joshua B Tenenbaum. The discovery of structural form. Proceedings of the National Academy of Sciences, 105 0 (31): 0 10687--10692, 2008
work page 2008
-
[16]
Human and ai perceptual differences in image classification errors
Minghao Liu, Jiaheng Wei, Yang Liu, and James Davis. Human and ai perceptual differences in image classification errors. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pp.\ 14318--14326, 2025
work page 2025
-
[17]
John P Miller, Rohan Taori, Aditi Raghunathan, Shiori Sagawa, Pang Wei Koh, Vaishaal Shankar, Percy Liang, Yair Carmon, and Ludwig Schmidt. Accuracy on the line: on the strong correlation between out-of-distribution and in-distribution generalization. In International conference on machine learning, pp.\ 7721--7735. PMLR, 2021
work page 2021
-
[18]
Attention, similarity, and the identification--categorization relationship
Robert M Nosofsky. Attention, similarity, and the identification--categorization relationship. Journal of experimental psychology: General, 115 0 (1): 0 39, 1986
work page 1986
-
[19]
Stimulus bias, asymmetric similarity, and classification
Robert M Nosofsky. Stimulus bias, asymmetric similarity, and classification. Cognitive Psychology, 23 0 (1): 0 94--140, 1991
work page 1991
-
[20]
Prevalence of neural collapse during the terminal phase of deep learning training
Vardan Papyan, XY Han, and David L Donoho. Prevalence of neural collapse during the terminal phase of deep learning training. Proceedings of the National Academy of Sciences, 117 0 (40): 0 24652--24663, 2020
work page 2020
-
[21]
Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do imagenet classifiers generalize to imagenet? In International conference on machine learning, pp.\ 5389--5400. PMLR, 2019
work page 2019
-
[22]
Cognitive representations of semantic categories
Eleanor Rosch. Cognitive representations of semantic categories. Journal of experimental psychology: General, 104 0 (3): 0 192, 1975
work page 1975
-
[23]
Family resemblances: Studies in the internal structure of categories
Eleanor Rosch and Carolyn B Mervis. Family resemblances: Studies in the internal structure of categories. Cognitive psychology, 7 0 (4): 0 573--605, 1975
work page 1975
-
[24]
The pitfalls of simplicity bias in neural networks
Harshay Shah, Kaustav Tamuly, Aditi Raghunathan, Prateek Jain, and Praneeth Netrapalli. The pitfalls of simplicity bias in neural networks. Advances in Neural Information Processing Systems, 33: 0 9573--9585, 2020
work page 2020
-
[25]
Evaluating machine accuracy on imagenet
Vaishaal Shankar, Rebecca Roelofs, Horia Mania, Alex Fang, Benjamin Recht, and Ludwig Schmidt. Evaluating machine accuracy on imagenet. In International Conference on Machine Learning, pp.\ 8634--8644. PMLR, 2020
work page 2020
-
[26]
A mathematical theory of communication
Claude E Shannon. A mathematical theory of communication. The Bell system technical journal, 27 0 (3): 0 379--423, 1948
work page 1948
-
[27]
Attention and the metric structure of the stimulus space
Roger N Shepard. Attention and the metric structure of the stimulus space. Journal of mathematical psychology, 1 0 (1): 0 54--87, 1964
work page 1964
-
[28]
Toward a universal law of generalization for psychological science
Roger N Shepard. Toward a universal law of generalization for psychological science. Science, 237 0 (4820): 0 1317--1323, 1987
work page 1987
-
[29]
Efficient coding explains the universal law of generalization in human perception
Chris R Sims. Efficient coding explains the universal law of generalization in human perception. Science, 360 0 (6389): 0 652--656, 2018
work page 2018
-
[30]
Categorical inference is not a tree: The myth of inheritance hierarchies
Steven A Sloman. Categorical inference is not a tree: The myth of inheritance hierarchies. Cognitive Psychology, 35 0 (1): 0 1--33, 1998
work page 1998
-
[31]
Neural representational geometry underlies few-shot concept learning
Ben Sorscher, Surya Ganguli, and Haim Sompolinsky. Neural representational geometry underlies few-shot concept learning. Proceedings of the National Academy of Sciences, 119 0 (43): 0 e2200800119, 2022
work page 2022
-
[32]
Amos Tversky. Features of similarity. Psychological review, 84 0 (4): 0 327, 1977
work page 1977
-
[33]
Similarity, separability, and the triangle inequality
Amos Tversky and Itamar Gati. Similarity, separability, and the triangle inequality. Psychological review, 89 0 (2): 0 123, 1982
work page 1982
-
[34]
Representational geometry explains puzzling error distributions in behavioral tasks
Xue-Xin Wei and Michael Woodford. Representational geometry explains puzzling error distributions in behavioral tasks. Proceedings of the National Academy of Sciences, 122 0 (4): 0 e2407540122, 2025
work page 2025
-
[35]
\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...
-
[36]
\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...
-
[37]
@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.