Human face perception aligns with neural networks trained on inverse-generative and naturalistic discriminative tasks, as these best predict human dissimilarity judgments on controversial and random face pairs.
super hub
1_Reasoning
21 Pith papers cite this work, alongside 72,323 external citations. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
authors
co-cited works
representative citing papers
Crossed random-effects models on LLM word ratings show 16.9% variance from genuine stimulus-specific individuality, exceeding null models and forming coherent per-model fingerprints.
SCOOTER supplies best-practice guidelines, open tools, and a 3K-image benchmark with 34K+ human ratings showing that six tested unrestricted attacks produce images humans can detect as fake.
Robots detect underspecified reward features via demonstration variation and query targeted natural language explanations to improve reward recovery from imperfect demos.
A framework using generative AI to produce synthetic multilevel data for Monte Carlo simulations that evaluate the performance and parameter recovery of quantitative methods.
Vision language models applied to daily-life photos quantify visual environmental features that correlate with momentary affect and chronic stress, establishing a paradigm for visual exposomics.
Substantive LLM reframing boosts cross-partisan receptivity to news headlines without backfire, but models overestimate effect sizes and lack fidelity in modeling human psychological responses.
Fluent AI users adopt an active, iterative collaboration mode that produces more visible failures but better recovery and success on hard tasks, whereas novices experience more invisible failures from passive use.
LLM originality raters exhibit self-preference bias toward artificial responses that disappears after controlling for idea elaboration in the Alternate Uses Task.
Later LLM layers align better with human cognitive effort in syntactic ambiguity than early layers do, indicating dual processing modes and complementary benefits from multi-layer probability updates.
Reasoning models expend more tokens on association-incompatible tasks than compatible ones, indicating greater effort on counter-stereotypical information, except for Claude 3.7 Sonnet which shows the reverse pattern linked to its bias-focused reasoning.
A framework using language models to simulate non-existent experiments and derive novel testable hypotheses on dative verb acquisition and cross-structural generalization in children.
Introduces MASI to standardize net migration rates for age structure and applies a Bayesian hierarchical model to forecast adjusted total and age-sex specific migration rates through 2100, yielding narrower intervals and moderated decline projections.
SPICE is a scalable Bayesian MCMC engine for explanatory IRT calibration on sparsely linked persons and items in large assessment banks.
ProfileGLMM is an R package extending Bayesian profile regression with GLMMs to support hierarchical data, random effects, and cluster-covariate interactions for continuous or binary outcomes.
Accented synthetic speech leads users to align their lexical choices with the perceived accent of the machine partner, mirroring human-human dialogue patterns.
LLMs function as accurate semantic processors for conditionals but do not replicate the pragmatic inferences that define human reasoning.
Low vision individuals with central visual field loss can use head-pointing to select 2° targets in VR, reaching near-control performance with sufficiently large pointer activation zones.
Open shelving in a virtual kitchen reduced task time and physical activity for older adults with and without MCI while increasing gaze entropy, with no change in subjective cognitive load or motivation.
Strategic selection of LLMs and reasoning effort optimizes automated scoring accuracy and cost more effectively than self-consistency ensembling.
citing papers explorer
-
Human face perception reflects inverse-generative and naturalistic discriminative objectives
Human face perception aligns with neural networks trained on inverse-generative and naturalistic discriminative tasks, as these best predict human dissimilarity judgments on controversial and random face pairs.
-
Machine individuality: Separating genuine idiosyncrasy from response bias in large language models
Crossed random-effects models on LLM word ratings show 16.9% variance from genuine stimulus-specific individuality, exceeding null models and forming coherent per-model fingerprints.
-
SCOOTER: A Human Evaluation Framework for Unrestricted Adversarial Examples
SCOOTER supplies best-practice guidelines, open tools, and a 3K-image benchmark with 34K+ human ratings showing that six tested unrestricted attacks produce images humans can detect as fake.
-
Robots That Know What to Ask: Recovering Misaligned Rewards through Targeted Explanations
Robots detect underspecified reward features via demonstration variation and query targeted natural language explanations to improve reward recovery from imperfect demos.
-
Generative AI-Based Monte Carlo Simulation for Method Evaluation Using Synthetic Multilevel Data
A framework using generative AI to produce synthetic multilevel data for Monte Carlo simulations that evaluate the performance and parameter recovery of quantitative methods.
-
Quantifying the human visual exposome with vision language models
Vision language models applied to daily-life photos quantify visual environmental features that correlate with momentary affect and chronic stress, establishing a paradigm for visual exposomics.
-
Can AI Debias the News? LLM Interventions Improve Cross-Partisan Receptivity but LLMs Overestimate Their Own Effectiveness
Substantive LLM reframing boosts cross-partisan receptivity to news headlines without backfire, but models overestimate effect sizes and lack fidelity in modeling human psychological responses.
-
A paradox of AI fluency
Fluent AI users adopt an active, iterative collaboration mode that produces more visible failures but better recovery and success on hard tasks, whereas novices experience more invisible failures from passive use.
-
The Effect of Idea Elaboration on the Automatic Assessment of Idea Originality
LLM originality raters exhibit self-preference bias toward artificial responses that disappears after controlling for idea elaboration in the Alternate Uses Task.
-
Dual Alignment Between Language Model Layers and Human Sentence Processing
Later LLM layers align better with human cognitive effort in syntactic ambiguity than early layers do, indicating dual processing modes and complementary benefits from multi-layer probability updates.
-
Implicit Bias-Like Patterns in Reasoning Models
Reasoning models expend more tokens on association-incompatible tasks than compatible ones, indicating greater effort on counter-stereotypical information, except for Claude 3.7 Sonnet which shows the reverse pattern linked to its bias-focused reasoning.
-
A systematic framework for generating novel experimental hypotheses from language models
A framework using language models to simulate non-existent experiments and derive novel testable hypotheses on dative verb acquisition and cross-structural generalization in children.
-
Bringing Age Back In: Accounting for Population Age Distribution in Forecasting Migration
Introduces MASI to standardize net migration rates for age structure and applies a Bayesian hierarchical model to forecast adjusted total and age-sex specific migration rates through 2100, yielding narrower intervals and moderated decline projections.
-
A Scalable Parametric Item Calibration Engine (SPICE) for Explanatory IRT with Sparse Data
SPICE is a scalable Bayesian MCMC engine for explanatory IRT calibration on sparsely linked persons and items in large assessment banks.
-
ProfileGLMM: a R Package Extending Bayesian Profile Regression using Generalised Linear Mixed Models
ProfileGLMM is an R package extending Bayesian profile regression with GLMMs to support hierarchical data, random effects, and cluster-covariate interactions for continuous or binary outcomes.
-
What's in an accent? The impact of accented synthetic speech on lexical choice in human-machine dialogue
Accented synthetic speech leads users to align their lexical choices with the perceived accent of the machine partner, mirroring human-human dialogue patterns.
-
Tracing the ongoing emergence of human-like reasoning in Large Language Models
LLMs function as accurate semantic processors for conditionals but do not replicate the pragmatic inferences that define human reasoning.
-
Performance of low vision individuals when selecting a target with head-pointing in virtual reality
Low vision individuals with central visual field loss can use head-pointing to select 2° targets in VR, reaching near-control performance with sufficiently large pointer activation zones.
-
Visual Accessibility in a Virtual Kitchen: Effects of Open Shelving on Performance, Cognitive Load, and Experience in Older Adults with and without MCI
Open shelving in a virtual kitchen reduced task time and physical activity for older adults with and without MCI while increasing gaze entropy, with no change in subjective cognitive load or motivation.
-
The Impact of LLM Self-Consistency and Reasoning Effort on Automated Scoring Accuracy and Cost
Strategic selection of LLMs and reasoning effort optimizes automated scoring accuracy and cost more effectively than self-consistency ensembling.
- Thinking Fast, Thinking Wrong: Intuitiveness Modulates LLM Counterfactual Reasoning in Policy Evaluation