Perception Is All You Need: A Neuroscience Framework for Low Cost Sensorless Gaze in HRI
Pith reviewed 2026-05-10 16:47 UTC · model grok-4.3
The pith
A cardboard robot with concave painted eyes makes viewers perceive mutual gaze by exploiting their own brain's assumptions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The core discovery is that implementing the brain's gaze direction computation in reverse—via a concave eye design—causes the distributed face processing network, including the superior temporal sulcus, to interpret the painted pupil as directed gaze because the high-precision convexity prior forces perception of the socket as convex and top-down predictions override actual depth signals from the concavity.
What carries the argument
The brain's high-precision convexity prior in the predictive processing hierarchy, which overrides bottom-up depth cues with top-down face knowledge to perceive concave eye sockets as convex and thereby compute mutual gaze direction.
If this is right
- Existing findings on how robot gaze improves attention and learning in children can now be applied using platforms costing under one dollar.
- Robot designs become open-source templates with interchangeable eye inserts that parameterize the effect.
- Privacy concerns disappear since no sensors or data processing are involved.
- Boundary conditions based on developmental stages, clinical populations, and viewing geometry predict where the effect holds.
- Two decades of HRI research on gaze become deliverable without the previous cost and complexity barriers.
Where Pith is reading between the lines
- Similar perceptual hacks could apply to other robot behaviors like facial expressions to reduce hardware needs.
- Field tests in real classrooms would reveal if the effect persists over time or with repeated exposure.
- Variations in the illusion strength across different cultures or age groups could inform refinements to the eye insert designs.
- Combining this with other low-tech robot elements might create fully functional educational robots from printed materials.
Load-bearing premise
The convexity prior and predictive processing hierarchy will reliably make people perceive mutual gaze from a concave painted eye across different angles, ages, and conditions in actual interactions.
What would settle it
A controlled study showing whether participants exhibit gaze-following behaviors or report eye contact with the concave-eye robot compared to control designs with flat or protruding eyes, especially under varying lighting or distances.
Figures
read the original abstract
Gaze-following in child-robot interaction improves attention, recall, and learning, but requires expensive platforms (\$30,000+), sensors, algorithms, and raises privacy concerns. We propose a framework that avoids sensors and computation entirely, instead relying on the human visual system's assumption of convexity to produce perceptual gaze-following between a robot and its viewer. Specifically, we motivate sub-dollar cardboard robot design that directly implements the brain's own gaze computation pipeline in reverse, making the viewer's perceptual system the robot's "actuator", with no sensors, no power, and no privacy concerns. We ground this framework in three converging lines of theoretical and empirical neuroscience evidence. Namely, the distributed face processing network that computes gaze direction via the superior temporal sulcus, the high-precision convexity prior that causes the brain to perceive concave faces as convex, and the predictive processing hierarchy in which top-down face knowledge overrides bottom-up depth signals. These mechanisms explain why a concave eye socket with a painted pupil produces the perception of mutual gaze from any viewing angle. We derive design constraints from perceptual science, present a sub-dollar open-template robot with parameterized interchangeable eye inserts, and identify boundary conditions (developmental, clinical, and geometric) that predict where the framework will succeed and where it will fail. If leveraged, two decades of HRI gaze findings become deliverable at population scale.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a neuroscience-inspired framework for sensorless gaze-following in HRI that uses a sub-dollar cardboard robot with concave eye sockets and painted pupils. It claims that the human visual system's convexity prior, face-processing network (STS), and predictive-processing hierarchy will cause viewers to perceive mutual gaze from any angle, effectively making the viewer's perception the robot's actuator. The work grounds the idea in three lines of existing literature, derives design constraints, presents an open-template robot with interchangeable eye inserts, and identifies developmental, clinical, and geometric boundary conditions under which the approach is predicted to succeed or fail, with the goal of scaling established HRI gaze benefits without sensors, computation, power, or privacy issues.
Significance. If the proposed translation from neuroscience to this low-fidelity artifact holds, the framework could enable population-scale deployment of gaze-following robots in education and therapy at negligible cost. The manuscript earns credit for its explicit grounding in established neuroscience results, the provision of a parameterized open-source template, and the clear statement of falsifiable boundary conditions rather than overclaiming universality.
major comments (2)
- [Abstract and Neuroscience Framework] The central claim that the cardboard design 'produces the perception of mutual gaze from any viewing angle' (Abstract) rests on an untested extrapolation; the cited convexity-prior and predictive-processing studies use near-photorealistic or mask stimuli in static adult lab settings, and the manuscript supplies no user studies, eye-tracking data, or validation experiments confirming the effect survives cardboard geometry, low-contrast painting, child viewers, autism-spectrum populations, or dynamic HRI angles.
- [Boundary Conditions] Boundary Conditions section: the listed developmental, clinical, and geometric limits are not quantified against the concrete design parameters (socket depth, pupil placement, viewing distance, contrast) that the open template actually uses, so the reliability prediction for the specific artifact remains unsupported by data internal to the manuscript.
minor comments (1)
- [Design] The open-template description would benefit from explicit CAD or cut-file parameters (e.g., exact concavity depth in mm) so that boundary-condition tests can be reproduced.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. The manuscript presents a theoretical neuroscience framework and open design template rather than an empirical validation study. We address each major comment below and outline targeted revisions to improve clarity and precision without altering the core contribution.
read point-by-point responses
-
Referee: [Abstract and Neuroscience Framework] The central claim that the cardboard design 'produces the perception of mutual gaze from any viewing angle' (Abstract) rests on an untested extrapolation; the cited convexity-prior and predictive-processing studies use near-photorealistic or mask stimuli in static adult lab settings, and the manuscript supplies no user studies, eye-tracking data, or validation experiments confirming the effect survives cardboard geometry, low-contrast painting, child viewers, autism-spectrum populations, or dynamic HRI angles.
Authors: We agree that the manuscript contains no new empirical data and that the abstract phrasing could be read as asserting a proven outcome rather than a literature-derived prediction. The contribution is the reverse-engineering of established perceptual mechanisms (convexity prior, STS gaze computation, and predictive processing) into a low-cost artifact, with explicit boundary conditions stated as falsifiable predictions. In revision we will rephrase the abstract and introduction to emphasize that the mutual-gaze perception is a hypothesized outcome requiring future validation across the listed populations and conditions. We will also add a short paragraph in the discussion outlining planned or recommended empirical tests (e.g., forced-choice gaze-direction judgments with the open template). This change clarifies scope without weakening the grounding in the cited neuroscience. revision: yes
-
Referee: [Boundary Conditions] Boundary Conditions section: the listed developmental, clinical, and geometric limits are not quantified against the concrete design parameters (socket depth, pupil placement, viewing distance, contrast) that the open template actually uses, so the reliability prediction for the specific artifact remains unsupported by data internal to the manuscript.
Authors: The boundary conditions are currently stated at the level of the general perceptual literature rather than tied to the template's specific dimensions. We acknowledge this gap. In the revised manuscript we will expand the section to map each condition to approximate quantitative ranges drawn from the referenced studies (for example, effective distances for convexity reversal effects, contrast thresholds for pupil visibility, and age ranges for mature face processing). Where the literature does not supply exact values for cardboard geometry, we will explicitly note the absence and flag it as an empirical question for users of the template. This will give readers clearer guidance on expected reliability for the provided design parameters. revision: partial
Circularity Check
No significant circularity; proposal applies external neuroscience results to new design without self-referential reduction
full rationale
The manuscript presents a conceptual framework that reverses known perceptual mechanisms (convexity prior, predictive processing hierarchy, STS gaze computation) to motivate a passive cardboard robot design. No equations, fitted parameters, or predictions are defined in terms of themselves. Design constraints are stated as derived from cited perceptual science literature rather than from any internal fit or self-citation chain. The central success claim is an untested extrapolation to HRI settings, but this is not circular by construction; it is simply unsupported by new data. No load-bearing step reduces to the paper's own inputs.
Axiom & Free-Parameter Ledger
axioms (3)
- domain assumption The distributed face processing network computes gaze direction via the superior temporal sulcus.
- domain assumption The brain applies a high-precision convexity prior that causes concave faces to be perceived as convex.
- domain assumption Top-down face knowledge overrides bottom-up depth signals in the predictive processing hierarchy.
invented entities (1)
-
Cardboard robot with concave eye sockets and painted pupils
no independent evidence
Reference graph
Works this paper leans on
-
[1]
\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...
-
[2]
\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...
-
[3]
@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.