GazeSync: A Mobile Eye-Tracking Tool for Analyzing Visual Attention on Dynamically Manipulated Content
Pith reviewed 2026-05-14 21:57 UTC · model grok-4.3
The pith
GazeSync synchronizes on-device gaze estimation with real-time UI transformation matrices to recover image-relative attention on dynamic mobile content.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
GazeSync enables accurate reconstruction of image-relative gaze patterns by pairing on-device eye estimation with precise real-time UI transformation matrices, thereby decoupling visual attention from fixed device coordinates and outperforming static mapping approaches while exposing calibration boundaries under compound manipulations.
What carries the argument
GazeSync, the end-to-end mobile system that synchronizes gaze coordinates with live scale, rotation, and translation matrices to reconstruct content-relative attention.
If this is right
- Attention patterns can be analyzed during natural pinch zoom and rotate interactions without losing semantic reference to the image content.
- Static coordinate baselines are shown to be inferior for recovering true gaze locations once content transforms.
- Calibration drift and reconstruction fragility become measurable boundaries when multiple transformations are applied together.
- The toolchain supports guided manipulation reading and visual search tasks on mobile devices.
Where Pith is reading between the lines
- The same synchronization approach could be tested on video playback or animated interfaces where content changes continuously.
- Real-time feedback loops might use the reconstructed gaze to adapt UI elements during user manipulations.
- Extending the method to multi-user shared screens could reveal how attention shifts when collaborators transform the same content.
Load-bearing premise
On-device gaze estimation can be kept synchronized with precise real-time UI transformation matrices without significant drift or accuracy loss when users perform combined pinch-zoom-rotate actions.
What would settle it
A controlled test that measures large systematic deviation between GazeSync-reconstructed gaze points and independently verified ground-truth locations on content undergoing simultaneous scale rotation and translation.
Figures
read the original abstract
Conventional mobile eye-tracking maps gaze to static screen coordinates, failing to capture user attention when content is dynamic. As users pinch, zoom, and rotate images, static coordinates lose their semantic meaning relative to the underlying visual content. To address this methodological gap, we present \textit{GazeSync}, a reusable mobile system that synchronizes on-device gaze estimation with real-time image transformation matrices (scale, rotation, and translation). By logging gaze coordinates alongside precise UI states, GazeSync enables the accurate reconstruction of \textit{image-relative} attention patterns, decoupling visual attention from device interaction. We validate our end-to-end toolchain through a formative study involving guided manipulation, reading, and visual search tasks. Our results demonstrate GazeSync's ability to recover ground-truth gaze locations on transforming content, explicitly showing how it outperforms static baselines, while also surfacing critical boundaries regarding calibration drift and reconstruction fragility under compound manipulations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents GazeSync, a reusable mobile system that synchronizes on-device gaze estimation with real-time image transformation matrices (scale, rotation, translation) to reconstruct image-relative gaze locations during dynamic manipulations such as pinch-zoom-rotate. It describes a formative study with guided manipulation, reading, and visual search tasks, claiming that the system recovers ground-truth gaze on transforming content, outperforms static baselines, and identifies boundaries like calibration drift and fragility under compound manipulations.
Significance. If the synchronization accuracy and reconstruction claims hold with supporting metrics, GazeSync would address a clear methodological gap in mobile HCI eye-tracking by enabling attention analysis decoupled from interaction on dynamic content. This could support more ecologically valid studies of visual attention during real-world mobile tasks, with the reusable toolchain as a practical contribution for the community.
major comments (3)
- [Abstract] Abstract: the central claim that GazeSync 'recovers ground-truth gaze locations' and 'outperforms static baselines' is unsupported by any quantitative error metrics, participant counts, statistical tests, or reconstruction accuracy numbers, leaving the empirical validation only partially described.
- [Formative study description] The formative study section does not specify how ground-truth gaze was independently established (e.g., fiducial markers, post-hoc annotation, or eye-tracker calibration validation) nor report error rates specifically for compound pinch-zoom-rotate sequences versus single-axis changes, which directly undermines the outperformance and fragility claims.
- [System description] No details are provided on the real-time acquisition of UI transformation matrices, synchronization latency, or any ablation isolating drift accumulation during simultaneous multi-axis manipulations, which is load-bearing for the decoupling of attention from interaction.
minor comments (2)
- [Formative study] Clarify the exact number of participants, task durations, and device models used in the study to allow replication.
- [Results] Add a figure or table summarizing reconstruction error under different manipulation types to make the results more concrete.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback. We address each major comment below and indicate planned revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that GazeSync 'recovers ground-truth gaze locations' and 'outperforms static baselines' is unsupported by any quantitative error metrics, participant counts, statistical tests, or reconstruction accuracy numbers, leaving the empirical validation only partially described.
Authors: We agree that the abstract summarizes results at a high level without quantitative metrics. The full manuscript reports reconstruction errors and baseline comparisons in the evaluation section, but we will revise the abstract to include key metrics (e.g., mean error, participant count, and statistical comparisons) so the central claims are substantiated at the abstract level. revision: yes
-
Referee: [Formative study description] The formative study section does not specify how ground-truth gaze was independently established (e.g., fiducial markers, post-hoc annotation, or eye-tracker calibration validation) nor report error rates specifically for compound pinch-zoom-rotate sequences versus single-axis changes, which directly undermines the outperformance and fragility claims.
Authors: We acknowledge this gap in the description. Ground-truth was established via post-hoc annotation aligned to the applied transformation matrices. We will revise the formative study section to explicitly describe this process and add separate error-rate breakdowns for compound versus single-axis manipulations to better support the outperformance and fragility claims. revision: yes
-
Referee: [System description] No details are provided on the real-time acquisition of UI transformation matrices, synchronization latency, or any ablation isolating drift accumulation during simultaneous multi-axis manipulations, which is load-bearing for the decoupling of attention from interaction.
Authors: We agree these implementation details are necessary for reproducibility. We will expand the system description to cover real-time matrix acquisition from the mobile UI framework, report measured synchronization latency, and include an ablation isolating drift under simultaneous multi-axis manipulations. revision: yes
Circularity Check
No circularity: system description plus empirical validation with no derivations or self-referential reductions
full rationale
The paper presents GazeSync as a mobile system that logs gaze coordinates alongside UI transformation matrices (scale, rotation, translation) to reconstruct image-relative attention, validated via a formative study of guided manipulation, reading, and visual search tasks. No equations, fitted parameters, derivations, or self-citations appear in the abstract or described content. The central claim of outperforming static baselines in recovering ground-truth gaze locations rests on direct empirical comparison rather than any reduction to self-defined quantities by construction. The analysis is therefore self-contained with no load-bearing steps that collapse to inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Mobile device APIs expose real-time image transformation matrices (scale, rotation, translation) that can be logged synchronously with gaze data
invented entities (1)
-
GazeSync system
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Andreas Bulling and Hans Gellersen. 2010. Toward mobile eye-based human- computer interaction.IEEE Pervasive Computing9, 4 (2010), 8–12
work page 2010
-
[2]
Zhuojiang Cai, Jingkai Hong, Zhimin Wang, and Feng Lu. 2025. GazeSwipe: Enhancing Mobile Touchscreen Reachability through Seamless Gaze and Finger- Swipe Integration. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–14
work page 2025
-
[3]
Yihua Cheng, Haofei Wang, Yiwei Bao, and Feng Lu. 2024. Appearance-based gaze estimation with deep learning: A review and benchmark.IEEE Transactions on Pattern Analysis and Machine Intelligence46, 12 (2024), 7509–7528
work page 2024
-
[4]
Francisco Diaz-Guerra and Angel Jimenez-Molina. 2023. Continuous Prediction of Web User Visual Attention on Short Span Windows Based on Gaze Data Analytics.Sensors23, 4 (2023), 2294
work page 2023
-
[5]
2017.Eye tracking methodology: Theory and practice
Andrew T Duchowski and Andrew T Duchowski. 2017.Eye tracking methodology: Theory and practice. Springer
work page 2017
-
[6]
Leonela González-Vides, José Luis Hernández-Verdejo, and Pilar Cañadas-Suárez
-
[7]
Eye tracking in optometry: A systematic review.Journal of Eye Movement Research16, 3 (2023), 10–16910
work page 2023
-
[8]
Elias Daniel Guestrin and Moshe Eizenman. 2006. General theory of remote gaze estimation using the pupil center and corneal reflections.IEEE Transactions on biomedical engineering53, 6 (2006), 1124–1133
work page 2006
-
[9]
Nishan Gunawardena, Jeewani Anupama Ginige, Bahman Javadi, and Gough Lui. 2024. Deep learning based eye tracking on smartphones for dynamic visual stimuli.Procedia Computer Science246 (2024), 3733–3742
work page 2024
-
[10]
Jarodzka Halszka, Kenneth Holmqvist, and Hans Gruber. 2017. Eye tracking in Educational Science: Theoretical frameworks and research agendas.Journal of eye movement research10, 1 (2017), 10–16910
work page 2017
- [11]
-
[12]
Christina Katsini, Yasmeen Abdrabou, George E Raptis, Mohamed Khamis, and Florian Alt. 2020. The role of eye gaze in security and privacy applications: Survey and future HCI research directions. InProceedings of the 2020 CHI conference on human factors in computing systems. 1–21
work page 2020
-
[13]
Kyle Krafka, Aditya Khosla, Petr Kellnhofer, Harini Kannan, Suchendra Bhan- darkar, Wojciech Matusik, and Antonio Torralba. 2016. Eye tracking for everyone. InProceedings of the IEEE conference on computer vision and pattern recognition. 2176–2184
work page 2016
-
[14]
Yaxiong Lei. 2021. Eye tracking calibration on mobile devices. InACM Symposium on Eye Tracking Research and Applications. 1–4
work page 2021
-
[15]
Yaxiong Lei, Xinya Gong, Shijing He, Yafei Wang, Mohamed Khamis, and Juan Ye. 2026. The People’s Gaze: Co-Designing and Refining Gaze Gestures with Users and Experts. InProceedings of the 2026 CHI conference on human factors in computing systems
work page 2026
-
[16]
Yaxiong Lei, Shijing He, Huining Feng, Kaixing Zhao, Mohamed Khamis, and Juan Ye. 2023. Protecting Privacy in an Era of Pervasive Camera-Based Devices: Challenges and Potential Directions. InProc. UK Mobile, Wearable and Ubiquitous GazeSync: A Mobile Eye-Tracking Tool for Analyzing Visual Attention on Dynamically Manipulated Content CHI EA ’26, April 13–1...
work page 2023
-
[17]
Yaxiong Lei, Shijing He, Mohamed Khamis, and Juan Ye. 2023. An end-to-end review of gaze estimation and its interactive applications on handheld mobile devices.Comput. Surveys56, 2 (2023), 1–38
work page 2023
- [18]
-
[19]
Yaxiong Lei, Yuheng Wang, Tyler Caslin, Alexander Wisowaty, Xu Zhu, Mohamed Khamis, and Juan Ye. 2023. DynamicRead: Exploring robust gaze interaction methods for reading on handheld mobile devices under dynamic conditions. Proceedings of the ACM on Human-Computer Interaction7, ETRA (2023), 1–17
work page 2023
- [20]
-
[21]
Julien Mercier, Olivier Ertz, and Erwan Bocher. 2024. Quantifying dwell time with location-based augmented reality: Dynamic AOI analysis on mobile eye tracking data with vision transformer.Journal of Eye Movement Research17, 3 (2024), 10–16910
work page 2024
-
[22]
Omar Namnakani. 2023. Gaze-based Interaction on Handheld Mobile Devices. InProceedings of the 2023 Symposium on Eye Tracking Research and Applications. 1–4
work page 2023
-
[23]
Yun Suen Pai, Benjamin Tag, Benjamin Outram, Noriyasu Vontin, Kazunori Sugiura, and Kai Kunze. 2016. GazeSim: simulating foveated rendering using depth in eye gaze for VR. InACM SIGGRAPH 2016 Posters. 1–2
work page 2016
-
[24]
Argenis Ramirez Ramirez Gomez, Christopher Clarke, Ludwig Sidenmark, and Hans Gellersen. 2021. Gaze+ hold: eyes-only direct manipulation with continuous gaze modulated by closure of one eye. InACM symposium on eye tracking research and applications. 1–12
work page 2021
-
[25]
Aaron Ruß. 2011. Modeling visual attention for rule-based usability simulations of elderly citizen. InInternational Conference on Engineering Psychology and Cognitive Ergonomics. Springer, 72–81
work page 2011
-
[26]
Sophie Stellmach and Raimund Dachselt. 2012. Investigating gaze-supported multimodal pan and zoom. InProceedings of the Symposium on Eye Tracking Research and Applications. 357–360
work page 2012
-
[27]
Adam Strupczewski, Błażej Czupryński, Jacek Naruniec, and Kamil Mucha. 2016. Geometric eye gaze tracking. InInternational Conference on Computer Vision Theory and Applications, Vol. 4. SCITEPRESS, 444–455
work page 2016
-
[28]
Yusuke Sugano, Yasuyuki Matsushita, Yoichi Sato, and Hideki Koike. 2015. Appearance-based gaze estimation with online calibration from mouse oper- ations.IEEE Transactions on Human-Machine Systems45, 6 (2015), 750–760
work page 2015
-
[29]
Hsin-Pei Sun, Cheng-Hsun Yang, and Shang-Hong Lai. 2017. A deep learning approach to appearance-based gaze estimation under head pose variations. In 2017 4th IAPR Asian Conference on Pattern Recognition (ACPR). IEEE, 935–940
work page 2017
-
[30]
Mohammed Tahri Sqalli, Begali Aslonov, Mukhammadjon Gafurov, Nurmukham- mad Mukhammadiev, and Yahya Sqalli Houssaini. 2023. Eye tracking technology in medical practice: a perspective on its diverse applications.Frontiers in Medical Technology5 (2023), 1253001
work page 2023
-
[31]
Nachiappan Valliappan, Na Dai, Ethan Steinberg, Junfeng He, Kantwon Rogers, Venky Ramachandran, Pingmei Xu, Mina Shojaeizadeh, Li Guo, Kai Kohlhoff, et al. 2020. Accelerating eye movement research via accurate and affordable smartphone eye tracking.Nature communications11, 1 (2020), 4553
work page 2020
-
[32]
VisualCamp Co., Ltd. [n. d.]. Eyedid SDK | For Developer. https://sdk.eyedid.ai/. Accessed: 2025-08-19
work page 2025
-
[33]
Xucong Zhang, Yusuke Sugano, Mario Fritz, and Andreas Bulling. 2015. Appearance-based gaze estimation in the wild. InProceedings of the IEEE confer- ence on computer vision and pattern recognition. 4511–4520
work page 2015
-
[34]
Xiaolong Zhou, Haibin Cai, Zhanpeng Shao, Hui Yu, and Honghai Liu. 2016. 3D eye model-based gaze estimation from a depth sensor. In2016 IEEE international conference on robotics and biomimetics (ROBIO). IEEE, 369–374
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.