Bengal-HP_RU: A Dataset of Bengal People For Head Pose Estimation

Bimal Kumar Pramanik; Md. Ahanaf Arif Khan; Md. Iqbal Aziz Khan; Md. Tawhidur Rahman; Sangeeta Biswas; Sanjoy Kumar Chakravarty; Subrata Pramanik

arxiv: 2606.24122 · v1 · pith:NEGLRILKnew · submitted 2026-06-23 · 💻 cs.CV

Bengal-HP_RU: A Dataset of Bengal People For Head Pose Estimation

Md. Ahanaf Arif Khan , Md. Tawhidur Rahman , Sangeeta Biswas , Md. Iqbal Aziz Khan , Subrata Pramanik , Sanjoy Kumar Chakravarty , Bimal Kumar Pramanik This is my paper

Pith reviewed 2026-06-26 01:45 UTC · model grok-4.3

classification 💻 cs.CV

keywords head pose estimationdatasetBengali subjectsSouth Asianin-the-wild imagescontinuous annotationWikimedia Commons

0 comments

The pith

Bengal-HP_RU supplies the first publicly released head-pose dataset built around Bengali subjects with 12,894 continuous yaw-pitch-roll labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Head-pose estimation models have been trained almost exclusively on Western and East Asian faces, leaving South Asian appearances underrepresented. The paper fills this gap by releasing Bengal-HP_RU, a collection of 12,894 images drawn from Wikimedia Commons and labeled with continuous yaw, pitch, and roll angles. Images were gathered under free licenses, processed by an automated pipeline, then manually corrected, and split by uploader identity to avoid train-test leakage across 296 distinct sources. The resulting set shows realistic variation in age, gender, occlusion, illumination, and background. Public release of the data at the stated DOI makes it possible for researchers to train and test models on Bengali subjects for the first time.

Core claim

Bengal-HP_RU is the first publicly available head-pose dataset centered on Bengali subjects. It contains 12,894 images annotated with continuous yaw, pitch, and roll values. The images were sourced from free-licensed Wikimedia Commons entries, labeled through an automated pipeline followed by manual correction, and partitioned by uploader identity into 10,494 training and 2,400 test images from 296 unique uploaders. The collection reflects substantial diversity in subject age, gender, occlusion, illumination, and background under in-the-wild conditions.

What carries the argument

Bengal-HP_RU dataset, which supplies the first large-scale, publicly licensed source of continuous head-pose labels drawn from Bengali subjects and partitioned by uploader to block data contamination.

If this is right

Head-pose estimators can now be trained and evaluated on Bengali facial geometry and appearance for the first time.
Existing models can be tested for accuracy drop when applied to South Asian subjects using the held-out test partition.
The uploader-based split guarantees that no identity or photo appears in both training and test sets.
Diversity across age, gender, occlusion, and lighting supports development of models that generalize to realistic conditions.
The public DOI allows direct download and extension by other researchers without licensing barriers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same Wikimedia sourcing and uploader partitioning strategy could be applied to create comparable datasets for other underrepresented ethnic or regional groups.
Models fine-tuned on Bengal-HP_RU may improve performance in downstream tasks such as gaze estimation or driver monitoring for South Asian populations.
If label quality holds, the dataset offers a low-cost template for rapidly expanding pose data coverage beyond currently dominant demographics.

Load-bearing premise

Wikimedia Commons images chosen for the collection plus the automated-plus-manual labeling process yield pose values and demographic coverage that accurately represent real Bengali head poses without systematic selection or annotation bias.

What would settle it

A controlled re-annotation of a random subset of the images by multiple independent human labelers or by a calibrated 3D head tracker that produces yaw-pitch-roll values differing by more than a few degrees on average from the released labels.

read the original abstract

Existing head pose datasets predominantly feature subjects of Western or East Asian origin, leaving South Asian populations, particularly Bengali individuals, largely underrepresented. We introduce Bengal-HP_RU, the first publicly available head pose dataset centred on Bengali subjects, comprising 12,894 labelled head images annotated with continuous yaw, pitch, and roll values. Images were collected from Wikimedia Commons under free licences and processed through an automated pipeline followed by manual label correction. The dataset is partitioned by Wikimedia uploader identity to prevent data contamination, yielding 10,494 training and 2,400 test images across 296 unique uploaders. Bengal-HP_RU exhibits substantial diversity in subject age, gender, occlusion, illumination, and background, reflecting realistic in-the-wild conditions. The dataset is publicly available at https://doi.org/10.17632/xbw9kr37jb.2.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A dataset release for Bengali head poses that fills a demographic gap but offers no evidence the labels or subject IDs are reliable.

read the letter

The main thing here is a new collection of 12,894 head-pose images drawn from Wikimedia Commons, presented as the first public set focused on Bengali subjects. The authors collected the images, ran them through an automated pose estimator plus manual fixes, split the data by uploader to avoid leakage, and released it with claims of diversity in age, gender, occlusion, and lighting.

What works is the basic idea of addressing under-representation in head-pose data. South Asian faces are indeed missing from most existing sets, and pulling from free-license sources with an uploader-based split is a reasonable way to keep the test set clean. The numbers (10k train, 2.4k test, 296 uploaders) are concrete enough to be usable.

The soft spot is the complete absence of any check on whether the labels or the Bengali identification are actually good. The abstract describes the pipeline but gives no MAE, no inter-rater numbers, no held-out ground-truth comparison, and no details on how Bengali ethnicity was confirmed. Without those, the central claim that this is a reliable Bengali-specific dataset rests on unverified steps. That is a real gap for a data paper.

This is the kind of work that belongs in a data-focused venue or as a short note rather than a full methods paper. A serious referee could usefully press on the validation question and the selection criteria, but the contribution is narrow enough that it does not need to be a high-priority review. I would bring it to a reading group only if the group is specifically looking at dataset bias or fairness in CV.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Bengal-HP_RU, claimed as the first publicly available head-pose dataset centered on Bengali subjects. It comprises 12,894 images sourced from Wikimedia Commons under free licenses, annotated with continuous yaw, pitch, and roll values via an automated pipeline plus manual correction. The dataset is partitioned by uploader identity (10,494 train / 2,400 test across 296 uploaders) to avoid contamination and is asserted to exhibit diversity in age, gender, occlusion, illumination, and background under in-the-wild conditions. The resource is released at a DOI link.

Significance. If the subject identification and pose labels prove reliable, the dataset would address a genuine gap in head-pose estimation by providing data from an underrepresented South Asian population, supporting fairness and generalization studies in computer vision. The uploader-based partitioning is a sound practice that reduces leakage risk and aids reproducibility. Public release under open license is also a clear strength.

major comments (2)

[Abstract] Abstract (data collection paragraph): the claim that the 12,894 images carry reliable continuous yaw/pitch/roll labels rests on an 'automated pipeline followed by manual label correction,' yet no quantitative validation (MAE, inter-rater reliability, or comparison to held-out ground truth) is reported. This directly undermines the central claim that the dataset is usable for training or benchmarking.
[Abstract] Abstract (subject selection paragraph): criteria used to confirm that images depict Bengali subjects (caption text, uploader metadata, visual assessment, or otherwise) are unspecified. Without explicit, reproducible rules, systematic selection bias or mislabeling cannot be ruled out, which is load-bearing for the 'first Bengali-centred dataset' assertion.

minor comments (1)

[Abstract] Abstract: the statement that the dataset 'exhibits substantial diversity' would be strengthened by even summary statistics (e.g., age/gender histograms or occlusion rates) rather than qualitative description alone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight important aspects of dataset documentation that we address point by point below.

read point-by-point responses

Referee: [Abstract] Abstract (data collection paragraph): the claim that the 12,894 images carry reliable continuous yaw/pitch/roll labels rests on an 'automated pipeline followed by manual label correction,' yet no quantitative validation (MAE, inter-rater reliability, or comparison to held-out ground truth) is reported. This directly undermines the central claim that the dataset is usable for training or benchmarking.

Authors: We agree that the absence of quantitative validation metrics for the pose labels is a limitation in the current manuscript. The description of the automated pipeline plus manual correction is provided, but no MAE, agreement statistics, or held-out comparisons are reported. In the revised version we will add a dedicated subsection on the annotation procedure that includes the scale of manual corrections performed, any internal consistency checks conducted during correction, and explicit discussion of remaining limitations in label reliability. revision: yes
Referee: [Abstract] Abstract (subject selection paragraph): criteria used to confirm that images depict Bengali subjects (caption text, uploader metadata, visual assessment, or otherwise) are unspecified. Without explicit, reproducible rules, systematic selection bias or mislabeling cannot be ruled out, which is load-bearing for the 'first Bengali-centred dataset' assertion.

Authors: The current manuscript states that images were selected from Wikimedia Commons to centre on Bengali subjects but does not enumerate the precise decision rules. We will revise the methods section (and update the abstract accordingly) to provide an explicit, reproducible protocol: the combination of uploader self-identification in metadata, language of captions, geographic tags, and the visual assessment criteria applied by the authors. This addition will allow readers to evaluate potential selection bias. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset collection paper with no derivations or predictions

full rationale

The paper is a data release effort that introduces Bengal-HP_RU by describing image sourcing from Wikimedia Commons, an automated-plus-manual annotation pipeline, and a train/test split by uploader identity. No equations, fitted parameters, predictions, uniqueness theorems, or ansatzes appear in the provided text. The central claim (first Bengali-centric head-pose dataset) is supported by the act of collection itself and does not reduce to any self-referential step. This matches the default expectation for non-circular papers; the reader's assigned score of 0.0 is confirmed.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Dataset release paper; contains no mathematical derivations, fitted parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5714 in / 1004 out tokens · 24038 ms · 2026-06-26T01:45:13.399395+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

168 extracted references · 73 canonical work pages · 3 internal anchors

[1]

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , author =

FisherMatch: Semi-Supervised Rotation Regression via Entropy-based Filtering , DOI =. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , author =. 2022 , month = jun, pages =

2022
[2]

and Kurakin, Alex and Zhang, Han and Raffel, Colin , title =

Sohn, Kihyuk and Berthelot, David and Li, Chun-Liang and Zhang, Zizhao and Carlini, Nicholas and Cubuk, Ekin D. and Kurakin, Alex and Zhang, Han and Raffel, Colin , title =. Proceedings of the 34th International Conference on Neural Information Processing Systems , articleno =. 2020 , publisher =

2020
[3]

Proceedings of the 35th International Conference on Neural Information Processing Systems , articleno =

Zhang, Bowen and Wang, Yidong and Hou, Wenxin and Wu, Hao and Wang, Jindong and Okumura, Manabu and Shinozaki, Takahiro , title =. Proceedings of the 35th International Conference on Neural Information Processing Systems , articleno =. 2021 , isbn =

2021
[4]

Deep semi-supervised regression via pseudo-label filtering and calibration , volume =

Jo, Yongwon and Kahng, Hyungu and Kim, Seoung Bum , year =. Deep semi-supervised regression via pseudo-label filtering and calibration , volume =. doi:10.1016/j.asoc.2024.111670 , journal =

work page doi:10.1016/j.asoc.2024.111670 2024
[5]

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , author =

Face Alignment Across Large Poses: A 3D Solution , DOI =. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , author =. 2016 , month = jun, pages =

2016
[6]

Deep Learning Face Attributes in the Wild , year =

Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou , booktitle =. Deep Learning Face Attributes in the Wild , year =. doi:10.1109/ICCV.2015.425 , url =

work page doi:10.1109/iccv.2015.425 2015
[7]

Random Forests for Real Time 3D Face Analysis , volume =

Fanelli, Gabriele and Dantone, Matthias and Gall, Juergen and Fossati, Andrea and Van Gool, Luc , year =. Random Forests for Real Time 3D Face Analysis , volume =. International Journal of Computer Vision , publisher =. doi:10.1007/s11263-012-0549-0 , number =

work page doi:10.1007/s11263-012-0549-0
[8]

Diversity-Aware Meta Visual Prompting

Zhang, Cheng and Liu, Hai and Deng, Yongjian and Xie, Bochen and Li, Youfu , booktitle =. 2023 , volume =. doi:10.1109/CVPR52729.2023.00859 , publisher =

work page doi:10.1109/cvpr52729.2023.00859 2023
[9]

Face-from-Depth for Head Pose Estimation on Depth Images , year=

Borghi, Guido and Fabbri, Matteo and Vezzani, Roberto and Calderara, Simone and Cucchiara, Rita , journal=. Face-from-Depth for Head Pose Estimation on Depth Images , year=
[10]

and Costeira, João Paulo , year =

Celestino, José and Marques, Manuel and Nascimento, Jacinto C. and Costeira, João Paulo , year =. 2D Image head pose estimation via latent space regression under occlusion settings , volume =. doi:10.1016/j.patcog.2022.109288 , journal =

work page doi:10.1016/j.patcog.2022.109288 2022
[11]

FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image , url =

Yang, Tsun-Yi and Chen, Yi-Ting and Lin, Yen-Yu and Chuang, Yung-Yu , year =. FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image , url =. doi:10.1109/cvpr.2019.00118 , booktitle =

work page doi:10.1109/cvpr.2019.00118 2019
[12]

A Vector-based Representation to Enhance Head Pose Estimation , year=

Cao, Zhiwen and Chu, Zongcheng and Liu, Dongfang and Chen, Yingjie , booktitle=. A Vector-based Representation to Enhance Head Pose Estimation , year=
[13]

Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose , volume =

Li, Yaokun and Tan, Guang and Gou, Chao , year =. Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose , volume =. International Journal of Computer Vision , publisher =. doi:10.1007/s11263-023-01935-2 , number =

work page doi:10.1007/s11263-023-01935-2
[14]

Diversity-Aware Meta Visual Prompting

Li, Heyuan and Wang, Bo and Cheng, Yu and Kankanhalli, Mohan and Tan, Robby T. , booktitle =. 2023 , volume =. doi:10.1109/CVPR52729.2023.00440 , publisher =

work page doi:10.1109/cvpr52729.2023.00440 2023
[15]

SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction , year=

Ruan, Zeyu and Zou, Changqing and Wu, Longhai and Wu, Gangshan and Wang, Limin , journal=. SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction , year=
[16]

Proceedings of the 41st International Conference on Machine Learning , pages =

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model , author =. Proceedings of the 41st International Conference on Machine Learning , pages =. 2024 , editor =

2024
[17]

Don’t shake the wheel: Momentum-aware planning in end-to-end autonomous driving

Hatamizadeh, Ali and Kautz, Jan , booktitle =. 2025 , volume =. doi:10.1109/CVPR52734.2025.02352 , publisher =

work page doi:10.1109/cvpr52734.2025.02352 2025
[18]

VMamba: Visual State Space Model , url =

Liu, Yue and Tian, Yunjie and Zhao, Yuzhong and Yu, Hongtian and Xie, Lingxi and Wang, Yaowei and Ye, Qixiang and Jiao, Jianbin and Liu, Yunfan , booktitle =. VMamba: Visual State Space Model , url =
[19]

LocalMamba: Visual State Space Model with Windowed Selective Scan , DOI =

Huang, Tao and Pei, Xiaohuan and You, Shan and Wang, Fei and Qian, Chen and Xu, Chang , year =. LocalMamba: Visual State Space Model with Windowed Selective Scan , DOI =. Computer Vision – ECCV 2024 Workshops , publisher =

2024
[20]

2025 , isbn =

Pei, Xiaohuan and Huang, Tao and Xu, Chang , title =. 2025 , isbn =. doi:10.1609/aaai.v39i6.32690 , booktitle =

work page doi:10.1609/aaai.v39i6.32690 2025
[21]

MambaOut: Do We Really Need Mamba for Vision?* , year=

Yu, Weihao and Wang, Xinchao , booktitle=. MambaOut: Do We Really Need Mamba for Vision?* , year=
[22]

Proceedings of the 38th International Conference on Neural Information Processing Systems , articleno =

Han, Dongchen and Wang, Ziyi and Xia, Zhuofan and Han, Yizeng and Pu, Yifan and Ge, Chunjiang and Song, Jun and Song, Shiji and Zheng, Bo and Huang, Gao , title =. Proceedings of the 38th International Conference on Neural Information Processing Systems , articleno =. 2025 , isbn =

2025
[23]

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model , volume =

Wang, Zeyu and Li, Chen and Xu, Huiying and Zhu, Xinzhong and Li, Hongbo , year =. Mamba YOLO: A Simple Baseline for Object Detection with State Space Model , volume =. Proceedings of the AAAI Conference on Artificial Intelligence , publisher =. doi:10.1609/aaai.v39i8.32885 , number =

work page doi:10.1609/aaai.v39i8.32885
[24]

arXiv preprint arXiv:2407.13772 , year=

GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model , author=. arXiv preprint arXiv:2407.13772 , year=

arXiv
[25]

MambaPose: A Human Pose Estimation Based on Gated Feedforward Network and Mamba , volume =

Zhang, Jianqiang and Hou, Jing and He, Qiusheng and Yuan, Zhengwei and Xue, Hao , year =. MambaPose: A Human Pose Estimation Based on Gated Feedforward Network and Mamba , volume =. Sensors , publisher =. doi:10.3390/s24248158 , number =

work page doi:10.3390/s24248158
[26]

Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network , volume =

Zhang, Xinyi and Bao, Qiqi and Cui, Qinpeng and Yang, Wenming and Liao, Qingmin , year =. Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network , volume =. Proceedings of the AAAI Conference on Artificial Intelligence , publisher =. doi:10.1609/aaai.v39i10.33112 , number =

work page doi:10.1609/aaai.v39i10.33112
[27]

PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model , volume =

Huang, Yunlong and Liu, Junshuo and Xian, Ke and Qiu, Robert Caiming , year =. PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model , volume =. Proceedings of the AAAI Conference on Artificial Intelligence , publisher =. doi:10.1609/aaai.v39i4.32401 , number =

work page doi:10.1609/aaai.v39i4.32401
[28]

2025 , volume =

Lang, Bo and Chuah, Mooi Choo , booktitle =. 2025 , volume =. doi:10.1109/WACV61041.2025.00102 , publisher =

work page doi:10.1109/wacv61041.2025.00102 2025
[29]

International Conference on Learning Representations , year=

Efficiently Modeling Long Sequences with Structured State Spaces , author=. International Conference on Learning Representations , year=
[30]

Deep Learning for Head Pose Estimation: A Survey , volume =

Asperti, Andrea and Filippini, Daniele , year =. Deep Learning for Head Pose Estimation: A Survey , volume =. SN Computer Science , publisher =. doi:10.1007/s42979-023-01796-z , number =

work page doi:10.1007/s42979-023-01796-z
[31]

Deep learning and machine learning techniques for head pose estimation: a survey , volume =

Algabri, Redhwan and Abdu, Ahmed and Lee, Sungon , year =. Deep learning and machine learning techniques for head pose estimation: a survey , volume =. Artificial Intelligence Review , publisher =. doi:10.1007/s10462-024-10936-7 , number =

work page doi:10.1007/s10462-024-10936-7
[32]

A survey of head pose estimation methods , url =

Shao, Xiaofeng and Qiang, Zhenping and Lin, Hong and Dong, Yueyu and Wang, Xiaorui , year =. A survey of head pose estimation methods , url =. doi:10.1109/ithings-greencom-cpscom-smartdata-cybermatics50389.2020.00135 , booktitle =

work page doi:10.1109/ithings-greencom-cpscom-smartdata-cybermatics50389.2020.00135 2020
[33]

A Comprehensive Survey on Mamba: Architectures, Challenges, and Opportunities , volume =

Salam, Abdus and Mahmud, Rasel and Islam, Tohedul and Mukta, Saddam and Shatabda, Swakkhar , year =. A Comprehensive Survey on Mamba: Architectures, Challenges, and Opportunities , volume =. Computer , publisher =. doi:10.1109/mc.2025.3571322 , number =

work page doi:10.1109/mc.2025.3571322 2025
[34]

Advancing Intelligent Sequence Modeling: Evolution, Trade-offs, and Applications of State- Space Architectures from S4 to Mamba

Somvanshi, Shriyank and Islam, Md Monzurul and Mimi, Mahmuda Sultana and Polock, Sazzad Bin Bashar and Chhetri, Gaurab and Das, Subasish , title =. 2025 , copyright =. doi:10.48550/ARXIV.2503.18970 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2503.18970 2025
[35]

2024 , copyright =

Liu, Xiao and Zhang, Chenxu and Zhang, Lei , title =. 2024 , copyright =. doi:10.48550/ARXIV.2405.04404 , url =

work page doi:10.48550/arxiv.2405.04404 2024
[36]

2024 , copyright =

Zhang, Hanwei and Zhu, Ying and Wang, Dan and Zhang, Lijun and Chen, Tianxiang and Ye, Zi , title =. 2024 , copyright =. doi:10.48550/ARXIV.2404.15956 , url =

work page doi:10.48550/arxiv.2404.15956 2024
[37]

, booktitle=

Youding Zhu and Fujimura, K. , booktitle=. Head pose estimation for driver monitoring , year=
[38]

and Elvezio, Carmine and Feiner, Steven K

Grinshpoon, Alon and Sadri, Shirin and Loeb, Gabrielle J. and Elvezio, Carmine and Feiner, Steven K. , booktitle=. Hands-Free Interaction for Augmented Reality in Vascular Interventions , year=
[39]

Human Computer Interaction with Head Pose, Eye Gaze and Body Gestures , year=

Wang, Kang and Zhao, Rui and Ji, Qiang , booktitle=. Human Computer Interaction with Head Pose, Eye Gaze and Body Gestures , year=
[40]

2011 , volume =

Chen, Chih-Wei and Ugarte, Rodrigo Cilla and Wu, Chen and Aghajan, Hamid , booktitle =. 2011 , volume =. doi:10.1109/FG.2011.5771376 , publisher =

work page doi:10.1109/fg.2011.5771376 2011
[41]

Head pose estimation and its application in TV viewers' behavior analysis , year=

Wu, Siyu and Liang, Jie and Ho, Jason , booktitle=. Head pose estimation and its application in TV viewers' behavior analysis , year=
[42]

Guiding Visual Surveillance by Tracking Human Attention , url =

Benfold, Ben and Reid, Ian , year =. Guiding Visual Surveillance by Tracking Human Attention , url =. doi:10.5244/c.23.14 , booktitle =

work page doi:10.5244/c.23.14
[43]

and Femiani, John , year =

Chuang, Chia Yuan and Craig, Scotty D. and Femiani, John , year =. Detecting probable cheating during online assessments based on time delay and head pose , volume =. Higher Education Research & Development , publisher =. doi:10.1080/07294360.2017.1303456 , number =

work page doi:10.1080/07294360.2017.1303456 2017
[44]

OpenFace 2.0: Facial Behavior Analysis Toolkit , year=

Baltrusaitis, Tadas and Zadeh, Amir and Lim, Yao Chong and Morency, Louis-Philippe , booktitle=. OpenFace 2.0: Facial Behavior Analysis Toolkit , year=
[45]

, booktitle=

Xu, Xiang and Kakadiaris, Ioannis A. , booktitle=. Joint Head Pose Estimation and Face Alignment Framework Using Global and Local CNN Features , year=
[46]

and Kim, Hak Gu and Kim, Seong Tae and Ro, Yong Man , year =

Lee, Hong Joo and Baddar, Wissam J. and Kim, Hak Gu and Kim, Seong Tae and Ro, Yong Man , year =. Teacher and Student Joint Learning for Compact Facial Landmark Detection Network , ISBN =. doi:10.1007/978-3-319-73603-7_40 , booktitle =

work page doi:10.1007/978-3-319-73603-7_40
[47]

and Chellappa, Rama , journal=

Ranjan, Rajeev and Patel, Vishal M. and Chellappa, Rama , journal=. 2019 , volume=. doi:10.1109/TPAMI.2017.2781233 , publisher=

work page doi:10.1109/tpami.2017.2781233 2019
[48]

Deep convolutional neural network-based Bernoulli heatmap for head pose estimation , volume =

Hu, Zhongxu and Xing, Yang and Lv, Chen and Hang, Peng and Liu, Jie , year =. Deep convolutional neural network-based Bernoulli heatmap for head pose estimation , volume =. doi:10.1016/j.neucom.2021.01.048 , journal =

work page doi:10.1016/j.neucom.2021.01.048 2021
[49]

Expression

Dhingra, Naina , booktitle =. 2021 , volume =. doi:10.1109/FG52635.2021.9667080 , publisher =

work page doi:10.1109/fg52635.2021.9667080 2021
[50]

2008 , volume =

Lablack, Adel and Zhang, Zhongfei (Mark) and Djeraba, Chabane , booktitle =. 2008 , volume =. doi:10.1109/ISM.2008.34 , publisher =

work page doi:10.1109/ism.2008.34 2008
[51]

2019 , volume =

Shao, Mingzhen and Sun, Zhun and Ozay, Mete and Okatani, Takayuki , booktitle =. 2019 , volume =. doi:10.1109/FG.2019.8756605 , publisher =

work page doi:10.1109/fg.2019.8756605 2019
[52]

Leaky Gated Cross-Attention for Weakly Supervised Multi-Modal Temporal Action Localization , booktitle =

Dhingra, Naina , booktitle =. 2022 , volume =. doi:10.1109/WACV51458.2022.00127 , publisher =

work page doi:10.1109/wacv51458.2022.00127 2022
[53]

Mansoor, M

Cobo, Alejandro and Valle, Roberto and Buenaposada, José M. and Baumela, Luis , year =. On the representation and methodology for wide and short range head pose estimation , volume =. doi:10.1016/j.patcog.2024.110263 , journal =

work page doi:10.1016/j.patcog.2024.110263 2024
[54]

Orientation Cues-Aware Facial Relationship Representation for Head Pose Estimation via Transformer , year=

Liu, Hai and Zhang, Cheng and Deng, Yongjian and Liu, Tingting and Zhang, Zhaoli and Li, You-Fu , journal=. Orientation Cues-Aware Facial Relationship Representation for Head Pose Estimation via Transformer , year=
[55]

Relative Pose Consistency for Semi-Supervised Head Pose Estimation , year=

Kuhnke, Felix and Ihler, Sontje and Ostermann, Jörn , booktitle=. Relative Pose Consistency for Semi-Supervised Head Pose Estimation , year=
[56]

In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Guo, Yuyu and Bai, Yancheng and Shi, Daiqi and Cai, Yang and Bian, Wei , booktitle =. 2023 , volume =. doi:10.1109/CVPRW59228.2023.00373 , publisher =

work page doi:10.1109/cvprw59228.2023.00373 2023
[57]

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose , DOI =

Zhou, Yijun and Gregson, James , year =. WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose , DOI =. Proceedings of the British Machine Vision Conference , publisher =
[58]

Learning 3D Head Pose From Synthetic Data: A Semi-Supervised Approach , year=

Basak, Shubhajit and Corcoran, Peter and Khan, Faisal and Mcdonnell, Rachel and Schukat, Michael , journal=. Learning 3D Head Pose From Synthetic Data: A Semi-Supervised Approach , year=
[59]

Towards unsupervised learning of joint facial landmark detection and head pose estimation , volume =

Zou, Zhiming and Jia, Dian and Tang, Wei , year =. Towards unsupervised learning of joint facial landmark detection and head pose estimation , volume =. doi:10.1016/j.patcog.2025.111393 , journal =

work page doi:10.1016/j.patcog.2025.111393 2025
[60]

doi:10.48550/ARXIV.2404.02544 , author =

Semi-Supervised Unconstrained Head Pose Estimation in the Wild , publisher =. doi:10.48550/ARXIV.2404.02544 , author =

work page doi:10.48550/arxiv.2404.02544
[61]

ArXiv , year=

SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series , author=. ArXiv , year=
[62]

arXiv preprint arXiv:2312.00752 , year=

Mamba: Linear-Time Sequence Modeling with Selective State Spaces , author=. arXiv preprint arXiv:2312.00752 , year=

Pith/arXiv arXiv
[63]

A Deep-Learning Based Method for Analysis of Students’ Attention in Offline Class , volume =

Ling, Xufeng and Yang, Jie and Liang, Jingxin and Zhu, Huaizhong and Sun, Hui , year =. A Deep-Learning Based Method for Analysis of Students’ Attention in Offline Class , volume =. Electronics , publisher =. doi:10.3390/electronics11172663 , number =

work page doi:10.3390/electronics11172663
[64]

Student Recognition and Activity Monitoring in E-Classes Using Deep Learning in Higher Education , year=

Alruwais, Nuha Mohammed and Zakariah, Mohammed , journal=. Student Recognition and Activity Monitoring in E-Classes Using Deep Learning in Higher Education , year=
[65]

and Chellappa, Rama , booktitle =

Ranjan, Rajeev and Sankaranarayanan, Swami and Castillo, Carlos D. and Chellappa, Rama , booktitle =. 2017 , volume =. doi:10.1109/FG.2017.137 , publisher =

work page doi:10.1109/fg.2017.137 2017
[66]

2017 , volume =

Kumar, Amit and Alavi, Azadeh and Chellappa, Rama , booktitle =. 2017 , volume =. doi:10.1109/FG.2017.149 , publisher =

work page doi:10.1109/fg.2017.149 2017
[67]

3D head pose estimation with convolutional neural network trained on synthetic images , year=

Liu, Xiabing and Liang, Wei and Wang, Yumeng and Li, Shuyang and Pei, Mingtao , booktitle=. 3D head pose estimation with convolutional neural network trained on synthetic images , year=
[68]

Facial Landmark, Head Pose, and Occlusion Analysis Using Multitask Stacked Hourglass , year=

Kim, Youngsam and Roh, Jong-Hyuk and Kim, Soohyung , journal=. Facial Landmark, Head Pose, and Occlusion Analysis Using Multitask Stacked Hourglass , year=
[69]

2018 , volume =

Yang, Wei and Ouyang, Wanli and Wang, Xiaolong and Ren, Jimmy and Li, Hongsheng and Wang, Xiaogang , booktitle =. 2018 , volume =. doi:10.1109/CVPR.2018.00551 , publisher =

work page doi:10.1109/cvpr.2018.00551 2018
[70]

A deep Coarse-to-Fine network for head pose estimation from synthetic data , volume =

Wang, Yujia and Liang, Wei and Shen, Jianbing and Jia, Yunde and Yu, Lap-Fai , year =. A deep Coarse-to-Fine network for head pose estimation from synthetic data , volume =. doi:10.1016/j.patcog.2019.05.026 , journal =

work page doi:10.1016/j.patcog.2019.05.026 2019
[71]

, booktitle =

Ruiz, Nataniel and Chong, Eunji and Rehg, James M. , booktitle =. 2018 , volume =. doi:10.1109/CVPRW.2018.00281 , publisher =

work page doi:10.1109/cvprw.2018.00281 2018
[72]

Head Pose Estimation Using Convolutional Neural Network , ISBN =

Lee, Seungsu and Saitoh, Takeshi , year =. Head Pose Estimation Using Convolutional Neural Network , ISBN =. doi:10.1007/978-981-10-6451-7_20 , booktitle =

work page doi:10.1007/978-981-10-6451-7_20
[73]

Head Pose Estimation in Complex Environment Based on Four-Branch Feature Selective Extraction and Regional Information Exchange Fusion Network , year=

Wang, Bin-Yu and Xie, Kai and He, Sheng-Tao and Wen, Chang and He, Jian-Biao , journal=. Head Pose Estimation in Complex Environment Based on Four-Branch Feature Selective Extraction and Regional Information Exchange Fusion Network , year=
[74]

Leaky Gated Cross-Attention for Weakly Supervised Multi-Modal Temporal Action Localization , booktitle =

Cantarini, Giorgio and Figari Tomenotti, Federico and Noceti, Nicoletta and Odone, Francesca , booktitle =. 2022 , volume =. doi:10.1109/WACV51458.2022.00340 , publisher =

work page doi:10.1109/wacv51458.2022.00340 2022
[75]

EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks , year=

Xin, Miao and Mo, Shentong and Lin, Yuanze , booktitle=. EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks , year=
[76]

doi:10.48550/ARXIV.2103.07615 , author =

An Efficient Multitask Neural Network for Face Alignment, Head Pose Estimation and Face Tracking , publisher =. doi:10.48550/ARXIV.2103.07615 , author =

work page doi:10.48550/arxiv.2103.07615
[77]

doi:10.48550/ARXIV.2110.10953 , author =

MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation , publisher =. doi:10.48550/ARXIV.2110.10953 , author =

work page doi:10.48550/arxiv.2110.10953
[78]

Head Pose Estimation through Keypoints Matching between Reconstructed 3D Face Model and 2D Image , volume =

Liu, Leyuan and Ke, Zeran and Huo, Jiao and Chen, Jingying , year =. Head Pose Estimation through Keypoints Matching between Reconstructed 3D Face Model and 2D Image , volume =. Sensors , publisher =. doi:10.3390/s21051841 , number =

work page doi:10.3390/s21051841
[79]

Self-Attention Mechanism-Based Head Pose Estimation Network with Fusion of Point Cloud and Image Features , volume =

Chen, Kui and Wu, Zhaofu and Huang, Jianwei and Su, Yiming , year =. Self-Attention Mechanism-Based Head Pose Estimation Network with Fusion of Point Cloud and Image Features , volume =. Sensors , publisher =. doi:10.3390/s23249894 , number =

work page doi:10.3390/s23249894
[80]

Head pose estimation with particle swarm optimization‐based contrastive learning and multimodal entangled GCN , volume =

Lian, Yuanfeng and Shi, Yinliang and Liu, Zhaonian and Jiang, Bin and Li, Xingtao , year =. Head pose estimation with particle swarm optimization‐based contrastive learning and multimodal entangled GCN , volume =. IET Image Processing , publisher =. doi:10.1049/ipr2.13142 , number =

work page doi:10.1049/ipr2.13142

Showing first 80 references.

[1] [1]

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , author =

FisherMatch: Semi-Supervised Rotation Regression via Entropy-based Filtering , DOI =. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , author =. 2022 , month = jun, pages =

2022

[2] [2]

and Kurakin, Alex and Zhang, Han and Raffel, Colin , title =

Sohn, Kihyuk and Berthelot, David and Li, Chun-Liang and Zhang, Zizhao and Carlini, Nicholas and Cubuk, Ekin D. and Kurakin, Alex and Zhang, Han and Raffel, Colin , title =. Proceedings of the 34th International Conference on Neural Information Processing Systems , articleno =. 2020 , publisher =

2020

[3] [3]

Proceedings of the 35th International Conference on Neural Information Processing Systems , articleno =

Zhang, Bowen and Wang, Yidong and Hou, Wenxin and Wu, Hao and Wang, Jindong and Okumura, Manabu and Shinozaki, Takahiro , title =. Proceedings of the 35th International Conference on Neural Information Processing Systems , articleno =. 2021 , isbn =

2021

[4] [4]

Deep semi-supervised regression via pseudo-label filtering and calibration , volume =

Jo, Yongwon and Kahng, Hyungu and Kim, Seoung Bum , year =. Deep semi-supervised regression via pseudo-label filtering and calibration , volume =. doi:10.1016/j.asoc.2024.111670 , journal =

work page doi:10.1016/j.asoc.2024.111670 2024

[5] [5]

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , author =

Face Alignment Across Large Poses: A 3D Solution , DOI =. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , author =. 2016 , month = jun, pages =

2016

[6] [6]

Deep Learning Face Attributes in the Wild , year =

Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou , booktitle =. Deep Learning Face Attributes in the Wild , year =. doi:10.1109/ICCV.2015.425 , url =

work page doi:10.1109/iccv.2015.425 2015

[7] [7]

Random Forests for Real Time 3D Face Analysis , volume =

Fanelli, Gabriele and Dantone, Matthias and Gall, Juergen and Fossati, Andrea and Van Gool, Luc , year =. Random Forests for Real Time 3D Face Analysis , volume =. International Journal of Computer Vision , publisher =. doi:10.1007/s11263-012-0549-0 , number =

work page doi:10.1007/s11263-012-0549-0

[8] [8]

Diversity-Aware Meta Visual Prompting

Zhang, Cheng and Liu, Hai and Deng, Yongjian and Xie, Bochen and Li, Youfu , booktitle =. 2023 , volume =. doi:10.1109/CVPR52729.2023.00859 , publisher =

work page doi:10.1109/cvpr52729.2023.00859 2023

[9] [9]

Face-from-Depth for Head Pose Estimation on Depth Images , year=

Borghi, Guido and Fabbri, Matteo and Vezzani, Roberto and Calderara, Simone and Cucchiara, Rita , journal=. Face-from-Depth for Head Pose Estimation on Depth Images , year=

[10] [10]

and Costeira, João Paulo , year =

Celestino, José and Marques, Manuel and Nascimento, Jacinto C. and Costeira, João Paulo , year =. 2D Image head pose estimation via latent space regression under occlusion settings , volume =. doi:10.1016/j.patcog.2022.109288 , journal =

work page doi:10.1016/j.patcog.2022.109288 2022

[11] [11]

FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image , url =

Yang, Tsun-Yi and Chen, Yi-Ting and Lin, Yen-Yu and Chuang, Yung-Yu , year =. FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image , url =. doi:10.1109/cvpr.2019.00118 , booktitle =

work page doi:10.1109/cvpr.2019.00118 2019

[12] [12]

A Vector-based Representation to Enhance Head Pose Estimation , year=

Cao, Zhiwen and Chu, Zongcheng and Liu, Dongfang and Chen, Yingjie , booktitle=. A Vector-based Representation to Enhance Head Pose Estimation , year=

[13] [13]

Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose , volume =

Li, Yaokun and Tan, Guang and Gou, Chao , year =. Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose , volume =. International Journal of Computer Vision , publisher =. doi:10.1007/s11263-023-01935-2 , number =

work page doi:10.1007/s11263-023-01935-2

[14] [14]

Diversity-Aware Meta Visual Prompting

Li, Heyuan and Wang, Bo and Cheng, Yu and Kankanhalli, Mohan and Tan, Robby T. , booktitle =. 2023 , volume =. doi:10.1109/CVPR52729.2023.00440 , publisher =

work page doi:10.1109/cvpr52729.2023.00440 2023

[15] [15]

SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction , year=

Ruan, Zeyu and Zou, Changqing and Wu, Longhai and Wu, Gangshan and Wang, Limin , journal=. SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction , year=

[16] [16]

Proceedings of the 41st International Conference on Machine Learning , pages =

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model , author =. Proceedings of the 41st International Conference on Machine Learning , pages =. 2024 , editor =

2024

[17] [17]

Don’t shake the wheel: Momentum-aware planning in end-to-end autonomous driving

Hatamizadeh, Ali and Kautz, Jan , booktitle =. 2025 , volume =. doi:10.1109/CVPR52734.2025.02352 , publisher =

work page doi:10.1109/cvpr52734.2025.02352 2025

[18] [18]

VMamba: Visual State Space Model , url =

Liu, Yue and Tian, Yunjie and Zhao, Yuzhong and Yu, Hongtian and Xie, Lingxi and Wang, Yaowei and Ye, Qixiang and Jiao, Jianbin and Liu, Yunfan , booktitle =. VMamba: Visual State Space Model , url =

[19] [19]

LocalMamba: Visual State Space Model with Windowed Selective Scan , DOI =

Huang, Tao and Pei, Xiaohuan and You, Shan and Wang, Fei and Qian, Chen and Xu, Chang , year =. LocalMamba: Visual State Space Model with Windowed Selective Scan , DOI =. Computer Vision – ECCV 2024 Workshops , publisher =

2024

[20] [20]

2025 , isbn =

Pei, Xiaohuan and Huang, Tao and Xu, Chang , title =. 2025 , isbn =. doi:10.1609/aaai.v39i6.32690 , booktitle =

work page doi:10.1609/aaai.v39i6.32690 2025

[21] [21]

MambaOut: Do We Really Need Mamba for Vision?* , year=

Yu, Weihao and Wang, Xinchao , booktitle=. MambaOut: Do We Really Need Mamba for Vision?* , year=

[22] [22]

Proceedings of the 38th International Conference on Neural Information Processing Systems , articleno =

Han, Dongchen and Wang, Ziyi and Xia, Zhuofan and Han, Yizeng and Pu, Yifan and Ge, Chunjiang and Song, Jun and Song, Shiji and Zheng, Bo and Huang, Gao , title =. Proceedings of the 38th International Conference on Neural Information Processing Systems , articleno =. 2025 , isbn =

2025

[23] [23]

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model , volume =

Wang, Zeyu and Li, Chen and Xu, Huiying and Zhu, Xinzhong and Li, Hongbo , year =. Mamba YOLO: A Simple Baseline for Object Detection with State Space Model , volume =. Proceedings of the AAAI Conference on Artificial Intelligence , publisher =. doi:10.1609/aaai.v39i8.32885 , number =

work page doi:10.1609/aaai.v39i8.32885

[24] [24]

arXiv preprint arXiv:2407.13772 , year=

GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model , author=. arXiv preprint arXiv:2407.13772 , year=

arXiv

[25] [25]

MambaPose: A Human Pose Estimation Based on Gated Feedforward Network and Mamba , volume =

Zhang, Jianqiang and Hou, Jing and He, Qiusheng and Yuan, Zhengwei and Xue, Hao , year =. MambaPose: A Human Pose Estimation Based on Gated Feedforward Network and Mamba , volume =. Sensors , publisher =. doi:10.3390/s24248158 , number =

work page doi:10.3390/s24248158

[26] [26]

Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network , volume =

Zhang, Xinyi and Bao, Qiqi and Cui, Qinpeng and Yang, Wenming and Liao, Qingmin , year =. Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network , volume =. Proceedings of the AAAI Conference on Artificial Intelligence , publisher =. doi:10.1609/aaai.v39i10.33112 , number =

work page doi:10.1609/aaai.v39i10.33112

[27] [27]

PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model , volume =

Huang, Yunlong and Liu, Junshuo and Xian, Ke and Qiu, Robert Caiming , year =. PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model , volume =. Proceedings of the AAAI Conference on Artificial Intelligence , publisher =. doi:10.1609/aaai.v39i4.32401 , number =

work page doi:10.1609/aaai.v39i4.32401

[28] [28]

2025 , volume =

Lang, Bo and Chuah, Mooi Choo , booktitle =. 2025 , volume =. doi:10.1109/WACV61041.2025.00102 , publisher =

work page doi:10.1109/wacv61041.2025.00102 2025

[29] [29]

International Conference on Learning Representations , year=

Efficiently Modeling Long Sequences with Structured State Spaces , author=. International Conference on Learning Representations , year=

[30] [30]

Deep Learning for Head Pose Estimation: A Survey , volume =

Asperti, Andrea and Filippini, Daniele , year =. Deep Learning for Head Pose Estimation: A Survey , volume =. SN Computer Science , publisher =. doi:10.1007/s42979-023-01796-z , number =

work page doi:10.1007/s42979-023-01796-z

[31] [31]

Deep learning and machine learning techniques for head pose estimation: a survey , volume =

Algabri, Redhwan and Abdu, Ahmed and Lee, Sungon , year =. Deep learning and machine learning techniques for head pose estimation: a survey , volume =. Artificial Intelligence Review , publisher =. doi:10.1007/s10462-024-10936-7 , number =

work page doi:10.1007/s10462-024-10936-7

[32] [32]

A survey of head pose estimation methods , url =

Shao, Xiaofeng and Qiang, Zhenping and Lin, Hong and Dong, Yueyu and Wang, Xiaorui , year =. A survey of head pose estimation methods , url =. doi:10.1109/ithings-greencom-cpscom-smartdata-cybermatics50389.2020.00135 , booktitle =

work page doi:10.1109/ithings-greencom-cpscom-smartdata-cybermatics50389.2020.00135 2020

[33] [33]

A Comprehensive Survey on Mamba: Architectures, Challenges, and Opportunities , volume =

Salam, Abdus and Mahmud, Rasel and Islam, Tohedul and Mukta, Saddam and Shatabda, Swakkhar , year =. A Comprehensive Survey on Mamba: Architectures, Challenges, and Opportunities , volume =. Computer , publisher =. doi:10.1109/mc.2025.3571322 , number =

work page doi:10.1109/mc.2025.3571322 2025

[34] [34]

Advancing Intelligent Sequence Modeling: Evolution, Trade-offs, and Applications of State- Space Architectures from S4 to Mamba

Somvanshi, Shriyank and Islam, Md Monzurul and Mimi, Mahmuda Sultana and Polock, Sazzad Bin Bashar and Chhetri, Gaurab and Das, Subasish , title =. 2025 , copyright =. doi:10.48550/ARXIV.2503.18970 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2503.18970 2025

[35] [35]

2024 , copyright =

Liu, Xiao and Zhang, Chenxu and Zhang, Lei , title =. 2024 , copyright =. doi:10.48550/ARXIV.2405.04404 , url =

work page doi:10.48550/arxiv.2405.04404 2024

[36] [36]

2024 , copyright =

Zhang, Hanwei and Zhu, Ying and Wang, Dan and Zhang, Lijun and Chen, Tianxiang and Ye, Zi , title =. 2024 , copyright =. doi:10.48550/ARXIV.2404.15956 , url =

work page doi:10.48550/arxiv.2404.15956 2024

[37] [37]

, booktitle=

Youding Zhu and Fujimura, K. , booktitle=. Head pose estimation for driver monitoring , year=

[38] [38]

and Elvezio, Carmine and Feiner, Steven K

Grinshpoon, Alon and Sadri, Shirin and Loeb, Gabrielle J. and Elvezio, Carmine and Feiner, Steven K. , booktitle=. Hands-Free Interaction for Augmented Reality in Vascular Interventions , year=

[39] [39]

Human Computer Interaction with Head Pose, Eye Gaze and Body Gestures , year=

Wang, Kang and Zhao, Rui and Ji, Qiang , booktitle=. Human Computer Interaction with Head Pose, Eye Gaze and Body Gestures , year=

[40] [40]

2011 , volume =

Chen, Chih-Wei and Ugarte, Rodrigo Cilla and Wu, Chen and Aghajan, Hamid , booktitle =. 2011 , volume =. doi:10.1109/FG.2011.5771376 , publisher =

work page doi:10.1109/fg.2011.5771376 2011

[41] [41]

Head pose estimation and its application in TV viewers' behavior analysis , year=

Wu, Siyu and Liang, Jie and Ho, Jason , booktitle=. Head pose estimation and its application in TV viewers' behavior analysis , year=

[42] [42]

Guiding Visual Surveillance by Tracking Human Attention , url =

Benfold, Ben and Reid, Ian , year =. Guiding Visual Surveillance by Tracking Human Attention , url =. doi:10.5244/c.23.14 , booktitle =

work page doi:10.5244/c.23.14

[43] [43]

and Femiani, John , year =

Chuang, Chia Yuan and Craig, Scotty D. and Femiani, John , year =. Detecting probable cheating during online assessments based on time delay and head pose , volume =. Higher Education Research & Development , publisher =. doi:10.1080/07294360.2017.1303456 , number =

work page doi:10.1080/07294360.2017.1303456 2017

[44] [44]

OpenFace 2.0: Facial Behavior Analysis Toolkit , year=

Baltrusaitis, Tadas and Zadeh, Amir and Lim, Yao Chong and Morency, Louis-Philippe , booktitle=. OpenFace 2.0: Facial Behavior Analysis Toolkit , year=

[45] [45]

, booktitle=

Xu, Xiang and Kakadiaris, Ioannis A. , booktitle=. Joint Head Pose Estimation and Face Alignment Framework Using Global and Local CNN Features , year=

[46] [46]

and Kim, Hak Gu and Kim, Seong Tae and Ro, Yong Man , year =

Lee, Hong Joo and Baddar, Wissam J. and Kim, Hak Gu and Kim, Seong Tae and Ro, Yong Man , year =. Teacher and Student Joint Learning for Compact Facial Landmark Detection Network , ISBN =. doi:10.1007/978-3-319-73603-7_40 , booktitle =

work page doi:10.1007/978-3-319-73603-7_40

[47] [47]

and Chellappa, Rama , journal=

Ranjan, Rajeev and Patel, Vishal M. and Chellappa, Rama , journal=. 2019 , volume=. doi:10.1109/TPAMI.2017.2781233 , publisher=

work page doi:10.1109/tpami.2017.2781233 2019

[48] [48]

Deep convolutional neural network-based Bernoulli heatmap for head pose estimation , volume =

Hu, Zhongxu and Xing, Yang and Lv, Chen and Hang, Peng and Liu, Jie , year =. Deep convolutional neural network-based Bernoulli heatmap for head pose estimation , volume =. doi:10.1016/j.neucom.2021.01.048 , journal =

work page doi:10.1016/j.neucom.2021.01.048 2021

[49] [49]

Expression

Dhingra, Naina , booktitle =. 2021 , volume =. doi:10.1109/FG52635.2021.9667080 , publisher =

work page doi:10.1109/fg52635.2021.9667080 2021

[50] [50]

2008 , volume =

Lablack, Adel and Zhang, Zhongfei (Mark) and Djeraba, Chabane , booktitle =. 2008 , volume =. doi:10.1109/ISM.2008.34 , publisher =

work page doi:10.1109/ism.2008.34 2008

[51] [51]

2019 , volume =

Shao, Mingzhen and Sun, Zhun and Ozay, Mete and Okatani, Takayuki , booktitle =. 2019 , volume =. doi:10.1109/FG.2019.8756605 , publisher =

work page doi:10.1109/fg.2019.8756605 2019

[52] [52]

Leaky Gated Cross-Attention for Weakly Supervised Multi-Modal Temporal Action Localization , booktitle =

Dhingra, Naina , booktitle =. 2022 , volume =. doi:10.1109/WACV51458.2022.00127 , publisher =

work page doi:10.1109/wacv51458.2022.00127 2022

[53] [53]

Mansoor, M

Cobo, Alejandro and Valle, Roberto and Buenaposada, José M. and Baumela, Luis , year =. On the representation and methodology for wide and short range head pose estimation , volume =. doi:10.1016/j.patcog.2024.110263 , journal =

work page doi:10.1016/j.patcog.2024.110263 2024

[54] [54]

Orientation Cues-Aware Facial Relationship Representation for Head Pose Estimation via Transformer , year=

Liu, Hai and Zhang, Cheng and Deng, Yongjian and Liu, Tingting and Zhang, Zhaoli and Li, You-Fu , journal=. Orientation Cues-Aware Facial Relationship Representation for Head Pose Estimation via Transformer , year=

[55] [55]

Relative Pose Consistency for Semi-Supervised Head Pose Estimation , year=

Kuhnke, Felix and Ihler, Sontje and Ostermann, Jörn , booktitle=. Relative Pose Consistency for Semi-Supervised Head Pose Estimation , year=

[56] [56]

In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Guo, Yuyu and Bai, Yancheng and Shi, Daiqi and Cai, Yang and Bian, Wei , booktitle =. 2023 , volume =. doi:10.1109/CVPRW59228.2023.00373 , publisher =

work page doi:10.1109/cvprw59228.2023.00373 2023

[57] [57]

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose , DOI =

Zhou, Yijun and Gregson, James , year =. WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose , DOI =. Proceedings of the British Machine Vision Conference , publisher =

[58] [58]

Learning 3D Head Pose From Synthetic Data: A Semi-Supervised Approach , year=

Basak, Shubhajit and Corcoran, Peter and Khan, Faisal and Mcdonnell, Rachel and Schukat, Michael , journal=. Learning 3D Head Pose From Synthetic Data: A Semi-Supervised Approach , year=

[59] [59]

Towards unsupervised learning of joint facial landmark detection and head pose estimation , volume =

Zou, Zhiming and Jia, Dian and Tang, Wei , year =. Towards unsupervised learning of joint facial landmark detection and head pose estimation , volume =. doi:10.1016/j.patcog.2025.111393 , journal =

work page doi:10.1016/j.patcog.2025.111393 2025

[60] [60]

doi:10.48550/ARXIV.2404.02544 , author =

Semi-Supervised Unconstrained Head Pose Estimation in the Wild , publisher =. doi:10.48550/ARXIV.2404.02544 , author =

work page doi:10.48550/arxiv.2404.02544

[61] [61]

ArXiv , year=

SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series , author=. ArXiv , year=

[62] [62]

arXiv preprint arXiv:2312.00752 , year=

Mamba: Linear-Time Sequence Modeling with Selective State Spaces , author=. arXiv preprint arXiv:2312.00752 , year=

Pith/arXiv arXiv

[63] [63]

A Deep-Learning Based Method for Analysis of Students’ Attention in Offline Class , volume =

Ling, Xufeng and Yang, Jie and Liang, Jingxin and Zhu, Huaizhong and Sun, Hui , year =. A Deep-Learning Based Method for Analysis of Students’ Attention in Offline Class , volume =. Electronics , publisher =. doi:10.3390/electronics11172663 , number =

work page doi:10.3390/electronics11172663

[64] [64]

Student Recognition and Activity Monitoring in E-Classes Using Deep Learning in Higher Education , year=

Alruwais, Nuha Mohammed and Zakariah, Mohammed , journal=. Student Recognition and Activity Monitoring in E-Classes Using Deep Learning in Higher Education , year=

[65] [65]

and Chellappa, Rama , booktitle =

Ranjan, Rajeev and Sankaranarayanan, Swami and Castillo, Carlos D. and Chellappa, Rama , booktitle =. 2017 , volume =. doi:10.1109/FG.2017.137 , publisher =

work page doi:10.1109/fg.2017.137 2017

[66] [66]

2017 , volume =

Kumar, Amit and Alavi, Azadeh and Chellappa, Rama , booktitle =. 2017 , volume =. doi:10.1109/FG.2017.149 , publisher =

work page doi:10.1109/fg.2017.149 2017

[67] [67]

3D head pose estimation with convolutional neural network trained on synthetic images , year=

Liu, Xiabing and Liang, Wei and Wang, Yumeng and Li, Shuyang and Pei, Mingtao , booktitle=. 3D head pose estimation with convolutional neural network trained on synthetic images , year=

[68] [68]

Facial Landmark, Head Pose, and Occlusion Analysis Using Multitask Stacked Hourglass , year=

Kim, Youngsam and Roh, Jong-Hyuk and Kim, Soohyung , journal=. Facial Landmark, Head Pose, and Occlusion Analysis Using Multitask Stacked Hourglass , year=

[69] [69]

2018 , volume =

Yang, Wei and Ouyang, Wanli and Wang, Xiaolong and Ren, Jimmy and Li, Hongsheng and Wang, Xiaogang , booktitle =. 2018 , volume =. doi:10.1109/CVPR.2018.00551 , publisher =

work page doi:10.1109/cvpr.2018.00551 2018

[70] [70]

A deep Coarse-to-Fine network for head pose estimation from synthetic data , volume =

Wang, Yujia and Liang, Wei and Shen, Jianbing and Jia, Yunde and Yu, Lap-Fai , year =. A deep Coarse-to-Fine network for head pose estimation from synthetic data , volume =. doi:10.1016/j.patcog.2019.05.026 , journal =

work page doi:10.1016/j.patcog.2019.05.026 2019

[71] [71]

, booktitle =

Ruiz, Nataniel and Chong, Eunji and Rehg, James M. , booktitle =. 2018 , volume =. doi:10.1109/CVPRW.2018.00281 , publisher =

work page doi:10.1109/cvprw.2018.00281 2018

[72] [72]

Head Pose Estimation Using Convolutional Neural Network , ISBN =

Lee, Seungsu and Saitoh, Takeshi , year =. Head Pose Estimation Using Convolutional Neural Network , ISBN =. doi:10.1007/978-981-10-6451-7_20 , booktitle =

work page doi:10.1007/978-981-10-6451-7_20

[73] [73]

Head Pose Estimation in Complex Environment Based on Four-Branch Feature Selective Extraction and Regional Information Exchange Fusion Network , year=

Wang, Bin-Yu and Xie, Kai and He, Sheng-Tao and Wen, Chang and He, Jian-Biao , journal=. Head Pose Estimation in Complex Environment Based on Four-Branch Feature Selective Extraction and Regional Information Exchange Fusion Network , year=

[74] [74]

Leaky Gated Cross-Attention for Weakly Supervised Multi-Modal Temporal Action Localization , booktitle =

Cantarini, Giorgio and Figari Tomenotti, Federico and Noceti, Nicoletta and Odone, Francesca , booktitle =. 2022 , volume =. doi:10.1109/WACV51458.2022.00340 , publisher =

work page doi:10.1109/wacv51458.2022.00340 2022

[75] [75]

EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks , year=

Xin, Miao and Mo, Shentong and Lin, Yuanze , booktitle=. EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks , year=

[76] [76]

doi:10.48550/ARXIV.2103.07615 , author =

An Efficient Multitask Neural Network for Face Alignment, Head Pose Estimation and Face Tracking , publisher =. doi:10.48550/ARXIV.2103.07615 , author =

work page doi:10.48550/arxiv.2103.07615

[77] [77]

doi:10.48550/ARXIV.2110.10953 , author =

MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation , publisher =. doi:10.48550/ARXIV.2110.10953 , author =

work page doi:10.48550/arxiv.2110.10953

[78] [78]

Head Pose Estimation through Keypoints Matching between Reconstructed 3D Face Model and 2D Image , volume =

Liu, Leyuan and Ke, Zeran and Huo, Jiao and Chen, Jingying , year =. Head Pose Estimation through Keypoints Matching between Reconstructed 3D Face Model and 2D Image , volume =. Sensors , publisher =. doi:10.3390/s21051841 , number =

work page doi:10.3390/s21051841

[79] [79]

Self-Attention Mechanism-Based Head Pose Estimation Network with Fusion of Point Cloud and Image Features , volume =

Chen, Kui and Wu, Zhaofu and Huang, Jianwei and Su, Yiming , year =. Self-Attention Mechanism-Based Head Pose Estimation Network with Fusion of Point Cloud and Image Features , volume =. Sensors , publisher =. doi:10.3390/s23249894 , number =

work page doi:10.3390/s23249894

[80] [80]

Head pose estimation with particle swarm optimization‐based contrastive learning and multimodal entangled GCN , volume =

Lian, Yuanfeng and Shi, Yinliang and Liu, Zhaonian and Jiang, Bin and Li, Xingtao , year =. Head pose estimation with particle swarm optimization‐based contrastive learning and multimodal entangled GCN , volume =. IET Image Processing , publisher =. doi:10.1049/ipr2.13142 , number =

work page doi:10.1049/ipr2.13142