pith. sign in

arxiv: 2606.24122 · v1 · pith:NEGLRILKnew · submitted 2026-06-23 · 💻 cs.CV

Bengal-HP_RU: A Dataset of Bengal People For Head Pose Estimation

Pith reviewed 2026-06-26 01:45 UTC · model grok-4.3

classification 💻 cs.CV
keywords head pose estimationdatasetBengali subjectsSouth Asianin-the-wild imagescontinuous annotationWikimedia Commons
0
0 comments X

The pith

Bengal-HP_RU supplies the first publicly released head-pose dataset built around Bengali subjects with 12,894 continuous yaw-pitch-roll labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Head-pose estimation models have been trained almost exclusively on Western and East Asian faces, leaving South Asian appearances underrepresented. The paper fills this gap by releasing Bengal-HP_RU, a collection of 12,894 images drawn from Wikimedia Commons and labeled with continuous yaw, pitch, and roll angles. Images were gathered under free licenses, processed by an automated pipeline, then manually corrected, and split by uploader identity to avoid train-test leakage across 296 distinct sources. The resulting set shows realistic variation in age, gender, occlusion, illumination, and background. Public release of the data at the stated DOI makes it possible for researchers to train and test models on Bengali subjects for the first time.

Core claim

Bengal-HP_RU is the first publicly available head-pose dataset centered on Bengali subjects. It contains 12,894 images annotated with continuous yaw, pitch, and roll values. The images were sourced from free-licensed Wikimedia Commons entries, labeled through an automated pipeline followed by manual correction, and partitioned by uploader identity into 10,494 training and 2,400 test images from 296 unique uploaders. The collection reflects substantial diversity in subject age, gender, occlusion, illumination, and background under in-the-wild conditions.

What carries the argument

Bengal-HP_RU dataset, which supplies the first large-scale, publicly licensed source of continuous head-pose labels drawn from Bengali subjects and partitioned by uploader to block data contamination.

If this is right

  • Head-pose estimators can now be trained and evaluated on Bengali facial geometry and appearance for the first time.
  • Existing models can be tested for accuracy drop when applied to South Asian subjects using the held-out test partition.
  • The uploader-based split guarantees that no identity or photo appears in both training and test sets.
  • Diversity across age, gender, occlusion, and lighting supports development of models that generalize to realistic conditions.
  • The public DOI allows direct download and extension by other researchers without licensing barriers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same Wikimedia sourcing and uploader partitioning strategy could be applied to create comparable datasets for other underrepresented ethnic or regional groups.
  • Models fine-tuned on Bengal-HP_RU may improve performance in downstream tasks such as gaze estimation or driver monitoring for South Asian populations.
  • If label quality holds, the dataset offers a low-cost template for rapidly expanding pose data coverage beyond currently dominant demographics.

Load-bearing premise

Wikimedia Commons images chosen for the collection plus the automated-plus-manual labeling process yield pose values and demographic coverage that accurately represent real Bengali head poses without systematic selection or annotation bias.

What would settle it

A controlled re-annotation of a random subset of the images by multiple independent human labelers or by a calibrated 3D head tracker that produces yaw-pitch-roll values differing by more than a few degrees on average from the released labels.

read the original abstract

Existing head pose datasets predominantly feature subjects of Western or East Asian origin, leaving South Asian populations, particularly Bengali individuals, largely underrepresented. We introduce Bengal-HP_RU, the first publicly available head pose dataset centred on Bengali subjects, comprising 12,894 labelled head images annotated with continuous yaw, pitch, and roll values. Images were collected from Wikimedia Commons under free licences and processed through an automated pipeline followed by manual label correction. The dataset is partitioned by Wikimedia uploader identity to prevent data contamination, yielding 10,494 training and 2,400 test images across 296 unique uploaders. Bengal-HP_RU exhibits substantial diversity in subject age, gender, occlusion, illumination, and background, reflecting realistic in-the-wild conditions. The dataset is publicly available at https://doi.org/10.17632/xbw9kr37jb.2.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Bengal-HP_RU, claimed as the first publicly available head-pose dataset centered on Bengali subjects. It comprises 12,894 images sourced from Wikimedia Commons under free licenses, annotated with continuous yaw, pitch, and roll values via an automated pipeline plus manual correction. The dataset is partitioned by uploader identity (10,494 train / 2,400 test across 296 uploaders) to avoid contamination and is asserted to exhibit diversity in age, gender, occlusion, illumination, and background under in-the-wild conditions. The resource is released at a DOI link.

Significance. If the subject identification and pose labels prove reliable, the dataset would address a genuine gap in head-pose estimation by providing data from an underrepresented South Asian population, supporting fairness and generalization studies in computer vision. The uploader-based partitioning is a sound practice that reduces leakage risk and aids reproducibility. Public release under open license is also a clear strength.

major comments (2)
  1. [Abstract] Abstract (data collection paragraph): the claim that the 12,894 images carry reliable continuous yaw/pitch/roll labels rests on an 'automated pipeline followed by manual label correction,' yet no quantitative validation (MAE, inter-rater reliability, or comparison to held-out ground truth) is reported. This directly undermines the central claim that the dataset is usable for training or benchmarking.
  2. [Abstract] Abstract (subject selection paragraph): criteria used to confirm that images depict Bengali subjects (caption text, uploader metadata, visual assessment, or otherwise) are unspecified. Without explicit, reproducible rules, systematic selection bias or mislabeling cannot be ruled out, which is load-bearing for the 'first Bengali-centred dataset' assertion.
minor comments (1)
  1. [Abstract] Abstract: the statement that the dataset 'exhibits substantial diversity' would be strengthened by even summary statistics (e.g., age/gender histograms or occlusion rates) rather than qualitative description alone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight important aspects of dataset documentation that we address point by point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract (data collection paragraph): the claim that the 12,894 images carry reliable continuous yaw/pitch/roll labels rests on an 'automated pipeline followed by manual label correction,' yet no quantitative validation (MAE, inter-rater reliability, or comparison to held-out ground truth) is reported. This directly undermines the central claim that the dataset is usable for training or benchmarking.

    Authors: We agree that the absence of quantitative validation metrics for the pose labels is a limitation in the current manuscript. The description of the automated pipeline plus manual correction is provided, but no MAE, agreement statistics, or held-out comparisons are reported. In the revised version we will add a dedicated subsection on the annotation procedure that includes the scale of manual corrections performed, any internal consistency checks conducted during correction, and explicit discussion of remaining limitations in label reliability. revision: yes

  2. Referee: [Abstract] Abstract (subject selection paragraph): criteria used to confirm that images depict Bengali subjects (caption text, uploader metadata, visual assessment, or otherwise) are unspecified. Without explicit, reproducible rules, systematic selection bias or mislabeling cannot be ruled out, which is load-bearing for the 'first Bengali-centred dataset' assertion.

    Authors: The current manuscript states that images were selected from Wikimedia Commons to centre on Bengali subjects but does not enumerate the precise decision rules. We will revise the methods section (and update the abstract accordingly) to provide an explicit, reproducible protocol: the combination of uploader self-identification in metadata, language of captions, geographic tags, and the visual assessment criteria applied by the authors. This addition will allow readers to evaluate potential selection bias. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset collection paper with no derivations or predictions

full rationale

The paper is a data release effort that introduces Bengal-HP_RU by describing image sourcing from Wikimedia Commons, an automated-plus-manual annotation pipeline, and a train/test split by uploader identity. No equations, fitted parameters, predictions, uniqueness theorems, or ansatzes appear in the provided text. The central claim (first Bengali-centric head-pose dataset) is supported by the act of collection itself and does not reduce to any self-referential step. This matches the default expectation for non-circular papers; the reader's assigned score of 0.0 is confirmed.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Dataset release paper; contains no mathematical derivations, fitted parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5714 in / 1004 out tokens · 24038 ms · 2026-06-26T01:45:13.399395+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

168 extracted references · 73 canonical work pages · 3 internal anchors

  1. [1]

    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , author =

    FisherMatch: Semi-Supervised Rotation Regression via Entropy-based Filtering , DOI =. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , author =. 2022 , month = jun, pages =

  2. [2]

    and Kurakin, Alex and Zhang, Han and Raffel, Colin , title =

    Sohn, Kihyuk and Berthelot, David and Li, Chun-Liang and Zhang, Zizhao and Carlini, Nicholas and Cubuk, Ekin D. and Kurakin, Alex and Zhang, Han and Raffel, Colin , title =. Proceedings of the 34th International Conference on Neural Information Processing Systems , articleno =. 2020 , publisher =

  3. [3]

    Proceedings of the 35th International Conference on Neural Information Processing Systems , articleno =

    Zhang, Bowen and Wang, Yidong and Hou, Wenxin and Wu, Hao and Wang, Jindong and Okumura, Manabu and Shinozaki, Takahiro , title =. Proceedings of the 35th International Conference on Neural Information Processing Systems , articleno =. 2021 , isbn =

  4. [4]

    Deep semi-supervised regression via pseudo-label filtering and calibration , volume =

    Jo, Yongwon and Kahng, Hyungu and Kim, Seoung Bum , year =. Deep semi-supervised regression via pseudo-label filtering and calibration , volume =. doi:10.1016/j.asoc.2024.111670 , journal =

  5. [5]

    IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , author =

    Face Alignment Across Large Poses: A 3D Solution , DOI =. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , author =. 2016 , month = jun, pages =

  6. [6]

    Deep Learning Face Attributes in the Wild , year =

    Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou , booktitle =. Deep Learning Face Attributes in the Wild , year =. doi:10.1109/ICCV.2015.425 , url =

  7. [7]

    Random Forests for Real Time 3D Face Analysis , volume =

    Fanelli, Gabriele and Dantone, Matthias and Gall, Juergen and Fossati, Andrea and Van Gool, Luc , year =. Random Forests for Real Time 3D Face Analysis , volume =. International Journal of Computer Vision , publisher =. doi:10.1007/s11263-012-0549-0 , number =

  8. [8]

    Diversity-Aware Meta Visual Prompting

    Zhang, Cheng and Liu, Hai and Deng, Yongjian and Xie, Bochen and Li, Youfu , booktitle =. 2023 , volume =. doi:10.1109/CVPR52729.2023.00859 , publisher =

  9. [9]

    Face-from-Depth for Head Pose Estimation on Depth Images , year=

    Borghi, Guido and Fabbri, Matteo and Vezzani, Roberto and Calderara, Simone and Cucchiara, Rita , journal=. Face-from-Depth for Head Pose Estimation on Depth Images , year=

  10. [10]

    and Costeira, João Paulo , year =

    Celestino, José and Marques, Manuel and Nascimento, Jacinto C. and Costeira, João Paulo , year =. 2D Image head pose estimation via latent space regression under occlusion settings , volume =. doi:10.1016/j.patcog.2022.109288 , journal =

  11. [11]

    FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image , url =

    Yang, Tsun-Yi and Chen, Yi-Ting and Lin, Yen-Yu and Chuang, Yung-Yu , year =. FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image , url =. doi:10.1109/cvpr.2019.00118 , booktitle =

  12. [12]

    A Vector-based Representation to Enhance Head Pose Estimation , year=

    Cao, Zhiwen and Chu, Zongcheng and Liu, Dongfang and Chen, Yingjie , booktitle=. A Vector-based Representation to Enhance Head Pose Estimation , year=

  13. [13]

    Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose , volume =

    Li, Yaokun and Tan, Guang and Gou, Chao , year =. Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose , volume =. International Journal of Computer Vision , publisher =. doi:10.1007/s11263-023-01935-2 , number =

  14. [14]

    Diversity-Aware Meta Visual Prompting

    Li, Heyuan and Wang, Bo and Cheng, Yu and Kankanhalli, Mohan and Tan, Robby T. , booktitle =. 2023 , volume =. doi:10.1109/CVPR52729.2023.00440 , publisher =

  15. [15]

    SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction , year=

    Ruan, Zeyu and Zou, Changqing and Wu, Longhai and Wu, Gangshan and Wang, Limin , journal=. SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction , year=

  16. [16]

    Proceedings of the 41st International Conference on Machine Learning , pages =

    Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model , author =. Proceedings of the 41st International Conference on Machine Learning , pages =. 2024 , editor =

  17. [17]

    Don’t shake the wheel: Momentum-aware planning in end-to-end autonomous driving

    Hatamizadeh, Ali and Kautz, Jan , booktitle =. 2025 , volume =. doi:10.1109/CVPR52734.2025.02352 , publisher =

  18. [18]

    VMamba: Visual State Space Model , url =

    Liu, Yue and Tian, Yunjie and Zhao, Yuzhong and Yu, Hongtian and Xie, Lingxi and Wang, Yaowei and Ye, Qixiang and Jiao, Jianbin and Liu, Yunfan , booktitle =. VMamba: Visual State Space Model , url =

  19. [19]

    LocalMamba: Visual State Space Model with Windowed Selective Scan , DOI =

    Huang, Tao and Pei, Xiaohuan and You, Shan and Wang, Fei and Qian, Chen and Xu, Chang , year =. LocalMamba: Visual State Space Model with Windowed Selective Scan , DOI =. Computer Vision – ECCV 2024 Workshops , publisher =

  20. [20]

    2025 , isbn =

    Pei, Xiaohuan and Huang, Tao and Xu, Chang , title =. 2025 , isbn =. doi:10.1609/aaai.v39i6.32690 , booktitle =

  21. [21]

    MambaOut: Do We Really Need Mamba for Vision?* , year=

    Yu, Weihao and Wang, Xinchao , booktitle=. MambaOut: Do We Really Need Mamba for Vision?* , year=

  22. [22]

    Proceedings of the 38th International Conference on Neural Information Processing Systems , articleno =

    Han, Dongchen and Wang, Ziyi and Xia, Zhuofan and Han, Yizeng and Pu, Yifan and Ge, Chunjiang and Song, Jun and Song, Shiji and Zheng, Bo and Huang, Gao , title =. Proceedings of the 38th International Conference on Neural Information Processing Systems , articleno =. 2025 , isbn =

  23. [23]

    Mamba YOLO: A Simple Baseline for Object Detection with State Space Model , volume =

    Wang, Zeyu and Li, Chen and Xu, Huiying and Zhu, Xinzhong and Li, Hongbo , year =. Mamba YOLO: A Simple Baseline for Object Detection with State Space Model , volume =. Proceedings of the AAAI Conference on Artificial Intelligence , publisher =. doi:10.1609/aaai.v39i8.32885 , number =

  24. [24]

    arXiv preprint arXiv:2407.13772 , year=

    GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model , author=. arXiv preprint arXiv:2407.13772 , year=

  25. [25]

    MambaPose: A Human Pose Estimation Based on Gated Feedforward Network and Mamba , volume =

    Zhang, Jianqiang and Hou, Jing and He, Qiusheng and Yuan, Zhengwei and Xue, Hao , year =. MambaPose: A Human Pose Estimation Based on Gated Feedforward Network and Mamba , volume =. Sensors , publisher =. doi:10.3390/s24248158 , number =

  26. [26]

    Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network , volume =

    Zhang, Xinyi and Bao, Qiqi and Cui, Qinpeng and Yang, Wenming and Liao, Qingmin , year =. Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network , volume =. Proceedings of the AAAI Conference on Artificial Intelligence , publisher =. doi:10.1609/aaai.v39i10.33112 , number =

  27. [27]

    PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model , volume =

    Huang, Yunlong and Liu, Junshuo and Xian, Ke and Qiu, Robert Caiming , year =. PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model , volume =. Proceedings of the AAAI Conference on Artificial Intelligence , publisher =. doi:10.1609/aaai.v39i4.32401 , number =

  28. [28]

    2025 , volume =

    Lang, Bo and Chuah, Mooi Choo , booktitle =. 2025 , volume =. doi:10.1109/WACV61041.2025.00102 , publisher =

  29. [29]

    International Conference on Learning Representations , year=

    Efficiently Modeling Long Sequences with Structured State Spaces , author=. International Conference on Learning Representations , year=

  30. [30]

    Deep Learning for Head Pose Estimation: A Survey , volume =

    Asperti, Andrea and Filippini, Daniele , year =. Deep Learning for Head Pose Estimation: A Survey , volume =. SN Computer Science , publisher =. doi:10.1007/s42979-023-01796-z , number =

  31. [31]

    Deep learning and machine learning techniques for head pose estimation: a survey , volume =

    Algabri, Redhwan and Abdu, Ahmed and Lee, Sungon , year =. Deep learning and machine learning techniques for head pose estimation: a survey , volume =. Artificial Intelligence Review , publisher =. doi:10.1007/s10462-024-10936-7 , number =

  32. [32]

    A survey of head pose estimation methods , url =

    Shao, Xiaofeng and Qiang, Zhenping and Lin, Hong and Dong, Yueyu and Wang, Xiaorui , year =. A survey of head pose estimation methods , url =. doi:10.1109/ithings-greencom-cpscom-smartdata-cybermatics50389.2020.00135 , booktitle =

  33. [33]

    A Comprehensive Survey on Mamba: Architectures, Challenges, and Opportunities , volume =

    Salam, Abdus and Mahmud, Rasel and Islam, Tohedul and Mukta, Saddam and Shatabda, Swakkhar , year =. A Comprehensive Survey on Mamba: Architectures, Challenges, and Opportunities , volume =. Computer , publisher =. doi:10.1109/mc.2025.3571322 , number =

  34. [34]

    Advancing Intelligent Sequence Modeling: Evolution, Trade-offs, and Applications of State- Space Architectures from S4 to Mamba

    Somvanshi, Shriyank and Islam, Md Monzurul and Mimi, Mahmuda Sultana and Polock, Sazzad Bin Bashar and Chhetri, Gaurab and Das, Subasish , title =. 2025 , copyright =. doi:10.48550/ARXIV.2503.18970 , url =

  35. [35]

    2024 , copyright =

    Liu, Xiao and Zhang, Chenxu and Zhang, Lei , title =. 2024 , copyright =. doi:10.48550/ARXIV.2405.04404 , url =

  36. [36]

    2024 , copyright =

    Zhang, Hanwei and Zhu, Ying and Wang, Dan and Zhang, Lijun and Chen, Tianxiang and Ye, Zi , title =. 2024 , copyright =. doi:10.48550/ARXIV.2404.15956 , url =

  37. [37]

    , booktitle=

    Youding Zhu and Fujimura, K. , booktitle=. Head pose estimation for driver monitoring , year=

  38. [38]

    and Elvezio, Carmine and Feiner, Steven K

    Grinshpoon, Alon and Sadri, Shirin and Loeb, Gabrielle J. and Elvezio, Carmine and Feiner, Steven K. , booktitle=. Hands-Free Interaction for Augmented Reality in Vascular Interventions , year=

  39. [39]

    Human Computer Interaction with Head Pose, Eye Gaze and Body Gestures , year=

    Wang, Kang and Zhao, Rui and Ji, Qiang , booktitle=. Human Computer Interaction with Head Pose, Eye Gaze and Body Gestures , year=

  40. [40]

    2011 , volume =

    Chen, Chih-Wei and Ugarte, Rodrigo Cilla and Wu, Chen and Aghajan, Hamid , booktitle =. 2011 , volume =. doi:10.1109/FG.2011.5771376 , publisher =

  41. [41]

    Head pose estimation and its application in TV viewers' behavior analysis , year=

    Wu, Siyu and Liang, Jie and Ho, Jason , booktitle=. Head pose estimation and its application in TV viewers' behavior analysis , year=

  42. [42]

    Guiding Visual Surveillance by Tracking Human Attention , url =

    Benfold, Ben and Reid, Ian , year =. Guiding Visual Surveillance by Tracking Human Attention , url =. doi:10.5244/c.23.14 , booktitle =

  43. [43]

    and Femiani, John , year =

    Chuang, Chia Yuan and Craig, Scotty D. and Femiani, John , year =. Detecting probable cheating during online assessments based on time delay and head pose , volume =. Higher Education Research & Development , publisher =. doi:10.1080/07294360.2017.1303456 , number =

  44. [44]

    OpenFace 2.0: Facial Behavior Analysis Toolkit , year=

    Baltrusaitis, Tadas and Zadeh, Amir and Lim, Yao Chong and Morency, Louis-Philippe , booktitle=. OpenFace 2.0: Facial Behavior Analysis Toolkit , year=

  45. [45]

    , booktitle=

    Xu, Xiang and Kakadiaris, Ioannis A. , booktitle=. Joint Head Pose Estimation and Face Alignment Framework Using Global and Local CNN Features , year=

  46. [46]

    and Kim, Hak Gu and Kim, Seong Tae and Ro, Yong Man , year =

    Lee, Hong Joo and Baddar, Wissam J. and Kim, Hak Gu and Kim, Seong Tae and Ro, Yong Man , year =. Teacher and Student Joint Learning for Compact Facial Landmark Detection Network , ISBN =. doi:10.1007/978-3-319-73603-7_40 , booktitle =

  47. [47]

    and Chellappa, Rama , journal=

    Ranjan, Rajeev and Patel, Vishal M. and Chellappa, Rama , journal=. 2019 , volume=. doi:10.1109/TPAMI.2017.2781233 , publisher=

  48. [48]

    Deep convolutional neural network-based Bernoulli heatmap for head pose estimation , volume =

    Hu, Zhongxu and Xing, Yang and Lv, Chen and Hang, Peng and Liu, Jie , year =. Deep convolutional neural network-based Bernoulli heatmap for head pose estimation , volume =. doi:10.1016/j.neucom.2021.01.048 , journal =

  49. [49]

    Expression

    Dhingra, Naina , booktitle =. 2021 , volume =. doi:10.1109/FG52635.2021.9667080 , publisher =

  50. [50]

    2008 , volume =

    Lablack, Adel and Zhang, Zhongfei (Mark) and Djeraba, Chabane , booktitle =. 2008 , volume =. doi:10.1109/ISM.2008.34 , publisher =

  51. [51]

    2019 , volume =

    Shao, Mingzhen and Sun, Zhun and Ozay, Mete and Okatani, Takayuki , booktitle =. 2019 , volume =. doi:10.1109/FG.2019.8756605 , publisher =

  52. [52]

    Leaky Gated Cross-Attention for Weakly Supervised Multi-Modal Temporal Action Localization , booktitle =

    Dhingra, Naina , booktitle =. 2022 , volume =. doi:10.1109/WACV51458.2022.00127 , publisher =

  53. [53]

    Mansoor, M

    Cobo, Alejandro and Valle, Roberto and Buenaposada, José M. and Baumela, Luis , year =. On the representation and methodology for wide and short range head pose estimation , volume =. doi:10.1016/j.patcog.2024.110263 , journal =

  54. [54]

    Orientation Cues-Aware Facial Relationship Representation for Head Pose Estimation via Transformer , year=

    Liu, Hai and Zhang, Cheng and Deng, Yongjian and Liu, Tingting and Zhang, Zhaoli and Li, You-Fu , journal=. Orientation Cues-Aware Facial Relationship Representation for Head Pose Estimation via Transformer , year=

  55. [55]

    Relative Pose Consistency for Semi-Supervised Head Pose Estimation , year=

    Kuhnke, Felix and Ihler, Sontje and Ostermann, Jörn , booktitle=. Relative Pose Consistency for Semi-Supervised Head Pose Estimation , year=

  56. [56]

    In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

    Guo, Yuyu and Bai, Yancheng and Shi, Daiqi and Cai, Yang and Bian, Wei , booktitle =. 2023 , volume =. doi:10.1109/CVPRW59228.2023.00373 , publisher =

  57. [57]

    WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose , DOI =

    Zhou, Yijun and Gregson, James , year =. WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose , DOI =. Proceedings of the British Machine Vision Conference , publisher =

  58. [58]

    Learning 3D Head Pose From Synthetic Data: A Semi-Supervised Approach , year=

    Basak, Shubhajit and Corcoran, Peter and Khan, Faisal and Mcdonnell, Rachel and Schukat, Michael , journal=. Learning 3D Head Pose From Synthetic Data: A Semi-Supervised Approach , year=

  59. [59]

    Towards unsupervised learning of joint facial landmark detection and head pose estimation , volume =

    Zou, Zhiming and Jia, Dian and Tang, Wei , year =. Towards unsupervised learning of joint facial landmark detection and head pose estimation , volume =. doi:10.1016/j.patcog.2025.111393 , journal =

  60. [60]

    doi:10.48550/ARXIV.2404.02544 , author =

    Semi-Supervised Unconstrained Head Pose Estimation in the Wild , publisher =. doi:10.48550/ARXIV.2404.02544 , author =

  61. [61]

    ArXiv , year=

    SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series , author=. ArXiv , year=

  62. [62]

    arXiv preprint arXiv:2312.00752 , year=

    Mamba: Linear-Time Sequence Modeling with Selective State Spaces , author=. arXiv preprint arXiv:2312.00752 , year=

  63. [63]

    A Deep-Learning Based Method for Analysis of Students’ Attention in Offline Class , volume =

    Ling, Xufeng and Yang, Jie and Liang, Jingxin and Zhu, Huaizhong and Sun, Hui , year =. A Deep-Learning Based Method for Analysis of Students’ Attention in Offline Class , volume =. Electronics , publisher =. doi:10.3390/electronics11172663 , number =

  64. [64]

    Student Recognition and Activity Monitoring in E-Classes Using Deep Learning in Higher Education , year=

    Alruwais, Nuha Mohammed and Zakariah, Mohammed , journal=. Student Recognition and Activity Monitoring in E-Classes Using Deep Learning in Higher Education , year=

  65. [65]

    and Chellappa, Rama , booktitle =

    Ranjan, Rajeev and Sankaranarayanan, Swami and Castillo, Carlos D. and Chellappa, Rama , booktitle =. 2017 , volume =. doi:10.1109/FG.2017.137 , publisher =

  66. [66]

    2017 , volume =

    Kumar, Amit and Alavi, Azadeh and Chellappa, Rama , booktitle =. 2017 , volume =. doi:10.1109/FG.2017.149 , publisher =

  67. [67]

    3D head pose estimation with convolutional neural network trained on synthetic images , year=

    Liu, Xiabing and Liang, Wei and Wang, Yumeng and Li, Shuyang and Pei, Mingtao , booktitle=. 3D head pose estimation with convolutional neural network trained on synthetic images , year=

  68. [68]

    Facial Landmark, Head Pose, and Occlusion Analysis Using Multitask Stacked Hourglass , year=

    Kim, Youngsam and Roh, Jong-Hyuk and Kim, Soohyung , journal=. Facial Landmark, Head Pose, and Occlusion Analysis Using Multitask Stacked Hourglass , year=

  69. [69]

    2018 , volume =

    Yang, Wei and Ouyang, Wanli and Wang, Xiaolong and Ren, Jimmy and Li, Hongsheng and Wang, Xiaogang , booktitle =. 2018 , volume =. doi:10.1109/CVPR.2018.00551 , publisher =

  70. [70]

    A deep Coarse-to-Fine network for head pose estimation from synthetic data , volume =

    Wang, Yujia and Liang, Wei and Shen, Jianbing and Jia, Yunde and Yu, Lap-Fai , year =. A deep Coarse-to-Fine network for head pose estimation from synthetic data , volume =. doi:10.1016/j.patcog.2019.05.026 , journal =

  71. [71]

    , booktitle =

    Ruiz, Nataniel and Chong, Eunji and Rehg, James M. , booktitle =. 2018 , volume =. doi:10.1109/CVPRW.2018.00281 , publisher =

  72. [72]

    Head Pose Estimation Using Convolutional Neural Network , ISBN =

    Lee, Seungsu and Saitoh, Takeshi , year =. Head Pose Estimation Using Convolutional Neural Network , ISBN =. doi:10.1007/978-981-10-6451-7_20 , booktitle =

  73. [73]

    Head Pose Estimation in Complex Environment Based on Four-Branch Feature Selective Extraction and Regional Information Exchange Fusion Network , year=

    Wang, Bin-Yu and Xie, Kai and He, Sheng-Tao and Wen, Chang and He, Jian-Biao , journal=. Head Pose Estimation in Complex Environment Based on Four-Branch Feature Selective Extraction and Regional Information Exchange Fusion Network , year=

  74. [74]

    Leaky Gated Cross-Attention for Weakly Supervised Multi-Modal Temporal Action Localization , booktitle =

    Cantarini, Giorgio and Figari Tomenotti, Federico and Noceti, Nicoletta and Odone, Francesca , booktitle =. 2022 , volume =. doi:10.1109/WACV51458.2022.00340 , publisher =

  75. [75]

    EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks , year=

    Xin, Miao and Mo, Shentong and Lin, Yuanze , booktitle=. EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks , year=

  76. [76]

    doi:10.48550/ARXIV.2103.07615 , author =

    An Efficient Multitask Neural Network for Face Alignment, Head Pose Estimation and Face Tracking , publisher =. doi:10.48550/ARXIV.2103.07615 , author =

  77. [77]

    doi:10.48550/ARXIV.2110.10953 , author =

    MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation , publisher =. doi:10.48550/ARXIV.2110.10953 , author =

  78. [78]

    Head Pose Estimation through Keypoints Matching between Reconstructed 3D Face Model and 2D Image , volume =

    Liu, Leyuan and Ke, Zeran and Huo, Jiao and Chen, Jingying , year =. Head Pose Estimation through Keypoints Matching between Reconstructed 3D Face Model and 2D Image , volume =. Sensors , publisher =. doi:10.3390/s21051841 , number =

  79. [79]

    Self-Attention Mechanism-Based Head Pose Estimation Network with Fusion of Point Cloud and Image Features , volume =

    Chen, Kui and Wu, Zhaofu and Huang, Jianwei and Su, Yiming , year =. Self-Attention Mechanism-Based Head Pose Estimation Network with Fusion of Point Cloud and Image Features , volume =. Sensors , publisher =. doi:10.3390/s23249894 , number =

  80. [80]

    Head pose estimation with particle swarm optimization‐based contrastive learning and multimodal entangled GCN , volume =

    Lian, Yuanfeng and Shi, Yinliang and Liu, Zhaonian and Jiang, Bin and Li, Xingtao , year =. Head pose estimation with particle swarm optimization‐based contrastive learning and multimodal entangled GCN , volume =. IET Image Processing , publisher =. doi:10.1049/ipr2.13142 , number =

Showing first 80 references.