Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets
Pith reviewed 2026-05-15 11:07 UTC · model grok-4.3
The pith
A method expands limited mmWave datasets for human pose estimation by adding pseudo-labeled mmWave point clouds and translated LiDAR data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Expanding an original mmWave HPE dataset with both LiDAR-converted point clouds and pseudo-labeled mmWave point clouds raises the performance and generalization of all examined models, with measured error reductions of 15.1 percent in-domain and 18.9 percent out-of-domain.
What carries the argument
EMDUL, a two-module system whose pseudo-label estimator annotates unlabeled mmWave data while a closed-form converter translates annotated LiDAR point clouds into matching mmWave point clouds.
Load-bearing premise
The pseudo-labels assigned to unlabeled mmWave data remain accurate enough for training, and the LiDAR-to-mmWave converter keeps the essential pose geometry intact.
What would settle it
Retraining the same HPE models on the expanded dataset and finding no reduction or even an increase in pose estimation error compared with the original dataset alone.
Figures
read the original abstract
Current millimeter-wave (mmWave) datasets for human pose estimation (HPE) are scarce and lack diversity in both point cloud (PC) attributes and human poses, hindering the generalization ability of their trained models. On the other hand, unlabeled mmWave HPE data and diverse LiDAR HPE datasets are readily available. We propose EMDUL, a novel approach to expand the volume and diversity of an existing mmWave dataset using unlabeled mmWave data and LiDAR datasets. EMDUL consists of two independent modules, namely a pseudo-label estimator to annotate unlabeled mmWave data, and a closed-form converter that translates an annotated LiDAR PC to its mmWave counterpart. Expanding the original dataset with both LiDAR-converted and pseudo-labeled mmWave PCs significantly boosts the performance and generalization ability of all the examined HPE models, reducing 15.1% and 18.9% error for in-domain and out-of-domain settings, respectively. Code is available at https://github.com/Shimmer93/EMDUL.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces EMDUL, a two-module approach to expand scarce mmWave point-cloud datasets for human pose estimation (HPE). One module applies a pseudo-label estimator to annotate unlabeled mmWave data; the other uses a closed-form converter to map annotated LiDAR point clouds to mmWave equivalents. The authors claim that augmenting an original mmWave dataset with both types of expanded samples yields consistent error reductions of 15.1% (in-domain) and 18.9% (out-of-domain) across examined HPE models, improving generalization.
Significance. If the pseudo-labels and LiDAR-to-mmWave conversions preserve pose information without systematic distortion, the method offers a practical route to leverage abundant unlabeled mmWave and diverse LiDAR data, addressing data scarcity in mmWave HPE. The public code release supports reproducibility and allows independent verification of the empirical gains.
major comments (3)
- The central claim of 15.1%/18.9% error reduction rests on the assumption that the pseudo-label estimator produces annotations accurate enough not to degrade training. No direct validation (e.g., mean per-joint position error or precision-recall on a held-out labeled mmWave subset) is reported for this estimator, leaving open the possibility that observed gains arise from training on internally consistent but noisy labels rather than genuine diversity.
- The closed-form LiDAR-to-mmWave converter is presented without quantitative fidelity checks. No ablation compares converted LiDAR point clouds against real mmWave point clouds captured for the same poses, nor is there a sensitivity analysis showing how range-dependent sparsity or joint-level cue loss in the conversion propagates into final HPE metrics.
- Experimental details on baseline selection, validation splits, and controls for dataset-selection effects are insufficient to confirm the reported gains are robust. The soundness assessment notes the absence of these controls, which directly affects whether the in-domain and out-of-domain improvements can be attributed to the proposed expansion rather than confounding factors.
minor comments (1)
- The abstract and method sections would benefit from explicit enumeration of the exact HPE architectures tested and the source datasets (including sizes and pose distributions) used for both the original and expanded sets.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and detailed assessment of our manuscript on EMDUL. We address each major comment below with point-by-point responses. Where the comments identify gaps in validation or experimental detail, we have revised the manuscript to incorporate the requested analyses and clarifications, strengthening the evidence for our claims of improved generalization in mmWave human pose estimation.
read point-by-point responses
-
Referee: The central claim of 15.1%/18.9% error reduction rests on the assumption that the pseudo-label estimator produces annotations accurate enough not to degrade training. No direct validation (e.g., mean per-joint position error or precision-recall on a held-out labeled mmWave subset) is reported for this estimator, leaving open the possibility that observed gains arise from training on internally consistent but noisy labels rather than genuine diversity.
Authors: We agree that direct validation of the pseudo-label estimator is important to substantiate the quality of the expanded data. In the revised manuscript, we now include an evaluation of the estimator on a held-out labeled mmWave subset, reporting mean per-joint position error (MPJPE) and precision-recall metrics. These results confirm that the pseudo-labels preserve sufficient pose accuracy to contribute to the observed performance gains rather than introducing only noise. revision: yes
-
Referee: The closed-form LiDAR-to-mmWave converter is presented without quantitative fidelity checks. No ablation compares converted LiDAR point clouds against real mmWave point clouds captured for the same poses, nor is there a sensitivity analysis showing how range-dependent sparsity or joint-level cue loss in the conversion propagates into final HPE metrics.
Authors: We acknowledge the absence of direct fidelity checks in the original submission. The revised version adds a quantitative comparison of converted LiDAR point clouds against real mmWave captures for identical poses, including an ablation study and sensitivity analysis on range-dependent sparsity and joint cue preservation. These additions demonstrate that the closed-form converter maintains sufficient fidelity for the reported HPE improvements. revision: yes
-
Referee: Experimental details on baseline selection, validation splits, and controls for dataset-selection effects are insufficient to confirm the reported gains are robust. The soundness assessment notes the absence of these controls, which directly affects whether the in-domain and out-of-domain improvements can be attributed to the proposed expansion rather than confounding factors.
Authors: We appreciate this observation on experimental rigor. The revised manuscript now provides expanded details on baseline model selection criteria, the precise train/validation/test splits, and additional controls including cross-validation experiments and dataset composition analyses. These updates confirm that the 15.1% in-domain and 18.9% out-of-domain error reductions are attributable to the data expansion rather than confounding factors. revision: yes
Circularity Check
No circularity: empirical dataset expansion with independent modules
full rationale
The paper describes an empirical method (EMDUL) consisting of a pseudo-label estimator for unlabeled mmWave data and a closed-form converter from LiDAR point clouds. Performance improvements (15.1% and 18.9% error reduction) are shown via direct experiments on HPE models after dataset expansion. No equations, derivations, or self-referential definitions appear in the provided text that would reduce any claimed result to its own inputs by construction. No fitted parameters are relabeled as predictions, and no load-bearing self-citations or uniqueness theorems are invoked. The central claims rest on experimental outcomes rather than any closed logical loop.
Axiom & Free-Parameter Ledger
free parameters (1)
- Hyperparameters of pseudo-label estimator
axioms (2)
- domain assumption Pseudo-labels from the estimator are accurate enough for effective model training
- domain assumption LiDAR point clouds can be mapped to mmWave equivalents via closed-form conversion while retaining pose features
Reference graph
Works this paper leans on
-
[1]
Sizhe An and Umit Y . Ogras. Fast and scalable human pose estimation using mmWave point cloud.Proceedings of the 59th ACM/IEEE Design Automation Conference, pages 889– 894, 2022. 2
work page 2022
-
[2]
mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D, and Inertial Sensors, 2022
Sizhe An, Yin Li, and Umit Ogras. mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D, and Inertial Sensors, 2022. 2
work page 2022
-
[3]
SPiKE: 3D Human Pose from Point Cloud Sequences, 2024
Irene Ballester, Ond ˇrej Peterka, and Martin Kampel. SPiKE: 3D Human Pose from Point Cloud Sequences, 2024. 5, 6
work page 2024
-
[4]
MixMatch: A holistic approach to semi-supervised learning
David Berthelot, Nicholas Carlini, Ian Goodfellow, Avital Oliver, Nicolas Papernot, and Colin Raffel. MixMatch: A holistic approach to semi-supervised learning. InProceed- ings of the 33rd International Conference on Neural Infor- mation Processing Systems, number 454, pages 5049–5059. Curran Associates Inc., Red Hook, NY , USA, 2019. 3
work page 2019
-
[5]
mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar, 2023
Anjun Chen, Xiangyu Wang, Shaohao Zhu, Yanxu Li, Jim- ing Chen, and Qi Ye. mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar, 2023. 2, 5
work page 2023
-
[6]
Yuwei Cheng, Jingran Su, Mengxin Jiang, and Yimin Liu. A Novel Radar Point Cloud Generation Method for Robot Environment Perception.IEEE Transactions on Robotics, 38(6):3754–3773, 2022. 2, 3
work page 2022
-
[7]
Han Cui and Naim Dahnoun. Real-Time Short-Range Hu- man Posture Estimation Using mmWave Radars and Neural Networks.IEEE Sensors Journal, 22(1):535–543, 2022. 2
work page 2022
-
[8]
Milipoint: A point cloud dataset for mmwave radar
Han Cui, Shu Zhong, Jiacheng Wu, Zichao Shen, Naim Dah- noun, and Yiren Zhao. Milipoint: A point cloud dataset for mmwave radar. InAdvances in Neural Information Process- ing Systems, pages 62713–62726, 2023. 2
work page 2023
-
[9]
Kaikai Deng, Dong Zhao, Qiaoyue Han, Zihan Zhang, Shuyue Wang, Anfu Zhou, and Huadong Ma. Midas: Gen- erating mmWave Radar Data from Videos for Training Per- vasive and Privacy-preserving Human Sensing Tasks.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 7(1): 9:1–9:26, 2023. 2
work page 2023
-
[10]
Radar-Based 3D Human Skeleton Estimation by Kinematic Constrained Learning
Wen Ding, Zhongping Cao, Jianxiong Zhang, Rihui Chen, Xuemei Guo, and Guoli Wang. Radar-Based 3D Human Skeleton Estimation by Kinematic Constrained Learning. IEEE Sensors Journal, 21(20):23174–23184, 2021. 2
work page 2021
-
[11]
Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos
Hehe Fan, Yi Yang, and Mohan Kankanhalli. Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos. In2021 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 14199– 14208, 2021. 1, 3, 5, 6, 7, 8
work page 2021
-
[12]
Dif- fusion Model is a Good Pose Estimator from 3D RF-Vision,
Junqiao Fan, Jianfei Yang, Yuecong Xu, and Lihua Xie. Dif- fusion Model is a Good Pose Estimator from 3D RF-Vision,
-
[13]
Yuxin Fan, Yong Wang, Hang Zheng, and Zhiguo Shi. Video2mmPoint: Synthesizing mmWave Point Cloud Data From Videos for Gait Recognition.IEEE Sensors Journal, 25(1):773–782, 2025. 2
work page 2025
-
[14]
DenserRadar: A 4D Millimeter-Wave Radar Point Cloud De- tector Based on Dense LiDAR Point Clouds
Zeyu Han, Junkai Jiang, Xiaokang Ding, Jiahao Wang, Qing- wen Meng, Shaobing Xu, Lei He, and Jianqiang Wang. DenserRadar: A 4D Millimeter-Wave Radar Point Cloud De- tector Based on Dense LiDAR Point Clouds. In2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), pages 930–936, 2024. 2, 3
work page 2024
-
[15]
RT-Pose: A 4D Radar Tensor-Based 3D Human Pose Estimation and Localization Benchmark
Yuan-Hao Ho, Jen-Hao Cheng, Sheng Yao Kuan, Zhongyu Jiang, Wenhao Chai, Hsiang-Wei Huang, Chih-Lung Lin, and Jenq-Neng Hwang. RT-Pose: A 4D Radar Tensor-Based 3D Human Pose Estimation and Localization Benchmark. InComputer Vision – ECCV 2024, pages 107–125, Cham,
work page 2024
-
[16]
Springer Nature Switzerland. 2
-
[17]
Linzhi Huang, Yulong Li, Hongbo Tian, Yue Yang, Xian- gang Li, Weihong Deng, and Jieping Ye. Semi-Supervised 2D Human Pose Estimation Driven by Position Inconsis- tency Pseudo Label Correction Module. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 693–703, 2023. 3
work page 2023
-
[18]
Catalin Ionescu, Dragos Papava, Vlad Olaru, and Cristian Sminchisescu. Human3.6M: Large Scale Datasets and Pre- dictive Methods for 3D Human Sensing in Natural Environ- ments.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7):1325–1339, 2014. 6
work page 2014
-
[19]
S. Laine and Timo Aila. Temporal Ensembling for Semi- Supervised Learning.ArXiv, 2016. 3
work page 2016
-
[20]
Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Net- works
Dong-Hyun Lee. Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Net- works. 2013. 3
work page 2013
-
[21]
Shih-Po Lee, Niraj Prakash Kini, Wen-Hsiao Peng, Ching- Wen Ma, and Jenq-Neng Hwang. HuPR: A Benchmark for Human Pose Estimation Using Millimeter Wave Radar.2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 5704–5713, 2023. 2
work page 2023
-
[22]
Li- DARCap: Long-range Markerless 3D Human Motion Cap- ture with LiDAR Point Clouds
Jialian Li, Jingyi Zhang, Zhiyong Wang, Siqi Shen, Chenglu Wen, Yuexin Ma, Lan Xu, Jingyi Yu, and Cheng Wang. Li- DARCap: Long-range Markerless 3D Human Motion Cap- ture with LiDAR Point Clouds. In2022 IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), pages 20470–20480, 2022. 2, 5
work page 2022
-
[23]
Hm- PEAR: A Dataset for Human Pose Estimation and Action Recognition
Yitai Lin, Zhijie Wei, Wanfa Zhang, Xiping Lin, Yudi Dai, Chenglu Wen, Siqi Shen, Lan Xu, and Cheng Wang. Hm- PEAR: A Dataset for Human Pose Estimation and Action Recognition. InProceedings of the 32nd ACM International Conference on Multimedia, pages 2069–2078, New York, NY , USA, 2024. Association for Computing Machinery. 2, 5
work page 2069
-
[24]
I. Loshchilov and F. Hutter. Decoupled Weight Decay Reg- ularization. InInternational Conference on Learning Repre- sentations, 2017. 6
work page 2017
-
[25]
SGDR: Stochastic Gradi- ent Descent with Warm Restarts, 2017
Ilya Loshchilov and Frank Hutter. SGDR: Stochastic Gradi- ent Descent with Warm Restarts, 2017. 6
work page 2017
-
[26]
Diffusion-Based Point Cloud Super-Resolution for mmWave Radar Data
Kai Luan, Chenghao Shi, Neng Wang, Yuwei Cheng, Huimin Lu, and Xieyuanli Chen. Diffusion-Based Point Cloud Super-Resolution for mmWave Radar Data. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 11171–11177, 2024. 2, 3
work page 2024
-
[27]
3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training
Dario Pavllo, Christoph Feichtenhofer, David Grangier, and Michael Auli. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training. In 2019 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 7745–7754, 2019. 3
work page 2019
-
[28]
High Resolution Point Clouds from mmWave Radar
Akarsh Prabhakara, Tao Jin, Arnav Das, Gantavya Bhatt, Lilly Kumari, Elahe Soltanaghai, Jeff Bilmes, Swarun Ku- mar, and Anthony Rowe. High Resolution Point Clouds from mmWave Radar. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 4135–4142, 2023. 2, 3
work page 2023
-
[29]
Data Distillation: Towards Omni-Supervised Learning
Ilija Radosavovic, Piotr Doll ´ar, Ross Girshick, Georgia Gkioxari, and Kaiming He. Data Distillation: Towards Omni-Supervised Learning. In2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4119– 4128, 2018. 3
work page 2018
-
[30]
LiveHPS: LiDAR-Based Scene-Level Human Pose and Shape Estimation in Free En- vironment
Yiming Ren, Xiao Han, Chengfeng Zhao, Jingya Wang, Lan Xu, Jingyi Yu, and Yuexin Ma. LiveHPS: LiDAR-Based Scene-Level Human Pose and Shape Estimation in Free En- vironment. In2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1281–1291,
-
[31]
Regularization with stochastic transformations and pertur- bations for deep semi-supervised learning
Mehdi Sajjadi, Mehran Javanmardi, and Tolga Tasdizen. Regularization with stochastic transformations and pertur- bations for deep semi-supervised learning. InProceedings of the 30th International Conference on Neural Information Processing Systems, pages 1171–1179, Red Hook, NY , USA,
-
[32]
Curran Associates Inc. 3
-
[33]
Arindam Sengupta and Siyang Cao.mmPose-NLP: A Natu- ral Language Processing Approach to Precise Skeletal Pose Estimation Using mmWave Radars.IEEE Transactions on Neural Networks and Learning Systems, 34(11):8418–8429,
-
[34]
Akash Deep Singh, Sandeep Singh Sandha, Luis Garcia, and Mani Srivastava. RadHAR: Human Activity Recogni- tion from Point Clouds Generated through a Millimeter-wave Radar.Proceedings of the 3rd ACM Workshop on Millimeter- wave Networks and Sensing Systems, pages 51–56, 2019. 2
work page 2019
-
[35]
Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel
Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel. FixMatch: Simplifying semi- supervised learning with consistency and confidence. InPro- ceedings of the 34th International Conference on Neural In- formation Processing Systems, pages 596–608, Red Hook, NY , USA, 2020. Cur...
work page 2020
-
[36]
Antti Tarvainen and Harri Valpola. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. InProceedings of the 31st International Conference on Neural Information Pro- cessing Systems, pages 1195–1204, Red Hook, NY , USA,
-
[37]
Curran Associates Inc. 3, 6, 1
-
[38]
Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V . Le. Self-Training With Noisy Student Improves ImageNet Classification. In2020 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 10684– 10695, 2020. 3
work page 2020
-
[39]
An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation
Rongchang Xie, Chunyu Wang, Wenjun Zeng, and Yizhou Wang. An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11220–11229, 2021. 3
work page 2021
-
[40]
MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing, 2023
Jianfei Yang, He Huang, Yunjiao Zhou, Xinyan Chen, Yue- cong Xu, Shenghai Yuan, Han Zou, Chris Xiaoxuan Lu, and Lihua Xie. MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing, 2023. 2, 5
work page 2023
-
[41]
Peijun Zhao, Chris Xiaoxuan Lu, Bing Wang, Niki Trigoni, and Andrew Markham. CubeLearn: End-to-End Learning for Human Motion Recognition From Raw mmWave Radar Signals.IEEE Internet of Things Journal, 10(12):10236– 10249, 2023. 2
work page 2023
-
[42]
3D Human Pose Esti- mation with Spatial and Temporal Transformers
Ce Zheng, Sijie Zhu, Matias Mendieta, Taojiannan Yang, Chen Chen, and Zhengming Ding. 3D Human Pose Esti- mation with Spatial and Temporal Transformers. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11636–11645, 2021. 6
work page 2021
-
[43]
Bing Zhu, Zixin He, Weiyi Xiong, Guanhua Ding, Jianan Liu, Tao Huang, Wei Chen, and Wei Xiang. ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion, 2024. 2 Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets Supplementary Material Table 9. The entire...
work page 2024
-
[44]
More Implementation Details This section presents more implementation details of our proposed EMDUL and a comparison scheme adapted for mmWave HPE. 9.1. PC Conversion Pipeline We specify the parameters used in our PC conversion pipeline in Tab. 9. The parameters are chosen based on empirical results on the validation set.υis re-sampled per instance. 9.2. ...
-
[45]
More Experimental Results In this section, we show more quantitative and visualization results for EMDUL. Figure 7. More visualization results of point cloud conversion. Left: original LiDAR PCs. Right: converted mmWave PCs. Joints with high flow values are yellow, while those with low flow values are blue. 10.1. Complete Ablation Study on PC Conversion W...
-
[46]
Limitation and Future Work While EMDUL significantly improves performance by ex- panding mmWave datasets with unlabeled data and LiDAR datasets, it has certain limitations that pave the way for future research. First, the PC conversion pipeline relies on empirical parameter settings, which may not be opti- mal for all scenarios. Future work could explore ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.