Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets

Boan Zhu; S.-H. Gary Chan; Wenying Li; Xingjian Zhang; Zhuoxuan Peng

arxiv: 2603.14507 · v3 · submitted 2026-03-15 · 💻 cs.CV

Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets

Zhuoxuan Peng , Boan Zhu , Xingjian Zhang , Wenying Li , S.-H. Gary Chan This is my paper

Pith reviewed 2026-05-15 11:07 UTC · model grok-4.3

classification 💻 cs.CV

keywords mmWavehuman pose estimationpoint clouddataset expansionpseudo labelingLiDAR conversiongeneralization

0 comments

The pith

A method expands limited mmWave datasets for human pose estimation by adding pseudo-labeled mmWave point clouds and translated LiDAR data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces EMDUL to increase the size and variety of existing mmWave point cloud datasets for human pose estimation. It does so through a pseudo-label estimator that annotates unlabeled mmWave recordings and a closed-form converter that turns annotated LiDAR point clouds into equivalent mmWave versions. Combining these additions with the original data improves accuracy for every tested model. The gains appear in both familiar and new environments, cutting error rates by 15.1 percent inside the original domain and 18.9 percent outside it.

Core claim

Expanding an original mmWave HPE dataset with both LiDAR-converted point clouds and pseudo-labeled mmWave point clouds raises the performance and generalization of all examined models, with measured error reductions of 15.1 percent in-domain and 18.9 percent out-of-domain.

What carries the argument

EMDUL, a two-module system whose pseudo-label estimator annotates unlabeled mmWave data while a closed-form converter translates annotated LiDAR point clouds into matching mmWave point clouds.

Load-bearing premise

The pseudo-labels assigned to unlabeled mmWave data remain accurate enough for training, and the LiDAR-to-mmWave converter keeps the essential pose geometry intact.

What would settle it

Retraining the same HPE models on the expanded dataset and finding no reduction or even an increase in pose estimation error compared with the original dataset alone.

Figures

Figures reproduced from arXiv: 2603.14507 by Boan Zhu, S.-H. Gary Chan, Wenying Li, Xingjian Zhang, Zhuoxuan Peng.

**Figure 2.** Figure 2: The overview of EMDUL integrating both PC conversion and pseudo-labeling modules. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the motion-detection mechanism in [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Step-by-step visualization of the point-cloud (PC) con [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Sample point clouds from different mmWave and Li [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Comparison of pseudo-labels generated with and with [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: More visualization results of point cloud conversion. [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: shows more visualization results for EMDUL. The first row compares P4T predictions trained on MM-Fi without versus with EMDUL (augmented by HmPEAR), while subsequent rows compare models trained on mmBody [5] expanded with LiDARHuman26M [21]. It is clearly shown that using EMDUL leads to consistently higher performance, even on sparse and noisy PCs. 11. Limitation and Future Work While EMDUL significantly… view at source ↗

read the original abstract

Current millimeter-wave (mmWave) datasets for human pose estimation (HPE) are scarce and lack diversity in both point cloud (PC) attributes and human poses, hindering the generalization ability of their trained models. On the other hand, unlabeled mmWave HPE data and diverse LiDAR HPE datasets are readily available. We propose EMDUL, a novel approach to expand the volume and diversity of an existing mmWave dataset using unlabeled mmWave data and LiDAR datasets. EMDUL consists of two independent modules, namely a pseudo-label estimator to annotate unlabeled mmWave data, and a closed-form converter that translates an annotated LiDAR PC to its mmWave counterpart. Expanding the original dataset with both LiDAR-converted and pseudo-labeled mmWave PCs significantly boosts the performance and generalization ability of all the examined HPE models, reducing 15.1% and 18.9% error for in-domain and out-of-domain settings, respectively. Code is available at https://github.com/Shimmer93/EMDUL.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EMDUL gives a practical recipe for growing small mmWave HPE datasets with pseudo-labels and a closed-form LiDAR converter, delivering reported error cuts of 15% in-domain and 19% out-of-domain, but the new labels and conversions lack direct accuracy checks.

read the letter

The main thing to know is that this paper introduces EMDUL, which adds pseudo-labeled unlabeled mmWave point clouds plus LiDAR-to-mmWave converted data to existing small datasets for human pose estimation. The authors show this expansion improves all tested models on both in-domain and out-of-domain tests, with those error reductions holding across the setups they ran. They also release code, which makes the approach easy to reproduce or adapt.

Referee Report

3 major / 1 minor

Summary. The manuscript introduces EMDUL, a two-module approach to expand scarce mmWave point-cloud datasets for human pose estimation (HPE). One module applies a pseudo-label estimator to annotate unlabeled mmWave data; the other uses a closed-form converter to map annotated LiDAR point clouds to mmWave equivalents. The authors claim that augmenting an original mmWave dataset with both types of expanded samples yields consistent error reductions of 15.1% (in-domain) and 18.9% (out-of-domain) across examined HPE models, improving generalization.

Significance. If the pseudo-labels and LiDAR-to-mmWave conversions preserve pose information without systematic distortion, the method offers a practical route to leverage abundant unlabeled mmWave and diverse LiDAR data, addressing data scarcity in mmWave HPE. The public code release supports reproducibility and allows independent verification of the empirical gains.

major comments (3)

The central claim of 15.1%/18.9% error reduction rests on the assumption that the pseudo-label estimator produces annotations accurate enough not to degrade training. No direct validation (e.g., mean per-joint position error or precision-recall on a held-out labeled mmWave subset) is reported for this estimator, leaving open the possibility that observed gains arise from training on internally consistent but noisy labels rather than genuine diversity.
The closed-form LiDAR-to-mmWave converter is presented without quantitative fidelity checks. No ablation compares converted LiDAR point clouds against real mmWave point clouds captured for the same poses, nor is there a sensitivity analysis showing how range-dependent sparsity or joint-level cue loss in the conversion propagates into final HPE metrics.
Experimental details on baseline selection, validation splits, and controls for dataset-selection effects are insufficient to confirm the reported gains are robust. The soundness assessment notes the absence of these controls, which directly affects whether the in-domain and out-of-domain improvements can be attributed to the proposed expansion rather than confounding factors.

minor comments (1)

The abstract and method sections would benefit from explicit enumeration of the exact HPE architectures tested and the source datasets (including sizes and pose distributions) used for both the original and expanded sets.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback and detailed assessment of our manuscript on EMDUL. We address each major comment below with point-by-point responses. Where the comments identify gaps in validation or experimental detail, we have revised the manuscript to incorporate the requested analyses and clarifications, strengthening the evidence for our claims of improved generalization in mmWave human pose estimation.

read point-by-point responses

Referee: The central claim of 15.1%/18.9% error reduction rests on the assumption that the pseudo-label estimator produces annotations accurate enough not to degrade training. No direct validation (e.g., mean per-joint position error or precision-recall on a held-out labeled mmWave subset) is reported for this estimator, leaving open the possibility that observed gains arise from training on internally consistent but noisy labels rather than genuine diversity.

Authors: We agree that direct validation of the pseudo-label estimator is important to substantiate the quality of the expanded data. In the revised manuscript, we now include an evaluation of the estimator on a held-out labeled mmWave subset, reporting mean per-joint position error (MPJPE) and precision-recall metrics. These results confirm that the pseudo-labels preserve sufficient pose accuracy to contribute to the observed performance gains rather than introducing only noise. revision: yes
Referee: The closed-form LiDAR-to-mmWave converter is presented without quantitative fidelity checks. No ablation compares converted LiDAR point clouds against real mmWave point clouds captured for the same poses, nor is there a sensitivity analysis showing how range-dependent sparsity or joint-level cue loss in the conversion propagates into final HPE metrics.

Authors: We acknowledge the absence of direct fidelity checks in the original submission. The revised version adds a quantitative comparison of converted LiDAR point clouds against real mmWave captures for identical poses, including an ablation study and sensitivity analysis on range-dependent sparsity and joint cue preservation. These additions demonstrate that the closed-form converter maintains sufficient fidelity for the reported HPE improvements. revision: yes
Referee: Experimental details on baseline selection, validation splits, and controls for dataset-selection effects are insufficient to confirm the reported gains are robust. The soundness assessment notes the absence of these controls, which directly affects whether the in-domain and out-of-domain improvements can be attributed to the proposed expansion rather than confounding factors.

Authors: We appreciate this observation on experimental rigor. The revised manuscript now provides expanded details on baseline model selection criteria, the precise train/validation/test splits, and additional controls including cross-validation experiments and dataset composition analyses. These updates confirm that the 15.1% in-domain and 18.9% out-of-domain error reductions are attributable to the data expansion rather than confounding factors. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical dataset expansion with independent modules

full rationale

The paper describes an empirical method (EMDUL) consisting of a pseudo-label estimator for unlabeled mmWave data and a closed-form converter from LiDAR point clouds. Performance improvements (15.1% and 18.9% error reduction) are shown via direct experiments on HPE models after dataset expansion. No equations, derivations, or self-referential definitions appear in the provided text that would reduce any claimed result to its own inputs by construction. No fitted parameters are relabeled as predictions, and no load-bearing self-citations or uniqueness theorems are invoked. The central claims rest on experimental outcomes rather than any closed logical loop.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The approach rests on standard semi-supervised learning assumptions and sensor data compatibility rather than new physical laws or entities.

free parameters (1)

Hyperparameters of pseudo-label estimator
Tuned parameters in the estimator module that affect label quality.

axioms (2)

domain assumption Pseudo-labels from the estimator are accurate enough for effective model training
Invoked in the description of the pseudo-label estimator module.
domain assumption LiDAR point clouds can be mapped to mmWave equivalents via closed-form conversion while retaining pose features
Central to the converter module.

pith-pipeline@v0.9.0 · 5493 in / 1290 out tokens · 79158 ms · 2026-05-15T11:07:49.319695+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages

[1]

Sizhe An and Umit Y . Ogras. Fast and scalable human pose estimation using mmWave point cloud.Proceedings of the 59th ACM/IEEE Design Automation Conference, pages 889– 894, 2022. 2

work page 2022
[2]

mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D, and Inertial Sensors, 2022

Sizhe An, Yin Li, and Umit Ogras. mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D, and Inertial Sensors, 2022. 2

work page 2022
[3]

SPiKE: 3D Human Pose from Point Cloud Sequences, 2024

Irene Ballester, Ond ˇrej Peterka, and Martin Kampel. SPiKE: 3D Human Pose from Point Cloud Sequences, 2024. 5, 6

work page 2024
[4]

MixMatch: A holistic approach to semi-supervised learning

David Berthelot, Nicholas Carlini, Ian Goodfellow, Avital Oliver, Nicolas Papernot, and Colin Raffel. MixMatch: A holistic approach to semi-supervised learning. InProceed- ings of the 33rd International Conference on Neural Infor- mation Processing Systems, number 454, pages 5049–5059. Curran Associates Inc., Red Hook, NY , USA, 2019. 3

work page 2019
[5]

mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar, 2023

Anjun Chen, Xiangyu Wang, Shaohao Zhu, Yanxu Li, Jim- ing Chen, and Qi Ye. mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar, 2023. 2, 5

work page 2023
[6]

A Novel Radar Point Cloud Generation Method for Robot Environment Perception.IEEE Transactions on Robotics, 38(6):3754–3773, 2022

Yuwei Cheng, Jingran Su, Mengxin Jiang, and Yimin Liu. A Novel Radar Point Cloud Generation Method for Robot Environment Perception.IEEE Transactions on Robotics, 38(6):3754–3773, 2022. 2, 3

work page 2022
[7]

Real-Time Short-Range Hu- man Posture Estimation Using mmWave Radars and Neural Networks.IEEE Sensors Journal, 22(1):535–543, 2022

Han Cui and Naim Dahnoun. Real-Time Short-Range Hu- man Posture Estimation Using mmWave Radars and Neural Networks.IEEE Sensors Journal, 22(1):535–543, 2022. 2

work page 2022
[8]

Milipoint: A point cloud dataset for mmwave radar

Han Cui, Shu Zhong, Jiacheng Wu, Zichao Shen, Naim Dah- noun, and Yiren Zhao. Milipoint: A point cloud dataset for mmwave radar. InAdvances in Neural Information Process- ing Systems, pages 62713–62726, 2023. 2

work page 2023
[9]

Midas: Gen- erating mmWave Radar Data from Videos for Training Per- vasive and Privacy-preserving Human Sensing Tasks.Proc

Kaikai Deng, Dong Zhao, Qiaoyue Han, Zihan Zhang, Shuyue Wang, Anfu Zhou, and Huadong Ma. Midas: Gen- erating mmWave Radar Data from Videos for Training Per- vasive and Privacy-preserving Human Sensing Tasks.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 7(1): 9:1–9:26, 2023. 2

work page 2023
[10]

Radar-Based 3D Human Skeleton Estimation by Kinematic Constrained Learning

Wen Ding, Zhongping Cao, Jianxiong Zhang, Rihui Chen, Xuemei Guo, and Guoli Wang. Radar-Based 3D Human Skeleton Estimation by Kinematic Constrained Learning. IEEE Sensors Journal, 21(20):23174–23184, 2021. 2

work page 2021
[11]

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos

Hehe Fan, Yi Yang, and Mohan Kankanhalli. Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos. In2021 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 14199– 14208, 2021. 1, 3, 5, 6, 7, 8

work page 2021
[12]

Dif- fusion Model is a Good Pose Estimator from 3D RF-Vision,

Junqiao Fan, Jianfei Yang, Yuecong Xu, and Lihua Xie. Dif- fusion Model is a Good Pose Estimator from 3D RF-Vision,

work page
[13]

Video2mmPoint: Synthesizing mmWave Point Cloud Data From Videos for Gait Recognition.IEEE Sensors Journal, 25(1):773–782, 2025

Yuxin Fan, Yong Wang, Hang Zheng, and Zhiguo Shi. Video2mmPoint: Synthesizing mmWave Point Cloud Data From Videos for Gait Recognition.IEEE Sensors Journal, 25(1):773–782, 2025. 2

work page 2025
[14]

DenserRadar: A 4D Millimeter-Wave Radar Point Cloud De- tector Based on Dense LiDAR Point Clouds

Zeyu Han, Junkai Jiang, Xiaokang Ding, Jiahao Wang, Qing- wen Meng, Shaobing Xu, Lei He, and Jianqiang Wang. DenserRadar: A 4D Millimeter-Wave Radar Point Cloud De- tector Based on Dense LiDAR Point Clouds. In2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), pages 930–936, 2024. 2, 3

work page 2024
[15]

RT-Pose: A 4D Radar Tensor-Based 3D Human Pose Estimation and Localization Benchmark

Yuan-Hao Ho, Jen-Hao Cheng, Sheng Yao Kuan, Zhongyu Jiang, Wenhao Chai, Hsiang-Wei Huang, Chih-Lung Lin, and Jenq-Neng Hwang. RT-Pose: A 4D Radar Tensor-Based 3D Human Pose Estimation and Localization Benchmark. InComputer Vision – ECCV 2024, pages 107–125, Cham,

work page 2024
[16]

Springer Nature Switzerland. 2

work page
[17]

Semi-Supervised 2D Human Pose Estimation Driven by Position Inconsis- tency Pseudo Label Correction Module

Linzhi Huang, Yulong Li, Hongbo Tian, Yue Yang, Xian- gang Li, Weihong Deng, and Jieping Ye. Semi-Supervised 2D Human Pose Estimation Driven by Position Inconsis- tency Pseudo Label Correction Module. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 693–703, 2023. 3

work page 2023
[18]

Catalin Ionescu, Dragos Papava, Vlad Olaru, and Cristian Sminchisescu. Human3.6M: Large Scale Datasets and Pre- dictive Methods for 3D Human Sensing in Natural Environ- ments.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7):1325–1339, 2014. 6

work page 2014
[19]

Laine and Timo Aila

S. Laine and Timo Aila. Temporal Ensembling for Semi- Supervised Learning.ArXiv, 2016. 3

work page 2016
[20]

Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Net- works

Dong-Hyun Lee. Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Net- works. 2013. 3

work page 2013
[21]

HuPR: A Benchmark for Human Pose Estimation Using Millimeter Wave Radar.2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 5704–5713, 2023

Shih-Po Lee, Niraj Prakash Kini, Wen-Hsiao Peng, Ching- Wen Ma, and Jenq-Neng Hwang. HuPR: A Benchmark for Human Pose Estimation Using Millimeter Wave Radar.2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 5704–5713, 2023. 2

work page 2023
[22]

Li- DARCap: Long-range Markerless 3D Human Motion Cap- ture with LiDAR Point Clouds

Jialian Li, Jingyi Zhang, Zhiyong Wang, Siqi Shen, Chenglu Wen, Yuexin Ma, Lan Xu, Jingyi Yu, and Cheng Wang. Li- DARCap: Long-range Markerless 3D Human Motion Cap- ture with LiDAR Point Clouds. In2022 IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), pages 20470–20480, 2022. 2, 5

work page 2022
[23]

Hm- PEAR: A Dataset for Human Pose Estimation and Action Recognition

Yitai Lin, Zhijie Wei, Wanfa Zhang, Xiping Lin, Yudi Dai, Chenglu Wen, Siqi Shen, Lan Xu, and Cheng Wang. Hm- PEAR: A Dataset for Human Pose Estimation and Action Recognition. InProceedings of the 32nd ACM International Conference on Multimedia, pages 2069–2078, New York, NY , USA, 2024. Association for Computing Machinery. 2, 5

work page 2069
[24]

Loshchilov and F

I. Loshchilov and F. Hutter. Decoupled Weight Decay Reg- ularization. InInternational Conference on Learning Repre- sentations, 2017. 6

work page 2017
[25]

SGDR: Stochastic Gradi- ent Descent with Warm Restarts, 2017

Ilya Loshchilov and Frank Hutter. SGDR: Stochastic Gradi- ent Descent with Warm Restarts, 2017. 6

work page 2017
[26]

Diffusion-Based Point Cloud Super-Resolution for mmWave Radar Data

Kai Luan, Chenghao Shi, Neng Wang, Yuwei Cheng, Huimin Lu, and Xieyuanli Chen. Diffusion-Based Point Cloud Super-Resolution for mmWave Radar Data. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 11171–11177, 2024. 2, 3

work page 2024
[27]

3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training

Dario Pavllo, Christoph Feichtenhofer, David Grangier, and Michael Auli. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training. In 2019 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 7745–7754, 2019. 3

work page 2019
[28]

High Resolution Point Clouds from mmWave Radar

Akarsh Prabhakara, Tao Jin, Arnav Das, Gantavya Bhatt, Lilly Kumari, Elahe Soltanaghai, Jeff Bilmes, Swarun Ku- mar, and Anthony Rowe. High Resolution Point Clouds from mmWave Radar. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 4135–4142, 2023. 2, 3

work page 2023
[29]

Data Distillation: Towards Omni-Supervised Learning

Ilija Radosavovic, Piotr Doll ´ar, Ross Girshick, Georgia Gkioxari, and Kaiming He. Data Distillation: Towards Omni-Supervised Learning. In2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4119– 4128, 2018. 3

work page 2018
[30]

LiveHPS: LiDAR-Based Scene-Level Human Pose and Shape Estimation in Free En- vironment

Yiming Ren, Xiao Han, Chengfeng Zhao, Jingya Wang, Lan Xu, Jingyi Yu, and Yuexin Ma. LiveHPS: LiDAR-Based Scene-Level Human Pose and Shape Estimation in Free En- vironment. In2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1281–1291,

work page
[31]

Regularization with stochastic transformations and pertur- bations for deep semi-supervised learning

Mehdi Sajjadi, Mehran Javanmardi, and Tolga Tasdizen. Regularization with stochastic transformations and pertur- bations for deep semi-supervised learning. InProceedings of the 30th International Conference on Neural Information Processing Systems, pages 1171–1179, Red Hook, NY , USA,

work page
[32]

Curran Associates Inc. 3

work page
[33]

Arindam Sengupta and Siyang Cao.mmPose-NLP: A Natu- ral Language Processing Approach to Precise Skeletal Pose Estimation Using mmWave Radars.IEEE Transactions on Neural Networks and Learning Systems, 34(11):8418–8429,

work page
[34]

Akash Deep Singh, Sandeep Singh Sandha, Luis Garcia, and Mani Srivastava. RadHAR: Human Activity Recogni- tion from Point Clouds Generated through a Millimeter-wave Radar.Proceedings of the 3rd ACM Workshop on Millimeter- wave Networks and Sensing Systems, pages 51–56, 2019. 2

work page 2019
[35]

Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel

Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel. FixMatch: Simplifying semi- supervised learning with consistency and confidence. InPro- ceedings of the 34th International Conference on Neural In- formation Processing Systems, pages 596–608, Red Hook, NY , USA, 2020. Cur...

work page 2020
[36]

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Antti Tarvainen and Harri Valpola. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. InProceedings of the 31st International Conference on Neural Information Pro- cessing Systems, pages 1195–1204, Red Hook, NY , USA,

work page
[37]

Curran Associates Inc. 3, 6, 1

work page
[38]

Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V . Le. Self-Training With Noisy Student Improves ImageNet Classification. In2020 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 10684– 10695, 2020. 3

work page 2020
[39]

An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

Rongchang Xie, Chunyu Wang, Wenjun Zeng, and Yizhou Wang. An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11220–11229, 2021. 3

work page 2021
[40]

MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing, 2023

Jianfei Yang, He Huang, Yunjiao Zhou, Xinyan Chen, Yue- cong Xu, Shenghai Yuan, Han Zou, Chris Xiaoxuan Lu, and Lihua Xie. MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing, 2023. 2, 5

work page 2023
[41]

CubeLearn: End-to-End Learning for Human Motion Recognition From Raw mmWave Radar Signals.IEEE Internet of Things Journal, 10(12):10236– 10249, 2023

Peijun Zhao, Chris Xiaoxuan Lu, Bing Wang, Niki Trigoni, and Andrew Markham. CubeLearn: End-to-End Learning for Human Motion Recognition From Raw mmWave Radar Signals.IEEE Internet of Things Journal, 10(12):10236– 10249, 2023. 2

work page 2023
[42]

3D Human Pose Esti- mation with Spatial and Temporal Transformers

Ce Zheng, Sijie Zhu, Matias Mendieta, Taojiannan Yang, Chen Chen, and Zhengming Ding. 3D Human Pose Esti- mation with Spatial and Temporal Transformers. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11636–11645, 2021. 6

work page 2021
[43]

ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion, 2024

Bing Zhu, Zixin He, Weiyi Xiong, Guanhua Ding, Jianan Liu, Tao Huang, Wei Chen, and Wei Xiang. ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion, 2024. 2 Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets Supplementary Material Table 9. The entire...

work page 2024
[44]

More Implementation Details This section presents more implementation details of our proposed EMDUL and a comparison scheme adapted for mmWave HPE. 9.1. PC Conversion Pipeline We specify the parameters used in our PC conversion pipeline in Tab. 9. The parameters are chosen based on empirical results on the validation set.υis re-sampled per instance. 9.2. ...

work page
[45]

Figure 7

More Experimental Results In this section, we show more quantitative and visualization results for EMDUL. Figure 7. More visualization results of point cloud conversion. Left: original LiDAR PCs. Right: converted mmWave PCs. Joints with high flow values are yellow, while those with low flow values are blue. 10.1. Complete Ablation Study on PC Conversion W...

work page
[46]

First, the PC conversion pipeline relies on empirical parameter settings, which may not be opti- mal for all scenarios

Limitation and Future Work While EMDUL significantly improves performance by ex- panding mmWave datasets with unlabeled data and LiDAR datasets, it has certain limitations that pave the way for future research. First, the PC conversion pipeline relies on empirical parameter settings, which may not be opti- mal for all scenarios. Future work could explore ...

work page

[1] [1]

Sizhe An and Umit Y . Ogras. Fast and scalable human pose estimation using mmWave point cloud.Proceedings of the 59th ACM/IEEE Design Automation Conference, pages 889– 894, 2022. 2

work page 2022

[2] [2]

mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D, and Inertial Sensors, 2022

Sizhe An, Yin Li, and Umit Ogras. mRI: Multi-modal 3D Human Pose Estimation Dataset using mmWave, RGB-D, and Inertial Sensors, 2022. 2

work page 2022

[3] [3]

SPiKE: 3D Human Pose from Point Cloud Sequences, 2024

Irene Ballester, Ond ˇrej Peterka, and Martin Kampel. SPiKE: 3D Human Pose from Point Cloud Sequences, 2024. 5, 6

work page 2024

[4] [4]

MixMatch: A holistic approach to semi-supervised learning

David Berthelot, Nicholas Carlini, Ian Goodfellow, Avital Oliver, Nicolas Papernot, and Colin Raffel. MixMatch: A holistic approach to semi-supervised learning. InProceed- ings of the 33rd International Conference on Neural Infor- mation Processing Systems, number 454, pages 5049–5059. Curran Associates Inc., Red Hook, NY , USA, 2019. 3

work page 2019

[5] [5]

mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar, 2023

Anjun Chen, Xiangyu Wang, Shaohao Zhu, Yanxu Li, Jim- ing Chen, and Qi Ye. mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar, 2023. 2, 5

work page 2023

[6] [6]

A Novel Radar Point Cloud Generation Method for Robot Environment Perception.IEEE Transactions on Robotics, 38(6):3754–3773, 2022

Yuwei Cheng, Jingran Su, Mengxin Jiang, and Yimin Liu. A Novel Radar Point Cloud Generation Method for Robot Environment Perception.IEEE Transactions on Robotics, 38(6):3754–3773, 2022. 2, 3

work page 2022

[7] [7]

Real-Time Short-Range Hu- man Posture Estimation Using mmWave Radars and Neural Networks.IEEE Sensors Journal, 22(1):535–543, 2022

Han Cui and Naim Dahnoun. Real-Time Short-Range Hu- man Posture Estimation Using mmWave Radars and Neural Networks.IEEE Sensors Journal, 22(1):535–543, 2022. 2

work page 2022

[8] [8]

Milipoint: A point cloud dataset for mmwave radar

Han Cui, Shu Zhong, Jiacheng Wu, Zichao Shen, Naim Dah- noun, and Yiren Zhao. Milipoint: A point cloud dataset for mmwave radar. InAdvances in Neural Information Process- ing Systems, pages 62713–62726, 2023. 2

work page 2023

[9] [9]

Midas: Gen- erating mmWave Radar Data from Videos for Training Per- vasive and Privacy-preserving Human Sensing Tasks.Proc

Kaikai Deng, Dong Zhao, Qiaoyue Han, Zihan Zhang, Shuyue Wang, Anfu Zhou, and Huadong Ma. Midas: Gen- erating mmWave Radar Data from Videos for Training Per- vasive and Privacy-preserving Human Sensing Tasks.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 7(1): 9:1–9:26, 2023. 2

work page 2023

[10] [10]

Radar-Based 3D Human Skeleton Estimation by Kinematic Constrained Learning

Wen Ding, Zhongping Cao, Jianxiong Zhang, Rihui Chen, Xuemei Guo, and Guoli Wang. Radar-Based 3D Human Skeleton Estimation by Kinematic Constrained Learning. IEEE Sensors Journal, 21(20):23174–23184, 2021. 2

work page 2021

[11] [11]

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos

Hehe Fan, Yi Yang, and Mohan Kankanhalli. Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos. In2021 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 14199– 14208, 2021. 1, 3, 5, 6, 7, 8

work page 2021

[12] [12]

Dif- fusion Model is a Good Pose Estimator from 3D RF-Vision,

Junqiao Fan, Jianfei Yang, Yuecong Xu, and Lihua Xie. Dif- fusion Model is a Good Pose Estimator from 3D RF-Vision,

work page

[13] [13]

Video2mmPoint: Synthesizing mmWave Point Cloud Data From Videos for Gait Recognition.IEEE Sensors Journal, 25(1):773–782, 2025

Yuxin Fan, Yong Wang, Hang Zheng, and Zhiguo Shi. Video2mmPoint: Synthesizing mmWave Point Cloud Data From Videos for Gait Recognition.IEEE Sensors Journal, 25(1):773–782, 2025. 2

work page 2025

[14] [14]

DenserRadar: A 4D Millimeter-Wave Radar Point Cloud De- tector Based on Dense LiDAR Point Clouds

Zeyu Han, Junkai Jiang, Xiaokang Ding, Jiahao Wang, Qing- wen Meng, Shaobing Xu, Lei He, and Jianqiang Wang. DenserRadar: A 4D Millimeter-Wave Radar Point Cloud De- tector Based on Dense LiDAR Point Clouds. In2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), pages 930–936, 2024. 2, 3

work page 2024

[15] [15]

RT-Pose: A 4D Radar Tensor-Based 3D Human Pose Estimation and Localization Benchmark

Yuan-Hao Ho, Jen-Hao Cheng, Sheng Yao Kuan, Zhongyu Jiang, Wenhao Chai, Hsiang-Wei Huang, Chih-Lung Lin, and Jenq-Neng Hwang. RT-Pose: A 4D Radar Tensor-Based 3D Human Pose Estimation and Localization Benchmark. InComputer Vision – ECCV 2024, pages 107–125, Cham,

work page 2024

[16] [16]

Springer Nature Switzerland. 2

work page

[17] [17]

Semi-Supervised 2D Human Pose Estimation Driven by Position Inconsis- tency Pseudo Label Correction Module

Linzhi Huang, Yulong Li, Hongbo Tian, Yue Yang, Xian- gang Li, Weihong Deng, and Jieping Ye. Semi-Supervised 2D Human Pose Estimation Driven by Position Inconsis- tency Pseudo Label Correction Module. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 693–703, 2023. 3

work page 2023

[18] [18]

Catalin Ionescu, Dragos Papava, Vlad Olaru, and Cristian Sminchisescu. Human3.6M: Large Scale Datasets and Pre- dictive Methods for 3D Human Sensing in Natural Environ- ments.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7):1325–1339, 2014. 6

work page 2014

[19] [19]

Laine and Timo Aila

S. Laine and Timo Aila. Temporal Ensembling for Semi- Supervised Learning.ArXiv, 2016. 3

work page 2016

[20] [20]

Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Net- works

Dong-Hyun Lee. Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Net- works. 2013. 3

work page 2013

[21] [21]

HuPR: A Benchmark for Human Pose Estimation Using Millimeter Wave Radar.2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 5704–5713, 2023

Shih-Po Lee, Niraj Prakash Kini, Wen-Hsiao Peng, Ching- Wen Ma, and Jenq-Neng Hwang. HuPR: A Benchmark for Human Pose Estimation Using Millimeter Wave Radar.2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 5704–5713, 2023. 2

work page 2023

[22] [22]

Li- DARCap: Long-range Markerless 3D Human Motion Cap- ture with LiDAR Point Clouds

Jialian Li, Jingyi Zhang, Zhiyong Wang, Siqi Shen, Chenglu Wen, Yuexin Ma, Lan Xu, Jingyi Yu, and Cheng Wang. Li- DARCap: Long-range Markerless 3D Human Motion Cap- ture with LiDAR Point Clouds. In2022 IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), pages 20470–20480, 2022. 2, 5

work page 2022

[23] [23]

Hm- PEAR: A Dataset for Human Pose Estimation and Action Recognition

Yitai Lin, Zhijie Wei, Wanfa Zhang, Xiping Lin, Yudi Dai, Chenglu Wen, Siqi Shen, Lan Xu, and Cheng Wang. Hm- PEAR: A Dataset for Human Pose Estimation and Action Recognition. InProceedings of the 32nd ACM International Conference on Multimedia, pages 2069–2078, New York, NY , USA, 2024. Association for Computing Machinery. 2, 5

work page 2069

[24] [24]

Loshchilov and F

I. Loshchilov and F. Hutter. Decoupled Weight Decay Reg- ularization. InInternational Conference on Learning Repre- sentations, 2017. 6

work page 2017

[25] [25]

SGDR: Stochastic Gradi- ent Descent with Warm Restarts, 2017

Ilya Loshchilov and Frank Hutter. SGDR: Stochastic Gradi- ent Descent with Warm Restarts, 2017. 6

work page 2017

[26] [26]

Diffusion-Based Point Cloud Super-Resolution for mmWave Radar Data

Kai Luan, Chenghao Shi, Neng Wang, Yuwei Cheng, Huimin Lu, and Xieyuanli Chen. Diffusion-Based Point Cloud Super-Resolution for mmWave Radar Data. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 11171–11177, 2024. 2, 3

work page 2024

[27] [27]

3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training

Dario Pavllo, Christoph Feichtenhofer, David Grangier, and Michael Auli. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training. In 2019 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 7745–7754, 2019. 3

work page 2019

[28] [28]

High Resolution Point Clouds from mmWave Radar

Akarsh Prabhakara, Tao Jin, Arnav Das, Gantavya Bhatt, Lilly Kumari, Elahe Soltanaghai, Jeff Bilmes, Swarun Ku- mar, and Anthony Rowe. High Resolution Point Clouds from mmWave Radar. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 4135–4142, 2023. 2, 3

work page 2023

[29] [29]

Data Distillation: Towards Omni-Supervised Learning

Ilija Radosavovic, Piotr Doll ´ar, Ross Girshick, Georgia Gkioxari, and Kaiming He. Data Distillation: Towards Omni-Supervised Learning. In2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4119– 4128, 2018. 3

work page 2018

[30] [30]

LiveHPS: LiDAR-Based Scene-Level Human Pose and Shape Estimation in Free En- vironment

Yiming Ren, Xiao Han, Chengfeng Zhao, Jingya Wang, Lan Xu, Jingyi Yu, and Yuexin Ma. LiveHPS: LiDAR-Based Scene-Level Human Pose and Shape Estimation in Free En- vironment. In2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1281–1291,

work page

[31] [31]

Regularization with stochastic transformations and pertur- bations for deep semi-supervised learning

Mehdi Sajjadi, Mehran Javanmardi, and Tolga Tasdizen. Regularization with stochastic transformations and pertur- bations for deep semi-supervised learning. InProceedings of the 30th International Conference on Neural Information Processing Systems, pages 1171–1179, Red Hook, NY , USA,

work page

[32] [32]

Curran Associates Inc. 3

work page

[33] [33]

Arindam Sengupta and Siyang Cao.mmPose-NLP: A Natu- ral Language Processing Approach to Precise Skeletal Pose Estimation Using mmWave Radars.IEEE Transactions on Neural Networks and Learning Systems, 34(11):8418–8429,

work page

[34] [34]

Akash Deep Singh, Sandeep Singh Sandha, Luis Garcia, and Mani Srivastava. RadHAR: Human Activity Recogni- tion from Point Clouds Generated through a Millimeter-wave Radar.Proceedings of the 3rd ACM Workshop on Millimeter- wave Networks and Sensing Systems, pages 51–56, 2019. 2

work page 2019

[35] [35]

Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel

Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel. FixMatch: Simplifying semi- supervised learning with consistency and confidence. InPro- ceedings of the 34th International Conference on Neural In- formation Processing Systems, pages 596–608, Red Hook, NY , USA, 2020. Cur...

work page 2020

[36] [36]

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Antti Tarvainen and Harri Valpola. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. InProceedings of the 31st International Conference on Neural Information Pro- cessing Systems, pages 1195–1204, Red Hook, NY , USA,

work page

[37] [37]

Curran Associates Inc. 3, 6, 1

work page

[38] [38]

Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V . Le. Self-Training With Noisy Student Improves ImageNet Classification. In2020 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 10684– 10695, 2020. 3

work page 2020

[39] [39]

An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

Rongchang Xie, Chunyu Wang, Wenjun Zeng, and Yizhou Wang. An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11220–11229, 2021. 3

work page 2021

[40] [40]

MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing, 2023

Jianfei Yang, He Huang, Yunjiao Zhou, Xinyan Chen, Yue- cong Xu, Shenghai Yuan, Han Zou, Chris Xiaoxuan Lu, and Lihua Xie. MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing, 2023. 2, 5

work page 2023

[41] [41]

CubeLearn: End-to-End Learning for Human Motion Recognition From Raw mmWave Radar Signals.IEEE Internet of Things Journal, 10(12):10236– 10249, 2023

Peijun Zhao, Chris Xiaoxuan Lu, Bing Wang, Niki Trigoni, and Andrew Markham. CubeLearn: End-to-End Learning for Human Motion Recognition From Raw mmWave Radar Signals.IEEE Internet of Things Journal, 10(12):10236– 10249, 2023. 2

work page 2023

[42] [42]

3D Human Pose Esti- mation with Spatial and Temporal Transformers

Ce Zheng, Sijie Zhu, Matias Mendieta, Taojiannan Yang, Chen Chen, and Zhengming Ding. 3D Human Pose Esti- mation with Spatial and Temporal Transformers. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11636–11645, 2021. 6

work page 2021

[43] [43]

ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion, 2024

Bing Zhu, Zixin He, Weiyi Xiong, Guanhua Ding, Jianan Liu, Tao Huang, Wei Chen, and Wei Xiang. ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion, 2024. 2 Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets Supplementary Material Table 9. The entire...

work page 2024

[44] [44]

More Implementation Details This section presents more implementation details of our proposed EMDUL and a comparison scheme adapted for mmWave HPE. 9.1. PC Conversion Pipeline We specify the parameters used in our PC conversion pipeline in Tab. 9. The parameters are chosen based on empirical results on the validation set.υis re-sampled per instance. 9.2. ...

work page

[45] [45]

Figure 7

More Experimental Results In this section, we show more quantitative and visualization results for EMDUL. Figure 7. More visualization results of point cloud conversion. Left: original LiDAR PCs. Right: converted mmWave PCs. Joints with high flow values are yellow, while those with low flow values are blue. 10.1. Complete Ablation Study on PC Conversion W...

work page

[46] [46]

First, the PC conversion pipeline relies on empirical parameter settings, which may not be opti- mal for all scenarios

Limitation and Future Work While EMDUL significantly improves performance by ex- panding mmWave datasets with unlabeled data and LiDAR datasets, it has certain limitations that pave the way for future research. First, the PC conversion pipeline relies on empirical parameter settings, which may not be opti- mal for all scenarios. Future work could explore ...

work page