WiFlow: A Lightweight WiFi-based Continuous Human Pose Estimation Network with Spatio-Temporal Feature Decoupling

Haiwei Zhang; Hao Liu; Lankai Zhang; Wenbo Wang; Yi Dao

arxiv: 2602.08661 · v2 · submitted 2026-02-09 · 💻 cs.CV

WiFlow: A Lightweight WiFi-based Continuous Human Pose Estimation Network with Spatio-Temporal Feature Decoupling

Yi Dao , Lankai Zhang , Hao Liu , Haiwei Zhang , Wenbo Wang This is my paper

Pith reviewed 2026-05-16 05:35 UTC · model grok-4.3

classification 💻 cs.CV

keywords WiFi sensinghuman pose estimationchannel state informationencoder-decoder networklightweight modelspatio-temporal featuresaxial attentioncontinuous tracking

0 comments

The pith

WiFlow estimates continuous human poses from WiFi signals at over 97 percent accuracy using only 2.23 million parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents WiFlow as an encoder-decoder network that turns WiFi channel state information into sequences of human body keypoint positions during ongoing motion. The encoder applies temporal and asymmetric convolutions to keep the signal's sequential order while extracting features, then uses axial attention to link keypoints according to body structure. The decoder converts those features into coordinate outputs. This design targets continuous tracking in settings where cameras are unavailable or undesirable, and the reported results come from training on 360,000 paired CSI-pose samples collected from five people performing eight everyday activities.

Core claim

WiFlow employs an encoder-decoder architecture. The encoder captures spatio-temporal features of CSI using temporal and asymmetric convolutions, preserving the original sequential structure of signals. It then refines keypoint features of human bodies to be tracked and capture their structural dependencies via axial attention. The decoder subsequently maps the encoded high-dimensional features into keypoint coordinates. On a self-collected dataset of 360,000 synchronized CSI-pose samples, the model reaches PCK@20 of 97.25 percent, PCK@50 of 99.48 percent, and mean per-joint error of 0.007 meters while using 2.23 million parameters.

What carries the argument

Encoder-decoder network that decouples spatio-temporal CSI features through temporal and asymmetric convolutions plus axial attention before mapping to keypoint coordinates.

If this is right

Pose estimation becomes feasible on resource-limited IoT devices that already have WiFi radios.
Applications such as fall detection or gesture interfaces can run without line-of-sight or lighting requirements.
Model size stays small enough for on-device inference rather than cloud offload.
The same architecture could serve as a starting point for other CSI-based regression tasks beyond single-person pose.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the axial-attention block successfully encodes body structure, the same block might transfer to multi-person tracking by adding an instance-separation head.
Performance numbers measured on short activity sequences leave open whether drift accumulates over minutes-long continuous motion.
Replacing the current loss with a temporal smoothness term could reduce jitter without increasing parameter count.

Load-bearing premise

Data from five subjects performing eight scripted activities in controlled indoor sequences is representative of new users, rooms, and longer unscripted motions.

What would settle it

A drop below 80 percent PCK@20 when the trained model is tested on recordings from entirely new subjects or in a different physical environment would show that the performance does not hold outside the original collection conditions.

Figures

Figures reproduced from arXiv: 2602.08661 by Haiwei Zhang, Hao Liu, Lankai Zhang, Wenbo Wang, Yi Dao.

**Figure 1.** Figure 1: WiFlow network architecture diagram in three stages: the first stage uses dilated causal convolution in the TCN module to extract temporal features and screen subcarriers; the second stage employs asymmetric residual blocks to extract spatial features, compressing the subcarrier dimension to the keypoint number; the third stage introduces axial attention to reinforce key features along the width direction … view at source ↗

**Figure 2.** Figure 2: Experimental environment layout demonstration. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Visual comparison of WiFi-based human pose estimation for eight daily actions. (Top row) Raw WiFi CSI amplitude heatmaps showing the temporal [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

Human pose estimation is fundamental to intelligent perception in the Internet of Things (IoT), enabling applications ranging from smart healthcare to human-computer interaction. While WiFi-based methods have gained traction, they often struggle with continuous motion and high computational overhead. This work presents WiFlow, a novel framework for continuous human pose estimation using WiFi signals. Unlike vision-based approaches such as two-dimensional deep residual networks that treat Channel State Information (CSI) as images, WiFlow employs an encoder-decoder architecture. The encoder captures spatio-temporal features of CSI using temporal and asymmetric convolutions, preserving the original sequential structure of signals. It then refines keypoint features of human bodies to be tracked and capture their structural dependencies via axial attention. The decoder subsequently maps the encoded high-dimensional features into keypoint coordinates. Trained on a self-collected dataset of 360,000 synchronized CSI-pose samples from 5 subjects performing continuous sequences of 8 daily activities, WiFlow achieves a Percentage of Correct Keypoints (PCK) of 97.25% at a threshold of 20% (PCK@20) and 99.48% at PCK@50, with a mean per-joint position error of 0.007 m. With only 2.23M parameters, WiFlow significantly reduces model complexity and computational cost, establishing a new performance baseline for practical WiFi-based human pose estimation. Our code and datasets are available at https://github.com/DY2434/WiFlow-WiFi-Pose-Estimation-with-Spatio-Temporal-Decoupling.git.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

WiFlow gives a clean encoder-decoder for WiFi CSI pose estimation with low parameter count and open code, but the headline numbers rest on a five-subject dataset whose splits are not described.

read the letter

The paper's real contribution is the specific encoder-decoder that keeps CSI's sequential structure with temporal and asymmetric convolutions before applying axial attention on keypoints. That is a clear step past the common trick of treating CSI matrices as images. The model stays small at 2.23 M parameters and the authors release both code and the 360 k sample dataset, which makes the work usable for follow-up even if the numbers need checking. On their data they report 97.25 % PCK@20 and 0.007 m mean joint error, which is the strongest quantitative result shown so far for this exact task. Those are the parts worth taking seriously. The soft spot is the evaluation. Five subjects performing eight scripted activities in controlled sequences is a narrow base. WiFi CSI carries strong subject-specific signatures, so without an explicit leave-one-subject-out protocol or any cross-environment test the high scores could partly reflect memorization of the training people rather than transferable pose features. The abstract gives no detail on how the train/test split was made, which leaves the central generalization claim unverified. This is the kind of paper that belongs in a reading group focused on wireless sensing or lightweight IoT perception. Readers who already work with CSI will get immediate value from the architecture and the released artifacts. It is solid enough on the technical side and open enough on the data side to deserve a full referee process rather than a desk reject, though any review should press hard on the subject-independent splits and error analysis.

Referee Report

1 major / 2 minor

Summary. The paper introduces WiFlow, a lightweight encoder-decoder architecture for continuous human pose estimation from WiFi CSI signals. The encoder applies temporal and asymmetric convolutions to extract spatio-temporal features while preserving sequential structure, uses axial attention to capture keypoint structural dependencies, and the decoder regresses keypoint coordinates. Trained on a self-collected dataset of 360,000 synchronized CSI-pose samples from 5 subjects performing 8 daily activities, WiFlow reports PCK@20 of 97.25%, PCK@50 of 99.48%, mean per-joint position error of 0.007 m, and 2.23M parameters, with code and data released.

Significance. If the reported performance holds under subject-independent evaluation, the work would establish a practical, low-complexity baseline for WiFi-based pose estimation in IoT settings. The release of code and the 360k-sample dataset is a clear strength that supports reproducibility and future comparisons.

major comments (1)

[Experimental Evaluation] The manuscript does not specify the train/test split protocol (e.g., whether it is subject-disjoint or uses leave-one-subject-out). With only 5 subjects and CSI signals known to encode subject-specific body geometry and multipath signatures, this detail is load-bearing for the central claim that the 2.23M-parameter model provides a practical baseline for new users and environments (see abstract results paragraph).

minor comments (2)

[Abstract] Add explicit details on the training procedure, hyperparameter selection, data augmentation, and per-activity error breakdown to allow full assessment of the PCK and mean joint error metrics.
[Method] Clarify the precise kernel sizes, strides, and channel dimensions of the asymmetric convolutions and the axial attention implementation for reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the importance of clearly specifying the train/test split protocol. We address this point directly below and will update the manuscript accordingly.

read point-by-point responses

Referee: [Experimental Evaluation] The manuscript does not specify the train/test split protocol (e.g., whether it is subject-disjoint or uses leave-one-subject-out). With only 5 subjects and CSI signals known to encode subject-specific body geometry and multipath signatures, this detail is load-bearing for the central claim that the 2.23M-parameter model provides a practical baseline for new users and environments (see abstract results paragraph).

Authors: We agree that the train/test split protocol is critical to substantiate the generalizability claims, particularly given the subject-specific nature of CSI signals. Our experiments followed a leave-one-subject-out (LOSO) cross-validation protocol: the model was trained on synchronized CSI-pose samples from 4 subjects and evaluated on the held-out fifth subject, with the procedure repeated across all 5 subjects and results averaged. This subject-disjoint split was used to simulate performance for new users. We will revise the manuscript to explicitly describe this protocol, including the partitioning of the 360,000 samples and the averaging procedure, in the Experimental Setup and Evaluation sections. revision: yes

Circularity Check

0 steps flagged

No circularity in architecture or performance claims

full rationale

The paper describes a standard encoder-decoder neural network for CSI-to-pose regression using temporal/asymmetric convolutions and axial attention, trained end-to-end on a self-collected dataset. Reported PCK scores and joint error are empirical results on held-out test samples rather than quantities defined by or reduced to fitted parameters, self-referential equations, or load-bearing self-citations. No derivation chain exists that collapses to its inputs by construction; the model and evaluation protocol are self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on empirical training of a neural network whose weights are learned from the collected CSI-pose pairs; no new physical entities or unstated mathematical axioms beyond standard supervised learning assumptions.

free parameters (1)

network architecture hyperparameters
Choices such as convolution kernel sizes, attention dimensions, and layer counts are selected and optimized during training to achieve the reported accuracy.

axioms (1)

domain assumption WiFi CSI signals contain sufficient spatio-temporal information to reconstruct human joint positions
Invoked by the design of the encoder that processes CSI directly for keypoint regression.

pith-pipeline@v0.9.0 · 5602 in / 1340 out tokens · 62718 ms · 2026-05-16T05:35:08.963409+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 2 internal anchors

[1]

Robust abnormal human-posture recognition using openpose and multiview cross-information,

M. Xu, L. Guo, and H.-C. Wu, “Robust abnormal human-posture recognition using openpose and multiview cross-information,”IEEE Sensors Journal, vol. 23, no. 11, pp. 12 370–12 379, 2023

work page 2023
[2]

Position tracking for virtual reality using commodity wifi,

M. Kotaru and S. Katti, “Position tracking for virtual reality using commodity wifi,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 68–78

work page 2017
[3]

Openpose: Realtime multi-person 2d pose estimation using part affinity fields,

Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y . Sheikh, “Openpose: Realtime multi-person 2d pose estimation using part affinity fields,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 1, pp. 172–186, 2019

work page 2019
[4]

Deepfuse: An imu- aware network for real-time 3d human pose estimation from multi- view image,

F. Huang, A. Zeng, M. Liu, Q. Lai, and Q. Xu, “Deepfuse: An imu- aware network for real-time 3d human pose estimation from multi- view image,” inProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 429–438

work page 2020
[5]

Probsparse attention with stacked group convolution for wireless signal-based human activity recognition,

D. Yi, H. Zhang, S. Feng, J. Fang, and W. Wang, “Probsparse attention with stacked group convolution for wireless signal-based human activity recognition,” in2024 16th International Conference on Wireless Com- munications and Signal Processing (WCSP). IEEE, 2024, pp. 1349– 1354

work page 2024
[6]

Vision transformers for human activity recognition using wifi channel state information,

F. Luo, S. Khan, B. Jiang, and K. Wu, “Vision transformers for human activity recognition using wifi channel state information,”IEEE Internet of Things Journal, vol. 11, no. 17, pp. 28 111–28 122, 2024

work page 2024
[7]

A contactless breathing pattern recognition system using deep learning and wifi signal,

D. Fan, X. Yang, N. Zhao, L. Guan, M. M. Arslan, M. Ullah, M. A. Imran, and Q. H. Abbasi, “A contactless breathing pattern recognition system using deep learning and wifi signal,”IEEE Internet of Things Journal, vol. 11, no. 13, pp. 23 820–23 834, 2024

work page 2024
[8]

Design and evaluation of volunteer user trials of unobtrusive vital signs monitoring for older people in care using wi-fi csi sensing,

A. Alzaabi, I. Saied, and T. Arslan, “Design and evaluation of volunteer user trials of unobtrusive vital signs monitoring for older people in care using wi-fi csi sensing,”IEEE Journal of Translational Engineering in Health and Medicine, 2025

work page 2025
[9]

Wi-SFDAGR: Wifi-based cross-domain gesture recog- nition via source-free domain adaptation,

H. Yan, et al., “Wi-SFDAGR: Wifi-based cross-domain gesture recog- nition via source-free domain adaptation,”IEEE Internet of Things Journal, 2025

work page 2025
[10]

Ubigest: Smartphone-based ubiquitous gesture recognition with wi-fi,

S.-H. Jeong, K. S. Shin, J. Park, S. Jo, and Y .-J. Suh, “Ubigest: Smartphone-based ubiquitous gesture recognition with wi-fi,”IEEE Internet of Things Journal, 2024

work page 2024
[11]

Can WiFi Estimate Person Pose?

F. Wang, S. Panev, Z. Dai, J. Han, and D. Huang, “Can WiFi estimate person pose?”arXiv preprint arXiv:1904.00277, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904
[12]

From point to space: 3D moving human pose estimation using commodity WiFi,

Y . Wang, L. Guo, Z. Lu, X. Wen, S. Zhou, and W. Meng, “From point to space: 3D moving human pose estimation using commodity WiFi,” IEEE Communications Letters, vol. 25, no. 7, pp. 2235–2239, 2021

work page 2021
[13]

MetaFi: Device-free pose estimation via commodity WiFi for metaverse avatar simulation,

J. Yang, Y . Zhou, H. Huang, H. Zou, and L. Xie, “MetaFi: Device-free pose estimation via commodity WiFi for metaverse avatar simulation,” in2022 IEEE 8th World Forum on Internet of Things (WF-IoT). IEEE, 2022, pp. 1–6

work page 2022
[14]

PerUnet: Deep signal channel attention in unet for wifi-based human pose estimation,

Y . Zhou, A. Zhu, C. Xu, F. Hu, and Y . Li, “PerUnet: Deep signal channel attention in unet for wifi-based human pose estimation,”IEEE Sensors Journal, vol. 22, no. 20, pp. 19 750–19 760, 2022

work page 2022
[15]

MetaFi++: WiFi-enabled transformer-based human pose estimation for metaverse avatar simulation,

Y . Zhou, H. Huang, S. Yuan, H. Zou, L. Xie, and J. Yang, “MetaFi++: WiFi-enabled transformer-based human pose estimation for metaverse avatar simulation,”IEEE Internet of Things Journal, vol. 10, no. 16, pp. 14 128–14 136, 2023

work page 2023
[16]

Towards 3D human pose construction using WiFi,

W. Jiang, et al., “Towards 3D human pose construction using WiFi,” inProceedings of the 26th Annual International Conference on Mobile Computing and Networking, 2020, pp. 1–14

work page 2020
[17]

CSI-former: Pay more attention to pose estimation with WiFi,

Y . Zhou, C. Xu, L. Zhao, A. Zhu, F. Hu, and Y . Li, “CSI-former: Pay more attention to pose estimation with WiFi,”Entropy, vol. 25, no. 1, p. 20, 2022

work page 2022
[18]

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,”arXiv preprint arXiv:1803.01271, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[19]

Axial-deeplab: Stand-alone axial-attention for panoptic segmentation,

H. Wang, et al., “Axial-deeplab: Stand-alone axial-attention for panoptic segmentation,” inEuropean conference on computer vision. Springer, 2020, pp. 108–126

work page 2020
[20]

Tool release: Gathering 802.11 n traces with channel state information,

D. Halperin, W. Hu, A. Sheth, and D. Wetherall, “Tool release: Gathering 802.11 n traces with channel state information,”ACM SIGCOMM computer communication review, vol. 41, no. 1, pp. 53–53, 2011

work page 2011
[21]

Wi-Fi sensing techniques for human activity recognition: Brief survey, potential challenges, and research directions,

F. Miao, Y . Huang, Z. Lu, T. Ohtsuki, G. Gui, and H. Sari, “Wi-Fi sensing techniques for human activity recognition: Brief survey, potential challenges, and research directions,”ACM Computing Surveys, vol. 57, no. 5, pp. 1–30, 2025

work page 2025
[22]

Mm-Fi: Multi-modal non-intrusive 4D human dataset for versatile wireless sensing,

J. Yang, et al., “Mm-Fi: Multi-modal non-intrusive 4D human dataset for versatile wireless sensing,”Advances in Neural Information Processing Systems, vol. 36, pp. 18 756–18 768, 2023

work page 2023
[23]

HPE-Li: WiFi-enabled lightweight dual selective kernel convolution for human pose estimation,

T. D. Gian, T. Dac Lai, T. Van Luong, K.-S. Wong, and V .-D. Nguyen, “HPE-Li: WiFi-enabled lightweight dual selective kernel convolution for human pose estimation,” inComputer Vision – ECCV 2024. Cham: Springer Nature Switzerland, 2025, pp. 93–111

work page 2024

[1] [1]

Robust abnormal human-posture recognition using openpose and multiview cross-information,

M. Xu, L. Guo, and H.-C. Wu, “Robust abnormal human-posture recognition using openpose and multiview cross-information,”IEEE Sensors Journal, vol. 23, no. 11, pp. 12 370–12 379, 2023

work page 2023

[2] [2]

Position tracking for virtual reality using commodity wifi,

M. Kotaru and S. Katti, “Position tracking for virtual reality using commodity wifi,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 68–78

work page 2017

[3] [3]

Openpose: Realtime multi-person 2d pose estimation using part affinity fields,

Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y . Sheikh, “Openpose: Realtime multi-person 2d pose estimation using part affinity fields,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 1, pp. 172–186, 2019

work page 2019

[4] [4]

Deepfuse: An imu- aware network for real-time 3d human pose estimation from multi- view image,

F. Huang, A. Zeng, M. Liu, Q. Lai, and Q. Xu, “Deepfuse: An imu- aware network for real-time 3d human pose estimation from multi- view image,” inProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 429–438

work page 2020

[5] [5]

Probsparse attention with stacked group convolution for wireless signal-based human activity recognition,

D. Yi, H. Zhang, S. Feng, J. Fang, and W. Wang, “Probsparse attention with stacked group convolution for wireless signal-based human activity recognition,” in2024 16th International Conference on Wireless Com- munications and Signal Processing (WCSP). IEEE, 2024, pp. 1349– 1354

work page 2024

[6] [6]

Vision transformers for human activity recognition using wifi channel state information,

F. Luo, S. Khan, B. Jiang, and K. Wu, “Vision transformers for human activity recognition using wifi channel state information,”IEEE Internet of Things Journal, vol. 11, no. 17, pp. 28 111–28 122, 2024

work page 2024

[7] [7]

A contactless breathing pattern recognition system using deep learning and wifi signal,

D. Fan, X. Yang, N. Zhao, L. Guan, M. M. Arslan, M. Ullah, M. A. Imran, and Q. H. Abbasi, “A contactless breathing pattern recognition system using deep learning and wifi signal,”IEEE Internet of Things Journal, vol. 11, no. 13, pp. 23 820–23 834, 2024

work page 2024

[8] [8]

Design and evaluation of volunteer user trials of unobtrusive vital signs monitoring for older people in care using wi-fi csi sensing,

A. Alzaabi, I. Saied, and T. Arslan, “Design and evaluation of volunteer user trials of unobtrusive vital signs monitoring for older people in care using wi-fi csi sensing,”IEEE Journal of Translational Engineering in Health and Medicine, 2025

work page 2025

[9] [9]

Wi-SFDAGR: Wifi-based cross-domain gesture recog- nition via source-free domain adaptation,

H. Yan, et al., “Wi-SFDAGR: Wifi-based cross-domain gesture recog- nition via source-free domain adaptation,”IEEE Internet of Things Journal, 2025

work page 2025

[10] [10]

Ubigest: Smartphone-based ubiquitous gesture recognition with wi-fi,

S.-H. Jeong, K. S. Shin, J. Park, S. Jo, and Y .-J. Suh, “Ubigest: Smartphone-based ubiquitous gesture recognition with wi-fi,”IEEE Internet of Things Journal, 2024

work page 2024

[11] [11]

Can WiFi Estimate Person Pose?

F. Wang, S. Panev, Z. Dai, J. Han, and D. Huang, “Can WiFi estimate person pose?”arXiv preprint arXiv:1904.00277, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904

[12] [12]

From point to space: 3D moving human pose estimation using commodity WiFi,

Y . Wang, L. Guo, Z. Lu, X. Wen, S. Zhou, and W. Meng, “From point to space: 3D moving human pose estimation using commodity WiFi,” IEEE Communications Letters, vol. 25, no. 7, pp. 2235–2239, 2021

work page 2021

[13] [13]

MetaFi: Device-free pose estimation via commodity WiFi for metaverse avatar simulation,

J. Yang, Y . Zhou, H. Huang, H. Zou, and L. Xie, “MetaFi: Device-free pose estimation via commodity WiFi for metaverse avatar simulation,” in2022 IEEE 8th World Forum on Internet of Things (WF-IoT). IEEE, 2022, pp. 1–6

work page 2022

[14] [14]

PerUnet: Deep signal channel attention in unet for wifi-based human pose estimation,

Y . Zhou, A. Zhu, C. Xu, F. Hu, and Y . Li, “PerUnet: Deep signal channel attention in unet for wifi-based human pose estimation,”IEEE Sensors Journal, vol. 22, no. 20, pp. 19 750–19 760, 2022

work page 2022

[15] [15]

MetaFi++: WiFi-enabled transformer-based human pose estimation for metaverse avatar simulation,

Y . Zhou, H. Huang, S. Yuan, H. Zou, L. Xie, and J. Yang, “MetaFi++: WiFi-enabled transformer-based human pose estimation for metaverse avatar simulation,”IEEE Internet of Things Journal, vol. 10, no. 16, pp. 14 128–14 136, 2023

work page 2023

[16] [16]

Towards 3D human pose construction using WiFi,

W. Jiang, et al., “Towards 3D human pose construction using WiFi,” inProceedings of the 26th Annual International Conference on Mobile Computing and Networking, 2020, pp. 1–14

work page 2020

[17] [17]

CSI-former: Pay more attention to pose estimation with WiFi,

Y . Zhou, C. Xu, L. Zhao, A. Zhu, F. Hu, and Y . Li, “CSI-former: Pay more attention to pose estimation with WiFi,”Entropy, vol. 25, no. 1, p. 20, 2022

work page 2022

[18] [18]

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,”arXiv preprint arXiv:1803.01271, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[19] [19]

Axial-deeplab: Stand-alone axial-attention for panoptic segmentation,

H. Wang, et al., “Axial-deeplab: Stand-alone axial-attention for panoptic segmentation,” inEuropean conference on computer vision. Springer, 2020, pp. 108–126

work page 2020

[20] [20]

Tool release: Gathering 802.11 n traces with channel state information,

D. Halperin, W. Hu, A. Sheth, and D. Wetherall, “Tool release: Gathering 802.11 n traces with channel state information,”ACM SIGCOMM computer communication review, vol. 41, no. 1, pp. 53–53, 2011

work page 2011

[21] [21]

Wi-Fi sensing techniques for human activity recognition: Brief survey, potential challenges, and research directions,

F. Miao, Y . Huang, Z. Lu, T. Ohtsuki, G. Gui, and H. Sari, “Wi-Fi sensing techniques for human activity recognition: Brief survey, potential challenges, and research directions,”ACM Computing Surveys, vol. 57, no. 5, pp. 1–30, 2025

work page 2025

[22] [22]

Mm-Fi: Multi-modal non-intrusive 4D human dataset for versatile wireless sensing,

J. Yang, et al., “Mm-Fi: Multi-modal non-intrusive 4D human dataset for versatile wireless sensing,”Advances in Neural Information Processing Systems, vol. 36, pp. 18 756–18 768, 2023

work page 2023

[23] [23]

HPE-Li: WiFi-enabled lightweight dual selective kernel convolution for human pose estimation,

T. D. Gian, T. Dac Lai, T. Van Luong, K.-S. Wong, and V .-D. Nguyen, “HPE-Li: WiFi-enabled lightweight dual selective kernel convolution for human pose estimation,” inComputer Vision – ECCV 2024. Cham: Springer Nature Switzerland, 2025, pp. 93–111

work page 2024