pith. sign in

arxiv: 2606.23771 · v1 · pith:QXYJQZLGnew · submitted 2026-06-22 · 📡 eess.SP · cs.AI

Integrated Sensing and Communications for Real-time Avatar Control in XR over 5G

Pith reviewed 2026-06-26 06:35 UTC · model grok-4.3

classification 📡 eess.SP cs.AI
keywords 5GISACXRgesture recognitionpower-per-beam-pairsEMGavatar controlmmWave
0
0 comments X

The pith

5G mmWave signals via power-per-beam-pair combined with sEMG enable multi-scale gesture recognition for real-time XR avatar control.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that standard 5G NR beam management procedures yield power-per-beam-pair values sufficient to classify coarse body-level gestures and poses. These radio-derived measurements achieve 82.2 plus or minus 5.9 percent average accuracy when tested on users excluded from training. Surface electromyography sensors attached to the forearm supply the complementary fine-grained finger gesture data that radio sensing cannot resolve. The resulting multimodal system supports full-body avatar mapping in XR without cameras, controllers, or line-of-sight constraints while using the same 5G signals already required for content delivery.

Core claim

Power-per-beam-pair values computed from unmodified 5G NR beam sweeping or beam management contain enough information to derive usable coarse body-level gestures and poses in real time; when fused with sEMG forearm signals for finger-level actions, the combination produces a complete multi-scale gesture recognition pipeline that drives real-time avatar control in XR.

What carries the argument

Power-per-beam-pair (PPBP) extracted from 5G mmWave beam management procedures, fused with surface electromyography (sEMG) signals for body-to-finger scale coverage.

If this is right

  • XR avatar control can reuse existing 5G infrastructure for body sensing without dedicated radar hardware.
  • Body-level recognition generalizes to unseen users at 82.2 plus or minus 5.9 percent accuracy.
  • Fine finger interactions remain the responsibility of lightweight sEMG sensors.
  • The same millimeter-wave link simultaneously delivers high-rate XR content and extracts pose information.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Deployment cost drops because no additional spectrum or hardware is required beyond standard 5G NR operation.
  • Latency for avatar updates could be reduced by performing PPBP feature extraction at the base station rather than the device.
  • The approach may extend to dynamic environments if beam management procedures are adapted for higher update rates.

Load-bearing premise

Power-per-beam-pair values from standard 5G beam management contain enough information to classify usable coarse body gestures and poses in real time.

What would settle it

A test showing that PPBP-based classification accuracy falls below 70 percent or cannot run in real time when evaluated on diverse users performing natural XR movements in typical indoor settings.

Figures

Figures reproduced from arXiv: 2606.23771 by Javad Sameri, Jeroen Famaey, Maria Torres Vega, Nabeel Nisar Bhat, Rafael Berkvens, Rreze Halili.

Figure 1
Figure 1. Figure 1: 5G ISAC system architecture for immersive XR: SSB beam sweeping [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Confusion matrix for the in-domain setting averaged across 8 users. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Activity classification by means of the sEMG collected in the [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
read the original abstract

Extended Reality (XR) presents a challenging use case for 5G and 6G networks, requiring high data-rates and lowlatency communication to deliver a truly immersive experience. Moreover, in order to seamlessly translate physical actions to the virtual world, accurate gesture recognition and pose estimation are required. Current XR interaction solutions based on handheld controllers and cameras cannot easily capture full-body poses, inhibit the free use of hands, and require good visibility and a clear line of sight. In this work, we propose a multimodal sensing architecture for XR that combines 5G MillimeterWave (mmWave) Integrated sensing and communication (ISAC) and surface electromyography (sEMG) signals. 5G mmWave ISAC cannot only be used to deliver content wirelessly to the Head-mounted display (HMD), but also the same communication signals can be used to derive coarse body-level gestures and poses of the user, to support real-time avatar control. For fine-grained finger-level gestures, our architecture leverages lightweight sEMG sensors that capture forearm muscle activity. To illustrate the need of both modalities, we present evaluations of both sensing technologies. At the body level (5G), our architecture relies on power-per-beam-pair (PPBP), which can be computed from standard beam management or beam sweeping procedures of the 5G NR standard. PPBP-based sensing achieves 82.2$\pm$5.9% average accuracy when evaluated on users not seen during training. For fine-grained finger-level interactions, we show that surface electromyography (sEMG) carries strong discriminative information achieving consistent promising performance across different movement settings. Thus, combining the two modalities enables multi-scale gesture recognition, at the body level via existing 5G signals and finger level via lightweight sEMG sensors, forming a complete XR framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a multimodal architecture for real-time XR avatar control that combines 5G mmWave ISAC, using power-per-beam-pair (PPBP) values extracted from standard 5G NR beam management procedures for coarse body-level gesture and pose recognition, with lightweight sEMG sensors for fine-grained finger-level gestures. It reports that PPBP-based sensing achieves 82.2±5.9% average accuracy when evaluated on users not seen during training, and that sEMG provides strong discriminative information for finger movements, enabling a complete multi-scale framework.

Significance. If the empirical results hold under full scrutiny, the work could be significant for XR systems by demonstrating that existing 5G communication infrastructure can be repurposed for body sensing without hardware modifications or line-of-sight requirements, complementing sEMG for full-body avatar control. The evaluation on unseen users and reliance on unmodified standard procedures are explicit strengths that support claims of practicality and generalization.

major comments (2)
  1. [Abstract, §4] Abstract and §4 (results): The central claim that PPBP from unmodified 5G NR beam sweeping suffices for usable coarse body gestures rests on the reported 82.2±5.9% accuracy on unseen users, yet the manuscript provides insufficient detail on the exact feature extraction from PPBP, the classifier architecture, training procedure, number of users/classes, or any baseline comparisons (e.g., against other RF or vision methods); these omissions are load-bearing because they prevent assessment of whether the accuracy genuinely supports real-time XR usability.
  2. [§3] §3 (methods): The description of how PPBP is computed from standard beam management must explicitly confirm that no modifications to the 5G NR procedures are introduced, as any custom processing would undermine the claim of leveraging unmodified signals; without this verification or pseudocode, the 'parameter-free' or infrastructure-free aspect of the sensing modality cannot be evaluated.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'consistent promising performance' for sEMG is vague; replace with quantitative metrics (e.g., accuracy ranges) to match the precision given for PPBP.
  2. Throughout: Ensure all acronyms (PPBP, sEMG, HMD, XR, ISAC) are defined on first use and that figure captions include enough detail for standalone interpretation of the multimodal pipeline.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential significance of repurposing unmodified 5G infrastructure for XR sensing. We address each major comment below and indicate the revisions that will be made to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract, §4] Abstract and §4 (results): The central claim that PPBP from unmodified 5G NR beam sweeping suffices for usable coarse body gestures rests on the reported 82.2±5.9% accuracy on unseen users, yet the manuscript provides insufficient detail on the exact feature extraction from PPBP, the classifier architecture, training procedure, number of users/classes, or any baseline comparisons (e.g., against other RF or vision methods); these omissions are load-bearing because they prevent assessment of whether the accuracy genuinely supports real-time XR usability.

    Authors: We agree that additional methodological detail is required for readers to fully evaluate the reported accuracy and its implications for real-time XR. In the revised manuscript we will expand §4 (and the abstract if space permits) to specify the PPBP feature extraction process, the classifier architecture, the training procedure including the cross-user evaluation protocol, the number of users and gesture classes, and baseline comparisons against other RF or vision approaches where feasible. These additions will directly address the load-bearing omissions noted. revision: yes

  2. Referee: [§3] §3 (methods): The description of how PPBP is computed from standard beam management must explicitly confirm that no modifications to the 5G NR procedures are introduced, as any custom processing would undermine the claim of leveraging unmodified signals; without this verification or pseudocode, the 'parameter-free' or infrastructure-free aspect of the sensing modality cannot be evaluated.

    Authors: The manuscript already states that PPBP values are obtained from standard 5G NR beam management procedures. To make this explicit and address the concern, we will revise §3 to include a direct confirmation that no modifications to the 5G NR beam sweeping or reporting procedures are performed, together with a short pseudocode description of the extraction step. This will reinforce the parameter-free and infrastructure-free nature of the sensing modality. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper reports empirical classification accuracies from 5G PPBP features and sEMG signals evaluated on held-out users, with no mathematical derivation chain, fitted parameters renamed as predictions, or self-citation load-bearing steps. All load-bearing claims are direct experimental measurements from unmodified 5G NR beam management procedures and standard ML training/testing splits, which remain independent of the reported results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the empirical performance of an experimental multimodal system rather than new theoretical derivations; it assumes standard wireless channel and signal processing models hold for the sensing task.

axioms (1)
  • domain assumption Power-per-beam-pair measurements from standard 5G beam management contain sufficient discriminative information for coarse body pose estimation.
    This premise is invoked when the abstract states that PPBP can be computed from beam sweeping procedures and used for gesture recognition.

pith-pipeline@v0.9.1-grok · 5886 in / 1283 out tokens · 33777 ms · 2026-06-26T06:35:56.914874+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 1 canonical work pages

  1. [1]

    Extended reality (XR) toward building immersive so- lutions: the key to unlocking industry 4.0,

    A. Alhakamy, “Extended reality (XR) toward building immersive so- lutions: the key to unlocking industry 4.0,”ACM Computing Surveys, vol. 56, no. 9, pp. 1–38, 2024

  2. [2]

    Leveraging 6G, extended reality, and IoT big data analytics for healthcare: A review,

    H. F. Ahmad, W. Rafique, R. U. Rasool, A. Alhumam, Z. Anwar, and J. Qadir, “Leveraging 6G, extended reality, and IoT big data analytics for healthcare: A review,”Computer Science Review, vol. 48, p. 100558, 2023

  3. [3]

    Toward interactive multi-user extended reality using millimeter-wave networking,

    J. Struye, S. Van Damme, N. N. Bhat, A. Troch, B. Van Liempd, H. Assasa, F. Lemic, J. Famaey, and M. T. Vega, “Toward interactive multi-user extended reality using millimeter-wave networking,”IEEE Communications Magazine, vol. 62, no. 8, pp. 54–60, 2024

  4. [4]

    Performance anal- ysis of 5G FR2 (mmWave) downlink 256QAM on commercial 5G networks,

    K. Arunruangsirilert, P. Wongprasert, and J. Katto, “Performance anal- ysis of 5G FR2 (mmWave) downlink 256QAM on commercial 5G networks,” inICC 2025 - IEEE International Conference on Communi- cations, 2025, pp. 741–746

  5. [5]

    Integrated sensing and communication with mmWave massive mimo: A compressed sampling perspective,

    Z. Gao, Z. Wan, D. Zheng, S. Tan, C. Masouros, D. W. K. Ng, and S. Chen, “Integrated sensing and communication with mmWave massive mimo: A compressed sampling perspective,”IEEE Transactions on Wireless Communications, vol. 22, no. 3, pp. 1745–1762, 2023

  6. [6]

    Con- vergent communication, sensing and localization in 6G systems: An overview of technologies, opportunities and challenges,

    C. De Lima, D. Belot, R. Berkvens, A. Bourdoux, D. Dardari, M. Guil- laud, M. Isomursu, E.-S. Lohan, Y . Miao, A. N. Barretoet al., “Con- vergent communication, sensing and localization in 6G systems: An overview of technologies, opportunities and challenges,”IEEE Access, vol. 9, pp. 26 902–26 925, 2021

  7. [7]

    Cross-layer integrated sensing and communication: A joint industrial and academic perspective,

    H. Wymeersch, N. Tervo, S. W ¨anstedt, S. Saleh, J. Ahlendorf, O. Akgul, V . Tsekenis, S. Barmpounakis, L. Bai, M. Bealeet al., “Cross-layer integrated sensing and communication: A joint industrial and academic perspective,”IEEE Open Journal of the Communications Society, 2025

  8. [8]

    MultiSenseVR: An open multimodal dataset for human pose estimation and perception in interactive VR,

    J. Sameri, N. Nisar Bhat, F. De Turck, R. Berkvens, J. Famaey, and M. Torres Vega, “MultiSenseVR: An open multimodal dataset for human pose estimation and perception in interactive VR,” inProceedings of the 17th ACM Multimedia Systems Conference, 2026

  9. [9]

    emg2qwerty: A large dataset with baselines for touch typing using surface electromyography,

    V . Sivakumar, J. Seely, A. Du, S. Bittner, A. Berenzweig, A. Bolarinwa, A. Gramfort, and M. Mandel, “emg2qwerty: A large dataset with baselines for touch typing using surface electromyography,”Advances in Neural Information Processing Systems, vol. 37, pp. 91 373–91 389, 2024

  10. [10]

    An overview and classification on beamforming techniques for 5G systems,

    C.-M. Andras, T.-F. R ˆıcoiu, D.-A. Drugea, G. Barb, and M. Otes ¸teanu, “An overview and classification on beamforming techniques for 5G systems,” in2024 International Symposium on Electronics and Telecom- munications (ISETC), 2024, pp. 1–4

  11. [11]

    Cross-domain wifi sensing with channel state information: A survey,

    C. Chen, G. Zhou, and Y . Lin, “Cross-domain wifi sensing with channel state information: A survey,”ACM Computing Surveys, vol. 55, no. 11, pp. 1–37, 2023

  12. [12]

    Disc: A dataset for integrated sensing and communications in mmwave systems,

    J. Pegoraro, P. Saucedo, J. O. Lacruz, M. Rossi, and J. Widmer, “Disc: A dataset for integrated sensing and communications in mmwave systems,” IEEE Communications Magazine, 2025

  13. [13]

    Sparcs: A sparse recovery approach for integrated communication and human sensing in mmWave systems,

    J. Pegoraro, J. O. Lacruz, M. Rossi, and J. Widmer, “Sparcs: A sparse recovery approach for integrated communication and human sensing in mmWave systems,” in2022 21st ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE, 2022, pp. 79–91

  14. [14]

    An in-depth measurement analysis of 5G mmwave phy latency and its impact on end-to-end delay,

    R. A. Fezeu, E. Ramadan, W. Ye, B. Minneci, J. Xie, A. Narayanan, A. Hassan, F. Qian, Z.-L. Zhang, J. Chandrashekaret al., “An in-depth measurement analysis of 5G mmwave phy latency and its impact on end-to-end delay,” inInternational Conference on Passive and Active Network Measurement. Springer, 2023, pp. 284–312

  15. [15]

    A comparative study of mmwave vs. sub-6 GHz 5G networks for urban environments,

    K. Shah, “A comparative study of mmwave vs. sub-6 GHz 5G networks for urban environments,”Journal of Artificial Intelligence & Cloud Computing. SRC/JAICC-E222. DOI: doi. org/10.47363/JAICC/2022 (1) E222 J Arti Inte & Cloud Comp, vol. 1, no. 1, pp. 2–10, 2022

  16. [16]

    Millimeter-wave gesture recognition in isac: Does reducing sensing air- time hamper accuracy?

    J. Struye, N. N. Bhat, S. Kumar, M. H. Moghaddam, and J. Famaey, “Millimeter-wave gesture recognition in isac: Does reducing sensing air- time hamper accuracy?” in2026 IEEE 23rd Consumer Communications & Networking Conference (CCNC), 2026, pp. 1–6

  17. [17]

    Feasibility Study on Integrated Sensing and Communication (ISAC),

    3GPP, “Feasibility Study on Integrated Sensing and Communication (ISAC),” 3rd Generation Partnership Project (3GPP), Technical Report TR 22.837, 2024, release 19. [Online]. Available: https://www.3gpp.org

  18. [18]

    Service Requirements for Sensing Services,

    3GPP , “Service Requirements for Sensing Services,” 3rd Generation Partnership Project (3GPP), Technical Specification TS 22.137, 2023, release 19. [Online]. Available: https://www.3gpp.org

  19. [19]

    Integrated Sensing and Communications (ISAC); System and RAN Architectures,

    ETSI ISG ISAC, “Integrated Sensing and Communications (ISAC); System and RAN Architectures,” Tech. Rep. ETSI GR ISC 003 V1.1.1, Feb. 2026, european Telecommunications Standards Institute (ETSI), Group Report. [Online]. Available: https://https://www.etsi.org/committee/isac

  20. [20]

    Study on 6G Use Cases and Service Requirements,

    3GPP, “Study on 6G Use Cases and Service Requirements,” 3rd Generation Partnership Project (3GPP), Technical Report TR 22.870, 2026, release 20. [Online]. Available: https://www.3gpp.org

  21. [21]

    mmGAN: Semi- supervised GAN for improved gesture recognition in mmwave isac systems,

    N. N. Bhat, S. Kumar, M. H. Moghaddam, J. Struye, J. O. Lacruz, J. Pegoraro, J. Widmer, R. Berkvens, and J. Famaey, “mmGAN: Semi- supervised GAN for improved gesture recognition in mmwave isac systems,”IEEE Open Journal of the Communications Society, vol. 7, pp. 95–117, 2025

  22. [22]

    Conditional t-SNE: more informative t-SNE embeddings,

    B. Kang, D. Garcia Garcia, J. Lijffijt, R. Santos-Rodr ´ıguez, and T. De Bie, “Conditional t-SNE: more informative t-SNE embeddings,” Machine Learning, vol. 110, no. 10, pp. 2905–2940, 2021