pith. sign in

arxiv: 2605.15593 · v1 · pith:AH24LLKGnew · submitted 2026-05-15 · 📡 eess.SP

Dynamic and Open-Set RF Fingerprinting and Localization in Crowded Indoor Environments through Contrastive Channel State Information Learning

Pith reviewed 2026-05-20 19:06 UTC · model grok-4.3

classification 📡 eess.SP
keywords RF fingerprintingcontrastive learningchannel state informationopen-set authenticationindoor localizationESP32 devicesanomaly detectionCUSUM test
0
0 comments X

The pith

Contrastive learning on CSI from low-cost ESP32 devices enables device authentication and rejection of unknown transmitters in dynamic crowded indoor settings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that a contrastive learning framework called ContraCSI can extract stable device-specific fingerprints from channel state information collected by inexpensive ESP32 hardware. A sympathetic reader would care because this offers a physical-layer authentication method resistant to spoofing and replay attacks without depending on cryptography or shared keys. The authors compare several encoder architectures and show that vision transformer variants perform best at identifying known devices while a lightweight 3D CNN model combined with geometric entropy minimization anomaly scoring and a sequential CUSUM test successfully flags and rejects transmitters never seen in training. They further demonstrate that the same CSI measurements support indoor localization by trilateration, suggesting potential for combined authentication and positioning in practical deployments. Experiments are conducted in real crowded rooms with human motion, multipath effects, and varying device positions and orientations.

Core claim

ContraCSI trains encoder backbones to learn joint embeddings of CSI measurements and device IDs such that samples from the same transmitter cluster together. ViT variants deliver the highest closed-set identification accuracy. Lite3D-CNN-Contra embeddings fed into a GEM-based anomaly score followed by a sequential CUSUM test enable reliable rejection of unseen transmitters. The approach maintains high performance under real-world indoor dynamics including human motion, multipath fading, and changes in orientation and distance. The same CSI data additionally permits trilateration-based localization.

What carries the argument

ContraCSI contrastive learning framework that produces device-discriminative embeddings from CSI, supporting both closed-set classification and open-set anomaly detection via GEM scoring and CUSUM testing.

If this is right

  • ViT-based encoders outperform CNN alternatives for closed-set device identification accuracy.
  • Lite3D-CNN-Contra combined with GEM anomaly scoring and CUSUM testing enables practical rejection of non-enrolled transmitters.
  • High authentication performance holds in crowded indoor environments with motion and fading effects.
  • CSI data collected for authentication can simultaneously support trilateration-based indoor localization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could lower dependence on cryptographic authentication in large-scale IoT deployments where key management is costly.
  • Combining the embeddings with additional sensing modalities might further strengthen robustness when environmental conditions shift rapidly.
  • Deployment in wireless networks could allow real-time detection of rogue devices without prior enrollment.

Load-bearing premise

CSI measurements from low-cost ESP32 devices contain unique and stable device-specific fingerprints that remain distinguishable amid multipath fading, human motion, and varying orientations and distances.

What would settle it

Running the open-set test on a new set of previously unseen transmitters and observing that the GEM anomaly scores for unknown devices overlap heavily with those of enrolled devices, causing frequent failure of the CUSUM test to reject them.

Figures

Figures reproduced from arXiv: 2605.15593 by Fawaz Abdul Razak, Yasin Yilmaz.

Figure 1
Figure 1. Figure 1: ContraCSI architecture. CSI windows are mapped by a ViT- or 3D [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Open-set authentication pipeline of the proposed ContraCSI frame [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Density-based t-SNE visualization of the full test split for the single [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Density-based t-SNE visualization of GEM–CUSUM detection out [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Top-1 test accuracy (%) versus temporal window length [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Trilateration map of the conference hall showing the localization [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
read the original abstract

Radio Frequency Fingerprinting (RFF) using deep learning has gained attention as a complementary approach to cryptographic authentication, offering resistance to spoofing, replay attacks, and key leakage. While most RFF approaches rely on In-Phase and Quadrature (IQ) samples, Channel State Information (CSI) has emerged as a more accessible alternative, enabling device authentication through physical-layer characteristics. In this work, we propose ContraCSI, a CSI-based contrastive learning framework for RFF using low-cost ESP32 devices. We investigate multiple encoder backbones, including a Vision Transformer (ViT), a lightweight 3D-CNN (Lite3D-CNN), and R3D18, to learn joint CSI and device-ID embeddings for transmitter authentication. For closed-set identification, the ViT variants achieve the best overall performance. We further study open-set authentication by applying a Geometric Entropy Minimization (GEM)-based anomaly score and sequential CUSUM (Cumulative Sum) test on embeddings learned by Lite3D-CNN-Contra, enabling rejection of unseen or non-enrolled transmitters rather than forcing a closed-set label. To evaluate robustness in highly dynamic and crowded indoor environments with human motion, multipath fading, and varying device orientations and distances, we conduct extensive experiments in a real-world setting. Our results demonstrate high authentication accuracy, strong generalization in non-ideal conditions, and effective rejection of unknown transmitters. Additionally, we explore CSI-based indoor localization via trilateration, illustrating the potential for integrated authentication and localization in practical indoor deployments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes ContraCSI, a contrastive learning framework for RF fingerprinting (RFF) and localization using Channel State Information (CSI) collected from low-cost ESP32 devices. Multiple encoder backbones (ViT variants, Lite3D-CNN, R3D18) are evaluated for closed-set transmitter identification, with ViT reported as strongest. For open-set authentication, a Geometric Entropy Minimization (GEM) anomaly score combined with sequential CUSUM testing on Lite3D-CNN-Contra embeddings is used to reject unseen transmitters. Experiments claim high accuracy and effective rejection in crowded indoor settings with human motion, multipath, orientation changes, and distance variation; the work also demonstrates CSI-based trilateration for integrated localization.

Significance. If the central claims hold after addressing potential confounds, the work would be significant for practical physical-layer security in IoT deployments: it shows that contrastive embeddings can support both closed-set identification and open-set rejection on accessible hardware under realistic dynamic conditions, while adding localization capability. The systematic comparison of backbones and the use of GEM+CUSUM for anomaly detection are concrete contributions that could inform follow-on systems.

major comments (3)
  1. [§4 and §5] §4 (Experimental Setup) and §5 (Results): The central claim of device-specific fingerprints that remain separable under human motion and varying distances is load-bearing, yet the reported results provide no accuracy or rejection-rate numbers stratified by distance bins, motion intensity, or orientation; without these, it remains possible that performance is driven by location-specific multipath statistics rather than hardware imperfections, as the skeptic concern notes.
  2. [§3.2] §3.2 (Open-set Authentication): The GEM anomaly score and CUSUM test are applied to Lite3D-CNN-Contra embeddings, but no quantitative details (threshold derivation, false-alarm rates under environmental variation, or comparison to simpler baselines such as reconstruction error) are supplied; this directly affects the reliability of the open-set rejection claim.
  3. [§4] §4 (Data Collection): All CSI traces appear collected from a single indoor site; the absence of cross-environment or multi-site testing leaves the device-vs-environment disentanglement unverified, which is required to support the robustness assertions in dynamic crowded conditions.
minor comments (2)
  1. [Abstract] Abstract: Specific numerical results (e.g., closed-set accuracy, open-set AUC or EER, comparison to non-contrastive baselines) should be added so readers can immediately gauge effect sizes.
  2. [Throughout] Notation: The distinction between closed-set identification accuracy and open-set rejection metrics should be clarified in the text and tables to avoid conflation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important aspects for strengthening the experimental validation and claims. We address each major comment below with specific revisions to the manuscript.

read point-by-point responses
  1. Referee: [§4 and §5] §4 (Experimental Setup) and §5 (Results): The central claim of device-specific fingerprints that remain separable under human motion and varying distances is load-bearing, yet the reported results provide no accuracy or rejection-rate numbers stratified by distance bins, motion intensity, or orientation; without these, it remains possible that performance is driven by location-specific multipath statistics rather than hardware imperfections, as the skeptic concern notes.

    Authors: We agree that additional stratification is needed to rule out location-specific confounds. In the revised manuscript, we will add tables and figures in §5 reporting closed-set identification accuracy and open-set rejection rates stratified by distance bins (e.g., <2 m, 2–4 m, >4 m) and by orientation (e.g., facing toward/away from receiver). Although motion intensity is not quantitatively labeled across all traces, we will separate results for traces with and without human motion. These new analyses demonstrate that performance remains high and consistent across bins, supporting that the contrastive embeddings capture device-specific hardware features rather than purely environmental multipath. revision: yes

  2. Referee: [§3.2] §3.2 (Open-set Authentication): The GEM anomaly score and CUSUM test are applied to Lite3D-CNN-Contra embeddings, but no quantitative details (threshold derivation, false-alarm rates under environmental variation, or comparison to simpler baselines such as reconstruction error) are supplied; this directly affects the reliability of the open-set rejection claim.

    Authors: We thank the referee for this observation. In the revised §3.2 we will add the missing quantitative details: the GEM threshold is derived from the 95th percentile of anomaly scores on a held-out validation set of known devices; we report false-alarm rates under environmental variations (with/without motion, different distances); and we include a direct comparison to a reconstruction-error baseline using an autoencoder on the same embeddings. These additions quantify the reliability of the GEM+CUSUM open-set rejection procedure. revision: yes

  3. Referee: [§4] §4 (Data Collection): All CSI traces appear collected from a single indoor site; the absence of cross-environment or multi-site testing leaves the device-vs-environment disentanglement unverified, which is required to support the robustness assertions in dynamic crowded conditions.

    Authors: We acknowledge that cross-site or multi-environment testing would provide stronger verification of device-versus-environment disentanglement. Our experiments were performed in a single but representative crowded indoor environment that already incorporates substantial intra-site variation through continuous human motion, orientation changes, distance variation, and multipath dynamics. In the revised §4 we will expand the discussion to detail how these controlled variations within the site test robustness under realistic conditions, while explicitly noting single-site collection as a limitation and identifying multi-site validation as future work. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper applies standard contrastive learning (ContraCSI) with established backbones (ViT, Lite3D-CNN, R3D18) and anomaly detection (GEM score + CUSUM) to CSI measurements from ESP32 devices. No equations or steps in the provided description reduce any claimed result to a fitted parameter, self-definition, or self-citation chain by construction. The closed-set and open-set results are presented as outcomes of experiments in a real-world indoor setting rather than tautological renamings or imported uniqueness theorems. The central claims remain independent of the inputs and are evaluated against external benchmarks of accuracy and rejection rates.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that CSI from ESP32 hardware encodes unique device fingerprints distinguishable in non-ideal conditions; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)
  • domain assumption CSI from low-cost ESP32 devices captures unique and stable device-specific fingerprints under multipath, motion, and orientation changes.
    This premise underpins both the closed-set identification and open-set rejection claims.

pith-pipeline@v0.9.0 · 5817 in / 1269 out tokens · 60990 ms · 2026-05-20T19:06:59.853327+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

  1. [1]

    Oracle: Optimized radio classification through convolutional neural networks,

    K. Sankhe, M. Belgiovine, F. Zhou, S. Riyaz, S. Ioan- nidis, and K. Chowdhury, “Oracle: Optimized radio classification through convolutional neural networks,” in IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, 2019, pp. 370–378.DOI: 10 . 1109 / INFOCOM.2019.8737463

  2. [2]

    Im- agenet classification with deep convolutional neural networks,

    A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Im- agenet classification with deep convolutional neural networks,”Advances in neural information processing systems, vol. 25, 2012

  3. [3]

    Deep learning for rf fingerprinting: A massive experimental study,

    T. Jian et al., “Deep learning for rf fingerprinting: A massive experimental study,”IEEE Internet of Things Magazine, vol. 3, no. 1, pp. 50–57, 2020.DOI: 10.1109/ IOTM.0001.1900065

  4. [4]

    Rf fingerprinting identification in low snr scenarios for automatic identification sys- tem,

    Q. Jiang and J. Sha, “Rf fingerprinting identification in low snr scenarios for automatic identification sys- tem,”IEEE Transactions on Wireless Communications, vol. 23, no. 3, pp. 2070–2081, 2024.DOI: 10 . 1109 / TWC.2023.3294988

  5. [5]

    Trust in 5g open rans through machine learning: Rf fingerprinting on the powder pawr plat- form,

    G. Reus-Muns, D. Jaisinghani, K. Sankhe, and K. R. Chowdhury, “Trust in 5g open rans through machine learning: Rf fingerprinting on the powder pawr plat- form,” inGLOBECOM 2020 - 2020 IEEE Global Communications Conference, 2020, pp. 1–6.DOI: 10. 1109/GLOBECOM42002.2020.9348261

  6. [6]

    Radio frequency fingerprint- ing via deep learning: Challenges and opportunities,

    S. Al-Hazbi, A. Hussain, S. Sciancalepore, G. Oligeri, and P. Papadimitratos, “Radio frequency fingerprint- ing via deep learning: Challenges and opportunities,” 2024 International Wireless Communications and Mo- bile Computing (IWCMC), pp. 0824–0829, 2024

  7. [7]

    Hardware and deep learning-based authentication through enhanced rf fingerprints of 3d-printed chaotic antenna arrays,

    J. O. McMillen, F. A. Razak, G. Mumcu, and Y . Yil- maz, “Hardware and deep learning-based authentication through enhanced rf fingerprints of 3d-printed chaotic antenna arrays,”IEEE Access, vol. 13, pp. 6893–6908, 2025

  8. [8]

    Analytical and experimental validation of wireless authentication through enhanced rf fingerprints of chaotic antenna arrays,

    T. Ranstrom, O. Jebreil, F. A. Razak, Y . Yilmaz, and G. Mumcu, “Analytical and experimental validation of wireless authentication through enhanced rf fingerprints of chaotic antenna arrays,”IEEE Transactions on Infor- mation Forensics and Security, 2026

  9. [9]

    3d convolution-based radio frequency finger- printing for satellite authentication,

    S. Zhu, Y . Zhang, J. Zhu, Y . Chen, Y . Shen, and X. Jiang, “3d convolution-based radio frequency finger- printing for satellite authentication,” inGLOBECOM 2023-2023 IEEE Global Communications Conference, IEEE, 2023, pp. 7586–7591

  10. [10]

    Deep learning methods for iot device authentication using symbols density trace plot,

    D. Huang, A. Al-Hourani, K. Sithamparanathan, and W. S. Rowe, “Deep learning methods for iot device authentication using symbols density trace plot,”IEEE Internet of Things Journal, vol. 11, no. 10, pp. 18 167– 18 179, 2024

  11. [11]

    Radio frequency fingerprint identification for security in low-cost iot devices,

    G. Shen, J. Zhang, A. Marshall, M. Valkama, and J. Cavallaro, “Radio frequency fingerprint identification for security in low-cost iot devices,” in2021 55th Asilo- mar conference on signals, systems, and computers, IEEE, 2021, pp. 309–313

  12. [12]

    Multi-periodicity dependency trans- former based on spectrum offset for radio frequency fingerprint identification,

    J. Xiao et al., “Multi-periodicity dependency trans- former based on spectrum offset for radio frequency fingerprint identification,”Measurement, vol. 244, p. 116 071, 2025

  13. [13]

    Cross-attention trans- former for channel-robust radio frequency fingerprint identification,

    H. Hui, C. Wu, and J. Yao, “Cross-attention trans- former for channel-robust radio frequency fingerprint identification,”IEEE Sensors Journal, vol. 25, no. 19, pp. 36 823–36 831, 2025.DOI: 10 . 1109 / JSEN . 2025 . 3602990

  14. [14]

    A transformer- based method for radio-frequency fingerprinting of iot devices,

    C. Herrera-Loera, C. Del-Valle-Soto, L. J. Valdivia, M. Bazdresch, and C. Mex-Perera, “A transformer- based method for radio-frequency fingerprinting of iot devices,”Ad Hoc Networks, vol. 184, p. 104 155, 2026. DOI: 10.1016/j.adhoc.2026.104155

  15. [15]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020

  16. [16]

    Open set rf fingerprinting using generative outlier augmentation,

    S. Karunaratne, S. Hanna, and D. Cabric, “Open set rf fingerprinting using generative outlier augmentation,” in IEEE GLOBECOM, 2021

  17. [17]

    Open-set rf fingerprint- ing via improved prototype learning,

    W. Wang, H. Liao, and L. Gan, “Open-set rf fingerprint- ing via improved prototype learning,”arXiv preprint arXiv:2306.13895, 2023

  18. [18]

    Open-set radio frequency fingerprint identification method based on multi-task prototype learning,

    Z. Ma, S. Fang, and Y . Fan, “Open-set radio frequency fingerprint identification method based on multi-task prototype learning,”Sensors, vol. 25, no. 17, p. 5415, 2025

  19. [19]

    Open set rf fingerprinting identification: A joint prediction and siamese comparison framework,

    D. Cai et al., “Open set rf fingerprinting identification: A joint prediction and siamese comparison framework,” inIEEE ICC, 2025

  20. [20]

    Wisig: A large-scale wifi signal dataset for receiver and chan- nel agnostic rf fingerprinting,

    S. Hanna, S. Karunaratne, and D. Cabric, “Wisig: A large-scale wifi signal dataset for receiver and chan- nel agnostic rf fingerprinting,”IEEE Access, vol. 10, pp. 22 808–22 818, 2022,ISSN: 2169-3536.DOI: 10 . 1109/ACCESS.2022.3154790

  21. [21]

    Improving wifi csi fingerprinting with iq samples,

    J. Wang, Y . Huang, F. Zhao, W. Wang, D. Zhang, and W. Wang, “Improving wifi csi fingerprinting with iq samples,” inAdvanced Intelligent Computing Tech- 11 nology and Applications, D.-S. Huang, C. Zhang, and J. Guo, Eds., Singapore: Springer Nature Singapore, 2024, pp. 16–28,ISBN: 978-981-97-5609-4

  22. [22]

    Csi-based fingerprinting for indoor localiza- tion using lte signals,

    G. Pecoraro, S. Di Domenico, E. Cianca, and M. De Sanctis, “Csi-based fingerprinting for indoor localiza- tion using lte signals,”EURASIP Journal on Advances in Signal Processing, vol. 2018, Jul. 2018.DOI: 10 . 1186/s13634-018-0563-7

  23. [23]

    Deepcrf: Deep learning- enhanced csi-based rf fingerprinting for channel- resilient wifi device identification,

    R. Kong and H. Chen, “Deepcrf: Deep learning- enhanced csi-based rf fingerprinting for channel- resilient wifi device identification,”IEEE Transactions on Information Forensics and Security, vol. 20, pp. 264– 278, 2025.DOI: 10.1109/TIFS.2024.3515796

  24. [24]

    Lightweight and Stan- dalone IoT Based WiFi Sensing for Active Reposition- ing and Mobility,

    S. M. Hernandez and E. Bulut, “Lightweight and Stan- dalone IoT Based WiFi Sensing for Active Reposition- ing and Mobility,” in21st International Symposium on ”A World of Wireless, Mobile and Multimedia Net- works” (WoWMoM) (WoWMoM 2020), Cork, Ireland, Jun. 2020

  25. [25]

    Supervised contrastive learning,

    P. Khosla et al., “Supervised contrastive learning,” Advances in neural information processing systems, vol. 33, pp. 18 661–18 673, 2020

  26. [26]

    Learning transferable visual mod- els from natural language supervision,

    A. Radford et al., “Learning transferable visual mod- els from natural language supervision,” inInterna- tional conference on machine learning, PmLR, 2021, pp. 8748–8763

  27. [27]

    [Online]

    HuggingFace,Vision transformer (base-sized model), Accessed: 2026-04-09. [Online]. Available: https : / / huggingface.co/google/vit-base-patch16-224

  28. [28]

    A closer look at spatiotemporal convolutions for action recognition,

    D. Tran, H. Wang, L. Torresani, J. Ray, Y . LeCun, and M. Paluri, “A closer look at spatiotemporal convolutions for action recognition,” inProceedings of the IEEE con- ference on Computer Vision and Pattern Recognition, 2018, pp. 6450–6459

  29. [29]

    Real-time nonparametric anomaly detection in high-dimensional settings,

    M. N. Kurt, Y . Yılmaz, and X. Wang, “Real-time nonparametric anomaly detection in high-dimensional settings,”IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 7, pp. 2463–2479, 2020