WiLoc: Massive Measured Dataset of Wi-Fi Channel State Information with Application to Machine-Learning Based Localization
Pith reviewed 2026-05-16 05:00 UTC · model grok-4.3
The pith
WiLoc supplies the largest public dataset of Wi-Fi channel measurements, with over 12 million locations and 3000 access points, to improve machine-learning localization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that WiLoc is the largest CSI dataset of its kind, obtained from three-month measurement campaigns, with more than 12 million UE locations and more than 3000 APs across 16 buildings and over 30 streets. It describes the dataset structure, environments, protocols, and validations, then shows through case studies that large-scale data improves ML-driven localization performance in both standard and transfer-learning settings. The authors position the release as a standard resource for researchers developing accurate and robust localization algorithms.
What carries the argument
The WiLoc dataset of paired CSI measurements and precise location labels collected from multiple APs for millions of UE positions.
If this is right
- ML localization models achieve higher accuracy and robustness when trained on datasets with millions of locations rather than smaller collections.
- Transfer learning across indoor and outdoor environments succeeds more reliably with the diversity provided by 16 buildings and 30 streets.
- Researchers can benchmark new algorithms directly against this public resource without repeating large measurement campaigns.
- Both standard supervised learning and transfer-learning strategies for Wi-Fi positioning benefit from the scale and coverage.
- The dataset lowers the cost barrier for developing practical ML-based localization systems.
Where Pith is reading between the lines
- The scale of CSI patterns may expose location-specific signatures not visible in smaller datasets, enabling new feature designs.
- Future work could combine WiLoc with measurements from additional cities to create even broader multi-environment benchmarks.
- The public release could serve as a testbed for studying domain shift and adaptation techniques specific to wireless channels.
- Integration with other radio technologies might produce unified multi-band localization models trained on combined large datasets.
Load-bearing premise
The measured buildings, streets, and three-month collection period capture conditions representative enough for models to generalize to other real-world sites.
What would settle it
An ML model trained solely on WiLoc data that shows large accuracy drops when tested in a new unmeasured building or street environment would indicate the dataset does not support broad generalization.
Figures
read the original abstract
Localization is a key component of the wireless ecosystem. Machine learning (ML)-based localization using channel state information (CSI) is one of the most popular methods for achieving high-accuracy localization with low cost. However, to be accurate and robust, ML-based algorithms need to be trained and tested with large amounts of data, covering not only many user equipment (UE)/target locations, but also many different access points (APs) locations to which the UEs connect, in a variety of different environment types. This paper presents a massive-sized CSI dataset, WiLoc (Wi-Fi Localization), and makes it publicly available. WiLoc is obtained by a series of precision measurement campaigns that span three months, and it is massive in all the above-mentioned three dimensions: > 12 million UE locations, > 3,000 APs, covering 16 buildings for indoor localization, and > 30 streets for outdoor use. The paper describes the dataset structure, measurement environments, measurement protocols, and the dataset validations. Comprehensive case studies validate the advantages of large datasets in ML-driven localization strategies for both "standard" and transfer learning. We envision this dataset, which is by far the largest of its kind, to become a standard resource for researchers in the field of ML-based localization.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents WiLoc, a publicly released massive CSI dataset for Wi-Fi localization collected via precision measurement campaigns over three months. It covers >12 million UE locations, >3000 APs, 16 indoor buildings, and >30 outdoor streets, with descriptions of dataset structure, environments, protocols, validations, and case studies demonstrating benefits of large-scale data for standard and transfer-learning ML localization.
Significance. If the reported scale and coverage hold, the dataset would be a substantial community resource as the largest CSI collection of its kind, supporting improved ML model training and validation for localization tasks while providing empirical evidence of performance scaling with data volume.
minor comments (1)
- [Abstract] The abstract and introduction would benefit from a brief explicit statement of the exact CSI dimensions (e.g., number of subcarriers, antennas per AP/UE) to allow immediate assessment of compatibility with existing ML pipelines.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the manuscript and for recommending acceptance. We are pleased that the scale, coverage, and potential utility of the WiLoc dataset for the community are recognized.
Circularity Check
No significant circularity: dataset release paper with no derivations
full rationale
This is a dataset release paper describing measurement campaigns, protocols, and case studies on collected Wi-Fi CSI data. No mathematical derivations, predictions, or fitted parameters are present that could reduce to inputs by construction. The central claims concern the scale (>12M locations, >3000 APs) and utility of the measured data, validated empirically within the dataset itself. No self-citation chains or ansatzes are load-bearing for any result.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Location based services: ongoing evolution and research agenda,
H. Huang, G. Gartner, J. M. Krisp, M. Raubal, and N. Van de Weghe, “Location based services: ongoing evolution and research agenda,” Journal of Location Based Services, vol. 12, no. 2, pp. 63–93, 2018
work page 2018
-
[2]
R. Zekavat and R. M. Buehrer,Handbook of position location: theory, practice, and advances. John Wiley & Sons, 2019
work page 2019
-
[3]
A. F. Molisch,Wireless Communications - from fundamentals to beyond 5G, 3rd ed. IEEE Press - Wiley, 2023
work page 2023
-
[4]
A comprehensive survey of machine learning based localization with wireless signals,
D. Burghal, A. T. Ravi, V . Rao, A. A. Alghafis, and A. F. Molisch, “A comprehensive survey of machine learning based localization with wireless signals,”arXiv preprint arXiv:2012.11171, 2020
-
[5]
Machine learning based indoor localization using wi-fi rssi fingerprints: An overview,
N. Singh, S. Choe, and R. Punmiya, “Machine learning based indoor localization using wi-fi rssi fingerprints: An overview,”IEEE access, vol. 9, pp. 127 150–127 174, 2021
work page 2021
-
[6]
A survey of machine learning techniques for indoor localization and navigation systems,
P. Roy and C. Chowdhury, “A survey of machine learning techniques for indoor localization and navigation systems,”Journal of Intelligent & Robotic Systems, vol. 101, no. 3, p. 63, 2021
work page 2021
-
[7]
Real-time outdoor local- ization using radio maps: A deep learning approach,
C ¸ . Yapar, R. Levie, G. Kutyniok, and G. Caire, “Real-time outdoor local- ization using radio maps: A deep learning approach,”IEEE Transactions on Wireless Communications, vol. 22, no. 12, pp. 9703–9717, 2023
work page 2023
-
[8]
“Ieee standard for information technology–telecommunications and information exchange between systems local and metropolitan area networks–specific requirements part 11: Wireless lan medium access control (mac) and physical layer (phy) specifications,”IEEE Std 802.11- 2024 (Revision of IEEE Std 802.11-2020), pp. 1–5956, 2025
work page 2024
-
[9]
Deep learning based wireless localization for indoor navigation,
R. Ayyalasomayajula, A. Arun, C. Wu, S. Sharma, A. R. Sethi, D. Va- sisht, and D. Bharadia, “Deep learning based wireless localization for indoor navigation,” inProceedings of the 26th Annual International Conference on Mobile Computing and Networking, 2020, pp. 1–14
work page 2020
-
[10]
P2slam: Bearing based wifi slam for indoor robots,
A. Arun, R. Ayyalasomayajula, W. Hunter, and D. Bharadia, “P2slam: Bearing based wifi slam for indoor robots,”IEEE Robotics and Automa- tion Letters, vol. 7, no. 2, pp. 3326–3333, 2022
work page 2022
-
[11]
Antisense: Standard- compliant csi obfuscation against unauthorized wi-fi sensing,
M. Cominelli, F. Gringoli, and R. Lo Cigno, “Antisense: Standard- compliant csi obfuscation against unauthorized wi-fi sensing,”Comput. Commun., vol. 185, no. C, p. 92–103, Mar. 2022. [Online]. Available: https://doi.org/10.1016/j.comcom.2021.12.019
-
[12]
A framework for csi-based indoor localization with 1d convolutional neural networks,
L. Wang and S. Pasricha, “A framework for csi-based indoor localization with 1d convolutional neural networks,” 2022. [Online]. Available: https://arxiv.org/abs/2205.08068
-
[13]
Wisig: A large-scale wifi signal dataset for receiver and channel agnostic rf fingerprinting,
S. Hanna, S. Karunaratne, and D. Cabric, “Wisig: A large-scale wifi signal dataset for receiver and channel agnostic rf fingerprinting,”IEEE Access, vol. 10, pp. 22 808–22 818, 2022
work page 2022
-
[14]
High-resolution radio environment map data set for indoor office environment,
F. Burmeister, Z. Li, and I. Bizon, “High-resolution radio environment map data set for indoor office environment,” 2022. [Online]. Available: https://dx.doi.org/10.21227/waxd-9525
-
[15]
Wifi CSI-based long-range person localization using directional antennas,
J. Strohmayer and M. Kampel, “Wifi CSI-based long-range person localization using directional antennas,” inThe Second Tiny Papers Track at ICLR 2024, 2024. [Online]. Available: https://openreview.net/forum?id=AOJFcEh5Eb
work page 2024
-
[16]
WiFi sensing with channel state information: A survey,
Y . Ma, G. Zhou, and S. Wang, “Wifi sensing with channel state information: A survey,”ACM Comput. Surv., vol. 52, no. 3, Jun. 2019. [Online]. Available: https://doi.org/10.1145/3310194
-
[17]
A survey of indoor localization systems and technologies,
F. Zafari, A. Gkelias, and K. K. Leung, “A survey of indoor localization systems and technologies,”IEEE Communications Surveys & Tutorials, vol. 21, no. 3, pp. 2568–2599, 2019
work page 2019
-
[18]
Indoor intelligent fingerprint-based localization: Principles, approaches and challenges,
X. Zhu, W. Qu, T. Qiu, L. Zhao, M. Atiquzzaman, and D. O. Wu, “Indoor intelligent fingerprint-based localization: Principles, approaches and challenges,”IEEE Communications Surveys & Tutorials, vol. 22, no. 4, pp. 2634–2657, 2020
work page 2020
-
[19]
A survey of recent indoor localization scenarios and methodologies,
T. Yang, A. Cabani, and H. Chafouk, “A survey of recent indoor localization scenarios and methodologies,”Sensors, vol. 21, no. 23, p. 8086, 2021
work page 2021
-
[20]
A systematic review of localization in wsn: Machine learning and optimization-based approaches,
P. Yadav and S. C. Sharma, “A systematic review of localization in wsn: Machine learning and optimization-based approaches,”International journal of communication systems, vol. 36, no. 4, p. e5397, 2023
work page 2023
-
[21]
The state of the art of deep learning-based wi-fi indoor positioning: A review,
Y . Lin, K. Yu, F. Zhu, J. Bu, and X. Dua, “The state of the art of deep learning-based wi-fi indoor positioning: A review,”IEEE Sensors Journal, 2024
work page 2024
-
[22]
Uncovering the potential of indoor localization: Role of deep and transfer learning,
O. Kerdjidj, Y . Himeur, S. S. Sohail, A. Amira, F. Fadli, S. Attala, W. Mansoor, A. Copiaco, A. Gawanmeh, S. Miniaouiet al., “Uncovering the potential of indoor localization: Role of deep and transfer learning,” IEEE Access, 2024
work page 2024
-
[23]
A survey of application of machine learning in wireless indoor positioning systems,
A. Sonny, A. Kumar, and L. R. Cenkeramaddi, “A survey of application of machine learning in wireless indoor positioning systems,”arXiv preprint arXiv:2403.04333, 2024
-
[24]
A novel convolutional neural network based indoor localization framework with wifi fingerprinting,
X. Song, X. Fan, C. Xiang, Q. Ye, L. Liu, Z. Wang, X. He, N. Yang, and G. Fang, “A novel convolutional neural network based indoor localization framework with wifi fingerprinting,”IEEE Access, vol. 7, pp. 110 698–110 709, 2019
work page 2019
-
[25]
Dnn-based indoor localization under limited dataset using gans and semi-supervised learning,
W. Njima, A. Bazzi, and M. Chafii, “Dnn-based indoor localization under limited dataset using gans and semi-supervised learning,”IEEE Access, vol. 10, pp. 69 896–69 909, 2022
work page 2022
-
[26]
Wideep: Wifi-based accurate and robust indoor localization system using deep learning,
M. Abbas, M. Elhamshary, H. Rizk, M. Torki, and M. Youssef, “Wideep: Wifi-based accurate and robust indoor localization system using deep learning,” in2019 IEEE International Conference on Pervasive Com- puting and Communications (PerCom, 2019, pp. 1–10
work page 2019
-
[27]
Wifi signal strength-based robot indoor localization,
Y . Sun, M. Liu, and M. Q.-H. Meng, “Wifi signal strength-based robot indoor localization,” in2014 IEEE International Conference on Information and Automation (ICIA), 2014, pp. 250–256
work page 2014
-
[28]
Indoor localization with wifi fingerprint- ing using convolutional neural network,
J.-W. Jang and S.-N. Hong, “Indoor localization with wifi fingerprint- ing using convolutional neural network,” in2018 Tenth International Conference on Ubiquitous and Future Networks (ICUFN), 2018, pp. 753–758
work page 2018
-
[29]
A review of open access wifi fingerprinting datasets for indoor positioning,
X. Feng, K. An Nguyen, and Z. Luo, “A review of open access wifi fingerprinting datasets for indoor positioning,”IEEE Access, vol. 12, pp. 167 970–167 989, 2024
work page 2024
-
[30]
Wi-fi positioning dataset with multiusers and multidevices considering spatio-temporal variations
I. Ashraf, S. Din, S. Hur, and Y . Park, “Wi-fi positioning dataset with multiusers and multidevices considering spatio-temporal variations.” Computers, Materials & Continua, vol. 70, no. 3, 2022
work page 2022
-
[31]
Quantifying the impact of localization error on indoor channel prediction using rems,
F. Burmeister, Z. Li, N. Schwarzenberg, A. Traßl, R. Jacob, and G. Fettweis, “Quantifying the impact of localization error on indoor channel prediction using rems,” inGLOBECOM 2022-2022 IEEE Global Communications Conference. IEEE, 2022, pp. 5372–5377
work page 2022
-
[32]
B. Zhang, H. Sifaou, and G. Y . Li, “Csi-fingerprinting indoor localization via attention-augmented residual convolutional neural network,”IEEE Transactions on Wireless Communications, vol. 22, no. 8, pp. 5583– 5597, 2023
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.