FAWN: A MultiEncoder Fusion-Attention Wave Network for Integrated Sensing and Communication Indoor Scene Inference
Pith reviewed 2026-05-21 21:47 UTC · model grok-4.3
The pith
FAWN fuses Wi-Fi and 5G signals to achieve sub-meter accuracy in passive indoor scene inference.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FAWN, a MultiEncoder Fusion-Attention Wave Network based on the original transformers architecture, fuses information from Wi-Fi and 5G, making the network capable of understanding the physical world without interfering with the current communication, as shown by a real-scenario prototype with errors below 0.6 m around 84% of the time.
What carries the argument
The MultiEncoder Fusion-Attention Wave Network, a transformer-based architecture that uses multiple encoders and attention mechanisms to integrate signals from different wireless technologies for indoor scene inference.
If this is right
- Combining Wi-Fi and 5G increases the accuracy reachable for passive indoor sensing beyond single-technology limits.
- The passive approach reuses existing communications to sense the environment without dedicated hardware or interference.
- Leveraging different spectrums augments the coverage area for scene inference tasks.
- Integration into a real prototype confirms the architecture works with current wireless infrastructure.
Where Pith is reading between the lines
- The fusion technique could apply to other wireless standards to further boost sensing performance.
- Such systems might support practical uses like indoor navigation or smart environment monitoring.
- Testing in additional building types would help identify where the accuracy gains hold or need adjustment.
Load-bearing premise
A single real-scenario prototype is sufficient to show that fusing Wi-Fi and 5G via this architecture reliably augments coverage and accuracy for general indoor scene inference tasks.
What would settle it
Repeated tests across multiple varied indoor environments where errors exceed 0.6 m in more than 16 percent of cases would indicate the fusion method does not deliver the claimed general reliability.
Figures
read the original abstract
The upcoming generations of wireless technologies promise an era where everything is interconnected and intelligent. As the need for intelligence grows, networks must learn to better understand the physical world. However, deploying dedicated hardware to perceive the environment is not always feasible, mainly due to costs and/or complexity. Integrated Sensing and Communication (ISAC) has made a step forward in addressing this challenge. Within ISAC, passive sensing emerges as a cost-effective solution that reuses wireless communications to sense the environment, without interfering with existing communications. Nevertheless, the majority of current solutions are limited to one technology (mostly Wi-Fi or 5G), constraining the maximum accuracy reachable. As different technologies work with different spectrums, we see a necessity in integrating more than one technology to augment the coverage area. Hence, we take the advantage of ISAC passive sensing, to present FAWN, a MultiEncoder Fusion-Attention Wave Network for ISAC indoor scene inference. FAWN is based on the original transformers architecture, to fuse information from Wi-Fi and 5G, making the network capable of understanding the physical world without interfering with the current communication. To test our solution, we have built a prototype and integrated it in a real scenario. Results show errors below 0.6 m around 84% of times.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes FAWN, a transformer-based MultiEncoder Fusion-Attention Wave Network for passive Integrated Sensing and Communication (ISAC) indoor scene inference. It fuses Wi-Fi and 5G signals to enable environment perception without dedicated hardware or interference with existing communications, claiming augmented coverage and accuracy. A real-scenario prototype is reported to achieve positioning errors below 0.6 m in approximately 84% of cases.
Significance. If the fusion mechanism proves robust, the work could advance multi-technology passive ISAC by demonstrating practical integration of heterogeneous wireless signals for scene inference using existing infrastructure. The prototype provides an initial feasibility demonstration, though broader validation would be needed to establish general impact.
major comments (2)
- Abstract and prototype evaluation: the central claim that FAWN reliably augments coverage and accuracy via Wi-Fi/5G fusion rests on results from a single real-scenario prototype. No information is supplied on experimental design (room dimensions, device placements, number of trials or test points, multipath profiles, or material properties), baselines, dataset size, or statistical tests, so it is not possible to determine whether the 84% figure for sub-0.6 m errors generalizes beyond the specific tested geometry and hardware configuration.
- Results section: absence of single-technology baselines (Wi-Fi-only or 5G-only) or alternative fusion architectures prevents isolating the contribution of the proposed multi-encoder fusion-attention mechanism from potential benefits of simply using multiple bands.
minor comments (2)
- Specify the exact performance metric (e.g., CDF value at 0.6 m) and total number of measurements rather than the phrasing 'around 84% of times'.
- Clarify notation for the wave-network components and how the fusion-attention layers combine encoder outputs from the two technologies.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the presentation of our experimental results and the contribution of the fusion mechanism. We address each major comment below and indicate the revisions we will make to the manuscript.
read point-by-point responses
-
Referee: Abstract and prototype evaluation: the central claim that FAWN reliably augments coverage and accuracy via Wi-Fi/5G fusion rests on results from a single real-scenario prototype. No information is supplied on experimental design (room dimensions, device placements, number of trials or test points, multipath profiles, or material properties), baselines, dataset size, or statistical tests, so it is not possible to determine whether the 84% figure for sub-0.6 m errors generalizes beyond the specific tested geometry and hardware configuration.
Authors: We agree that additional details on the experimental design are required to support claims of generalizability. The full manuscript describes the prototype setup at a high level, but we will expand the relevant section to include room dimensions, device placements, number of trials and test points, multipath profiles, material properties, dataset size, and any statistical tests. These additions will allow readers to better evaluate the 84% figure for sub-0.6 m errors. revision: yes
-
Referee: Results section: absence of single-technology baselines (Wi-Fi-only or 5G-only) or alternative fusion architectures prevents isolating the contribution of the proposed multi-encoder fusion-attention mechanism from potential benefits of simply using multiple bands.
Authors: We acknowledge that single-technology baselines are necessary to isolate the benefit of the proposed fusion-attention mechanism. In the revised manuscript we will add Wi-Fi-only and 5G-only results to the evaluation. For alternative fusion architectures, we will include a discussion of simpler multi-band fusion approaches and, space permitting, comparative results to highlight the specific advantages of the multi-encoder design. revision: yes
Circularity Check
No significant circularity; empirical prototype results are direct measurements
full rationale
The paper introduces the FAWN multi-encoder fusion-attention architecture (based on transformers) to fuse Wi-Fi and 5G signals for passive ISAC indoor scene inference. It then describes building a prototype and integrating it in a real scenario, reporting empirical error statistics (below 0.6 m in ~84% of cases). No derivation chain, first-principles equations, or predictions are present that reduce by construction to fitted parameters, self-citations, or renamed inputs. Performance numbers are presented as direct outcomes from the built system rather than quantities defined in terms of the model itself. This is the most common honest finding for an empirical systems paper; the central claim rests on external falsifiable measurements, not tautological reduction.
Axiom & Free-Parameter Ledger
free parameters (1)
- Network hyperparameters and weights
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
FAWN is based on the original transformers architecture, to fuse information from Wi-Fi and 5G... Multi-encoder attention based on transformer architecture.
-
IndisputableMonolith/Foundation/DimensionForcing.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Results show errors below 0.6 m around 84% of times.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
A survey on integrated sensing, communication, and computation,
D. Wen, Y . Zhou, X. Li, Y . Shi, K. Huang, and K. B. Letaief, “A survey on integrated sensing, communication, and computation,”IEEE Communications Surveys & Tutorials, pp. 1–1, 2024
work page 2024
-
[2]
A robust CSI-based Wi-Fi passive sensing method using attention mechanism deep learning,
Z. He, X. Zhang, Y . Wang, Y . Lin, G. Gui, and H. Gacanin, “A robust CSI-based Wi-Fi passive sensing method using attention mechanism deep learning,”IEEE Internet of Things Journal, vol. 10, 2023
work page 2023
-
[3]
A. Hussain, Y . Chen, A. Ullah, and S. Zhang, “WiSigPro: Transformer for elevating CSI-based human activity recognition through attention mechanisms,”Expert Systems with Applications, vol. 258, p. 124976, 2024
work page 2024
-
[4]
5G-based passive radar sensing for human activity recognition using deep learning,
M. Dwivedi, I. E. L. Hulede, O. Venegas, J. Ashdown, and A. Mukher- jee, “5G-based passive radar sensing for human activity recognition using deep learning,” in2024 Radar Conference (RadarConf24). IEEE, 2024, pp. 1–6
work page 2024
-
[5]
5G-based passive radar utilizing channel response estimated via reference signals,
M. Wypich, R. Maksymiuk, and T. P. Zielinski, “5G-based passive radar utilizing channel response estimated via reference signals,”IEEE Transactions on Radar Systems, 2025. [6]IEEE Standard for Information Technology–Telecommunications and Information Exchange Between Systems–Local and Metropolitan Area Networks–Specific Requirements–Part 11: Wireless LAN ...
work page 2025
-
[6]
P2SLAM: Bearing based WiFi SLAM for indoor robots,
A. Arun, R. Ayyalasomayajula, W. Hunter, and D. Bharadia, “P2SLAM: Bearing based WiFi SLAM for indoor robots,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 3326–3333, 2022
work page 2022
-
[7]
WiFi-CSI difference paradigm: Achieving efficient doppler speed estimation for passive tracking,
W. Li, R. Gao, J. Xiong, J. Zhou, L. Wang, X. Mao, E. Yi, and D. Zhang, “WiFi-CSI difference paradigm: Achieving efficient doppler speed estimation for passive tracking,”Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 8, no. 2, pp. 1–29, 2024
work page 2024
-
[8]
Gradient-based learning applied to document recognition,
Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,”Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 2002
work page 2002
-
[9]
Model complexity of deep learning: A survey,
X. Hu, L. Chu, J. Pei, W. Liu, and J. Bian, “Model complexity of deep learning: A survey,”Knowledge and Information Systems, vol. 63, no. 10, pp. 2585–2619, 2021
work page 2021
-
[10]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”Advances in neural information processing systems, vol. 30, 2017
work page 2017
-
[11]
waveSLAM: Empowering accurate indoor mapping using off-the-shelf millimeter-wave self-sensing,
P. Picazo, M. Groshev, A. Blanco, C. Fiandrino, A. de La Oliva, and J. Widmer, “waveSLAM: Empowering accurate indoor mapping using off-the-shelf millimeter-wave self-sensing,” inIEEE 98th Vehicular Technology Conference (VTC2023-Fall), 2023, pp. 1–7
work page 2023
-
[12]
Radio sensing using 5G signals: Concepts, state of the art, and challenges,
Y . Chen, J. Zhang, W. Feng, and M.-S. Alouini, “Radio sensing using 5G signals: Concepts, state of the art, and challenges,”IEEE Internet of Things Journal, vol. 9, no. 2, pp. 1037–1052, 2022
work page 2022
-
[13]
Integrated Sensing and Communication (ISAC) for vehicles: Bistatic radar with 5G-NR signals,
N. K. Nataraja, S. Sharma, K. Ali, F. Bai, R. Wang, and A. F. Molisch, “Integrated Sensing and Communication (ISAC) for vehicles: Bistatic radar with 5G-NR signals,”IEEE Transactions on Vehicular Technology, vol. 74, no. 4, pp. 6121–6137, 2025
work page 2025
-
[14]
Ericsson Indoor Planner for iOS,
Ericsson AB, “Ericsson Indoor Planner for iOS,” https: //ericsson-indoor-planner-ios.soft112.com/, 2019, accessed: 2025- 07-23. Carlos Barroso-Fern ´andezgot his M.Sc. in 2022 and is a Ph.D. student at Universidad Carlos III de Madrid. Alejandro Calvillo-Fernandezgot his M.Sc. in 2024 and is a Ph.D. student at Universidad Carlos III de Madrid. Antonio de ...
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.