FusionSense: Tri-Stage Near-Sensor Learning for Runtime-Adaptive Multimodal Edge Intelligence

Hyunwoo Oh; Minhyoung Na; Mohsen Imani; Ryozo Masukawa; Sanggeon Yun; Sungheon Jeong; Wenjun Huang; Yoshiki Yamaguchi

arxiv: 2605.22868 · v1 · pith:R5TQK2EVnew · submitted 2026-05-19 · 💻 cs.LG

FusionSense: Tri-Stage Near-Sensor Learning for Runtime-Adaptive Multimodal Edge Intelligence

Sanggeon Yun , Ryozo Masukawa , Minhyoung Na , Hyunwoo Oh , Yoshiki Yamaguchi , Wenjun Huang , SungHeon Jeong , Mohsen Imani This is my paper

Pith reviewed 2026-05-25 05:40 UTC · model grok-4.3

classification 💻 cs.LG

keywords multimodal sensingnear-sensor learningedge intelligencefusion-aware filteringenergy efficiencydata reductionruntime adaptivityautonomous systems

0 comments

The pith

FusionSense trains near-sensor classifiers in three stages using server fusion insights to decide which modalities to transmit, sustaining task quality at far higher data reduction rates than uni-modal methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FusionSense as a framework that splits learning across server and edge to handle multimodal sensors under tight energy budgets. A server model first masters the downstream task on fused data. Filter-out-safe labels then mark when each modality is truly required. These labels train a compact edge model that makes runtime decisions on what to compute or send. The result is linear scaling with sensor count and large efficiency gains on dual-modality setups, which matters because autonomous systems increasingly face limits on power and bandwidth while needing reliable event detection.

Core claim

FusionSense establishes that a tri-stage procedure—server-side fusion model training, generation of filter-out-safe labels that quantify each modality's necessity relative to the fused decision, and compaction of an edge fusion model by injecting near-sensor predictions as auxiliary signals—produces runtime decisions that jointly reduce compute and communication while preserving downstream task quality, delivering up to 33x lower energy at 1% FoI prevalence, 11x at 10%, and a 92.3% reduction in quality loss at a fixed 30% data reduction on dual RGB plus Depth/LiDAR setups with SynDrone.

What carries the argument

The filter-out-safe (FoS) labels that quantify each modality's necessity relative to the fused decision, used to guide compaction of the edge model with auxiliary near-sensor predictions.

If this is right

The approach sustains task quality at substantially higher data-reduction rates than uni-modal filters.
End-to-end energy use drops by up to 33 times at 1% event prevalence and 11 times at 10% prevalence.
Quality loss falls by 92.3% at a fixed 30% data reduction compared with prior filtering baselines.
The decision layer scales linearly with the number of sensors because cross-modal dependencies are handled at training time.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If FoS labels remain reliable on new tasks, the same labels could support dynamic sensor activation when environmental conditions change.
The linear scaling property suggests the method could handle three or more modalities without a proportional rise in edge compute.
Deployment on physical hardware would allow direct measurement of latency reductions that simulation alone cannot capture.

Load-bearing premise

The filter-out-safe labels produced by the server-side fusion model accurately capture each modality's necessity for the downstream task without introducing bias that would degrade the compacted edge model.

What would settle it

Running the compacted edge model on held-out SynDrone sequences and measuring whether task quality loss at 30% data reduction exceeds the claimed 92.3% reduction relative to always-transmit baselines would settle the performance claims.

Figures

Figures reproduced from arXiv: 2605.22868 by Hyunwoo Oh, Minhyoung Na, Mohsen Imani, Ryozo Masukawa, Sanggeon Yun, Sungheon Jeong, Wenjun Huang, Yoshiki Yamaguchi.

**Figure 1.** Figure 1: Comparison of our proposed sensing and information processing pipeline with other approaches: (a) Conventional approach, (b) Compression-based approach, (c) Using a previously proposed filter-out approach designed for a single sensor environment, and (d) ours [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Overview of the Proposed Three-step Training Method: (a) presents a schematic representation of the entire Three-step Training process. The process begins with the initial training phase depicted in (b), proceeds to the secondary training phase illustrated in (c), and concludes with the tertiary training phase outlined in (d). for RGB and depth modality are retrieved for all data points, augmenting the da… view at source ↗

**Figure 3.** Figure 3: Comparative distribution of energy consumption across four methods: the conventional method, the compressive near-sensor approach, the previous filtering-out approach using individual near-sensor models, and our proposed method, across varying probabilities of FoI. The total energy consumption values are normalized to the total of the conventional method and displayed at the center of each distribution. Qu… view at source ↗

**Figure 4.** Figure 4: Trade-off relationship between data efficiency indicating the saved portion of the data in size and quality loss which is the performance drop rate of the server-side model when using filteredout data. interest as a frame of interest. The vehicle coarse class contains 6 subclasses: Car, Truck, Bus, Train, Motorcycle, and Bicycle. In our evaluation, we assume a scenario of the edge-side or server-side fusi… view at source ↗

read the original abstract

Autonomous systems and smart-industry deployments increasingly split computation across near-sensor, edge, and cloud resources, where tight energy, latency, and reliability budgets demand run-time adaptivity. In practice, deciding what to compute and transmit at each point is pivotal; yet as multimodal sensor suites (cameras, LiDAR/depth, etc.) proliferate at the edge, most prior approaches either (i) fuse modalities on powerful servers or (ii) apply uni-modal near-sensor filters that ignore cross-modal dependencies, leading to redundant transmissions or missed events. We present FusionSense, a fusion-aware intelligent sensing framework for energy-constrained autonomous edge systems. Lightweight near-sensor classifiers are trained via a three-step procedure: (i) a server-side fusion model learns the downstream task, (ii) filter-out-safe (FoS) labels quantify each modality's necessity relative to the fused decision, and (iii) an edge-side fusion model is compacted by injecting near-sensor predictions as auxiliary signals. The result is a run-time decision layer that jointly reduces compute and communication while scaling linearly with sensor count. On a dual-modality (RGB+Depth/LiDAR) setup with SynDrone, FusionSense sustains task quality at substantially higher data-reduction rates than uni-modal filters and delivers large end-to-end gains: up to 33x lower energy at 1% FoI prevalence, 11x at 10%, a 92.3% reduction in quality loss at a fixed 30% data reduction, and roughly 1.5x higher energy savings than the best prior filtering baseline.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FusionSense's tri-stage pipeline for multimodal edge sensing looks practically motivated but the big energy claims rest on details not visible in the abstract.

read the letter

The core idea is a three-step training flow: train a server-side fusion model on the task, derive filter-out-safe labels from it to mark per-modality necessity, then compact an edge-side model by feeding near-sensor predictions as extra signals. This targets the gap between crude uni-modal filters and full server fusion for things like drones or vehicles where bandwidth and power are tight. On the SynDrone dual-modality case it reports large reported gains in energy and data reduction while holding task quality, which is the kind of engineering result that could matter for deployment if it holds up. The approach is new as a packaged combination even if pieces echo distillation or filtering work already in the literature. The stress-test worry about FoS labels carrying server-model bias is reasonable to check, but the abstract alone gives no way to test whether the labels actually match modality importance or introduce correlated noise. No ablations, error bars, or dataset splits are shown, so the 33x energy number at 1% prevalence cannot be evaluated yet. This is the sort of paper that belongs in a reading group focused on edge multimodal systems; the idea is clear enough to discuss even if the numbers need scrutiny. A serious editor should send it to review once the full methods and results are in hand, because the practical framing is worth referee time if the experiments are reproducible.

Referee Report

2 major / 2 minor

Summary. The paper introduces FusionSense, a tri-stage near-sensor learning framework for runtime-adaptive multimodal edge intelligence. A server-side fusion model is trained on the downstream task; filter-out-safe (FoS) labels are then derived to quantify each modality's necessity relative to the fused output; finally, an edge-side model is compacted by injecting near-sensor predictions as auxiliary signals. On a dual-modality (RGB + Depth/LiDAR) SynDrone setup, the approach is claimed to sustain task quality at higher data-reduction rates than uni-modal filters, yielding up to 33× lower energy at 1% FoI prevalence, 11× at 10%, a 92.3% reduction in quality loss at 30% data reduction, and ~1.5× higher energy savings than the best prior baseline.

Significance. If the quantitative claims are reproducible and the FoS-label transfer is shown to be unbiased, the work would offer a practical advance for energy- and bandwidth-constrained multimodal edge systems by exploiting cross-modal dependencies rather than treating modalities independently. The linear scaling with sensor count and the explicit three-stage training recipe are potentially useful for deployment.

major comments (2)

[Tri-stage procedure (server fusion → FoS labeling → edge compaction)] The central performance claims (33× energy reduction, 92.3% quality-loss reduction, etc.) rest on the correctness of the FoS-label generation step. No quantitative validation is supplied that FoS labels agree with modality importance measured by ablation or leave-one-modality-out experiments; if the server fusion surface over-weights one modality or the implicit thresholding introduces correlated label noise, the compacted edge model will inherit the same bias and the reported data-reduction gains will not generalize. This assumption is load-bearing for all end-to-end numbers.
[Experiments on SynDrone] The experimental section reports aggregate energy and quality metrics but supplies no dataset splits, number of runs, error bars, or explicit baseline implementations. Without these, the claimed superiority over “uni-modal filters” and “best prior filtering baseline” cannot be assessed for statistical significance or implementation fairness.

minor comments (2)

[Method] Notation for FoS label extraction (thresholding relative to fused output) is described only at a high level; a precise algorithmic statement or pseudocode would improve reproducibility.
[Abstract] The abstract states “roughly 1.5× higher energy savings” without specifying the exact prior baseline or the operating point at which the comparison is made.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of validation and reproducibility. We address each major comment below and will revise the manuscript to incorporate the suggested improvements.

read point-by-point responses

Referee: The central performance claims (33× energy reduction, 92.3% quality-loss reduction, etc.) rest on the correctness of the FoS-label generation step. No quantitative validation is supplied that FoS labels agree with modality importance measured by ablation or leave-one-modality-out experiments; if the server fusion surface over-weights one modality or the implicit thresholding introduces correlated label noise, the compacted edge model will inherit the same bias and the reported data-reduction gains will not generalize. This assumption is load-bearing for all end-to-end numbers.

Authors: We agree that direct quantitative validation of FoS labels against ablation studies is necessary to confirm they accurately capture cross-modal necessity without bias. In the revised manuscript we will add leave-one-modality-out ablation experiments on the server fusion model and report agreement metrics (e.g., rank correlation or precision of modality importance) between these results and the derived FoS labels. This will strengthen the claims and allow readers to assess potential bias. revision: yes
Referee: The experimental section reports aggregate energy and quality metrics but supplies no dataset splits, number of runs, error bars, or explicit baseline implementations. Without these, the claimed superiority over “uni-modal filters” and “best prior filtering baseline” cannot be assessed for statistical significance or implementation fairness.

Authors: We acknowledge that the current experimental reporting lacks sufficient detail for reproducibility. In the revision we will explicitly state the SynDrone train/validation/test splits, the number of independent runs performed, include error bars or standard deviations on all reported metrics, and provide implementation details or references for the uni-modal filters and prior baselines to enable fair comparison and statistical assessment. revision: yes

Circularity Check

0 steps flagged

No circularity: pipeline described without equations or self-referential reductions

full rationale

The paper describes a tri-stage procedure (server fusion model, FoS label generation, edge compaction) in prose only. No equations, derivations, fitted parameters renamed as predictions, or self-citations appear in the provided text. The FoS labeling step is presented as an independent quantification step rather than a self-definition or fitted-input prediction. The central claims rest on empirical gains on SynDrone rather than any load-bearing mathematical reduction to inputs. This is the normal self-contained case.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The FoS labeling step implicitly requires an unstated decision threshold or safety criterion that is fitted or chosen to produce the reported energy numbers.

pith-pipeline@v0.9.0 · 5846 in / 1224 out tokens · 17506 ms · 2026-05-25T05:40:19.700233+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages · 1 internal anchor

[1]

Hamidreza Alikhani, Anil Kanduri, Pasi Liljeberg, Amir M Rahmani, and Nikil Dutt. 2023. DynaFuse: dynamic fusion for resource efficient multimodal machine learning inference.IEEE Embedded Systems Letters15, 4 (2023), 222–225

work page 2023
[2]

Hamidreza Alikhani, Ziyu Wang, Anil Kanduri, Pasi Lilieberg, Amir M Rah- mani, and Nikil Dutt. 2024. SEAL: Sensing efficient active learning on wearables through context-awareness. In2024 Design, Automation & Test in Europe Confer- ence & Exhibition (DATE). IEEE, 1–2

work page 2024
[3]

Alan Baade, Puyuan Peng, and David Harwath. 2022. Mae-ast: Masked autoen- coding audio spectrogram transformer.arXiv preprint arXiv:2203.16691(2022)

work page arXiv 2022
[4]

Safa Bahri, Nesrine Zoghlami, Mourad Abed, and João Manuel RS Tavares. 2018. Big data for healthcare: a survey.IEEE access7 (2018), 7397–7408

work page 2018
[5]

Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuScenes: A Multimodal Dataset for Autonomous Driving. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

work page 2020
[6]

Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-K˜irkpatrick, and Shlomo Dubnov. 2022. HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection. InICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 646–650. doi:10. 1109/ICASSP43922.2022.9746312

work page arXiv 2022
[7]

Zhuo Chen, Kui Fan, Shiqi Wang, Lingyu Duan, Weisi Lin, and Alex Chichung Kot. 2020. Toward Intelligent Sensing: Intermediate Deep Feature Compression. IEEE Transactions on Image Processing29 (2020), 2230–2243. doi:10.1109/TIP.2019. 2941660

work page doi:10.1109/tip.2019 2020
[8]

Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, and Ishan Misra. 2023. Imagebind: One embedding space to bind them all. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15180–15190

work page 2023
[9]

H Hu, F Wang, J Su, Y Wang, L Hu, W Fang, J Xu, and Z Zhang. 2023. EA-LSS: Edge-aware Lift-splat-shot Framework for 3D BEV Object Detection.arXiv preprint arXiv:2303.178952 (2023). FusionSense: Tri-Stage Near-Sensor Learning for Runtime-Adaptive Multimodal Edge Intelligence

work page arXiv 2023
[10]

Wenjun Huang, Arghavan Rezvani, Hanning Chen, Yang Ni, Sanggeon Yun, Sungheon Jeong, and Mohsen Imani. 2024. A Plug-in Tiny AI Module for Intelli- gent and Selective Sensor Data Transmission.arXiv preprint arXiv:2402.02043 (2024)

work page arXiv 2024
[11]

Sang-Ho Hwang, Kyung-Min Kim, Sungho Kim, and Jong Wook Kwak. 2023. Lossless Data Compression for Time-Series Sensor Data Based on Dynamic Bit Packing.Sensors23, 20 (2023), 8575

work page 2023
[12]

Samuel Isuwa, David Amos, Amit Kumar Singh, Bashir M Al-Hashimi, and Geoff V Merrett. 2023. Content-and lighting-aware adaptive brightness scaling for improved mobile user experience. In2023 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1–2

work page 2023
[13]

Abbas Javed, Hadi Larijani, Ali Ahmadinia, Rohinton Emmanuel, Mike Mannion, and Des Gibson. 2017. Design and Implementation of a Cloud Enabled Random Neural Network-Based Decentralized Smart Controller With Intelligent Sensor Nodes for HVAC.IEEE Internet of Things Journal4, 2 (2017), 393–403. doi:10. 1109/JIOT.2016.2627403

work page arXiv 2017
[14]

Bushra Khalid, Kashif Naseer Qureshi, Kayhan Zrar Ghafoor, and Gwanggil Jeon

work page
[15]

An improved biometric based user authentication and key agreement scheme for intelligent sensor based wireless communication.Microprocessors and Microsystems96 (2023), 104722

work page 2023
[16]

Ohn Kim, Junwon Seo, Seongyong Ahn, and Chong Hui Kim. 2024. UFO: Uncertainty-aware LiDAR-image Fusion for Off-road Semantic Terrain Map Estimation.arXiv preprint arXiv:2403.02642(2024)

work page arXiv 2024
[17]

Yecheol Kim, Konyul Park, Minwook Kim, Dongsuk Kum, and Jun Won Choi

work page
[18]

3D dual-fusion: Dual-domain dual-query camera-LIDAR fusion for 3D object detection.arXiv preprint arXiv:2211.13529(2022)

work page arXiv 2022
[19]

Brett Koonce and Brett Koonce. 2021. MobileNetV3.Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization (2021), 125–144

work page 2021
[20]

Lee, Matthew Tan, Yuke Zhu, and Jeannette Bohg

Michelle A. Lee, Matthew Tan, Yuke Zhu, and Jeannette Bohg. 2021. Detect, Reject, Correct: Crossmodal Compensation of Corrupted Sensors. In2021 IEEE International Conference on Robotics and Automation (ICRA). 909–916. doi:10. 1109/ICRA48506.2021.9561847

work page arXiv 2021
[21]

Jinglong Li and Han Han. 2022. Emotional Design Strategy of Smart Furniture for Small Households Based on User Experience. InInternational Conference on Human-Computer Interaction. Springer, 311–320

work page 2022
[22]

Konstantinos G Liakos, Patrizia Busato, Dimitrios Moshou, Simon Pearson, and Dionysis Bochtis. 2018. Machine learning in agriculture: A review.Sensors18, 8 (2018), 2674

work page 2018
[23]

Clemens Linnhoff, Kristof Hofrichter, Lukas Elster, Philipp Rosenberger, and Hermann Winner. 2022. Measuring the influence of environmental conditions on automotive lidar sensors.Sensors22, 14 (2022), 5266

work page 2022
[24]

Guan-Horng Liu, Avinash Siravuru, Sai Prabhakar, Manuela Veloso, and George Kantor. 2017. Learning end-to-end multimodal sensor policies for autonomous navigation. InConference on Robot Learning. PMLR, 249–261

work page 2017
[25]

Zheyu Liu, Erxiang Ren, Fei Qiao, Qi Wei, Xinjun Liu, Li Luo, Huichan Zhao, and Huazhong Yang. 2020. NS-CIM: A Current-Mode Computation-in-Memory Architecture Enabling Near-Sensor Processing for Intelligent IoT Vision Nodes. IEEE Transactions on Circuits and Systems I: Regular Papers67, 9 (2020), 2909–2922. doi:10.1109/TCSI.2020.2984161

work page doi:10.1109/tcsi.2020.2984161 2020
[26]

Zhijian Liu, Haotian Tang, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela L Rus, and Song Han. 2023. Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation. In2023 IEEE international conference on robotics and automation (ICRA). IEEE, 2774–2781

work page 2023
[27]

R Madhusudhan and P Pravisha. 2024. Blockchain Based Artificial Intelligence of Things (AIoT) for Wildlife Monitoring. InInternational Conference on Advanced Information Networking and Applications. Springer, 25–36

work page 2024
[28]

Yang Ni, Yeseong Kim, Tajana Rosing, and Mohsen Imani. 2022. Online perfor- mance and power prediction for edge TPU via comprehensive characterization. In2022 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 612–615

work page 2022
[29]

Shahriar Nirjon, Robert F Dickerson, Philip Asare, Qiang Li, Dezhi Hong, John A Stankovic, Pan Hu, Guobin Shen, and Xiaofan Jiang. 2013. Auditeur: A mobile- cloud service platform for acoustic event detection on smartphones. InProceeding of the 11th annual international conference on Mobile systems, applications, and services. 403–416

work page 2013
[30]

Sabyasachi Pramanik, Digvijay Pandey, Subhankar Joardar, M Niranjanamurthy, Binay Kumar Pandey, and Jaspinder Kaur. 2023. An overview of IoT privacy and security in smart cities. InAIP Conference Proceedings, Vol. 2495. AIP Publishing

work page 2023
[31]

Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, and Piotr Dollár. 2020. Designing network design spaces. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10428–10436

work page 2020
[32]

Giulia Rizzoli, Francesco Barbato, Matteo Caligiuri, and Pietro Zanuttigh. 2023. SynDrone-Multi-Modal UAV Dataset for Urban Scenarios. InProceedings of the IEEE/CVF International Conference on Computer Vision. 2210–2220

work page 2023
[33]

Takami Sato, Yuki Hayakawa, Ryo Suzuki, Yohsuke Shiiki, Kentaro Yoshioka, and Qi Alfred Chen. 2023. Revisiting LiDAR Spoofing Attack Capabilities against Object Detection: Improvements, Measurement, and New Attack.arXiv preprint arXiv:2303.10555(2023)

work page arXiv 2023
[34]

Abhishek Sharma, Vaidehi Sharma, Mohita Jaiswal, Hwang-Cheng Wang, Dushantha Nalin K Jayakody, Chathuranga M Wijerathna Basnayaka, and Am- mar Muthanna. 2022. Recent trends in AI-based intelligent sensing.Electronics 11, 10 (2022), 1661

work page 2022
[35]

Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. 2016. Edge Computing: Vision and Challenges.IEEE Internet of Things Journal3, 5 (2016), 637–646. doi:10.1109/JIOT.2016.2579198

work page doi:10.1109/jiot.2016.2579198 2016
[36]

Kailai Sun, Xinwei Wang, and Qianchuan Zhao. 2023. A Review of AIoT-based Edge Devices and Lightweight Deployment.Authorea Preprints(2023)

work page 2023
[37]

Kuniyuki Takahashi and Jethro Tan. 2019. Deep visuo-tactile learning: Estimation of tactile properties from images. In2019 International Conference on Robotics and Automation (ICRA). IEEE, 8951–8957

work page 2019
[38]

Zain Taufique, Aman Vyas, Antonio Miele, Pasi Liljeberg, and Anil Kanduri. 2025. HiDP: Hierarchical DNN Partitioning for Distributed Inference on Heterogeneous Edge Platforms. In2025 Design, Automation & Test in Europe Conference (DATE). IEEE, 1–7

work page 2025
[39]

Dequn Teng. 2021. AIoT Powered Wild Animal Tracing and Protection System Research Proposal for MRes in Engineering Science Supervised By Niki Trigoni. (2021)

work page 2021
[40]

Yao-Hung Hubert Tsai, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency, and Ruslan Salakhutdinov. 2018. Learning factorized multimodal representations. arXiv preprint arXiv:1806.06176(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[41]

Delia Velasco-Montero, Jorge Fernández-Berni, Ricardo Carmona-Galán, and Ángel Rodríguez-Vázquez. 2018. Performance analysis of real-time DNN infer- ence on Raspberry Pi. InReal-Time Image and Video Processing 2018, Vol. 10670. SPIE, 115–123

work page 2018
[42]

Shurun Wang, Shiqi Wang, Wenhan Yang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, and Wen Gao. 2022. Towards Analysis-Friendly Face Representation With Scalable Feature and Texture Compression.IEEE Transactions on Multimedia24 (2022), 3169–3181. doi:10.1109/TMM.2021.3094300

work page doi:10.1109/tmm.2021.3094300 2022
[43]

Manuel Woschank, Erwin Rauch, and Helmut Zsifkovits. 2020. A review of further directions for artificial intelligence, machine learning, and deep learning in smart logistics.Sustainability12, 9 (2020), 3760

work page 2020
[44]

Ho-Hsiang Wu, Prem Seetharaman, Kundan Kumar, and Juan Pablo Bello. 2022. Wav2clip: Learning robust audio representations from clip. InICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4563–4567

work page 2022
[45]

Zihui Xue and Radu Marculescu. 2023. Dynamic Multimodal Fusion. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 2575–2584

work page 2023
[46]

Lei Xun, Mingyu Hu, Hengrui Zhao, Amit Kumar Singh, Jonathon Hare, and Geoff V Merrett. 2024. Fluid dynamic DNNs for reliable and adaptive distributed inference on edge devices. In2024 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1–2

work page 2024
[47]

Sensing with Computing

Xinghua Yang, Zheyu Liu, Kechao Tang, Xunzhao Yin, Cheng Zhuo, Qi Wei, and Fei Qiao. 2023. Breaking the energy-efficiency barriers for smart sensing appli- cations with “Sensing with Computing” architectures.Science China Information Sciences66, 10 (2023), 200409

work page 2023
[48]

Sanggeon Yun, Hanning Chen, Ryozo Masukawa, Hamza Errahmouni Barkam, Andrew Ding, Wenjun Huang, Arghavan Rezvani, Shaahin Angizi, and Mohsen Imani. 2024. HyperSense: Hyperdimensional Intelligent Sensing for Energy- Efficient Sparse Data Processing.Advanced Intelligent Systems6, 12 (2024), 2400228

work page 2024
[49]

Sanggeon Yun, Ryozo Masukawa, Hanning Chen, Sungheon Jeong, Wenjun Huang, Arghavan Rezvani, Minhyoung Na, Yoshiki Yamaguchi, and Mohsen Imani. 2025. Hyperdimensional intelligent sensing for efficient real-time audio processing on extreme edge.IEEE Access(2025)

work page 2025
[50]

Sanggeon Yun, Ryozo Masukawa, Raheeb Hassan, Minhyoung Na, and Mohsen Imani. 2026. Contextual Fusion Strategies for Multimodal GNN-based Reasoning: Performance and Computational Trade-offs.IEEE Access(2026)

work page 2026
[51]

Sanggeon Yun, Ryozo Masukawa, Minhyoung Na, and Mohsen Imani. 2025. Mis- siongnn: Hierarchical multimodal gnn-based weakly supervised video anomaly recognition with mission-specific knowledge graph generation. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (W ACV). IEEE, 4736–4745

work page 2025
[52]

Sanggeon Yun, Hyunwoo Oh, Ryozo Masukawa, and Mohsen Imani. 2026. De- coHD: Decomposed Hyperdimensional Classification under Extreme Memory Budgets. In2026 Design, Automation & Test in Europe Conference (DATE). IEEE

work page 2026
[53]

Sanggeon Yun, Hyunwoo Oh, Ryozo Masukawa, Pietro Mercati, Nathaniel D Bas- tian, and Mohsen Imani. 2026. LogHD: Robust Compression of Hyperdimensional Classifiers via Logarithmic Class-Axis Reduction. In2026 Design, Automation & Test in Europe Conference (DATE). IEEE

work page 2026
[54]

Tan Zhi-Xuan, Harold Soh, and Desmond Ong. 2020. Factorized inference in deep markov models for incomplete multimodal time series. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10334–10341

work page 2020

[1] [1]

Hamidreza Alikhani, Anil Kanduri, Pasi Liljeberg, Amir M Rahmani, and Nikil Dutt. 2023. DynaFuse: dynamic fusion for resource efficient multimodal machine learning inference.IEEE Embedded Systems Letters15, 4 (2023), 222–225

work page 2023

[2] [2]

Hamidreza Alikhani, Ziyu Wang, Anil Kanduri, Pasi Lilieberg, Amir M Rah- mani, and Nikil Dutt. 2024. SEAL: Sensing efficient active learning on wearables through context-awareness. In2024 Design, Automation & Test in Europe Confer- ence & Exhibition (DATE). IEEE, 1–2

work page 2024

[3] [3]

Alan Baade, Puyuan Peng, and David Harwath. 2022. Mae-ast: Masked autoen- coding audio spectrogram transformer.arXiv preprint arXiv:2203.16691(2022)

work page arXiv 2022

[4] [4]

Safa Bahri, Nesrine Zoghlami, Mourad Abed, and João Manuel RS Tavares. 2018. Big data for healthcare: a survey.IEEE access7 (2018), 7397–7408

work page 2018

[5] [5]

Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuScenes: A Multimodal Dataset for Autonomous Driving. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

work page 2020

[6] [6]

Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-K˜irkpatrick, and Shlomo Dubnov. 2022. HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection. InICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 646–650. doi:10. 1109/ICASSP43922.2022.9746312

work page arXiv 2022

[7] [7]

Zhuo Chen, Kui Fan, Shiqi Wang, Lingyu Duan, Weisi Lin, and Alex Chichung Kot. 2020. Toward Intelligent Sensing: Intermediate Deep Feature Compression. IEEE Transactions on Image Processing29 (2020), 2230–2243. doi:10.1109/TIP.2019. 2941660

work page doi:10.1109/tip.2019 2020

[8] [8]

Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, and Ishan Misra. 2023. Imagebind: One embedding space to bind them all. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15180–15190

work page 2023

[9] [9]

H Hu, F Wang, J Su, Y Wang, L Hu, W Fang, J Xu, and Z Zhang. 2023. EA-LSS: Edge-aware Lift-splat-shot Framework for 3D BEV Object Detection.arXiv preprint arXiv:2303.178952 (2023). FusionSense: Tri-Stage Near-Sensor Learning for Runtime-Adaptive Multimodal Edge Intelligence

work page arXiv 2023

[10] [10]

Wenjun Huang, Arghavan Rezvani, Hanning Chen, Yang Ni, Sanggeon Yun, Sungheon Jeong, and Mohsen Imani. 2024. A Plug-in Tiny AI Module for Intelli- gent and Selective Sensor Data Transmission.arXiv preprint arXiv:2402.02043 (2024)

work page arXiv 2024

[11] [11]

Sang-Ho Hwang, Kyung-Min Kim, Sungho Kim, and Jong Wook Kwak. 2023. Lossless Data Compression for Time-Series Sensor Data Based on Dynamic Bit Packing.Sensors23, 20 (2023), 8575

work page 2023

[12] [12]

Samuel Isuwa, David Amos, Amit Kumar Singh, Bashir M Al-Hashimi, and Geoff V Merrett. 2023. Content-and lighting-aware adaptive brightness scaling for improved mobile user experience. In2023 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1–2

work page 2023

[13] [13]

Abbas Javed, Hadi Larijani, Ali Ahmadinia, Rohinton Emmanuel, Mike Mannion, and Des Gibson. 2017. Design and Implementation of a Cloud Enabled Random Neural Network-Based Decentralized Smart Controller With Intelligent Sensor Nodes for HVAC.IEEE Internet of Things Journal4, 2 (2017), 393–403. doi:10. 1109/JIOT.2016.2627403

work page arXiv 2017

[14] [14]

Bushra Khalid, Kashif Naseer Qureshi, Kayhan Zrar Ghafoor, and Gwanggil Jeon

work page

[15] [15]

An improved biometric based user authentication and key agreement scheme for intelligent sensor based wireless communication.Microprocessors and Microsystems96 (2023), 104722

work page 2023

[16] [16]

Ohn Kim, Junwon Seo, Seongyong Ahn, and Chong Hui Kim. 2024. UFO: Uncertainty-aware LiDAR-image Fusion for Off-road Semantic Terrain Map Estimation.arXiv preprint arXiv:2403.02642(2024)

work page arXiv 2024

[17] [17]

Yecheol Kim, Konyul Park, Minwook Kim, Dongsuk Kum, and Jun Won Choi

work page

[18] [18]

3D dual-fusion: Dual-domain dual-query camera-LIDAR fusion for 3D object detection.arXiv preprint arXiv:2211.13529(2022)

work page arXiv 2022

[19] [19]

Brett Koonce and Brett Koonce. 2021. MobileNetV3.Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization (2021), 125–144

work page 2021

[20] [20]

Lee, Matthew Tan, Yuke Zhu, and Jeannette Bohg

Michelle A. Lee, Matthew Tan, Yuke Zhu, and Jeannette Bohg. 2021. Detect, Reject, Correct: Crossmodal Compensation of Corrupted Sensors. In2021 IEEE International Conference on Robotics and Automation (ICRA). 909–916. doi:10. 1109/ICRA48506.2021.9561847

work page arXiv 2021

[21] [21]

Jinglong Li and Han Han. 2022. Emotional Design Strategy of Smart Furniture for Small Households Based on User Experience. InInternational Conference on Human-Computer Interaction. Springer, 311–320

work page 2022

[22] [22]

Konstantinos G Liakos, Patrizia Busato, Dimitrios Moshou, Simon Pearson, and Dionysis Bochtis. 2018. Machine learning in agriculture: A review.Sensors18, 8 (2018), 2674

work page 2018

[23] [23]

Clemens Linnhoff, Kristof Hofrichter, Lukas Elster, Philipp Rosenberger, and Hermann Winner. 2022. Measuring the influence of environmental conditions on automotive lidar sensors.Sensors22, 14 (2022), 5266

work page 2022

[24] [24]

Guan-Horng Liu, Avinash Siravuru, Sai Prabhakar, Manuela Veloso, and George Kantor. 2017. Learning end-to-end multimodal sensor policies for autonomous navigation. InConference on Robot Learning. PMLR, 249–261

work page 2017

[25] [25]

Zheyu Liu, Erxiang Ren, Fei Qiao, Qi Wei, Xinjun Liu, Li Luo, Huichan Zhao, and Huazhong Yang. 2020. NS-CIM: A Current-Mode Computation-in-Memory Architecture Enabling Near-Sensor Processing for Intelligent IoT Vision Nodes. IEEE Transactions on Circuits and Systems I: Regular Papers67, 9 (2020), 2909–2922. doi:10.1109/TCSI.2020.2984161

work page doi:10.1109/tcsi.2020.2984161 2020

[26] [26]

Zhijian Liu, Haotian Tang, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela L Rus, and Song Han. 2023. Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation. In2023 IEEE international conference on robotics and automation (ICRA). IEEE, 2774–2781

work page 2023

[27] [27]

R Madhusudhan and P Pravisha. 2024. Blockchain Based Artificial Intelligence of Things (AIoT) for Wildlife Monitoring. InInternational Conference on Advanced Information Networking and Applications. Springer, 25–36

work page 2024

[28] [28]

Yang Ni, Yeseong Kim, Tajana Rosing, and Mohsen Imani. 2022. Online perfor- mance and power prediction for edge TPU via comprehensive characterization. In2022 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 612–615

work page 2022

[29] [29]

Shahriar Nirjon, Robert F Dickerson, Philip Asare, Qiang Li, Dezhi Hong, John A Stankovic, Pan Hu, Guobin Shen, and Xiaofan Jiang. 2013. Auditeur: A mobile- cloud service platform for acoustic event detection on smartphones. InProceeding of the 11th annual international conference on Mobile systems, applications, and services. 403–416

work page 2013

[30] [30]

Sabyasachi Pramanik, Digvijay Pandey, Subhankar Joardar, M Niranjanamurthy, Binay Kumar Pandey, and Jaspinder Kaur. 2023. An overview of IoT privacy and security in smart cities. InAIP Conference Proceedings, Vol. 2495. AIP Publishing

work page 2023

[31] [31]

Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, and Piotr Dollár. 2020. Designing network design spaces. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10428–10436

work page 2020

[32] [32]

Giulia Rizzoli, Francesco Barbato, Matteo Caligiuri, and Pietro Zanuttigh. 2023. SynDrone-Multi-Modal UAV Dataset for Urban Scenarios. InProceedings of the IEEE/CVF International Conference on Computer Vision. 2210–2220

work page 2023

[33] [33]

Takami Sato, Yuki Hayakawa, Ryo Suzuki, Yohsuke Shiiki, Kentaro Yoshioka, and Qi Alfred Chen. 2023. Revisiting LiDAR Spoofing Attack Capabilities against Object Detection: Improvements, Measurement, and New Attack.arXiv preprint arXiv:2303.10555(2023)

work page arXiv 2023

[34] [34]

Abhishek Sharma, Vaidehi Sharma, Mohita Jaiswal, Hwang-Cheng Wang, Dushantha Nalin K Jayakody, Chathuranga M Wijerathna Basnayaka, and Am- mar Muthanna. 2022. Recent trends in AI-based intelligent sensing.Electronics 11, 10 (2022), 1661

work page 2022

[35] [35]

Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. 2016. Edge Computing: Vision and Challenges.IEEE Internet of Things Journal3, 5 (2016), 637–646. doi:10.1109/JIOT.2016.2579198

work page doi:10.1109/jiot.2016.2579198 2016

[36] [36]

Kailai Sun, Xinwei Wang, and Qianchuan Zhao. 2023. A Review of AIoT-based Edge Devices and Lightweight Deployment.Authorea Preprints(2023)

work page 2023

[37] [37]

Kuniyuki Takahashi and Jethro Tan. 2019. Deep visuo-tactile learning: Estimation of tactile properties from images. In2019 International Conference on Robotics and Automation (ICRA). IEEE, 8951–8957

work page 2019

[38] [38]

Zain Taufique, Aman Vyas, Antonio Miele, Pasi Liljeberg, and Anil Kanduri. 2025. HiDP: Hierarchical DNN Partitioning for Distributed Inference on Heterogeneous Edge Platforms. In2025 Design, Automation & Test in Europe Conference (DATE). IEEE, 1–7

work page 2025

[39] [39]

Dequn Teng. 2021. AIoT Powered Wild Animal Tracing and Protection System Research Proposal for MRes in Engineering Science Supervised By Niki Trigoni. (2021)

work page 2021

[40] [40]

Yao-Hung Hubert Tsai, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency, and Ruslan Salakhutdinov. 2018. Learning factorized multimodal representations. arXiv preprint arXiv:1806.06176(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[41] [41]

Delia Velasco-Montero, Jorge Fernández-Berni, Ricardo Carmona-Galán, and Ángel Rodríguez-Vázquez. 2018. Performance analysis of real-time DNN infer- ence on Raspberry Pi. InReal-Time Image and Video Processing 2018, Vol. 10670. SPIE, 115–123

work page 2018

[42] [42]

Shurun Wang, Shiqi Wang, Wenhan Yang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, and Wen Gao. 2022. Towards Analysis-Friendly Face Representation With Scalable Feature and Texture Compression.IEEE Transactions on Multimedia24 (2022), 3169–3181. doi:10.1109/TMM.2021.3094300

work page doi:10.1109/tmm.2021.3094300 2022

[43] [43]

Manuel Woschank, Erwin Rauch, and Helmut Zsifkovits. 2020. A review of further directions for artificial intelligence, machine learning, and deep learning in smart logistics.Sustainability12, 9 (2020), 3760

work page 2020

[44] [44]

Ho-Hsiang Wu, Prem Seetharaman, Kundan Kumar, and Juan Pablo Bello. 2022. Wav2clip: Learning robust audio representations from clip. InICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4563–4567

work page 2022

[45] [45]

Zihui Xue and Radu Marculescu. 2023. Dynamic Multimodal Fusion. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 2575–2584

work page 2023

[46] [46]

Lei Xun, Mingyu Hu, Hengrui Zhao, Amit Kumar Singh, Jonathon Hare, and Geoff V Merrett. 2024. Fluid dynamic DNNs for reliable and adaptive distributed inference on edge devices. In2024 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1–2

work page 2024

[47] [47]

Sensing with Computing

Xinghua Yang, Zheyu Liu, Kechao Tang, Xunzhao Yin, Cheng Zhuo, Qi Wei, and Fei Qiao. 2023. Breaking the energy-efficiency barriers for smart sensing appli- cations with “Sensing with Computing” architectures.Science China Information Sciences66, 10 (2023), 200409

work page 2023

[48] [48]

Sanggeon Yun, Hanning Chen, Ryozo Masukawa, Hamza Errahmouni Barkam, Andrew Ding, Wenjun Huang, Arghavan Rezvani, Shaahin Angizi, and Mohsen Imani. 2024. HyperSense: Hyperdimensional Intelligent Sensing for Energy- Efficient Sparse Data Processing.Advanced Intelligent Systems6, 12 (2024), 2400228

work page 2024

[49] [49]

Sanggeon Yun, Ryozo Masukawa, Hanning Chen, Sungheon Jeong, Wenjun Huang, Arghavan Rezvani, Minhyoung Na, Yoshiki Yamaguchi, and Mohsen Imani. 2025. Hyperdimensional intelligent sensing for efficient real-time audio processing on extreme edge.IEEE Access(2025)

work page 2025

[50] [50]

Sanggeon Yun, Ryozo Masukawa, Raheeb Hassan, Minhyoung Na, and Mohsen Imani. 2026. Contextual Fusion Strategies for Multimodal GNN-based Reasoning: Performance and Computational Trade-offs.IEEE Access(2026)

work page 2026

[51] [51]

Sanggeon Yun, Ryozo Masukawa, Minhyoung Na, and Mohsen Imani. 2025. Mis- siongnn: Hierarchical multimodal gnn-based weakly supervised video anomaly recognition with mission-specific knowledge graph generation. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (W ACV). IEEE, 4736–4745

work page 2025

[52] [52]

Sanggeon Yun, Hyunwoo Oh, Ryozo Masukawa, and Mohsen Imani. 2026. De- coHD: Decomposed Hyperdimensional Classification under Extreme Memory Budgets. In2026 Design, Automation & Test in Europe Conference (DATE). IEEE

work page 2026

[53] [53]

Sanggeon Yun, Hyunwoo Oh, Ryozo Masukawa, Pietro Mercati, Nathaniel D Bas- tian, and Mohsen Imani. 2026. LogHD: Robust Compression of Hyperdimensional Classifiers via Logarithmic Class-Axis Reduction. In2026 Design, Automation & Test in Europe Conference (DATE). IEEE

work page 2026

[54] [54]

Tan Zhi-Xuan, Harold Soh, and Desmond Ong. 2020. Factorized inference in deep markov models for incomplete multimodal time series. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10334–10341

work page 2020