ConSensus: Multi-Agent Collaboration for Multimodal Sensing

Hyungjun Yoon, Mohammad Malekzadeh, Sung-Ju Lee, Fahim Kawsar, Lorena Qendro · 2026 · cs.AI · arXiv 2601.06453

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Large language models (LLMs) are increasingly grounded in sensor data to perceive and reason about human physiology and the physical world. However, accurately interpreting heterogeneous multimodal sensor data remains a fundamental challenge. We show that a single monolithic LLM often fails to reason coherently across modalities, leading to incomplete interpretations and prior-knowledge bias. We introduce ConSensus, a training-free multi-agent collaboration framework that decomposes multimodal sensing tasks into specialized, modality-aware agents. To aggregate agent-level interpretations, we propose a hybrid fusion mechanism that balances semantic aggregation, which enables cross-modal reasoning and contextual understanding, with statistical consensus, which provides robustness through agreement across modalities. While each approach has complementary failure modes, their combination enables reliable inference under sensor noise and missing data. We evaluate ConSensus on five diverse multimodal sensing benchmarks, demonstrating an average accuracy improvement of 7.1% over the single-agent baseline. Furthermore, ConSensus matches or exceeds the performance of iterative multi-agent debate methods while achieving a 12.7 times reduction in average fusion token cost through a single-round hybrid fusion protocol, yielding a robust and efficient solution for real-world multimodal sensing tasks. The source code is available at https://github.com/nokia/multi-agent-collaboration-for-multimodal-sensing.

representative citing papers

SensingAgents: A Multi-Agent Collaborative Framework for Robust IMU Activity Recognition

cs.AI · 2026-05-06 · unverdicted · novelty 6.0

SensingAgents is a multi-agent LLM framework that reaches 79.5% zero-shot accuracy on IMU activity recognition by using position-specific analysts, debating advocates, and a final decision agent, beating prior agent and deep-learning baselines.

EduGage: Methods and Dataset for Sensor-Based Momentary Assessment of Engagement in Self-Guided Video Learning

cs.HC · 2026-05-02 · unverdicted · novelty 6.0

EduGage releases a multimodal sensor dataset and models for estimating learner engagement in self-guided video learning, reporting MAE of 0.81 and outperforming baselines with 16 participants.

From 'What' to 'How' and 'Why': Sharing LLM-Generated Retrospective Summaries of Older Adults' Passive Tracking Data with Remote Family Members

cs.HC · 2026-06-02 · unverdicted · novelty 5.0

Redesigned LLM summaries of older adults' tracking data, structured as multi-layer narratives, were rated higher in satisfaction, helpfulness, trust, and willingness by 11 remote family members in a survey.

citing papers explorer

Showing 3 of 3 citing papers after filters.

SensingAgents: A Multi-Agent Collaborative Framework for Robust IMU Activity Recognition cs.AI · 2026-05-06 · unverdicted · none · ref 28 · internal anchor
SensingAgents is a multi-agent LLM framework that reaches 79.5% zero-shot accuracy on IMU activity recognition by using position-specific analysts, debating advocates, and a final decision agent, beating prior agent and deep-learning baselines.
EduGage: Methods and Dataset for Sensor-Based Momentary Assessment of Engagement in Self-Guided Video Learning cs.HC · 2026-05-02 · unverdicted · none · ref 87 · internal anchor
EduGage releases a multimodal sensor dataset and models for estimating learner engagement in self-guided video learning, reporting MAE of 0.81 and outperforming baselines with 16 participants.
From 'What' to 'How' and 'Why': Sharing LLM-Generated Retrospective Summaries of Older Adults' Passive Tracking Data with Remote Family Members cs.HC · 2026-06-02 · unverdicted · none · ref 89 · internal anchor
Redesigned LLM summaries of older adults' tracking data, structured as multi-layer narratives, were rated higher in satisfaction, helpfulness, trust, and willingness by 11 remote family members in a survey.

ConSensus: Multi-Agent Collaboration for Multimodal Sensing

fields

years

verdicts

representative citing papers

citing papers explorer