Harmoni: Multimodal personalization of multi-user human- robot interactions with llms

· 2026 · arXiv 2601.19839

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

IntentVLM: Open-Vocabulary Intention Recognition through Forward-Inverse Modeling with Video-Language Models

cs.HC · 2026-04-27 · unverdicted · novelty 7.0

IntentVLM uses forward-inverse modeling in a two-stage video-language setup to reach up to 80% accuracy on open-vocabulary intention recognition benchmarks, beating baselines by 30% and matching human performance.

citing papers explorer

Showing 1 of 1 citing paper.

IntentVLM: Open-Vocabulary Intention Recognition through Forward-Inverse Modeling with Video-Language Models cs.HC · 2026-04-27 · unverdicted · none · ref 2
IntentVLM uses forward-inverse modeling in a two-stage video-language setup to reach up to 80% accuracy on open-vocabulary intention recognition benchmarks, beating baselines by 30% and matching human performance.

Harmoni: Multimodal personalization of multi-user human- robot interactions with llms

fields

years

verdicts

representative citing papers

citing papers explorer