VA-FastNavi-MARL: Real-Time Robot Control with Multimedia-Driven Meta-Reinforcement Learning

Fengxiang Wang; Hong Wang; Shengxi Jing; Yang Zhang; Yuan Feng

arxiv: 2604.03998 · v1 · submitted 2026-04-05 · 💻 cs.RO

VA-FastNavi-MARL: Real-Time Robot Control with Multimedia-Driven Meta-Reinforcement Learning

Yang Zhang , Shengxi Jing , Fengxiang Wang , Yuan Feng , Hong Wang This is my paper

classification 💻 cs.RO

keywords real-timeva-fastnavi-marlcontrollearningmeta-reinforcementmultimediaadaptationaligns

0 comments

read the original abstract

Interpreting dynamic, heterogeneous multimedia commands with real-time responsiveness is critical for Human-Robot Interaction. We present VA-FastNavi-MARL, a framework that aligns asynchronous audio-visual inputs into a unified latent representation. By treating diverse instructions as a distribution of navigable goals via Meta-Reinforcement Learning, our method enables rapid adaptation to unseen directives with negligible inference overhead. Unlike approaches bottlenecked by heavy sensory processing, our modality-agnostic stream ensures seamless, low-latency control. Validation on a multi-arm workspace confirms that VA-FastNavi-MARL significantly outperforms baselines in sample efficiency and maintains robust, real-time execution even under noisy multimedia streams.

This paper has not been read by Pith yet.

VA-FastNavi-MARL: Real-Time Robot Control with Multimedia-Driven Meta-Reinforcement Learning

discussion (0)