Expertise need not monopolize: Action-specialized mixture of experts for vision-language-action learning

· 2025 · arXiv 2510.14300

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

EventVLA: Event-Driven Visual Evidence Memory for Long-Horizon Vision-Language-Action Policies

cs.CV · 2026-06-18 · unverdicted · novelty 6.0

EventVLA introduces foundational visual anchors and a Keyframe Evidence Memory module that predicts future keyframe probabilities from VLA embeddings to improve long-horizon task success by an average of 40% on 17 simulation and 4 real-world tasks.

SARM2: Multi-Task Stage Aware Reward Modeling for Self Improving Robotic Manipulation

cs.RO · 2026-06-09 · unverdicted · novelty 6.0

SARM2 presents RM, a multi-task stage-aware reward model achieving 80% lower value-estimation MSE, which when used in SPIRAL boosts manipulation task success from ~50% to near-perfect on several benchmarks.

TORL-VLA: Tactile Guided Online Reinforcement Learning for Contact-Rich Manipulation

cs.RO · 2026-06-08 · unverdicted · novelty 6.0

TORL-VLA couples a tactile wrench-aware VLA policy with a lightweight online RL module and an intervention-censored critic to improve success and efficiency on contact-rich robotic tasks.

ELAN4D: Embodiment-Centric 4D Supervision for Vision-Language-Action Models via Plug-and-Play Adaptation

cs.RO · 2026-05-28 · unverdicted · novelty 6.0

ELAN4D introduces plug-and-play 4D keypoint track supervision from forward kinematics to enhance VLA policy generalization in robotic manipulation tasks.

GuidedVLA: Specifying Task-Relevant Factors via Plug-and-Play Action Attention Specialization

cs.RO · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

GuidedVLA improves VLA generalization by supervising individual attention heads with manually defined auxiliary signals for three task-relevant factors.

Continually Evolving Skill Knowledge in Vision Language Action Model

cs.RO · 2025-11-22 · unverdicted · novelty 6.0

Stellar VLA achieves continual learning in VLA models by maintaining a growing knowledge space and routing tasks to specialized experts conditioned on semantic relations, delivering strong LIBERO benchmark results with only 1% data replay and successful real-world transfer on dual-arm hardware.

citing papers explorer

Showing 6 of 6 citing papers.

EventVLA: Event-Driven Visual Evidence Memory for Long-Horizon Vision-Language-Action Policies cs.CV · 2026-06-18 · unverdicted · none · ref 23
EventVLA introduces foundational visual anchors and a Keyframe Evidence Memory module that predicts future keyframe probabilities from VLA embeddings to improve long-horizon task success by an average of 40% on 17 simulation and 4 real-world tasks.
SARM2: Multi-Task Stage Aware Reward Modeling for Self Improving Robotic Manipulation cs.RO · 2026-06-09 · unverdicted · none · ref 44
SARM2 presents RM, a multi-task stage-aware reward model achieving 80% lower value-estimation MSE, which when used in SPIRAL boosts manipulation task success from ~50% to near-perfect on several benchmarks.
TORL-VLA: Tactile Guided Online Reinforcement Learning for Contact-Rich Manipulation cs.RO · 2026-06-08 · unverdicted · none · ref 32
TORL-VLA couples a tactile wrench-aware VLA policy with a lightweight online RL module and an intervention-censored critic to improve success and efficiency on contact-rich robotic tasks.
ELAN4D: Embodiment-Centric 4D Supervision for Vision-Language-Action Models via Plug-and-Play Adaptation cs.RO · 2026-05-28 · unverdicted · none · ref 41
ELAN4D introduces plug-and-play 4D keypoint track supervision from forward kinematics to enhance VLA policy generalization in robotic manipulation tasks.
GuidedVLA: Specifying Task-Relevant Factors via Plug-and-Play Action Attention Specialization cs.RO · 2026-05-12 · unverdicted · none · ref 75 · 2 links
GuidedVLA improves VLA generalization by supervising individual attention heads with manually defined auxiliary signals for three task-relevant factors.
Continually Evolving Skill Knowledge in Vision Language Action Model cs.RO · 2025-11-22 · unverdicted · none · ref 35
Stellar VLA achieves continual learning in VLA models by maintaining a growing knowledge space and routing tasks to specialized experts conditioned on semantic relations, delivering strong LIBERO benchmark results with only 1% data replay and successful real-world transfer on dual-arm hardware.

Expertise need not monopolize: Action-specialized mixture of experts for vision-language-action learning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer