Optimus-3: Dual-router aligned mixture-of-experts agent with dual-granularity reasoning-aware policy optimization

Li, Z · 2025 · arXiv 2506.10357

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

SPIKE: An Adaptive Dual Controller Framework for Cost-Efficient Long-Horizon Game Agents

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

SPIKE dual-controller framework raises success rates 5-9 points and cuts tokens 55% in StarDojo agents by reusing strategic plans across stable segments and escalating only at detected events.

ConsisVLA-4D: Advancing Spatiotemporal Consistency in Efficient 3D-Perception and 4D-Reasoning for Robotic Manipulation

cs.RO · 2026-05-06 · unverdicted · novelty 6.0

ConsisVLA-4D adds cross-view semantic alignment, cross-object geometric fusion, and cross-scene dynamic reasoning to VLA models, delivering 21.6% and 41.5% gains plus 2.3x and 2.4x speedups on LIBERO and real-world tasks.

Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Efficient Robotic Manipulation

cs.RO · 2026-02-22 · unverdicted · novelty 6.0

OptimusVLA augments hierarchical VLA models with Global Prior Memory for shorter generative paths and Local Consistency Memory for temporal coherence, yielding higher success rates and 2.9x faster inference on simulation and real-world robotic benchmarks.

Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

cs.CL · 2025-11-25 · unverdicted · novelty 6.0

Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and reasoning datasets.

From Abstraction to Instantiation: Learning Behavioral Representation for Vision-Language-Action Model

cs.CV · 2026-05-21 · unverdicted · novelty 5.0

BehaviorVLA introduces a symmetric encoder-decoder architecture with causal Mamba and phase conditioning to learn unified long-horizon behavioral representations for improved generalization in VLA models.

ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement

cs.RO · 2026-04-22 · unverdicted · novelty 5.0

ALAS disentangles environment and self-state streams via bio-inspired modules to deliver 23% higher subtask success and 29% better execution efficiency on long-horizon HSI tasks.

A Survey of Reinforcement Learning for Large Reasoning Models

cs.CL · 2025-09-10 · accept · novelty 3.0

A survey compiling RL methods, challenges, data resources, and applications for enhancing reasoning in large language models and large reasoning models since DeepSeek-R1.

citing papers explorer

Showing 7 of 7 citing papers.

SPIKE: An Adaptive Dual Controller Framework for Cost-Efficient Long-Horizon Game Agents cs.CV · 2026-05-18 · unverdicted · none · ref 14
SPIKE dual-controller framework raises success rates 5-9 points and cuts tokens 55% in StarDojo agents by reusing strategic plans across stable segments and escalating only at detected events.
ConsisVLA-4D: Advancing Spatiotemporal Consistency in Efficient 3D-Perception and 4D-Reasoning for Robotic Manipulation cs.RO · 2026-05-06 · unverdicted · none · ref 43
ConsisVLA-4D adds cross-view semantic alignment, cross-object geometric fusion, and cross-scene dynamic reasoning to VLA models, delivering 21.6% and 41.5% gains plus 2.3x and 2.4x speedups on LIBERO and real-world tasks.
Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Efficient Robotic Manipulation cs.RO · 2026-02-22 · unverdicted · none · ref 22
OptimusVLA augments hierarchical VLA models with Global Prior Memory for shorter generative paths and Local Consistency Memory for temporal coherence, yielding higher success rates and 2.9x faster inference on simulation and real-world robotic benchmarks.
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory cs.CL · 2025-11-25 · unverdicted · none · ref 259
Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and reasoning datasets.
From Abstraction to Instantiation: Learning Behavioral Representation for Vision-Language-Action Model cs.CV · 2026-05-21 · unverdicted · none · ref 14
BehaviorVLA introduces a symmetric encoder-decoder architecture with causal Mamba and phase conditioning to learn unified long-horizon behavioral representations for improved generalization in VLA models.
ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement cs.RO · 2026-04-22 · unverdicted · none · ref 18
ALAS disentangles environment and self-state streams via bio-inspired modules to deliver 23% higher subtask success and 29% better execution efficiency on long-horizon HSI tasks.
A Survey of Reinforcement Learning for Large Reasoning Models cs.CL · 2025-09-10 · accept · none · ref 294
A survey compiling RL methods, challenges, data resources, and applications for enhancing reasoning in large language models and large reasoning models since DeepSeek-R1.

Optimus-3: Dual-router aligned mixture-of-experts agent with dual-granularity reasoning-aware policy optimization

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer