hub Mixed citations

arXiv preprint arXiv:2302.00111 , year=

· 2023 · arXiv 2302.00111

Mixed citation behavior. Most common role is background (60%).

16 Pith papers citing it

Background 60% of classified citations

read on arXiv browse 16 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 baseline 2

citation-polarity summary

background 3 baseline 2

representative citing papers

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution

cs.CL · 2023-09-28 · unverdicted · novelty 8.0

Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.

OA-WAM: Object-Addressable World Action Model for Robust Robot Manipulation

cs.RO · 2026-05-07 · unverdicted · novelty 7.0

OA-WAM uses persistent address vectors and dynamic content vectors in object slots to enable addressable world-action prediction, improving robustness on manipulation benchmarks under scene changes.

Action Agent: Agentic Video Generation Meets Flow-Constrained Diffusion

cs.RO · 2026-05-02 · unverdicted · novelty 7.0

Action Agent pairs LLM-driven video generation with a flow-constrained diffusion transformer to produce velocity commands, raising video success to 86% and delivering 64.7% real-world navigation on a Unitree G1 humanoid.

Privileged Foresight Distillation: Zero-Cost Future Correction for World Action Models

cs.RO · 2026-04-28 · unverdicted · novelty 7.0

Privileged Foresight Distillation distills the residual difference in action predictions with versus without future context into a current-only adapter, yielding consistent gains on LIBERO and RoboTwin benchmarks.

RoboDreamer: Learning Compositional World Models for Robot Imagination

cs.RO · 2024-04-18 · unverdicted · novelty 7.0

RoboDreamer factorizes video generation using language primitives to achieve compositional generalization in robot world models, outperforming monolithic baselines on unseen goals in RT-X.

Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models

cs.RO · 2023-10-16 · conditional · novelty 7.0

SuSIE uses a finetuned InstructPix2Pix diffusion model to propose subgoal images that guide a low-level goal-conditioned policy, achieving SOTA zero-shot performance on CALVIN and real-world manipulation.

Learning Interactive Real-World Simulators

cs.AI · 2023-10-09 · conditional · novelty 7.0

UniSim learns a universal real-world simulator from orchestrated diverse datasets, enabling zero-shot deployment of policies trained purely in simulation.

Fast-WAM: Do World Action Models Need Test-time Future Imagination?

cs.CV · 2026-03-17 · unverdicted · novelty 6.0

Fast-WAM shows that explicit future imagination at test time is not required for strong WAM performance; video modeling during training provides the main benefit.

Flexible Multitask Learning with Factorized Diffusion Policy

cs.RO · 2025-12-26 · unverdicted · novelty 6.0

A factorized modular diffusion policy improves fitting of multimodal robot actions and enables flexible task adaptation without catastrophic forgetting.

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

cs.RO · 2025-09-08 · unverdicted · novelty 6.0

F1 integrates next-scale visual foresight prediction into a Mixture-of-Transformer VLA architecture to reformulate action generation as foresight-guided inverse dynamics, achieving higher success rates on 136 tasks.

DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

cs.RO · 2024-11-07 · unverdicted · novelty 6.0

DINO-WM builds world models on pre-trained DINOv2 features to enable zero-shot planning from offline data without rewards or demonstrations.

3D Diffuser Actor: Policy Diffusion with 3D Scene Representations

cs.RO · 2024-02-16 · conditional · novelty 6.0

3D Diffuser Actor unifies diffusion policies with 3D scene features to set new state-of-the-art results on RLBench and CALVIN robot benchmarks.

Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation

cs.RO · 2023-12-20 · conditional · novelty 6.0

A GPT-style model pre-trained on large video datasets achieves 94.9% success on CALVIN multi-task manipulation and 85.4% zero-shot generalization, outperforming prior baselines.

Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own

cs.RO · 2023-10-04 · unverdicted · novelty 5.0

RLFP and the FAC algorithm combine foundation-model priors for policy, value, and rewards to produce sample-efficient robotic RL that reaches 86% real-robot success after one hour and 100% success on 7/8 Meta-world tasks in under 100k frames.

World Action Models: The Next Frontier in Embodied AI

cs.RO · 2026-05-12 · unverdicted · novelty 4.0

The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.

Active Inference: A method for Phenotyping Agency in AI systems?

cs.AI · 2026-04-25 · unverdicted · novelty 4.0

Active inference offers a variational way to phenotype agency in AI systems by measuring empowerment in generative models via a T-maze paradigm.

citing papers explorer

Showing 16 of 16 citing papers.

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution cs.CL · 2023-09-28 · unverdicted · none · ref 97
Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.
OA-WAM: Object-Addressable World Action Model for Robust Robot Manipulation cs.RO · 2026-05-07 · unverdicted · none · ref 16
OA-WAM uses persistent address vectors and dynamic content vectors in object slots to enable addressable world-action prediction, improving robustness on manipulation benchmarks under scene changes.
Action Agent: Agentic Video Generation Meets Flow-Constrained Diffusion cs.RO · 2026-05-02 · unverdicted · none · ref 26
Action Agent pairs LLM-driven video generation with a flow-constrained diffusion transformer to produce velocity commands, raising video success to 86% and delivering 64.7% real-world navigation on a Unitree G1 humanoid.
Privileged Foresight Distillation: Zero-Cost Future Correction for World Action Models cs.RO · 2026-04-28 · unverdicted · none · ref 5
Privileged Foresight Distillation distills the residual difference in action predictions with versus without future context into a current-only adapter, yielding consistent gains on LIBERO and RoboTwin benchmarks.
RoboDreamer: Learning Compositional World Models for Robot Imagination cs.RO · 2024-04-18 · unverdicted · none · ref 38
RoboDreamer factorizes video generation using language primitives to achieve compositional generalization in robot world models, outperforming monolithic baselines on unseen goals in RT-X.
Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models cs.RO · 2023-10-16 · conditional · none · ref 16
SuSIE uses a finetuned InstructPix2Pix diffusion model to propose subgoal images that guide a low-level goal-conditioned policy, achieving SOTA zero-shot performance on CALVIN and real-world manipulation.
Learning Interactive Real-World Simulators cs.AI · 2023-10-09 · conditional · none · ref 150
UniSim learns a universal real-world simulator from orchestrated diverse datasets, enabling zero-shot deployment of policies trained purely in simulation.
Fast-WAM: Do World Action Models Need Test-time Future Imagination? cs.CV · 2026-03-17 · unverdicted · none · ref 10
Fast-WAM shows that explicit future imagination at test time is not required for strong WAM performance; video modeling during training provides the main benefit.
Flexible Multitask Learning with Factorized Diffusion Policy cs.RO · 2025-12-26 · unverdicted · none · ref 16
A factorized modular diffusion policy improves fitting of multimodal robot actions and enables flexible task adaptation without catastrophic forgetting.
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions cs.RO · 2025-09-08 · unverdicted · none · ref 11
F1 integrates next-scale visual foresight prediction into a Mixture-of-Transformer VLA architecture to reformulate action generation as foresight-guided inverse dynamics, achieving higher success rates on 136 tasks.
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning cs.RO · 2024-11-07 · unverdicted · none · ref 14
DINO-WM builds world models on pre-trained DINOv2 features to enable zero-shot planning from offline data without rewards or demonstrations.
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations cs.RO · 2024-02-16 · conditional · none · ref 45
3D Diffuser Actor unifies diffusion policies with 3D scene features to set new state-of-the-art results on RLBench and CALVIN robot benchmarks.
Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation cs.RO · 2023-12-20 · conditional · none · ref 101
A GPT-style model pre-trained on large video datasets achieves 94.9% success on CALVIN multi-task manipulation and 85.4% zero-shot generalization, outperforming prior baselines.
Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own cs.RO · 2023-10-04 · unverdicted · none · ref 25
RLFP and the FAC algorithm combine foundation-model priors for policy, value, and rewards to produce sample-efficient robotic RL that reaches 86% real-robot success after one hour and 100% success on 7/8 Meta-world tasks in under 100k frames.
World Action Models: The Next Frontier in Embodied AI cs.RO · 2026-05-12 · unverdicted · none · ref 8
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
Active Inference: A method for Phenotyping Agency in AI systems? cs.AI · 2026-04-25 · unverdicted · none · ref 19
Active inference offers a variational way to phenotype agency in AI systems by measuring empowerment in generative models via a T-maze paradigm.

arXiv preprint arXiv:2302.00111 , year=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer