Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning

Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Karol Hausman, Chelsea Finn, Sergey Levine · 2020

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

dataset 2 background 1

citation-polarity summary

use dataset 2 background 1

representative citing papers

Why Does Agentic Safety Fail to Generalize Across Tasks?

cs.LG · 2026-05-07 · conditional · novelty 6.0

Agentic safety fails to generalize across tasks because the task-to-safe-controller mapping has a higher Lipschitz constant than the task-to-controller mapping alone, as proven in linear-quadratic control and demonstrated in quadcopter and LLM experiments.

dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model

cs.RO · 2026-04-24 · unverdicted · novelty 6.0

A discrete diffusion model tokenizes multimodal robotic data and uses a progress token to predict future states and task completion for scalable policy evaluation.

Multi-View Video Diffusion Policy: A 3D Spatio-Temporal-Aware Video Action Model

cs.RO · 2026-04-03 · conditional · novelty 6.0

MV-VDP jointly predicts multi-view RGB and heatmap videos via diffusion to achieve data-efficient, robust robotic manipulation policies.

Hyper-DP3: Frequency-Aware Right-Sizing of 3D Diffusion Policies for Visuomotor Control

cs.RO · 2026-05-02 · 3 refs

citing papers explorer

Showing 4 of 4 citing papers.

Why Does Agentic Safety Fail to Generalize Across Tasks? cs.LG · 2026-05-07 · conditional · none · ref 123
Agentic safety fails to generalize across tasks because the task-to-safe-controller mapping has a higher Lipschitz constant than the task-to-controller mapping alone, as proven in linear-quadratic control and demonstrated in quadcopter and LLM experiments.
dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model cs.RO · 2026-04-24 · unverdicted · none · ref 47
A discrete diffusion model tokenizes multimodal robotic data and uses a progress token to predict future states and task completion for scalable policy evaluation.
Multi-View Video Diffusion Policy: A 3D Spatio-Temporal-Aware Video Action Model cs.RO · 2026-04-03 · conditional · none · ref 18
MV-VDP jointly predicts multi-view RGB and heatmap videos via diffusion to achieve data-efficient, robust robotic manipulation policies.
Hyper-DP3: Frequency-Aware Right-Sizing of 3D Diffusion Policies for Visuomotor Control cs.RO · 2026-05-02 · unreviewed · ref 33 · 3 links

Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer