Prioritized level replay

Minqi Jiang, Edward Grefenstette, Tim Rockt · 2020 · arXiv 2010.03934

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Robots Need More than VLA and World Models

cs.RO · 2026-06-04 · unverdicted · novelty 5.0

The paper identifies four missing interfaces (data autolabelling, embodiment retargeting, physics-grounded world models, and video-based reward inference) as the central bottleneck beyond VLA scaling for robot intelligence.

Trading Human Curation for Synthetic Augmentation in RLVR

cs.LG · 2026-06-02 · unverdicted · novelty 4.0

Gated synthetic augmentations can substitute for additional human-authored RLVR tasks at a cost-adjusted trade rate of 1.4x-11.6x while retaining held-out generalization on ten benchmarks spanning code, instruction following, reasoning, and agentic function calling.

Learning to Reason at the Frontier of Learnability

cs.LG · 2025-02-17 · unverdicted · novelty 4.0

A curriculum sampling questions with high variance in success rate improves reinforcement learning performance for LLM reasoning tasks.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Robots Need More than VLA and World Models cs.RO · 2026-06-04 · unverdicted · none · ref 199
The paper identifies four missing interfaces (data autolabelling, embodiment retargeting, physics-grounded world models, and video-based reward inference) as the central bottleneck beyond VLA scaling for robot intelligence.
Trading Human Curation for Synthetic Augmentation in RLVR cs.LG · 2026-06-02 · unverdicted · none · ref 26
Gated synthetic augmentations can substitute for additional human-authored RLVR tasks at a cost-adjusted trade rate of 1.4x-11.6x while retaining held-out generalization on ten benchmarks spanning code, instruction following, reasoning, and agentic function calling.

Prioritized level replay

fields

years

verdicts

representative citing papers

citing papers explorer