Never give up: Learning directed exploration strategies.arXiv preprint arXiv:2002.06038, 2020

Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Daniel Guo, Bilal Piot, Steven Kapturowski, Olivier Tieleman, Martín Arjovsky, Alexander Pritzel, Andew Bolt, et al · 2002 · arXiv 2002.06038

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

Baba in Wonderland: Online Self-Supervised Dynamics Discovery for Executable World Models

cs.AI · 2026-05-16 · unverdicted · novelty 7.0

Alice uses preservation conflicts from failed candidate updates to create class-stratified hypotheses and guide exploration, improving executable world-model learning under prior misalignment.

Beyond Noisy-TVs: Noise-Robust Exploration Via Learning Progress Monitoring

cs.LG · 2025-09-29 · unverdicted · novelty 7.0

LPM uses a dual-network design to compute intrinsic rewards from the change in prediction error across iterations, providing a noise-robust signal that is theoretically linked to information gain.

citing papers explorer

Showing 2 of 2 citing papers.

Baba in Wonderland: Online Self-Supervised Dynamics Discovery for Executable World Models cs.AI · 2026-05-16 · unverdicted · none · ref 1
Alice uses preservation conflicts from failed candidate updates to create class-stratified hypotheses and guide exploration, improving executable world-model learning under prior misalignment.
Beyond Noisy-TVs: Noise-Robust Exploration Via Learning Progress Monitoring cs.LG · 2025-09-29 · unverdicted · none · ref 2
LPM uses a dual-network design to compute intrinsic rewards from the change in prediction error across iterations, providing a noise-robust signal that is theoretically linked to information gain.

Never give up: Learning directed exploration strategies.arXiv preprint arXiv:2002.06038, 2020

fields

years

verdicts

representative citing papers

citing papers explorer