{"paper":{"title":"DQN with model-based exploration: efficient learning on environments with sparse rewards","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"","cross_cats":["stat.ML"],"primary_cat":"cs.LG","authors_text":"Stephen Zhen Gou, Yuyang Liu","submitted_at":"2019-03-22T01:41:50Z","abstract_excerpt":"We propose Deep Q-Networks (DQN) with model-based exploration, an algorithm combining both model-free and model-based approaches that explores better and learns environments with sparse rewards more efficiently. DQN is a general-purpose, model-free algorithm and has been proven to perform well in a variety of tasks including Atari 2600 games since it's first proposed by Minh et el. However, like many other reinforcement learning (RL) algorithms, DQN suffers from poor sample efficiency when rewards are sparse in an environment. As a result, most of the transitions stored in the replay memory ha"},"claims":{"count":0,"items":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"1903.09295","kind":"arxiv","version":1},"verdict":{"id":null,"model_set":{},"created_at":null,"strongest_claim":"","one_line_summary":"","pipeline_version":null,"weakest_assumption":"","pith_extraction_headline":""},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}