A general rein- forcement learning algorithm that masters chess, shogi, and Go through self-play.Science, 362(6419):1140–1144

David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis · 2018

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Heterogeneous Self-Play for Realistic Highway Traffic Simulation

cs.AI · 2026-03-31 · accept · novelty 6.0

PHASE uses heterogeneous self-play and context-conditioned policies to achieve realistic, zero-shot highway traffic simulation that outperforms traditional rule-based and self-play models on real-world datasets.

citing papers explorer

Showing 1 of 1 citing paper.

Heterogeneous Self-Play for Realistic Highway Traffic Simulation cs.AI · 2026-03-31 · accept · none · ref 20
PHASE uses heterogeneous self-play and context-conditioned policies to achieve realistic, zero-shot highway traffic simulation that outperforms traditional rule-based and self-play models on real-world datasets.

A general rein- forcement learning algorithm that masters chess, shogi, and Go through self-play.Science, 362(6419):1140–1144

fields

years

verdicts

representative citing papers

citing papers explorer