Tesauro, Temporal difference learning and TD-Gammon,Communi- cations of the ACM58–68 (1995)

· 1995

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Superhuman Safe and Agile Racing through Multi-Agent Reinforcement Learning

cs.RO · 2026-05-21 · conditional · novelty 6.0

Multi-agent RL with league self-play trains quadrotors to exceed champion human performance in multi-player races above 22 m/s while cutting collisions by 50% and generalizing zero-shot to safer human interaction.

citing papers explorer

Showing 1 of 1 citing paper.

Superhuman Safe and Agile Racing through Multi-Agent Reinforcement Learning cs.RO · 2026-05-21 · conditional · none · ref 16
Multi-agent RL with league self-play trains quadrotors to exceed champion human performance in multi-player races above 22 m/s while cutting collisions by 50% and generalizing zero-shot to safer human interaction.

Tesauro, Temporal difference learning and TD-Gammon,Communi- cations of the ACM58–68 (1995)

fields

years

verdicts

representative citing papers

citing papers explorer