The reciprocity gradient allows agents to learn near-optimal context-sensitive policies by analytically propagating reward gradients through reputation chains in multi-agent settings.
Bottom-up reputation promotes cooperation with multi-agent reinforcement learning.arXiv preprint arXiv:2502.01971
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
unclear 1representative citing papers
Coupling exploration rates to local reputation differences and using asymmetric reputation updates in Q-learning promotes the evolution of cooperation in multi-agent evolutionary games.
citing papers explorer
-
The Reciprocity Gradient
The reciprocity gradient allows agents to learn near-optimal context-sensitive policies by analytically propagating reward gradients through reputation chains in multi-agent settings.
-
Reinforcement learning with reputation-based adaptive exploration promotes the evolution of cooperation
Coupling exploration rates to local reputation differences and using asymmetric reputation updates in Q-learning promotes the evolution of cooperation in multi-agent evolutionary games.