PP-DTD achieves linear convergence to a neighborhood of the optimum under constant step-sizes and O(T^{-1}) under decaying step-sizes for distributed TD policy evaluation in MARL over directed graphs, claimed as the first with rates comparable to single-agent TD.
Finite- time analysis of decentralized temporal-difference learning with linear function approximation
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
math.OC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Distributed TD Tracking with Linear Function Approximation over Directed Communication Networks
PP-DTD achieves linear convergence to a neighborhood of the optimum under constant step-sizes and O(T^{-1}) under decaying step-sizes for distributed TD policy evaluation in MARL over directed graphs, claimed as the first with rates comparable to single-agent TD.