MSRL represents trajectory segments as PSD matrices to prove additive composition properties and bootstrap value functions for better transfer, reaching 0.73 AUC versus 0.57-0.65 baselines.
Modeling other players with bayesian beliefs for games with incomplete information.arXiv preprint arXiv:2405.14122
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
HPML projects multi-agent update fields onto the closest metric-gradient potential flow via Hodge decomposition, yielding Lyapunov potentials and equilibrium-gap bounds.
VPSD-RL discovers exact and approximate value-preserving Lie-group operators in continuous RL to stabilize learning via transition augmentation and consistency regularization.
citing papers explorer
-
Matrix-Space Reinforcement Learning for Reusing Local Transition Geometry
MSRL represents trajectory segments as PSD matrices to prove additive composition properties and bootstrap value functions for better transfer, reaching 0.73 AUC versus 0.57-0.65 baselines.
-
Metric-Gradient Projection for Stable Multi-Agent Policy Learning
HPML projects multi-agent update fields onto the closest metric-gradient potential flow via Hodge decomposition, yielding Lyapunov potentials and equilibrium-gap bounds.
-
Operator-Guided Invariance Learning for Continuous Reinforcement Learning
VPSD-RL discovers exact and approximate value-preserving Lie-group operators in continuous RL to stabilize learning via transition augmentation and consistency regularization.