A low-rank matrix estimation method in a reward-free RL framework learns shared representations across linear MDPs and yields near-optimal policies with characterized regret bounds under relaxed feature assumptions.
Accelerating multi-task temporal difference learning under low-rank representation
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1roles
background 1polarities
background 1representative citing papers
citing papers explorer
-
Provable Multi-Task Reinforcement Learning: A Representation Learning Framework with Low Rank Rewards
A low-rank matrix estimation method in a reward-free RL framework learns shared representations across linear MDPs and yields near-optimal policies with characterized regret bounds under relaxed feature assumptions.