A low-rank matrix estimation method in a reward-free RL framework learns shared representations across linear MDPs and yields near-optimal policies with characterized regret bounds under relaxed feature assumptions.
Few-shot learning via learning the representation, provably
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2roles
background 2polarities
background 2representative citing papers
CMTRL recovers a shared low-rank feature matrix for T constrained linear bandit tasks in d dimensions using Safe-AltGDmin and provides regret and sample complexity bounds.
citing papers explorer
-
Provable Multi-Task Reinforcement Learning: A Representation Learning Framework with Low Rank Rewards
A low-rank matrix estimation method in a reward-free RL framework learns shared representations across linear MDPs and yields near-optimal policies with characterized regret bounds under relaxed feature assumptions.
-
Multi-Task Representation Learning for Conservative Linear Bandits
CMTRL recovers a shared low-rank feature matrix for T constrained linear bandit tasks in d dimensions using Safe-AltGDmin and provides regret and sample complexity bounds.