Multitask offline fitted Q-iteration achieves 1/sqrt(nT) generalization rates under shared low-rank structure and reduces complexity for new tasks by reusing the upstream representation.
Kernel-based reinforcement learning: A finite-time analysis
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Generalisation in Multitask Fitted Q-Iteration and Offline Q-learning
Multitask offline fitted Q-iteration achieves 1/sqrt(nT) generalization rates under shared low-rank structure and reduces complexity for new tasks by reusing the upstream representation.