DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
Multi-task learning as a bargaining game
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
APT augments multi-task learning by adapting advanced optimizers via momentum balancing and light direction preservation, delivering performance gains on four standard MTL datasets.
A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.
citing papers explorer
-
Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
-
Delve into the Applicability of Advanced Optimizers for Multi-Task Learning
APT augments multi-task learning by adapting advanced optimizers via momentum balancing and light direction preservation, delivering performance gains on four standard MTL datasets.
-
Constraint-Aware Reinforcement Learning via Adaptive Action Scaling
A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.