DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A deep RL traffic light controller dynamically balances vehicle and pedestrian flows to cut congestion while delivering equitable service to both user types.
Pricing, matching, and bundling act as complementary levers that platforms can adjust to balance their own profitability against overall market welfare in equilibrium.
citing papers explorer
-
Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
-
Balancing Efficiency and Fairness in Traffic Light Control through Deep Reinforcement Learning
A deep RL traffic light controller dynamically balances vehicle and pedestrian flows to cut congestion while delivering equitable service to both user types.
-
Pricing, Matching, and Bundling: an Equilibrium Analysis of Online Platforms
Pricing, matching, and bundling act as complementary levers that platforms can adjust to balance their own profitability against overall market welfare in equilibrium.