pith. sign in

Chen-Yu Wei

Identifiers

  • name variant Chen-Yu Wei 0.60 · backfill

Papers (47)

  1. On the Complexity of Offline Reinforcement Learning with $Q^\star$-Approximation and Partial Coverage cs.LG · 2026 · author #3
  2. An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction cs.LG · 2025 · author #4
  3. Decision Making in Hybrid Environments: A Model Aggregation Approach cs.LG · 2025 · author #2
  4. Beating Adversarial Low-Rank MDPs with Unknown Transition and Bandit Feedback cs.LG · 2024 · author #3
  5. How Does Variance Shape the Regret in Contextual Bandits? cs.LG · 2024 · author #4
  6. Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification cs.LG · 2024 · author #4
  7. Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data cs.LG · 2024 · author #4
  8. On Tractable $\Phi$-Equilibria in Non-Concave Games cs.GT · 2024 · author #4
  9. Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games cs.LG · 2024 · author #3
  10. Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback cs.LG · 2023 · author #2
  11. Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits cs.LG · 2023 · author #2
  12. Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs math.OC · 2023 · author #2
  13. No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions cs.LG · 2023 · author #5
  14. First- and Second-Order Bounds for Adversarial Linear Contextual Bandits cs.LG · 2023 · author #5
  15. Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback cs.GT · 2023 · author #3
  16. A Blackbox Approach to Best of Both Worlds in Bandits and Beyond cs.LG · 2023 · author #2
  17. Best of Both Worlds Policy Optimization cs.LG · 2023 · author #2
  18. Refined Regret for Adversarial MDPs with Linear Function Approximation cs.LG · 2023 · author #3
  19. A Unified Algorithm for Stochastic Path Problems cs.LG · 2022 · author #2
  20. Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning stat.ML · 2022 · author #2
  21. Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence cs.LG · 2022 · author #2
  22. Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure cs.LG · 2021 · author #2
  23. A Model Selection Approach for Corruption Robust Reinforcement Learning cs.LG · 2021 · author #1
  24. Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses cs.LG · 2021 · author #2
  25. Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously cs.LG · 2021 · author #3
  26. Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach cs.LG · 2021 · author #1
  27. Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games cs.LG · 2021 · author #1
  28. Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications cs.LG · 2021 · author #3
  29. Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition cs.LG · 2020 · author #3
  30. Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation cs.LG · 2020 · author #1
  31. Linear Last-iterate Convergence in Constrained Saddle-point Optimization cs.LG · 2020 · author #1
  32. Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs cs.LG · 2020 · author #3
  33. A Model-free Learning Algorithm for Infinite-horizon Average-reward MDPs with Near-optimal Regret cs.LG · 2020 · author #2
  34. Federated Residual Learning cs.LG · 2020 · author #3
  35. Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds cs.LG · 2020 · author #2
  36. Taking a hint: How to leverage loss predictors in contextual bandits? cs.LG · 2020 · author #1
  37. Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes cs.LG · 2019 · author #1
  38. Analyzing the Variance of Policy Gradient Estimators for the Linear-Quadratic Regulator cs.LG · 2019 · author #3
  39. Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case cs.LG · 2019 · author #5
  40. A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free cs.LG · 2019 · author #4
  41. Improved Path-length Regret Bounds for Bandits cs.LG · 2019 · author #4
  42. Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously cs.LG · 2019 · author #3
  43. Efficient Online Portfolio with Logarithmic Regret cs.LG · 2018 · author #2
  44. More Adaptive Algorithms for Adversarial Bandits cs.LG · 2018 · author #1
  45. Online Reinforcement Learning in Stochastic Games cs.LG · 2017 · author #1
  46. Tracking the Best Expert in Non-stationary Stochastic Environments cs.LG · 2017 · author #1
  47. Efficient Contextual Bandits in Non-stationary Worlds cs.LG · 2017 · author #2

Mentions

  • 2502.05974 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2403.08171 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2110.03580 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2410.12713 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2411.06739 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2410.07533 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2401.15240 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2403.17091 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2306.11700 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2303.02738 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2305.17380 #5 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2310.11550 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2309.00814 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2301.12942 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2305.00832 #5 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2302.09739 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2302.09408 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2210.09255 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2202.04129 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2202.05318 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2102.01046 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2111.00781 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2102.05406 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2107.08346 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2102.04540 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2012.04053 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2102.05858 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2007.11849 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2003.03490 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2006.09517 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2006.04354 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2006.08040 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2003.01922 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2003.12880 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1910.07072 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1910.01249 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1901.08779 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1712.00578 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1902.02244 #5 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1902.00980 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1901.10604 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1708.01799 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1805.07430 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1801.03265 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1712.00579 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2602.12107 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2508.11931 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei

Frequent Coauthors