pith. sign in

Chen-Yu Wei

Identifiers

  • name variant Chen-Yu Wei 0.60 · backfill

Papers (44)

  1. On the Complexity of Offline Reinforcement Learning with $Q^\star$-Approximation and Partial Coverage cs.LG · 2026 · author #3
  2. An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction cs.LG · 2025 · author #4
  3. Beating Adversarial Low-Rank MDPs with Unknown Transition and Bandit Feedback cs.LG · 2024 · author #3
  4. How Does Variance Shape the Regret in Contextual Bandits? cs.LG · 2024 · author #4
  5. Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification cs.LG · 2024 · author #4
  6. Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data cs.LG · 2024 · author #4
  7. Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games cs.LG · 2024 · author #3
  8. Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback cs.LG · 2023 · author #2
  9. Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits cs.LG · 2023 · author #2
  10. Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs math.OC · 2023 · author #2
  11. No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions cs.LG · 2023 · author #5
  12. First- and Second-Order Bounds for Adversarial Linear Contextual Bandits cs.LG · 2023 · author #5
  13. Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback cs.GT · 2023 · author #3
  14. A Blackbox Approach to Best of Both Worlds in Bandits and Beyond cs.LG · 2023 · author #2
  15. Best of Both Worlds Policy Optimization cs.LG · 2023 · author #2
  16. Refined Regret for Adversarial MDPs with Linear Function Approximation cs.LG · 2023 · author #3
  17. A Unified Algorithm for Stochastic Path Problems cs.LG · 2022 · author #2
  18. Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning stat.ML · 2022 · author #2
  19. Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence cs.LG · 2022 · author #2
  20. Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure cs.LG · 2021 · author #2
  21. Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses cs.LG · 2021 · author #2
  22. Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously cs.LG · 2021 · author #3
  23. Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach cs.LG · 2021 · author #1
  24. Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games cs.LG · 2021 · author #1
  25. Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications cs.LG · 2021 · author #3
  26. Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition cs.LG · 2020 · author #3
  27. Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation cs.LG · 2020 · author #1
  28. Linear Last-iterate Convergence in Constrained Saddle-point Optimization cs.LG · 2020 · author #1
  29. Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs cs.LG · 2020 · author #3
  30. A Model-free Learning Algorithm for Infinite-horizon Average-reward MDPs with Near-optimal Regret cs.LG · 2020 · author #2
  31. Federated Residual Learning cs.LG · 2020 · author #3
  32. Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds cs.LG · 2020 · author #2
  33. Taking a hint: How to leverage loss predictors in contextual bandits? cs.LG · 2020 · author #1
  34. Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes cs.LG · 2019 · author #1
  35. Analyzing the Variance of Policy Gradient Estimators for the Linear-Quadratic Regulator cs.LG · 2019 · author #3
  36. Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case cs.LG · 2019 · author #5
  37. A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free cs.LG · 2019 · author #4
  38. Improved Path-length Regret Bounds for Bandits cs.LG · 2019 · author #4
  39. Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously cs.LG · 2019 · author #3
  40. Efficient Online Portfolio with Logarithmic Regret cs.LG · 2018 · author #2
  41. More Adaptive Algorithms for Adversarial Bandits cs.LG · 2018 · author #1
  42. Online Reinforcement Learning in Stochastic Games cs.LG · 2017 · author #1
  43. Tracking the Best Expert in Non-stationary Stochastic Environments cs.LG · 2017 · author #1
  44. Efficient Contextual Bandits in Non-stationary Worlds cs.LG · 2017 · author #2

Mentions

  • 2410.12713 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2411.06739 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2410.07533 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2401.15240 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2403.17091 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2306.11700 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2303.02738 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2305.17380 #5 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2310.11550 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2309.00814 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2301.12942 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2305.00832 #5 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2302.09739 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2302.09408 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2210.09255 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2202.04129 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2202.05318 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2102.01046 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2111.00781 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2102.05406 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2107.08346 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2102.04540 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2012.04053 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2102.05858 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2007.11849 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2003.03490 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2006.09517 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2006.04354 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2006.08040 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2003.01922 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2003.12880 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1910.07072 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1910.01249 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1901.08779 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1712.00578 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1902.02244 #5 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1902.00980 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1901.10604 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1708.01799 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1805.07430 #2 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1801.03265 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 1712.00579 #1 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2602.12107 #3 · arxiv_oai · confidence 0.70 Chen-Yu Wei
  • 2508.11931 #4 · arxiv_oai · confidence 0.70 Chen-Yu Wei

Frequent Coauthors