pith. sign in

Bingxiang He

Identifiers

  • name variant Bingxiang He 0.60 · backfill

Papers (9)

  1. Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe cs.LG · 2026 · author #3
  2. Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization cs.AI · 2026 · author #9
  3. CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing cs.AI · 2026 · author #4
  4. CPMobius: Iterative Coach-Player Reasoning for Data-Free Reinforcement Learning cs.CL · 2026 · author #4
  5. CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents cs.AI · 2025 · author #6
  6. MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe cs.LG · 2025 · author #21
  7. A Survey of Reinforcement Learning for Large Reasoning Models cs.CL · 2025 · author #3
  8. Process Reinforcement through Implicit Rewards cs.LG · 2025 · author #8
  9. UltraFeedback: Boosting Language Models with Scaled AI Feedback cs.CL · 2023 · author #5

Mentions

  • 2602.02979 #4 · arxiv_oai · confidence 0.70 Bingxiang He
  • 2509.08827 #3 · arxiv_oai · confidence 0.70 Bingxiang He
  • 2310.01377 #5 · arxiv_oai · confidence 0.70 Bingxiang He
  • 2509.18154 #21 · arxiv_oai · confidence 0.70 Bingxiang He

Frequent Coauthors