pith. sign in

Wenhu Chen

Identifiers

  • name variant Wenhu Chen 0.60 · backfill

Papers (38)

  1. The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence cs.AI · 2026 · author #129
  2. Starve to Perceive: Taming Lazy Perception in VLMs with Constrained Visual Bandwidth cs.CV · 2026 · author #4
  3. Bad Seeing or Bad Thinking? Rewarding Perception for Multimodal Reasoning cs.AI · 2026 · author #6
  4. WorldReasonBench: Human-Aligned Stress Testing of Video Generators as Future World-State Predictors cs.CV · 2026 · author #13
  5. RewardHarness: Self-Evolving Agentic Post-Training cs.AI · 2026 · author #12
  6. Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction cs.IR · 2026 · author #16
  7. Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling cs.CV · 2026 · author #22
  8. Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation cs.CV · 2026 · author #12
  9. ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning cs.CV · 2026 · author #5
  10. MMEB-V3: Measuring the Performance Gaps of Omni-Modality Embedding Models cs.IR · 2026 · author #9
  11. RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time cs.AI · 2026 · author #6
  12. ClawBench: Can AI Agents Complete Everyday Online Tasks? cs.CL · 2026 · author #20
  13. VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction cs.CV · 2026 · author #5
  14. VisCoder2: Building Multi-Language Visualization Coding Agents cs.SE · 2025 · author #11
  15. Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities cs.CL · 2025 · author #93
  16. VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents cs.CV · 2025 · author #12
  17. StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs cs.SE · 2025 · author #20
  18. Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning cs.CV · 2025 · author #5
  19. VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning cs.LG · 2025 · author #6
  20. VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks cs.CV · 2024 · author #6
  21. MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark cs.CL · 2024 · author #12
  22. MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark cs.CL · 2024 · author #17
  23. MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI cs.CL · 2023 · author #22
  24. MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning cs.CL · 2023 · author #8
  25. Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks cs.CL · 2022 · author #1
  26. Global Textual Relation Embedding for Relational Understanding cs.CL · 2019 · author #4
  27. Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention cs.CL · 2019 · author #1
  28. How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection cs.CL · 2019 · author #1
  29. A Variational Dirichlet Framework for Out-of-Distribution Detection cs.LG · 2018 · author #1
  30. Approximate Distribution Matching for Sequence-to-Sequence Learning cs.CL · 2018 · author #1
  31. XL-NBT: A Cross-lingual Neural Belief Tracking Framework cs.CL · 2018 · author #1
  32. Triangular Architecture for Rare Language Translation cs.CL · 2018 · author #2
  33. No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling cs.CL · 2018 · author #2
  34. Variational Knowledge Graph Reasoning cs.AI · 2018 · author #1
  35. Video Captioning via Hierarchical Reinforcement Learning cs.CV · 2017 · author #2
  36. Generative Bridging Network in Neural Sequence Prediction cs.AI · 2017 · author #1
  37. A Semi-supervised Framework for Image Captioning cs.CV · 2016 · author #1
  38. Guided Alignment Training for Topic-Aware Neural Machine Translation cs.CL · 2016 · author #1

Mentions

  • 2605.14054 #6 · arxiv_oai · confidence 0.70 Wenhu Chen
  • 2605.26494 #129 · arxiv_oai · confidence 0.70 Wenhu Chen
  • 2602.13294 #5 · arxiv_oai · confidence 0.70 Wenhu Chen
  • 2605.18603 #4 · arxiv_oai · confidence 0.70 Wenhu Chen
  • 2604.24763 #12 · arxiv_oai · confidence 0.70 Wenhu Chen
  • 2507.04590 #12 · arxiv_oai · confidence 0.70 Wenhu Chen
  • 2309.05653 #8 · arxiv_oai · confidence 0.70 Wenhu Chen
  • 2410.05160 #6 · arxiv_oai · confidence 0.70 Wenhu Chen

Frequent Coauthors