pith. sign in

Yuhang Zang

Identifiers

  • name variant Yuhang Zang 0.60 · backfill

Papers (21)

  1. CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning cs.CV · 2026 · author #4
  2. AdaGRPO: A Capability-Aware Adaptive Enhancement for Flow-based GRPO cs.CV · 2026 · author #5
  3. OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs cs.CV · 2026 · author #3
  4. Pave-GRPO: Beyond Instantaneous Guidance through Principled Average Velocity Decomposition cs.CV · 2026 · author #9
  5. Skill-as-Pseudocode: Refactoring Skill Libraries to Pseudocode for LLM Agents cs.PL · 2026 · author #2
  6. ETCHR: Editing To Clarify and Harness Reasoning cs.CV · 2026 · author #4
  7. SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept Prediction cs.CV · 2026 · author #4
  8. WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation cs.CL · 2026 · author #17
  9. Visual-ERM: Reward Modeling for Visual Equivalence cs.CV · 2026 · author #10
  10. GraphThinker: Reinforcing Temporally Grounded Video Reasoning with Event Graph Thinking cs.CV · 2026 · author #4
  11. MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing cs.CV · 2025 · author #43
  12. Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning cs.CV · 2025 · author #3
  13. Unified Reward Model for Multimodal Understanding and Generation cs.CV · 2025 · author #2
  14. Visual-RFT: Visual Reinforcement Fine-Tuning cs.CV · 2025 · author #3
  15. PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction cs.CV · 2024 · author #6
  16. InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output cs.CV · 2024 · author #3
  17. Are We on the Right Way for Evaluating Large Vision-Language Models? cs.CV · 2024 · author #5
  18. InternLM2 Technical Report cs.CL · 2024 · author #79
  19. RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition cs.CV · 2024 · author #3
  20. InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model cs.CV · 2024 · author #3
  21. Scene Text Detection with Supervised Pyramid Context Network cs.CV · 2018 · author #2

Mentions

  • 2606.09393 #4 · arxiv_oai · confidence 0.70 Yuhang Zang
  • 2606.06828 #5 · arxiv_oai · confidence 0.70 Yuhang Zang
  • 2606.03890 #3 · arxiv_oai · confidence 0.70 Yuhang Zang
  • 2606.01636 #9 · arxiv_oai · confidence 0.70 Yuhang Zang
  • 2605.27955 #2 · arxiv_oai · confidence 0.70 Yuhang Zang
  • 2605.23897 #4 · arxiv_oai · confidence 0.70 Yuhang Zang
  • 2605.20110 #4 · arxiv_oai · confidence 0.70 Yuhang Zang
  • 2403.13805 #3 · arxiv_oai · confidence 0.70 Yuhang Zang
  • 2509.22186 #43 · arxiv_oai · confidence 0.70 Yuhang Zang
  • 2407.03320 #3 · arxiv_oai · confidence 0.70 Yuhang Zang
  • 2401.16420 #3 · arxiv_oai · confidence 0.70 Yuhang Zang

Frequent Coauthors