pith. sign in

Zhuoyang Zhang

Identifiers

  • name variant Zhuoyang Zhang 0.60 · backfill

Papers (9)

  1. Grounded 3D-Aware Spatial Vision-Language Modeling cs.CV · 2026 · author #6
  2. JetViT: Efficient High-Resolution Vision Transformer with Post-Training Attention Search cs.CV · 2026 · author #2
  3. Hide to Guide: Learning via Semantic Masking cs.LG · 2026 · author #7
  4. ${\pi}_{0.7}$: a Steerable Generalist Robotic Foundation Model with Emergent Capabilities cs.LG · 2026 · author #87
  5. Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization cs.LG · 2026 · author #8
  6. CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models cs.CV · 2025 · author #5
  7. NVILA: Efficient Frontier Visual Language Models cs.CV · 2024 · author #4
  8. VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation cs.CV · 2024 · author #2
  9. Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model cs.CV · 2023 · author #3

Mentions

  • 2605.30307 #6 · arxiv_oai · confidence 0.70 Zhuoyang Zhang
  • 2605.26636 #2 · arxiv_oai · confidence 0.70 Zhuoyang Zhang
  • 2605.25198 #7 · arxiv_oai · confidence 0.70 Zhuoyang Zhang
  • 2310.15110 #3 · arxiv_oai · confidence 0.70 Zhuoyang Zhang
  • 2503.22020 #5 · arxiv_oai · confidence 0.70 Zhuoyang Zhang
  • 2409.04429 #2 · arxiv_oai · confidence 0.70 Zhuoyang Zhang

Frequent Coauthors