Shuai Bai
Identifiers
- name variant Shuai Bai 0.60 · backfill
Papers (24)
- Unified Multimodal Autoregressive Modeling with Shared Context-Visual Tokenizer is Key to Unification cs.CV · 2026 · author #10
- Qwen-RobotNav Technical Report: A Scalable Navigation Model Designed for an Agentic Navigation System cs.RO · 2026 · author #30
- Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments cs.RO · 2026 · author #15
- FineVLA: Fine-Grained Instruction Alignment for Steerable Vision-Language-Action Policies cs.RO · 2026 · author #13
- CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents cs.AI · 2026 · author #10
- MPDocBench-Parse: Benchmarking Practical Multi-page Document Parsing cs.AI · 2026 · author #8
- Qwen-Image-2.0 Technical Report cs.CV · 2026 · author #56
- Qwen3-VL-Seg: Unlocking Open-World Referring Segmentation with Vision-Language Grounding cs.CV · 2026 · author #6
- CC-OCR V2: Benchmarking Large Multimodal Models for Literacy in Real-world Document Processing cs.CL · 2026 · author #12
- Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking cs.CL · 2026 · author #6
- VLM4VLA: Revisiting Vision-Language-Models in Vision-Language-Action Models cs.CV · 2026 · author #8
- Qwen3-VL Technical Report cs.CV · 2025 · author #1
- Soft Adaptive Policy Optimization cs.LG · 2025 · author #8
- Unify Robot Actions in Camera Frame cs.RO · 2025 · author #10
- Revisiting Multimodal Positional Encoding in Vision-Language Models cs.CV · 2025 · author #7
- Qwen3-Omni Technical Report cs.CL · 2025 · author #24
- Qwen-Image Technical Report cs.CV · 2025 · author #8
- Qwen2.5-Omni Technical Report cs.CL · 2025 · author #6
- Qwen2.5-VL Technical Report cs.CV · 2025 · author #1
- Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution cs.CV · 2024 · author #2
- Qwen2 Technical Report cs.CL · 2024 · author #40
- Qwen Technical Report cs.CL · 2023 · author #2
- Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond cs.CV · 2023 · author #2
- Multi-hierarchical Independent Correlation Filters for Visual Tracking cs.CV · 2018 · author #1
Mentions
- 2606.18112 #30 · arxiv_oai · confidence 0.70 Shuai Bai
- 2606.18249 #10 · arxiv_oai · confidence 0.70 Shuai Bai
- 2601.03309 #8 · arxiv_oai · confidence 0.70 Shuai Bai
- 2605.30280 #15 · arxiv_oai · confidence 0.70 Shuai Bai
- 2605.27284 #13 · arxiv_oai · confidence 0.70 Shuai Bai
- 2605.25624 #10 · arxiv_oai · confidence 0.70 Shuai Bai
- 2605.22100 #8 · arxiv_oai · confidence 0.70 Shuai Bai
- 2407.10671 #40 · backfill · confidence 0.70 Shuai Bai
- 2511.21631 #1 · backfill · confidence 0.70 Shuai Bai
Frequent Coauthors
- Junyang Lin 17 shared papers
- Dayiheng Liu 12 shared papers
- Jingren Zhou 12 shared papers
- An Yang 8 shared papers
- Peng Wang 8 shared papers
- Zhibo Yang 8 shared papers
- Keqin Chen 7 shared papers
- Xuejing Liu 7 shared papers
- Bowen Yu 6 shared papers
- Kai Dang 6 shared papers
- Rui Men 6 shared papers
- Jin Xu 5 shared papers
- Mingsheng Li 5 shared papers
- Qiuyue Wang 5 shared papers
- Shijie Wang 5 shared papers
- Wenbin Ge 5 shared papers
- Xuancheng Ren 5 shared papers
- Yang Fan 5 shared papers
- Chang Zhou 4 shared papers
- Chenfei Wu 4 shared papers