Yu Qiao
Identifiers
- name variant Yu Qiao 0.60 · backfill
Papers (105)
- Faithful, Enriched, and Precise: Benchmarking Natural-Science Illustration Generation by T2I models cs.CV · 2026 · author #10
- Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction cs.CV · 2026 · author #7
- CauTion: Knowing When to Trust LLMs for Ensemble Causal Discovery cs.LG · 2026 · author #5
- Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO cs.LG · 2026 · author #10
- PARE: Pruning and Adaptive Routing for Efficient Video Generation cs.CV · 2026 · author #4
- Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling cs.AI · 2026 · author #25
- MARBLE: Multi-Aspect Reward Balance for Diffusion RL cs.CV · 2026 · author #4
- Teaching Thinking Models to Reason with Tools: A Full-Pipeline Recipe for Tool-Integrated Reasoning cs.CL · 2026 · author #7
- StableI2I: Spotting Unintended Changes in Image-to-Image Transition cs.CV · 2026 · author #7
- FedDAP: Domain-Aware Prototype Learning for Federated Learning under Domain Shift cs.CV · 2026 · author #3
- Domain-Aware Hybrid Quantum Learning via Correlation-Guided Circuit Design for Crime Pattern Analytics cs.LG · 2026 · author #4
- MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale cs.CV · 2026 · author #40
- SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond cs.LG · 2026 · author #16
- Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models cs.CV · 2026 · author #14
- RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling cs.CV · 2025 · author #9
- InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy cs.RO · 2025 · author #10
- Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark cs.CV · 2025 · author #9
- MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing cs.CV · 2025 · author #57
- GenExam: A Multidisciplinary Text-to-Image Exam cs.CV · 2025 · author #5
- InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency cs.CV · 2025 · author #73
- A Survey on Foundation Models for Personalized Federated Intelligence cs.AI · 2025 · author #1
- InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models cs.CV · 2025 · author #49
- VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning cs.CV · 2025 · author #8
- VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness cs.CV · 2025 · author #11
- MM-Eureka: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning cs.CV · 2025 · author #13
- AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems cs.RO · 2025 · author #26
- InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling cs.CV · 2025 · author #14
- VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling cs.CV · 2024 · author #11
- Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling cs.CV · 2024 · author #40
- Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization cs.CL · 2024 · author #10
- OS-ATLAS: A Foundation Action Model for Generalist GUI Agents cs.CL · 2024 · author #11
- Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation cs.CV · 2024 · author #9
- MinerU: An Open-Source Solution for Precise Document Content Extraction cs.CV · 2024 · author #16
- InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output cs.CV · 2024 · author #25
- How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites cs.CV · 2024 · author #33
- Are We on the Right Way for Evaluating Large Vision-Language Models? cs.CV · 2024 · author #9
- InternLM2 Technical Report cs.CL · 2024 · author #99
- InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model cs.CV · 2024 · author #21
- Latte: Latent Diffusion Transformer for Video Generation cs.CV · 2024 · author #8
- InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks cs.CV · 2023 · author #14
- MVBench: A Comprehensive Multi-modal Video Understanding Benchmark cs.CV · 2023 · author #12
- SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models cs.CV · 2023 · author #16
- InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition cs.CV · 2023 · author #19
- InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation cs.CV · 2023 · author #16
- AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning cs.CV · 2023 · author #6
- Faster Segment Anything: Towards Lightweight SAM for Mobile Applications cs.CV · 2023 · author #3
- Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory cs.AI · 2023 · author #11
- VideoChat: Chat-Centric Video Understanding cs.CV · 2023 · author #9
- LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model cs.CV · 2023 · author #12
- LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention cs.CV · 2023 · author #10
- InternVideo: General Video Foundation Models via Generative and Discriminative Learning cs.CV · 2022 · author #17
- Product Image Recognition with Guidance Learning and Noisy Supervision cs.CV · 2019 · author #6
- Bootstrap Model Ensemble and Rank Loss for Engagement Intensity Regression cs.CV · 2019 · author #6
- Suppressing Model Overfitting for Image Super-Resolution Networks cs.CV · 2019 · author #3
- P2SGrad: Refined Gradients for Optimizing Deep Face Models cs.CV · 2019 · author #5
- AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations cs.CV · 2019 · author #3
- Modulating Image Restoration with Continual Levels via Adaptive Feature Modification Layers cs.CV · 2019 · author #3
- Gluing action groupoids: Fredholm conditions and layer potentials math.OA · 2018 · author #3
- Super-Identity Convolutional Neural Network for Face Hallucination cs.CV · 2018 · author #5
- PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report cs.CV · 2018 · author #42
- ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks cs.CV · 2018 · author #8
- Fredholm Groupoids and Layer Potentials on Conical Domains math.OA · 2018 · author #2
- Prostate Segmentation using 2D Bridged U-net cs.CV · 2018 · author #4
- Knowledge-based Fully Convolutional Network and Its Application in Segmentation of Lung CT Images cs.CV · 2018 · author #2
- Boosting up Scene Text Detectors with Guided CNN cs.CV · 2018 · author #6
- SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters cs.CV · 2018 · author #5
- An end-to-end TextSpotter with Explicit Alignment and Attention cs.CV · 2018 · author #5
- LSTD: A Low-Shot Transfer Detector for Object Detection cs.CV · 2018 · author #4
- Structured Triplet Learning with POS-tag Guided Attention for Visual Question Answering cs.CV · 2018 · author #5
- FOTS: Fast Oriented Text Spotting with a Unified Network cs.CV · 2018 · author #5
- Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward cs.CV · 2017 · author #2
- Deep Embedding Convolutional Neural Network for Synthesizing CT Image from T1-Weighted MR Image cs.CV · 2017 · author #5
- Single Shot Text Detector with Regional Attention cs.CV · 2017 · author #5
- Temporal Segment Networks for Action Recognition in Videos cs.CV · 2017 · author #4
- Fredholm conditions on non-compact manifolds: theory and examples math.OA · 2017 · author #3
- Analysis of the Mean Field Free Energy Functional of Electrolyte Solution with Non-zero Boundary Conditions and the Generalized PB/PNP Equations with Inhomogeneous Dielectric Permittivity cond-mat.soft · 2017 · author #2
- Range Loss for Deep Face Recognition with Long-tail cs.CV · 2016 · author #5
- Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs cs.CV · 2016 · author #5
- Detecting Text in Natural Image with Connectionist Text Proposal Network cs.CV · 2016 · author #5
- Transferring Object-Scene Convolutional Neural Networks for Event Recognition in Still Images cs.CV · 2016 · author #3
- Weakly Supervised PatchNets: Describing and Aggregating Local Patches for Scene Recognition cs.CV · 2016 · author #5
- Temporal Segment Networks: Towards Good Practices for Deep Action Recognition cs.CV · 2016 · author #4
- CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016 cs.CV · 2016 · author #8
- DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification cs.CV · 2016 · author #2
- Real-time Action Recognition with Enhanced Motion Vector CNNs cs.CV · 2016 · author #4
- Actionness Estimation Using Hybrid Fully Convolutional Networks cs.CV · 2016 · author #2
- Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks cs.CV · 2016 · author #4
- Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network cs.CV · 2016 · author #3
- Locally-Supervised Deep Hybrid Model for Scene Recognition cs.CV · 2016 · author #4
- Improvements in continuum modeling for biomolecular systems physics.bio-ph · 2015 · author #1
- Better Exploiting OS-CNNs for Better Event Recognition in Images cs.CV · 2015 · author #4
- Text-Attentional Convolutional Neural Networks for Scene Text Detection cs.CV · 2015 · author #3
- Local Multi-Grouped Binary Descriptor with Ring-based Pooling Configuration and Optimization cs.CV · 2015 · author #3
- A local approximation of fundamental measure theory incorporated into three dimensional Poisson-Nernst-Planck equations to account for hard sphere repulsion among ions physics.chem-ph · 2015 · author #1
- Places205-VGGNet Models for Scene Recognition cs.CV · 2015 · author #4
- Local Color Contrastive Descriptor for Image Classification cs.CV · 2015 · author #3
- Towards Good Practices for Very Deep Two-Stream ConvNets cs.CV · 2015 · author #4
- Reading Scene Text in Deep Convolutional Sequences cs.CV · 2015 · author #3
- Boosting Optical Character Recognition: A Super-Resolution Approach cs.CV · 2015 · author #5
- Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors cs.CV · 2015 · author #2
- Object-Scene Convolutional Neural Networks for Event Recognition in Images cs.CV · 2015 · author #4
- Bag of Visual Words and Fusion Methods for Action Recognition: Comprehensive Study and Good Practice cs.CV · 2014 · author #4
- A Study on Unsupervised Dictionary Learning and Feature Encoding for Action Classification cs.CV · 2013 · author #3
- Uniform shift estimates for transmission problems and optimal rates of convergence for the parametric Finite Element Method math.NA · 2012 · author #3
- Layer potentials C*-algebras of domains with conical points math.OA · 2011 · author #2
Mentions
- 1509.06557 #3 · backfill · confidence 0.70 Yu Qiao
- 2606.05949 #10 · arxiv_oai · confidence 0.70 Yu Qiao
- 2606.05769 #7 · arxiv_oai · confidence 0.70 Yu Qiao
- 1508.06427 #1 · backfill · confidence 0.70 Yu Qiao
- 1508.01667 #4 · backfill · confidence 0.70 Yu Qiao
- 1508.00307 #3 · backfill · confidence 0.70 Yu Qiao
- 1507.02159 #4 · backfill · confidence 0.70 Yu Qiao
- 1506.04395 #3 · backfill · confidence 0.70 Yu Qiao
- 1212.6287 #3 · arxiv_oai · confidence 0.70 Yu Qiao
- 1506.02211 #5 · backfill · confidence 0.70 Yu Qiao
- 1505.04868 #2 · backfill · confidence 0.70 Yu Qiao
- 1505.00296 #4 · backfill · confidence 0.70 Yu Qiao
- 2606.03602 #5 · arxiv_oai · confidence 0.70 Yu Qiao
- 2605.30789 #10 · arxiv_oai · confidence 0.70 Yu Qiao
- 1405.4506 #4 · backfill · confidence 0.70 Yu Qiao
- 1309.0309 #3 · backfill · confidence 0.70 Yu Qiao
- 2605.27336 #4 · arxiv_oai · confidence 0.70 Yu Qiao
- 1212.6287 #3 · backfill · confidence 0.70 Yu Qiao
- 1111.5754 #2 · backfill · confidence 0.70 Yu Qiao
- 2505.06907 #1 · arxiv_oai · confidence 0.70 Yu Qiao
- 2410.05363 #9 · arxiv_oai · confidence 0.70 Yu Qiao
- 2501.00574 #11 · arxiv_oai · confidence 0.70 Yu Qiao
- 2306.14289 #3 · arxiv_oai · confidence 0.70 Yu Qiao
- 2311.17005 #12 · arxiv_oai · confidence 0.70 Yu Qiao
- 2309.15112 #19 · arxiv_oai · confidence 0.70 Yu Qiao
- 2509.22186 #57 · arxiv_oai · confidence 0.70 Yu Qiao
- 2407.03320 #25 · arxiv_oai · confidence 0.70 Yu Qiao
- 2401.16420 #21 · arxiv_oai · confidence 0.70 Yu Qiao
- 2311.07575 #16 · arxiv_oai · confidence 0.70 Yu Qiao
- 2501.12386 #14 · arxiv_oai · confidence 0.70 Yu Qiao
- 2212.03191 #17 · arxiv_oai · confidence 0.70 Yu Qiao
- 2411.10442 #10 · arxiv_oai · confidence 0.70 Yu Qiao
- 2409.18839 #16 · arxiv_oai · confidence 0.70 Yu Qiao
- 2504.06958 #8 · arxiv_oai · confidence 0.70 Yu Qiao
- 2305.17144 #11 · arxiv_oai · confidence 0.70 Yu Qiao
Frequent Coauthors
- Limin Wang 26 shared papers
- Dahua Lin 16 shared papers
- Conghui He 13 shared papers
- Weilin Huang 11 shared papers
- Wenhai Wang 11 shared papers
- Yi Wang 11 shared papers
- Kai Chen 10 shared papers
- Yinan He 10 shared papers
- Zhe Wang 10 shared papers
- Bin Wang 9 shared papers
- Jifeng Dai 9 shared papers
- Kaipeng Zhang 9 shared papers
- Wei Li 9 shared papers
- Yali Wang 9 shared papers
- Jiaqi Wang 7 shared papers
- Linke Ouyang 7 shared papers
- Ping Luo 7 shared papers
- Xiaoou Tang 7 shared papers
- Xiaoyi Dong 7 shared papers
- Xizhou Zhu 7 shared papers