Yu Qiao

Identifiers

name variant Yu Qiao 0.60 · backfill

Papers (105)

Faithful, Enriched, and Precise: Benchmarking Natural-Science Illustration Generation by T2I models cs.CV · 2026 · author #10
Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction cs.CV · 2026 · author #7
CauTion: Knowing When to Trust LLMs for Ensemble Causal Discovery cs.LG · 2026 · author #5
Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO cs.LG · 2026 · author #10
PARE: Pruning and Adaptive Routing for Efficient Video Generation cs.CV · 2026 · author #4
Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling cs.AI · 2026 · author #25
MARBLE: Multi-Aspect Reward Balance for Diffusion RL cs.CV · 2026 · author #4
Teaching Thinking Models to Reason with Tools: A Full-Pipeline Recipe for Tool-Integrated Reasoning cs.CL · 2026 · author #7
StableI2I: Spotting Unintended Changes in Image-to-Image Transition cs.CV · 2026 · author #7
FedDAP: Domain-Aware Prototype Learning for Federated Learning under Domain Shift cs.CV · 2026 · author #3
Domain-Aware Hybrid Quantum Learning via Correlation-Guided Circuit Design for Crime Pattern Analytics cs.LG · 2026 · author #4
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale cs.CV · 2026 · author #40
SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond cs.LG · 2026 · author #16
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models cs.CV · 2026 · author #14
RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling cs.CV · 2025 · author #9
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy cs.RO · 2025 · author #10
Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark cs.CV · 2025 · author #9
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing cs.CV · 2025 · author #57
GenExam: A Multidisciplinary Text-to-Image Exam cs.CV · 2025 · author #5
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency cs.CV · 2025 · author #73
A Survey on Foundation Models for Personalized Federated Intelligence cs.AI · 2025 · author #1
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models cs.CV · 2025 · author #49
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning cs.CV · 2025 · author #8
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness cs.CV · 2025 · author #11
MM-Eureka: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning cs.CV · 2025 · author #13
AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems cs.RO · 2025 · author #26
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling cs.CV · 2025 · author #14
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling cs.CV · 2024 · author #11
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling cs.CV · 2024 · author #40
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization cs.CL · 2024 · author #10
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents cs.CL · 2024 · author #11
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation cs.CV · 2024 · author #9
MinerU: An Open-Source Solution for Precise Document Content Extraction cs.CV · 2024 · author #16
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output cs.CV · 2024 · author #25
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites cs.CV · 2024 · author #33
Are We on the Right Way for Evaluating Large Vision-Language Models? cs.CV · 2024 · author #9
InternLM2 Technical Report cs.CL · 2024 · author #99
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model cs.CV · 2024 · author #21
Latte: Latent Diffusion Transformer for Video Generation cs.CV · 2024 · author #8
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks cs.CV · 2023 · author #14
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark cs.CV · 2023 · author #12
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models cs.CV · 2023 · author #16
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition cs.CV · 2023 · author #19
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation cs.CV · 2023 · author #16
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning cs.CV · 2023 · author #6
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications cs.CV · 2023 · author #3
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory cs.AI · 2023 · author #11
VideoChat: Chat-Centric Video Understanding cs.CV · 2023 · author #9
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model cs.CV · 2023 · author #12
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention cs.CV · 2023 · author #10
InternVideo: General Video Foundation Models via Generative and Discriminative Learning cs.CV · 2022 · author #17
Product Image Recognition with Guidance Learning and Noisy Supervision cs.CV · 2019 · author #6
Bootstrap Model Ensemble and Rank Loss for Engagement Intensity Regression cs.CV · 2019 · author #6
Suppressing Model Overfitting for Image Super-Resolution Networks cs.CV · 2019 · author #3
P2SGrad: Refined Gradients for Optimizing Deep Face Models cs.CV · 2019 · author #5
AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations cs.CV · 2019 · author #3
Modulating Image Restoration with Continual Levels via Adaptive Feature Modification Layers cs.CV · 2019 · author #3
Gluing action groupoids: Fredholm conditions and layer potentials math.OA · 2018 · author #3
Super-Identity Convolutional Neural Network for Face Hallucination cs.CV · 2018 · author #5
PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report cs.CV · 2018 · author #42
ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks cs.CV · 2018 · author #8
Fredholm Groupoids and Layer Potentials on Conical Domains math.OA · 2018 · author #2
Prostate Segmentation using 2D Bridged U-net cs.CV · 2018 · author #4
Knowledge-based Fully Convolutional Network and Its Application in Segmentation of Lung CT Images cs.CV · 2018 · author #2
Boosting up Scene Text Detectors with Guided CNN cs.CV · 2018 · author #6
SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters cs.CV · 2018 · author #5
An end-to-end TextSpotter with Explicit Alignment and Attention cs.CV · 2018 · author #5
LSTD: A Low-Shot Transfer Detector for Object Detection cs.CV · 2018 · author #4
Structured Triplet Learning with POS-tag Guided Attention for Visual Question Answering cs.CV · 2018 · author #5
FOTS: Fast Oriented Text Spotting with a Unified Network cs.CV · 2018 · author #5
Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward cs.CV · 2017 · author #2
Deep Embedding Convolutional Neural Network for Synthesizing CT Image from T1-Weighted MR Image cs.CV · 2017 · author #5
Single Shot Text Detector with Regional Attention cs.CV · 2017 · author #5
Temporal Segment Networks for Action Recognition in Videos cs.CV · 2017 · author #4
Fredholm conditions on non-compact manifolds: theory and examples math.OA · 2017 · author #3
Analysis of the Mean Field Free Energy Functional of Electrolyte Solution with Non-zero Boundary Conditions and the Generalized PB/PNP Equations with Inhomogeneous Dielectric Permittivity cond-mat.soft · 2017 · author #2
Range Loss for Deep Face Recognition with Long-tail cs.CV · 2016 · author #5
Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs cs.CV · 2016 · author #5
Detecting Text in Natural Image with Connectionist Text Proposal Network cs.CV · 2016 · author #5
Transferring Object-Scene Convolutional Neural Networks for Event Recognition in Still Images cs.CV · 2016 · author #3
Weakly Supervised PatchNets: Describing and Aggregating Local Patches for Scene Recognition cs.CV · 2016 · author #5
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition cs.CV · 2016 · author #4
CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016 cs.CV · 2016 · author #8
DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification cs.CV · 2016 · author #2
Real-time Action Recognition with Enhanced Motion Vector CNNs cs.CV · 2016 · author #4
Actionness Estimation Using Hybrid Fully Convolutional Networks cs.CV · 2016 · author #2
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks cs.CV · 2016 · author #4
Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network cs.CV · 2016 · author #3
Locally-Supervised Deep Hybrid Model for Scene Recognition cs.CV · 2016 · author #4
Improvements in continuum modeling for biomolecular systems physics.bio-ph · 2015 · author #1
Better Exploiting OS-CNNs for Better Event Recognition in Images cs.CV · 2015 · author #4
Text-Attentional Convolutional Neural Networks for Scene Text Detection cs.CV · 2015 · author #3
Local Multi-Grouped Binary Descriptor with Ring-based Pooling Configuration and Optimization cs.CV · 2015 · author #3
A local approximation of fundamental measure theory incorporated into three dimensional Poisson-Nernst-Planck equations to account for hard sphere repulsion among ions physics.chem-ph · 2015 · author #1
Places205-VGGNet Models for Scene Recognition cs.CV · 2015 · author #4
Local Color Contrastive Descriptor for Image Classification cs.CV · 2015 · author #3
Towards Good Practices for Very Deep Two-Stream ConvNets cs.CV · 2015 · author #4
Reading Scene Text in Deep Convolutional Sequences cs.CV · 2015 · author #3
Boosting Optical Character Recognition: A Super-Resolution Approach cs.CV · 2015 · author #5
Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors cs.CV · 2015 · author #2
Object-Scene Convolutional Neural Networks for Event Recognition in Images cs.CV · 2015 · author #4
Bag of Visual Words and Fusion Methods for Action Recognition: Comprehensive Study and Good Practice cs.CV · 2014 · author #4
A Study on Unsupervised Dictionary Learning and Feature Encoding for Action Classification cs.CV · 2013 · author #3
Uniform shift estimates for transmission problems and optimal rates of convergence for the parametric Finite Element Method math.NA · 2012 · author #3
Layer potentials C*-algebras of domains with conical points math.OA · 2011 · author #2

Mentions

1509.06557 #3 · backfill · confidence 0.70 Yu Qiao
2606.05949 #10 · arxiv_oai · confidence 0.70 Yu Qiao
2606.05769 #7 · arxiv_oai · confidence 0.70 Yu Qiao
1508.06427 #1 · backfill · confidence 0.70 Yu Qiao
1508.01667 #4 · backfill · confidence 0.70 Yu Qiao
1508.00307 #3 · backfill · confidence 0.70 Yu Qiao
1507.02159 #4 · backfill · confidence 0.70 Yu Qiao
1506.04395 #3 · backfill · confidence 0.70 Yu Qiao
1212.6287 #3 · arxiv_oai · confidence 0.70 Yu Qiao
1506.02211 #5 · backfill · confidence 0.70 Yu Qiao
1505.04868 #2 · backfill · confidence 0.70 Yu Qiao
1505.00296 #4 · backfill · confidence 0.70 Yu Qiao
2606.03602 #5 · arxiv_oai · confidence 0.70 Yu Qiao
2605.30789 #10 · arxiv_oai · confidence 0.70 Yu Qiao
1405.4506 #4 · backfill · confidence 0.70 Yu Qiao
1309.0309 #3 · backfill · confidence 0.70 Yu Qiao
2605.27336 #4 · arxiv_oai · confidence 0.70 Yu Qiao
1212.6287 #3 · backfill · confidence 0.70 Yu Qiao
1111.5754 #2 · backfill · confidence 0.70 Yu Qiao
2505.06907 #1 · arxiv_oai · confidence 0.70 Yu Qiao
2410.05363 #9 · arxiv_oai · confidence 0.70 Yu Qiao
2501.00574 #11 · arxiv_oai · confidence 0.70 Yu Qiao
2306.14289 #3 · arxiv_oai · confidence 0.70 Yu Qiao
2311.17005 #12 · arxiv_oai · confidence 0.70 Yu Qiao
2309.15112 #19 · arxiv_oai · confidence 0.70 Yu Qiao
2509.22186 #57 · arxiv_oai · confidence 0.70 Yu Qiao
2407.03320 #25 · arxiv_oai · confidence 0.70 Yu Qiao
2401.16420 #21 · arxiv_oai · confidence 0.70 Yu Qiao
2311.07575 #16 · arxiv_oai · confidence 0.70 Yu Qiao
2501.12386 #14 · arxiv_oai · confidence 0.70 Yu Qiao
2212.03191 #17 · arxiv_oai · confidence 0.70 Yu Qiao
2411.10442 #10 · arxiv_oai · confidence 0.70 Yu Qiao
2409.18839 #16 · arxiv_oai · confidence 0.70 Yu Qiao
2504.06958 #8 · arxiv_oai · confidence 0.70 Yu Qiao
2305.17144 #11 · arxiv_oai · confidence 0.70 Yu Qiao

Frequent Coauthors

Limin Wang 26 shared papers
Dahua Lin 16 shared papers
Conghui He 13 shared papers
Weilin Huang 11 shared papers
Wenhai Wang 11 shared papers
Yi Wang 11 shared papers
Kai Chen 10 shared papers
Yinan He 10 shared papers
Zhe Wang 10 shared papers
Bin Wang 9 shared papers
Jifeng Dai 9 shared papers
Kaipeng Zhang 9 shared papers
Wei Li 9 shared papers
Yali Wang 9 shared papers
Jiaqi Wang 7 shared papers
Linke Ouyang 7 shared papers
Ping Luo 7 shared papers
Xiaoou Tang 7 shared papers
Xiaoyi Dong 7 shared papers
Xizhou Zhu 7 shared papers