Dahua Lin

Identifiers

name variant Dahua Lin 0.60 · backfill

Papers (91)

Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent cs.CL · 2026 · author #18
MemoryWAM: Efficient World Action Modeling with Persistent Memory cs.RO · 2026 · author #9
Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games cs.CV · 2026 · author #5
SpecGen: Accelerating Agentic Kernel Optimization with Speculative Generation cs.DC · 2026 · author #7
PermaVid: Consistent Video Generation Across Edits via Disentangled Context Memory cs.CV · 2026 · author #5
CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning cs.CV · 2026 · author #13
AdaGRPO: A Capability-Aware Adaptive Enhancement for Flow-based GRPO cs.CV · 2026 · author #11
ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning cs.AI · 2026 · author #8
AMix-2: Establishing Protein as a Native Modality in Large Language Models q-bio.BM · 2026 · author #20
SGMD: Score Gradient Matching Distillation for Few-Step Video Diffusion Distillation cs.CV · 2026 · author #7
From Pixels to Words -- Towards Native One-Vision Models at Scale cs.CV · 2026 · author #20
ETCHR: Editing To Clarify and Harness Reasoning cs.CV · 2026 · author #6
NanoCP: Request-Level Dynamic Context Parallelism for Data-Expert Parallel Decoding cs.DC · 2026 · author #12
Beyond Mode Collapse: Distribution Matching for Diverse Reasoning cs.AI · 2026 · author #10
What and When to Distill: Selective Hindsight Distillation for Multi-Turn Agents cs.AI · 2026 · author #8
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture cs.CV · 2026 · author #58
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation cs.CL · 2026 · author #15
ResiHP: Taming LLM Training Failures with Dynamic Hybrid Parallelism cs.DC · 2026 · author #7
OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis cs.AI · 2026 · author #14
Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs cs.AI · 2026 · author #12
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale cs.CV · 2026 · author #42
Demystifying Video Reasoning cs.CV · 2026 · author #12
Visual-ERM: Reward Modeling for Visual Equivalence cs.CV · 2026 · author #9
Robo3R: Enhancing Robotic Manipulation with Accurate Feed-Forward 3D Reconstruction cs.RO · 2026 · author #6
EAG-PT: Emission-Aware Gaussians and Path Tracing for Diffuse Indoor Scene Reconstruction and Editing cs.GR · 2026 · author #7
End-to-End Training for Autoregressive Video Diffusion via Self-Resampling cs.CV · 2025 · author #8
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing cs.CV · 2025 · author #59
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency cs.CV · 2025 · author #68
InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling cs.CL · 2025 · author #15
MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence cs.CV · 2025 · author #11
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models cs.CV · 2025 · author #48
Visual-RFT: Visual Reinforcement Fine-Tuning cs.CV · 2025 · author #7
Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation cs.RO · 2024 · author #5
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling cs.CV · 2024 · author #39
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction cs.CV · 2024 · author #11
MinerU: An Open-Source Solution for Precise Document Content Extraction cs.CV · 2024 · author #17
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output cs.CV · 2024 · author #26
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites cs.CV · 2024 · author #32
Are We on the Right Way for Evaluating Large Vision-Language Models? cs.CV · 2024 · author #10
InternLM2 Technical Report cs.CL · 2024 · author #100
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition cs.CV · 2024 · author #8
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model cs.CV · 2024 · author #22
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions cs.CV · 2023 · author #8
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition cs.CV · 2023 · author #20
MMBench: Is Your Multi-modal Model an All-around Player? cs.CV · 2023 · author #12
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning cs.CV · 2023 · author #8
MMDetection: Open MMLab Detection Toolbox and Benchmark cs.CV · 2019 · author #25
POPQORN: Quantifying Robustness of Recurrent Neural Networks cs.LG · 2019 · author #6
Learning to Cluster Faces on an Affinity Graph cs.CV · 2019 · author #6
Libra R-CNN: Towards Balanced Learning for Object Detection cs.CV · 2019 · author #6
Self-Supervised Learning via Conditional Motion Propagation cs.CV · 2019 · author #4
WIDER Face and Pedestrian Challenge 2018: Methods and Results cs.CV · 2019 · author #2
Hybrid Task Cascade for Instance Segmentation cs.CV · 2019 · author #12
Region Proposal by Guided Anchoring cs.CV · 2019 · author #5
Monocular 3D Pose Recovery via Nonconvex Sparsity with Theoretical Analysis cs.CV · 2018 · author #2
A Neural Compositional Paradigm for Image Captioning cs.CV · 2018 · author #3
Improving On-policy Learning with Statistical Reward Accumulation cs.LG · 2018 · author #3
Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition cs.CV · 2018 · author #4
Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation cs.CV · 2018 · author #5
Generative Adversarial Frontal View to Bird View Synthesis cs.CV · 2018 · author #5
Pose Guided Human Video Generation cs.CV · 2018 · author #6
Person Search in Videos with One Portrait Through Visual and Temporal Links cs.CV · 2018 · author #3
Move Forward and Tell: A Progressive Generator of Video Descriptions cs.CV · 2018 · author #3
Rethinking the Form of Latent States in Image Captioning cs.CV · 2018 · author #3
Probabilistic Ensemble of Collaborative Filters cs.IR · 2018 · author #2
From Trailers to Storylines: An Efficient Way to Learn from Movies cs.CV · 2018 · author #5
Unifying Identification and Context Learning for Person Recognition cs.CV · 2018 · author #3
Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination cs.CV · 2018 · author #4
Optimizing Video Object Detection via a Scale-Time Lattice cs.CV · 2018 · author #7
Low-Latency Video Semantic Segmentation cs.CV · 2018 · author #3
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition cs.CV · 2018 · author #3
Accelerated Training for Massive Classification via Dynamic Class Selection cs.CV · 2018 · author #4
Peephole: Predicting Network Performance Before Training cs.LG · 2017 · author #3
Learning Sparse Visual Representations with Leaky Capped Norm Regularizers cs.LG · 2017 · author #2
Be Your Own Prada: Fashion Synthesis with Structural Coherence cs.CV · 2017 · author #4
Contrastive Learning for Image Captioning cs.CV · 2017 · author #2
Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data stat.ML · 2017 · author #2
Integrating Specialized Classifiers Based on Continuous Time Markov Chain cs.LG · 2017 · author #2
Discover and Learn New Objects from Documentaries cs.CV · 2017 · author #4
Temporal Segment Networks for Action Recognition in Videos cs.CV · 2017 · author #5
Temporal Action Detection with Structured Segment Networks cs.CV · 2017 · author #6
Detecting Visual Relationships with Deep Relational Networks cs.CV · 2017 · author #3
Towards Diverse and Natural Image Descriptions via a Conditional GAN cs.CV · 2017 · author #4
UntrimmedNets for Weakly Supervised Action Recognition and Detection cs.CV · 2017 · author #3
A Pursuit of Temporal Accuracy in General Activity Detection cs.CV · 2017 · author #4
PolyNet: A Pursuit of Structural Diversity in Very Deep Networks cs.CV · 2016 · author #4
Deep Markov Random Field for Image Modeling cs.CV · 2016 · author #2
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition cs.CV · 2016 · author #5
CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016 cs.CV · 2016 · author #7
Adjustable Bounded Rectifiers: Towards Deep Binary Representations cs.LG · 2015 · author #2
Generating Multi-Sentence Lingual Descriptions of Indoor Scenes cs.CV · 2015 · author #1

Mentions

2512.15702 #8 · arxiv_oai · confidence 0.70 Dahua Lin
2606.30616 #18 · arxiv_oai · confidence 0.70 Dahua Lin
2606.20562 #9 · arxiv_oai · confidence 0.70 Dahua Lin
2606.19338 #5 · arxiv_oai · confidence 0.70 Dahua Lin
2606.17518 #7 · arxiv_oai · confidence 0.70 Dahua Lin
2606.16449 #5 · arxiv_oai · confidence 0.70 Dahua Lin
2606.09393 #13 · arxiv_oai · confidence 0.70 Dahua Lin
2606.06828 #11 · arxiv_oai · confidence 0.70 Dahua Lin
1511.06201 #2 · backfill · confidence 0.70 Dahua Lin
1711.02857 #2 · arxiv_oai · confidence 0.70 Dahua Lin
2606.03503 #8 · arxiv_oai · confidence 0.70 Dahua Lin
1503.00064 #1 · backfill · confidence 0.70 Dahua Lin
2605.30963 #20 · arxiv_oai · confidence 0.70 Dahua Lin
2605.30116 #7 · arxiv_oai · confidence 0.70 Dahua Lin
2605.28820 #20 · arxiv_oai · confidence 0.70 Dahua Lin
2603.16870 #12 · arxiv_oai · confidence 0.70 Dahua Lin
2505.23764 #11 · arxiv_oai · confidence 0.70 Dahua Lin
2605.23897 #6 · arxiv_oai · confidence 0.70 Dahua Lin
2412.15109 #5 · arxiv_oai · confidence 0.70 Dahua Lin
2605.21100 #12 · arxiv_oai · confidence 0.70 Dahua Lin
2508.08636 #15 · arxiv_oai · confidence 0.70 Dahua Lin
2605.19461 #10 · arxiv_oai · confidence 0.70 Dahua Lin
2605.19447 #8 · arxiv_oai · confidence 0.70 Dahua Lin
2403.13805 #8 · arxiv_oai · confidence 0.70 Dahua Lin
2309.15112 #20 · arxiv_oai · confidence 0.70 Dahua Lin
2509.22186 #59 · arxiv_oai · confidence 0.70 Dahua Lin
2407.03320 #26 · arxiv_oai · confidence 0.70 Dahua Lin
2401.16420 #22 · arxiv_oai · confidence 0.70 Dahua Lin
2409.18839 #17 · arxiv_oai · confidence 0.70 Dahua Lin

Frequent Coauthors

Jiaqi Wang 23 shared papers
Kai Chen 22 shared papers
Yu Qiao 16 shared papers
Conghui He 15 shared papers
Yuhang Zang 14 shared papers
Chen Change Loy 12 shared papers
Xiaoyi Dong 12 shared papers
Yuanjun Xiong 12 shared papers
Haodong Duan 11 shared papers
Xingcheng Zhang 11 shared papers
Wei Li 10 shared papers
Limin Wang 9 shared papers
Ziwei Liu 9 shared papers
Bin Wang 8 shared papers
Bo Dai 8 shared papers
Jiangmiao Pang 8 shared papers
Pan Zhang 8 shared papers
Xiaoou Tang 8 shared papers
Jianping Shi 7 shared papers
Linke Ouyang 7 shared papers