InternVL: Scaling up vision foundation models and aligning for generic visual-linguistic tasks

Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Benchmarking Large Vision-Language Models on Fine-Grained Image Tasks: A Comprehensive Evaluation

cs.CV · 2025-04-21 · unverdicted · novelty 6.0

Introduces FG-BMK benchmark and evaluates twelve LVLMs on fine-grained semantic recognition and feature tasks, identifying influences from training paradigms and perturbation sensitivity.

citing papers explorer

Showing 1 of 1 citing paper.

Benchmarking Large Vision-Language Models on Fine-Grained Image Tasks: A Comprehensive Evaluation cs.CV · 2025-04-21 · unverdicted · none · ref 8
Introduces FG-BMK benchmark and evaluates twelve LVLMs on fine-grained semantic recognition and feature tasks, identifying influences from training paradigms and perturbation sensitivity.

InternVL: Scaling up vision foundation models and aligning for generic visual-linguistic tasks

fields

years

verdicts

representative citing papers

citing papers explorer